Zheyuan Zhang1*Keerthiram Murugesan31University of Notre Dame,{zzhang42,yli62,zwang43,tma2,vgalassi,nmoniz2,nchawla,yye7}@nd.edu,nhihlle@brandeis.edu,keerthiram.murugesa@ibm.com,werner.geyer@us.ibm.com,chuxu.zhang@uconn.eduAbstract Diet plays a critical role in human health, yet tailoring dietary reasoning to individual healthconditions remains a major challenge. Nutri-tion Question Answering (QA) has emergedas a popular method for addressing this prob-lem.However, current research faces twocritical limitations. On the one hand, the ab-sence of datasets involving user-specific med-ical information severely limitspersonaliza-tion.This challenge is further compoundedby the wide variability in individual health this task, demonstrate strong reasoning abil-ities, they struggle with thedomain-specificcomplexities of personalized healthy dietaryreasoning, and existing benchmarks fail to cap-Figure 1: An Overview of NGQA Benchmark (a) alongwith a data showcase: (b) an example of the knowledgegraph used for a standard level question and (c) the ture these challenges. To address these gaps,we introduce theNutritionalGraphQuestionAnswering (NGQA) benchmark, the first graphquestion answering dataset designed forper-sonalized nutritional health reasoning. NGQAleverages data from the National Health and Nu- benefits of balanced nutrition, unhealthy eatinghabits remain alarmingly prevalent in modern so-ciety (WHO, 2021). In the United States alone,approximately 42.4% of adults are classified asobese (CDC, 2020a), and in 2017, poor dietaryhabits contributed to over 11 million deaths and asubstantial number of disability-adjusted life-years(DALYs), often linked to factors such as excessivesodium intake (Afshin et al., 2019; WHO, 2023).These statistics underscore an urgent need to pro-mote healthier eating habits on a societal scale.However, nutritional health requires complex do-main knowledge, and there is no one-size-fits-allsolution for healthy diets, as the nutritional needsof individuals can vary widely based on their healtharXiv:2412.15547v1 [cs.CL] 20 Dec 2024 1Introduction Diet is a cornerstone of human health, playing apivotal role in both maintaining well-being andpreventing disease. Despite the well-documented Why this benchmark matters: Numerous ef-forts have sought to address the challenges in per-sonalized nutritional health, with Nutrition Ques-tion Answering (QA) emerging as a popular task(Min et al., 2022; Bondevik et al., 2024). Recent ad-vancements in large language models (LLMs) havedemonstrated significant potential in this domain,offering sophisticated reasoning capabilities to ana-lyze and interpret nutritional information (Mavro-matis and Karypis, 2024). However, these effortsremain constrained by two major limitations. First,to the best of our knowledge, no existing bench-mark truly personalizes answers based on users’specific health conditions, primarily due to theinaccessibility of individual medical data (Bölzet al., 2023). This lack of user-specific datasets follows: •Novel Benchmark for Personalized Nutri- tion.We present NGQA, the first benchmark to incorporate users’ medical information in anutritional question answering task, address-ing a significant research gap in the domain •Advancing the GraphQA Ecosystem.NGQA introduces a domain-specific bench-mark and extends GraphQA benchmarksbeyond datasets likeWebQSP and Expla-Graphsin the general domain. This additionbroadens the scope of GraphQA research, •Comprehensive Resource and Evaluation.Through extensive experiments, NGQA pro-vides a challenging benchmark, a completecodebase supporting the full pipeline fromdata preprocessing to model evaluation, and To address these critical gaps and advance theunderstanding of healthy diet personalization, we propose theNutritionalGraphQuestionAnswering(NGQA) benchmark. This isthe first benchmarkin the personalized nutritional health domainto 2Related Work evaluate whether a specific food is healthy for auser, supported by detailed reasoning of the keycontributing nutrients.By recognizing the intri-cate interplay between a user’s medical conditions,dietary behaviors, and the nutrition of foods, weframe this task as a knowledge graph question an-swering problem. Specifically, using data from theNational Health and Nutrition Examination Survey(NHANES) and the Food and Nutrient Databasefor Dietary Studies (FNDDS), we construct theNGQA benchmark and categorize questions intothree complexity settings: sparse, standard, and Question Answering in Nutritional Health Do-main.Question answering has become an essentialtool in the nutritional and health domain, offer-ing a flexible framework for applications such asfood recommendation (Min et al., 2022; Bonde-vik et al., 2024). Knowledge graphs (KGs) havebeen widely used to model relationships betweenfoods, ingredients, and health, supporting taskslike ingredient substitution and adaptive dietaryrecommendations (Haussmann et al., 2019; Chenet al.