A new equation, developed using data from an IAEA nutrition database, is helping researchers assess the accuracy of self-reported dietary information in studies and surveys.
This equation, developed using machine learning, has revealed that close to a third of records in two widely used nutritional datasets were likely to be misreported, according to a recent scientific article published in Nature Food.
This revelation underlines the need for better methods to measure what people really eat.
Nutritional epidemiology, a field that examines the link between diet and human diseases, commonly relies on tools such as questionnaires and food diaries to assess dietary intake. However, these methods are prone to misreporting, as participants may inaccurately estimate portion sizes, misremember what they ate, intentionally misstate their consumption, or even alter their eating habits during the reporting period.
"Many nutritional epidemiology studies that try to link dietary exposure to disease outcomes are based on unreliable data, which can explain why many findings contradict each other," said John Speakman, one of the paper's authors and a professor at the Shenzhen Institute of Advanced Technology in China and the University of Aberdeen in the United Kingdom.
While the issue of misreporting and its impact on metabolic research has been recognized since the 1980s, studies continue to use these tools due to their perceived utility and the lack of practical and accessible alternatives for collecting dietary data.
Energy Expenditure Measured Using Stable Isotopic Techniques
The doubly labelled water (DLW) technique, which uses stable isotopes of hydrogen and oxygen to track their use and elimination by the human body, offers a solution. This method, considered the 'gold standard' for measuring energy expenditure in non-laboratory conditions, is non-invasive, applicable to diverse populations, and has been used to evaluate the accuracy and precision of other measurement techniques.
The IAEA Doubly Labelled Water Database, which pools data from multiple studies using this technique, now features over 12 000 measurements of daily energy expenditure from individuals ranging from pre-term infants to 90-year-olds across 45 countries. This resource has enabled groundbreaking research on human energy metabolism, including a popular and highly cited 2021 article in Science.
"The IAEA Doubly Labelled Water Database is an unparalleled resource that provides invaluable insights into human energy expenditure across the lifespan, playing a critical role in advancing our understanding of metabolism and health," noted Alexia Alford, a nutrition specialist in the IAEA's Division of Human Health.
Developing the New Predictive Equation

IAEA Section Head Cornelia Loechl (left) and IAEA nutritionist Alexia Alford (right) discuss the new predictive equation made possible by the IAEA's Doubly Labelled Water (DLW) Database (Photo: P. Lee/IAEA)
Nearly 100 scientists from across the globe leveraged the DLW Database - the largest dataset of its kind - to derive the predictive equation. Using classical general linear regression and machine learning models, researchers analysed part of the dataset and validated their finding with another portion. They identified key predictors such as body weight, height, age, elevation and sex, refining the equation's accuracy and applicability.
This equation now enables researchers to predict a range of energy expenditures values for individuals, which can be compared to reported energy intake values. This serves as a screening tool to identify misreporting in dietary studies.
When applied to external datasets such as the National Diet and Nutrition Survey - which includes over 12 000 records on diet, nutrient intake and nutritional status from the United Kingdom's general population - researchers found that only 66.8 per cent of adult dietary reports fell within the predictive range, indicating 33.2 per cent misreporting. For children, 83.4 per cent of reports were within the predictive range.
Similar discrepancies were found in the United States of America's National Health and Nutrition Examination Survey. Of the 5 873 available records on adults and children in the United States, 32.1 per cent of the adult reports and 18.3 per cent children's reports reflected misreporting.
"While new methods of dietary intake reporting are actively being developed, none are ready for large-scale implementation just yet. In the meantime, the prediction equation based on DLW data can help researchers estimate the extent of misreporting in their studies," said Cornelia Loechl, Head of the Nutritional and Health-Related Environmental Studies Section in the IAEA's Division of Human Health.