Fast and Accurate Approaches for Large-Scale, Automated Mapping of Food Diaries on Food Composition Tables

Marc Lamarine, Jorg Hager, Wim H. M. Saris, Arne Astrup, Armand Valsesia*

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review


Aim of Study: The use of weighed food diaries in nutritional studies provides a powerful method to quantify food and nutrient intakes Yet, mapping these records onto food composition tables (FCTs) is a challenging, time-consuming and error-prone process Experts make this effort manually and no automation has been previously proposed Our study aimed to assess automated approaches to map food items onto FCTs.

Methods: We used food diaries (similar to 170,000 records pertaining to 4,200 unique food items) from the DiOGenes randomized clinical trial We attempted to map these items onto six FCTs available from the EuroFIR resource Two approaches were tested the first was based solely on food name similarity (fuzzy matching) The second used a machine learning approach (C5.0 classifier) combining both fuzzy matching and food energy We tested mapping food items using their original names and also an English-translation Top matching pairs were reviewed manually to derive performance metrics precision (the percentage of correctly mapped items) and recall (percentage of mapped items)

Results: The simpler approach fuzzy matching, provided very good performance Under a relaxed threshold (score 50%), this approach enabled to remap 99.49% of the items with a precision of 88.75% With a slightly more stringent threshold (score > 63%), the precision could be significantly improved to 96.81% while keeping a recall rate > 95% (i.e., only 5% of the queried items would not be mapped) The machine learning approach did not lead to any improvements compared to the fuzzy matching. However, it could increase substantially the recall rate for food items without any clear equivalent in the FCTs (+7 and +20% when mapping items using their original or English-translated names) Our approaches have been implemented as R packages and are freely available from GitHub.

Conclusion: This study is the first to provide automated approaches for large-scale food item mapping onto FCTs We demonstrate that both high precision and recall can be achieved Our solutions can be used with any FCT and do not require any programming background These methodologies and findings are useful to any small or large nutritional study (observational as well as interventional).

Original languageEnglish
Article number38
Number of pages11
JournalFrontiers in nutrition
Publication statusPublished - 9 May 2018


  • fuzzy matching
  • food composition tables
  • food diaries
  • macronutrient
  • food mapping
  • dietary studies


Dive into the research topics of 'Fast and Accurate Approaches for Large-Scale, Automated Mapping of Food Diaries on Food Composition Tables'. Together they form a unique fingerprint.

Cite this