Motivation and Background

"Can I eat this food or not?" is a question that millions of people consider before each meal. This is particularly significant for individuals with chronic conditions such as diabetes, obesity, or hypertension, who need to monitor their diet closely to manage their health. It is equally relevant for those with dietary restrictions like lactose-free, gluten-free, or nut-free diets, as well as those with dietary preferences such as vegan, kosher, or vegetarian. For people with chronic conditions or dietary limitations, the importance of this question is heightened, as it can have serious health implications. According to the World Health Organization (WHO), over 830 million people worldwide suffer from diabetes [1], and approximately more than half a billion people [2] are affected by cardiovascular disease, with high cholesterol being a known risk factor. These numbers are growing day by day.

Many medical organizations have published dietary guidelines tailored for specific health conditions. However, these guidelines are often extensive, complex, and not easily applicable to immediate meal choices, particularly when someone is in a hurry or hungry. Additionally, these guidelines serve as general recommendations rather than comprehensive databases, requiring significant time for research and continuous updates. For instance, while potatoes are considered a healthy carbohydrate, they have a high glycemic index. Similarly, while it is advised to avoid trans fats, individuals may not always know which foods contain them. While branded foods often list trans fats in their nutritional information, certain cooking methods, such as deep frying, introduce trans fats implicitly, adding to the complexity of making informed dietary choices.

Neurosymbolic Explainable Diet AI Model

To answer the question, "Can I eat this food or not? Why", we build an assistive, explainable diet recommendation model using a neurosymbolic approach. The system consists of several models and knowledge graphs handling different tasks. The takes food images or cooking instruction as input. For a given food image, a model will generate ingredients and cooking instructions. From the cooking instructions, a model will extract the cooking actions. The ingredients and cooking actions are mapped to construct a recipe graph that can tell which cooking action is being performed on which ingredient. This is essential to generate inferences such as grilling + meat --> traces of carcinogens. Then, knowledge from several sources is applied to the ingredients and cooking actions to determine whether the food is suitable for users' health conditions and food preferences. The system also explain why the food is suitable and not suitable using knowledge gathered from several sources. Further, alternative ingredients and cooking methods can be suggested wherever applicable along with explanations. The system leverages generalization and pattern mining ability of deep learning models and reasoning ability of knowledge graphs to explain the recommendations or decisions made by the model. The notable features of the proposed approach are:

Multi-contextual grounding: The ingredients and cooking actions are grounded with knowledge in several context. For example, potato is a healthy carbohydrate in the context of diabetes categories. In the context of glycemic index, it has high glycemic index. Further, we also capture nutrition, nutrition retention and visual representation of entities
Alignment: The recommendation reasoning is aligned with dietary guidelines for diabetes from medical source
Attribution: Each reasoning provided by the model can be attributed with the medical sources or published papers
Explainability: The model employs several kinds of reasoning such as counterfactual reasoning, chain of evidence reasoning, path-based reasoning, procedural reasoning and analogical reasoning to explain the results
Instructability: The model can take inputs from medical experts to adjust the process of meal analysis

We aim to build a custom, compact, neurosymbolic model to incorporate the abilities mentioned above. While general-purpose generative models are trained on extensive data from the internet, including medical guidelines, extracting disease-specific dietary information from vast embedding spaces remains a significant challenge. Effective meal analysis requires a comprehensive understanding of various contexts, including medical guidelines, nutritional content, types of ingredients, the impact of cooking methods, and the user’s health condition and food preferences. These systems should be able to reason over the food by attributing their explanations to medical guidelines. Such systems should be custom trained for specific use cases. A neurosymbolic approach can harness rich knowledge sources, facilitating accountable and explainable reasoning.

Proposal Defense slides of Revathy Venkataramanan who is the project coordinator

The system consists of several modules with models performing specific tasks as listed below

1. Learning Representations to Retrieve Instructions from Images

1.1 Knowledge-infused model to retrieve instructions from images

This method uses knowledge infused representation learning approach to retrieve ingredients and cooking instructions given a food image. In this work, we propose a metric that measures semantic similarity between recipes to be infused as knowledge to cluster and arrange recipes in the embedding space as their semantic similarity. The model performed better against baselines and against a version trained without knowledge. More on this model can be read at full paper here

Architecture of Ki-Cook Model that utilizes procedural attribute of cooking to learn the prepresentation

1.2 Generative Models to generate Instructions from Images

In this approach, generative models are used to generate instructions and ingredients from food images. The process of generating cooking instructions and ingredient lists from images relies on a structured multimodal framework that combines visual and textual data to produce comprehensive outputs. This framework consists of two main stages: synthetic image generation, instruction, and ingredient modeling. Together, these stages work to transform input data into accurate cooking ingredients and instructions. To build a robust model, synthetic images are created using data from the Recipe1M dataset, which includes one million recipes with associated titles, ingredients, instructions, and images. Recipe titles are input into a Stable Diffusion model, generating diverse images for each recipe. This visual diversity enhances the model’s ability to generalize and adapt to various real-world cooking scenarios.

2. Generating Cooking Actions from Cooking Instructions

Extracting cooking actions (saute, fry, boil) from cooking instructions are required for two reasons (i) Analyse the cooking actions to for any harmful substances. For example, deep frying introduces trans-fat, boiling loses vitamin C but steaming retains it. (ii) If a cooking action is harmful on a given ingredient. For example, grilling meat at high temperature produces traces of carcinogens. However, extracting cooking actions from cooking instructions are challenging due to irregular word distributions. To add, there are no benchmark datasets that can be used to train the model. We developed CookGen, a bobust generative modeling of cooking actions from recipes. It is a small generative model with 20k parameters that out-performed XLNet and Electra. The model is compact and trained specifically for our custom use-case to extract cooking actions. Full paper can be read here

Diabetes Specific Diet Knowledge Graph

3. Mapping Ingredients and Cooking Actions to Construct Recipe Graphs

Once the cooking actions are extracted using CookGen model, the cooking actions need to be mapped to ingredients to identify which cooking actions are being performed on which ingredients. Certain cooking actions are harmful to certain ingredients. For example, grilling meat produces traces of carcinogens, steaming is suggested for broccoli as it loses it vitamins quickly compared to boiling. Therefore, we are working on benchmarking several approaches to map ingredients to cooking actions. This will aid in constructing a recipe graph for each recipes. This will aid in analysing ingredients and cooking actions of a recipe by viewing several knowledge sources. Further, using the recipe graph, similar recipes can be identified on several context such as diabetes, vegan friendly dishes and so on.

4. Multicontextual and Multimodal Food Knowledge Graphs

To support the neurosymbolic recommender outlined in Project 1, various knowledge sources are combined to create a comprehensive, multicontextual, and multimodal food knowledge graph. The construction of the knowledge graph consists of the following modules:

4.1 Mapping UFDA Ingredients to Recommend/Not Recommend for diabetes

This module focuses on gathering and mapping ingredient data from the United States Food and Drug Administration (USFDA) database and aligning it with dietary guidelines for diabetes from the Mayo Clinic and the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK). The resulting mappings assist in designing diets for diabetes patients based on expert recommendations. Below are some example mappings:

Potato → Categorized as Vegetables (USFDA) → Labeled as Healthy Carbohydrate (MayoClinic) → Recommend for Diabetes (MayoClinic) Pickled Sausage → Categorized as Sausages (USFDA) → Identified with Saturated Fat, Animal Protein (MayoClinic) → Avoid (MayoClinic) The mapping process uses a set of rules and keywords provided by MayoClinic to ensure accurate categorization. For instance, MayoClinic’s guidelines specify keywords and examples for both "recommend" and "avoid" categories. Nutritional profiles are also analyzed to classify ingredients effectively. Ingredients high in polyunsaturated and monounsaturated fats, such as those found in Omega-3 fatty acids, are marked as "good fats" and recommended for diabetes, while those containing saturated fat are marked as "avoid."

Example

Salmon → Categorized as Finfish and Shellfish Products (USFDA) → Classified as Good Fats (MayoClinic) → Recommend (MayoClinic). More information can be found in the slides given below. Each ingredient can be associated with multiple diabetes categories. In the slides below, each Salmon is associated with Medium Fat, High Cholesterol, Heart Healthy Items, Low Sodium and Animal Protein. Each category is marked with the suggestive decisions from MayoClinic or NIDDK as Recommended, Avoid or Caution (have it with caution). For each ingredient and its paths, following information is present

Path Based Reasoning: For each path, say, Salmon --> Medium Fat or Salmon --> Heart Healthy Fish, the reason behind this association is stored as shown in the slides.
Explanation: For each ingredient, the explanation behind the classification is also stored. For example, Salmon is a keyword given by MayoClinic for Heart Healthy Fish. The nutrition of Salmon shows high cholesterol or medium fat as per source1, source2 and etc.
Attribution/Provenance: For each ingredient, the source of information is also stored with them.

Diabetes Specific Diet Knowledge Graph

4.2 Other Modules

Several other modules are underwork. This involves, integrating smoking point of fats and oils, incorporating glycemic index from University of Sydney Glycemic Index Database, Nutrition Retention of Ingredients and also integrating with ingredient substitution knowledge graph given in the next project.

5. Ingredient Substitution: Identifying suitable ingredients for health condition and food preferences

Food is a fundamental part of life, and personalizing dietary choices is crucial due to the varying health conditions and food preferences of individuals. A significant aspect of this personalization involves adapting recipes to specific user needs, primarily achieved through ingredient substitution. However, the challenge arises as one ingredient may have multiple substitutes depending on the context, and existing works have not adequately captured this variability. We introduce a Multimodal Ingredient Substitution Knowledge Graph (MISKG) that captures a comprehensive and contextual understanding of 16,077 ingredients and 80,110 substitution pairs. The KG integrates semantic, nutritional, and flavor data, allowing both text and image-based querying for ingredient substitutions. Utilizing various sources such as ConceptNet, Wikidata, Edamam, and FlavorDB, this dataset supports personalized recipe adjustments based on dietary constraints, health labels, and sensory preferences. This work addresses gaps in existing datasets by including visual representations, nutrient information, contextual ingredient relationships, providing a valuable resource for culinary research and digital gastronomy.

Dataset

The dataset can be downloaded from

mDiabetes: A mHealth application to monitor and track carbohydrate intake

In this work, we developed a mobile health application to track carbohydrate intake of type-1 diabetes patients. The user will enter the food item name and their quantity in their convenient units. The app will query the nutrition database, perform necessary computation to convert the user entered volume to estimate the carbohydrates.

Resources

The app is under development to include food image-based carbohydrate estimation which requires food image based volume estimation.

Team Members

Lead by - Revathy Venkatramanan
Advised By - Amit Sheth

AIISC Collaborators:

Kaushik Roy
Yuxin Zi
Renjith Prasad
Hyunwook Kim
Chathurangi Shyalika

External Collaborators
Students/Interns

Kanak Raj (BIT Mesra, Microsoft Research, BIT Mesra)
Aditya Luthra
Deeptansh (IIIT Hyderabad)

Past Students

Jinendra Malekar (AIISC)
Ishan Rai (Amazon)
Jayati Srivastava (Google)
Rohan Rao (TU Munich)
Ronak Kaoshik (UCLA)
Anirudh Sundar Rajan (University of Wisconsin)
Dhruv Makwana (Ignitarium)
Akshit (IIIT Hyderabad)

Professors/Leads

Victor Penev (Edamam - Industry collaborator)
Dr. Ramesh Jain
Dr.Lisa Knight (Prisma Health - Endocrinologist)
Dr. James Herbert (USC - Epidemiologist and Nutritionist)
Dr. Ponnurangam Kumaraguru (Professor - IIIT Hyderabad)

Press Coverage

Edamam Provides Data for the Creation of an AI Model for Personalized Meal Recommendations. Read More:https://whnt.com/business/press-releases/ein-presswire/744188562/edamam-provides-data-for-the-creation-of-an-ai-model-for-personalized-meal-recommendations/

Publications

Revathy Venkataramanan, Swati Padhee, Saini Rohan Rao, Ronak Kaoshik, Anirudh Sundara Rajan, and Amit Sheth. "Ki-Cook: Clustering Multimodal Cooking Representations through Knowledge-infused Learning." Frontiers in Big Data 6: 1200840. link
Revathy Venkataramanan, Kaushik Roy, Kanak Raj, Renjith Prasad, Yuxin Zi, Vignesh Narayanan, and Amit Sheth. "Cook-gen: Robust generative modeling of cooking actions from recipes." In 2023 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 981-986. IEEE, 2023. link
Sheth, Amit, Manas Gaur, Kaushik Roy, Revathy Venkataraman, and Vedant Khandelwal. "Process knowledge-infused ai: Toward user-level explainability, interpretability, and safety." IEEE Internet Computing 26, no. 5 (2022): 76-84.

References

[1] World Health Organization. Diabetes. https://www.who.int/health-topics/diabetes#tab=tab_1. Accessed on Nov 19, 2024 [2] World Heart Federation. Word Heart Report. https://world-heart-federation.org/wp-content/uploads/World-Heart-Report-2023.pdf. Accessed on Nov 19, 2024

Food Computation

Contents