Focus: KGs for Food
Dan Barber, the James Beard award-winning chef who pioneered the farm-to-table movement, has a single, elegant term for the complexity of molecules, processes, and sequences that underpins great food: flavor. Like great chefs, computer scientists grapple with complexity, seeking elegant ways to model and analyze increasingly complex–or perhaps, flavorful–phenomena.
Across many business verticals, we are seeing knowledge graphs being leveraged to capture and model complexity, and food supply is no exception. The emerging and relatively large interest area of knowledge graphs in food is composed of a wide range of interconnected domains: weather, soil science, plant breeding, farming, food processing, distribution, farmers markets, restaurants, groceries, schools, and so on. Knowledge graphs have the power to enable a more resilient, distributed food network, matching small-scale food producers and regional food processors directly to demand, and producing cascading benefits for local communities and environment.
To do so, we would need to link the relevant food data, to understand and leverage the inherent context of these nested systems and to enable critical information-sharing and communication.
The opportunity here is immense: transforming food consumption patterns – measured in $ trillions – would impact also on other areas of concern: energy use, carbon footprint, rural economics, chronic illness rates, desertification, rise of epidemics, etc.
The Trouble with Food Today
Barber is featured in S1E2 of the highly-recommended Netflix series Chef’s Table, which examines his career and especially his work in the Hudson Valley to rebuild a culture of food that supports family farms, resilient/regional food processing, and NYC eaters. We see him working “backwards” from the restaurant plate to solve problems: going back through farmers markets, for instance, to assist individual farmers to develop demand for seasonal foods–which in turn helps to restore soils–and also with plant breeders to develop plant varieties that maximize nutrition and flavor while minimizing transportation costs–such as little squash.
Acclaimed food journalist Michael Pollan, who has his own highly-recommended Netflix series, Cooked (based on his popular book of the same title), examines how humans as a species have evolved to rely on cooked foods for the energy required for our relative brain size. He details the fascinating origins of food processing–from sprouting/malts to fermentation and enzymes, producing favorites like cheese, sourdough, and chocolate–dating back through tens of thousands of years of human history and cultures. The resulting geopolitical significance of grains (wheat, rice, etc.) in terms of international trade and food security is featured in S1E3 of the Netflix mini series.
Pollan also calls our attention to more recent changes in food supply: following the end of WWII, there was a big push to ramp consumer use of highly processed foods. This has led to a spectrum of health problems, both for those who consume these foods–quoting Pollan, “The food industry creates patients for the healthcare industry,”– and within agricultural environments themselves. In the US, soil depletion is measured in $ billions, while increasingly large monocultural swathes of hybridized annual varieties of perennial plants perpetuate reliance on petrochemicals, and food processing consumes more freshwater worldwide than do cities (70% of freshwater withdrawals worldwide). The result? The nutritional value of common staples, such as breads, has become severely degraded, and the sudden rise of food allergies and increasing food-ties to prevalent diseases depicts an even more sobering picture. Quoting Pollan again, “The more you process any food, the more profitable it becomes,” –with an inverse relationship for local communities and eaters.
These interconnected issues were echoed in the recent Google Food Lab Summit – see “5 Takeaways From Google Food Lab” with chef Erik Oberholtzer: “Currently, we get 60% of our calories from just 4 ingredients … industrialized corn, soy, wheat, and rice … heavily sprayed on mono-crop farms, then stripped of nutrients for processed foods that make our planet and people sick.”
Graph Algorithms
As an example of where KGs come into play with food, a student research project in computer science “Cooking up Food Embeddings: Understanding Flavors in the Recipe-Ingredient Graph” by Christopher Sauer, Alex Haigh, and Jake Rachleff explores how recipes are explicitly structured as linked data, in terms of the ingredients required. Moreover, the underlying drivers of flavor (aka, complexity) and nutrition imply linked data and relations. “We explore the bipartite graph between recipes and ingredients. Previous work primarily used local, count-based metrics to explore the relationships between ingredients. We conduct a more complete exploration of the underlying network structure of ingredients through more rigorous graphical analysis and a novel graph analysis metric for finding substitute ingredients from edge weights.” They use embedding to produce high quality substitute and ingredients pairs, understand the graph at a more global scale, and automatically generate new recipes.
See also the 2013 paper “From Amateurs to Connoisseurs: Modeling the Evolution of User Expertise through Online Reviews” by Julian McAuley (UCSD) and Jure Leskovec (Stanford), who leveraged graphs to analyze datasets from two popular beer rating social networks. Beyond the popular assumptions of marketing and social influence, these researchers were able to develop models of “user evolution” based on graph analysis, tracking how consumers gain knowledge and experience. For example, what do “acquired tastes” imply and how can those be discovered using graph data? The goal was to create recommender systems capable of making good suggestions based on where a user’s tastes are at a point in time, which may change again in the future. Check out the video interview with Julian McAuley.
Translated: it’s important for resilient markets at scale to match supplies with demand and to provide for effective contingencies and failover, preferably letting consumer demand drive how the markets get shaped. To do this, it becomes necessary to understand and model what consumers need.
Many Food KG projects
Food is a popular project topic in the KGC community. For example, MIT had a project running Sep 2015 through May 2017: “The Foodome: Building a Comprehensive Knowledge Graph of Food” which addressed “how to create deeper understanding and predictive intelligence about the relationships between how we talk and learn about food, and what we actually eat. Our aim is to build a food learning machine that comprehensively maps, for any given food, its form, function, production, distribution, marketing, science, policy, history, and culture (as well as the connections among all of these aspects).”
KGC co-founder François Scharffe and other colleagues worked on the Food Ontology to model the food domain, describing ingredients and food products. This is used by the Open Food Facts dataset. Another resource which is becoming popular is the FoodOn food-to-fork ontology. There’s also the Smart Recipe Project by Condé Nast and RES Group which aims to develop “AI services able to extract information from food recipes. The leading idea is that such implementations might fit many and various business use-cases like recommendation engines, vocal assistant, smart kitchen tools, and much more.”
One of our KGC 2020 presenters, Luigi Assom, applied knowledge landscaping approaches in Uncovering Food Trade impact by using knowledge graph. Another one of our notable KGC 2020 presenters, Deborah McGuiness, leads the FoodKG project at Rensselaer Polytechnic Institute, sponsored in part by IBM Healthcare Group in Decision Science, which makes use of FoodOn: “We combine multiple sources of data into a single cohesive knowledge graph, forming linkages to relate similar concepts.”
Internet of Food
We recently spoke with Matthew Lange at UC Davis about IC-FOODS. Their mission is to create a standardized language that both industry and government will use, and they collaborate with industry and academic partners around the world to build the semantic platform for the Internet of Food (IoF). “We’re creating technical and computational capabilities on an underlying data architecture of connected ontologies … What we haven’t seen yet is the food equivalent of the web browser.” Here’s a conference video of Matthew – presenting immediately before a keynote by Steve Wozniak (no pressure).
Matthew notes that their project recently transitioned from focusing on building ontologies to a modular build of a KG of the global food system. See the Nature article “Review of the sustainability of food systems and transition using the Internet of Food” which explores the issues introduced in this article in much more detail.
An immediate concern is that food supply and especially food security face substantial risks due to the pandemic. Regulatory efforts and agribusiness pressures in the US over the past several decades have pushed to centralize food processing, resulting in downsides in terms of transportation and energy costs, negative impact on family farms, and market distortions of centralized food distribution. As we’ve seen with the Covid-19 outbreaks at large meat processing plants, there are serious public health concerns as well. Even at a time when grocery stores struggle to keep their shelves stocked with even staple foods, and when “food deserts” in inner cities result in starvation rates, obesity, diabetes, etc., centralized bottlenecks can incentivize food going to waste instead of going to market, resulting in so many farmers dumping sorely needed milk, eggs, and meats.
The Role for KGs
Knowledge graph work can be a major factor in the decentralization of food supply chains. Consider how much effort in machine learning at scale has been dedicated to decentralization of advertising: making ad network markets more fluid by matching advertisers with publishers, as with Google Ads. Much the same can be said for Facebook, LinkedIn, Tindr, Uber, etc.
What if that kind of AI technology were instead leveraged to disrupt legacy systems for centralized food processing and distribution? Instead of current practices, we could match small-scale food producers and regional food processors directly to demand at grocery stores, restaurants, schools, etc. To do so, we would need to understand and leverage the context of the linked data involved. With their ability to display context, that priority places knowledge graph work at the center of this effort.The opportunities for increased knowledge in the food industry is gargantuan: to quote from a recent McKinsey Global Institute report about the pandemic: “Some of these vulnerabilities are inherent to a given industry; the perishability of food and agricultural products, for example, means that the associated value chains are highly vulnerable to delivery delays and spoilage.” Even minor enhancements in global value chains that get measured in $ trillions can result in large revenue streams, not to mention increased nutritional availability. How’s that for flavor?