Blog Post

Co-Ingredient Networks, part 2

Co-Ingredient Networks, part 2

Previously, I wrote about problems doing co-ingredient network analyses on recipes when the ingredients begin getting too clustered. On a tip from Jonathan Goodwin (thanks!), I tried visualizing them not as a force-directed graph, but as a matrix. I also did a tiny bit of cleanup with the data to emphasize the “community” detection and the opacity of individual items (otherwise they all appear 100% filled). Immediately, it helped me see into my data better.

Rather than the cluttered graph, I was able to pull up this (click image to go to the page itself): 

First and foremost, it made it much easier to read. Having a grid that allowed me to hover over a particular ingredient and see its relation to other ingredients was much easier than routing through the cluttered mess of dots in the force-directed graph. Furthermore, being able to sort the graph based on frequency, name, or cluster allowed me to search for a specific item with ease or see the ingredients used together most often, or, simply, to see what communities were formed. Doing so was much more accessible than with the force-directed graph. However, while I was happy to see these communities more clearly, I’m still unsure what to do with them. 

One community consists of: ground beef; onion, finally chopped; chicken broth; honey; salt and pepper; ground black pepper; garlic, minced; kosher salt; heavy cream; all-purpose flour; mayonnaise; canola oil; chopped onion; and butter. Except for the ground beef, mayonnaise, and honey, the ingredients in this cluster are, generally speaking, foundational elements of any Western dish. Off the top of my head, though, I can only imagine ground beef and mayonnaise working together in something like a hamburger. This wouldn’t be too shocking; a search for the string “burger” in my master file brings up 11,743 matches. For most of those, the string will show up twice (once in the URL, once in the title), so cutting them in half gives 5871.5. In other words, roughly 1% of all of’s recipes are burger recipes (with some round off for recipes that use “hamburger buns” but are not, actually, hamburger recipes). Also, at least looking at the different subdomains uses (Australian, baking, Chinese, etc.), I’d expect more communities.

The biggest bonus, however, of using this matrix layout, is that it allows easier glances at the frequency of specific items. Sorting by frequency, it’s easy to see that salt, sugar, butter, water, flour, olive oil, and eggs are used with nearly all other ingredients, and used repeatedly. “Salt and pepper,” as well, shows up frequently, and would likely be even more prevalent if I had taken the time to clean the data more thoroughly so that “salt” and “pepper” were the same as “salt and pepper.” Interestingly, the most frequently co-occurring ingredients in the Australian subset of recipes aren’t too different from those found in the all-encompassing corpus:

Top 15 Co-Ingredients (bolded items are items shared between both lists)
Rank Australia All
1 Olive Oil Salt
2. Butter Sugar
3. Salt Butter
4. Milk Water
5.  Eggs Flour
6.  Sugar Olive Oil
7. Water Eggs
8. Lemon Juice Milk
9. Plain Flour All-Purpose Flour
10. Caster Sugar Pepper
11. Brown Sugar Lemon Juice
12. Honey Egg
13.  Cream Baking Powder
14.  Flour Cinnamon
15. Salt and Pepper Brown Sugar






Obviously this continues to suffer from dirty data problems: egg and eggs are almost certainly the same ingredient, and “plain flour” is probably the same as “all-purpose flour.”  Most of these ingredients make sense – they’re all relatively generic Western ingredients – but the inclusion of caster sugar seemed odd. After some quick and deep research (by which I mean looked at the first few Google hits for “caster sugar”), I discovered that outside of the United States, superfine sugar is more commonly used. This does not, of course, explain the presence of baking powder in the corpus of “all” recipes.

Between this and the last attempt at visualizing ingredient networks, I’ve come to the conclusion I was expecting: its use is limited. I was able to get some quick and dirty ideas of how ingredients tend to work together, at least for recipes on, and I now know more about caster sugar than I did before, but I’m not too sure I’ve learned much else beyond different options for visualizing network analyses. 

See also: the Australian matrix and the matrix for All recipes (or rather a random selection from those reciepes).

An aside on caster sugar: Caster or castor (the OED prefers castor, with an ‘o’) sugar, also known as superfine, has a finer grain than the typical granulated sugar found in the United States, but not quite as fine as powdered or icing sugar (which usually contains corn starch). It’s also great for cocktails because superfine will dissolve in relatively cold water due to its finer grain whereas American granulated sugar needs to be heated a bit. The name comes from putting the sugar in castors – small, saltshaker-like things.

I'll be putting all the code and data up on github sometime soon so others can play with it if they so desire.


No comments