Data modeling: What does a recipe need to include?


Here are some thoughts I’ve been chewing on about what a recipe needs to include in order to be worthy of the name. @rbaynes, @spaghet, and @webbhm, this is part of the ongoing discussion about data modeling in relation to designing a cheaper and simpler food computer (aka MVP food computer).

What is the nature and role of a recipe?

In cooking, the essential nature of a recipe is that it helps the person following it to reproduce a particular dish. If someone prepares food that you like, and you ask them for the recipe, the idea is that, by following the recipe well, you can re-create some aspect of your enjoyment from the original meal–that’s the point of it. A challenge with cooking is that reproducibility of results relies a lot on the training, experience, and judgement of the cook. In cooking–similar to the difference between a jazz performance and classical symphony–exact reproducibility may not be a priority at all compared to achieving other desirable traits.

Without going off on too much of a philosophical tangent, I’ll just make the claim that it’s often not possible to describe the truly important aspects of a thing using words–people much wiser than me say as much. My point is that you quickly run into challenges of diminishing returns if you chase after too much fidelity in a specification. Great cooks don’t necessarily use recipes, and even the best recipe might give poor results in a badly equipped kitchen or in the hands of an unskilled cook.

In practical terms, that means that a key part of writing a good recipe is making good assumptions about the baseline skills, capabilities, and desires of the intended audience who might follow it. Deciding on the content and level of of detail for a recipe is a Goldilocks problem.

What does that imply for food computers?

  1. The mechanisms we build for working with food computer recipes need to allow for differing goals. History shows that it’s a hard problem for people who build software systems to anticipate what people who use them will want to do. Systems where designers make too many hard-coded policy assumptions tend to become obsolete quickly. In The Art of UNIX Programming, Eric S. Raymond talks about the benefits of separating mechanism and policy–we could learn from that.

    For example, suppose a recipe includes the policy of maintaining temperature at 24C plus or minus 3C. That could be expressed using the mechanism of a plaintext policy file, it could be interpreted using the mechanism of a policy file parser, and it could be applied using the mechanism of a temperature controller.

    Maybe your food computer implements temperature control with a PID loop and an air conditioner while my food computer just uses a fan with a thermostat. But, as long both of our food computers provide the mechanisms of a recipe parser and an acceptably good temperature controller, the implementation details don’t matter–we can both follow the recipe.

  2. Assuming that the essential distinguishing feature of a food computer is that it facilitates repeatable phenotype expression in plants, a food computer recipe should be capable of articulating policies that lead to repeatable expression. That’s a hard problem, and the smaller you set the margin of error for repeatability, the harder it gets.

    But, there is a lot of horticultural knowledge for us to draw on. Watering schedules matter, seeds matter, temperature matters, light spectrum matters, light timing and duration matters, nutrient formulation matters, oxygenation matters, pruning matters, sanitation maters, etc. What we need is a general mechanism for selectively articulating policies about those sorts of things.

How to apply these ideas in practice?

Here’s my first attempt at describing a data model for a food computer recipe. As I discussed above, writing a cooking recipe requires that you make assumptions about the cook’s capabilities. The same concept seems applicable to food computers, so, along with the recipe, I’m describing a set of minimum capabilities for an abstract food computer.

I’m stating the stuff below in absolute language to avoid the hassle of typing lots of qualifying weasel words. What I’m proposing is likely incomplete and perhaps outright wrong–I welcome comments.

Proposed abstract machine model for a Food Computer

At minimum, a food computer must provide concurrent processes to implement these things:

  1. Event loop: A mechanism of triggering actions in response to events according to mapping rules specified in a policy.

  2. Sensing: A mechanism for sending events to the event loop with sensor measurements or human input about observed conditions.

  3. Actuation: A mechanism of actions for affecting physical changes in the growth chamber. This could include things like turning on a relay or giving instructions to a human to flip a switch.

  4. Storage: A mechanism of actions for recording images, sensor measurements, and other events to disk.

  5. Data Export: A mechanism of actions to retrieve data from storage and send it somewhere.

  6. Timing: A mechanism of actions that send events based on clock time or interval timers.

  7. Policy Loading: A mechanism of actions for loading new policy rules into the event loop’s mapping of events to actions. This could include things like starting a recipe, enabling or disabling a sensor driver, switching actuation for a fan from machine-controlled to human-controlled, etc.

Proposed abstract model for a Food Computer Recipe

A food computer recipe specifies policy rules that map events to actions. The building blocks for a policy file are pre-defined events like init and predefined actions like ask a person to, or schedule a repeating timer to.

As an example (using the format of event --> action_block):

init --> {
    ask a person to "obtain a specific type of seeds"
    ask a person to "sanitize the growth chamber"
    ask a person to "mix the nutrient solution"
    ask a person to "plant the seeds"
    ask a person to "send start_germination_phase using the sensing UI when you are ready"
start_germination_phase --> {
    configure temperature control to maintain "25.0C"
    schedule a repeating timer to "send change_nutrients every 2 weeks"
    schedule a repeating timer to "send check_for_sprouts every day"
check_for_sprouts --> {
    ask a person to "send start_growth_phase using the sensing UI when you see sprouts"
start_growth_phase --> {
    configure temperature control to maintain "24.0C"
    schedule a repeating timer to "send daily_check every day"
change_nutrients --> {
    ask a person to "change the nutrient solution"
daily_check --> {
    ask a person to "prune the plants if needed"
    ask a person to "weigh the plant tissue and send a harvested_grams event from the storage UI if they pruned"

@rbaynes, Do you have any thoughts about that?


You call it a Goldilocks Problem, I call it context. There are always assumptions and a larger context to any issue. This is also the problem of ‘part of’ verses ‘related to’. My Joy of Coooking cookbook has several sections that are not recipes - descriptions of spices and cuts of meat, conversion tables and basic procedures. This is the difficulty of defining what is ‘a part of’ and what is ‘related to’. It is all needed information, the question is where to put it. I think we want to include all the parts in a recipe, but separate the relations to protocols and background.
I like your temperature example, we want the ‘instructions’ to say the target temperature is 24C, with a min of 21C and a max of 27C (though we do need to agree on how this is worded!!, and what happens if it exceeds the limits). Whether this is controlled by a PID or mechanical thermostat is a separate issue (but needs testing to see if they are equivalent). Basically the recipe defines ‘what’ not ‘how’.

From my data background I am a strong believer in normalized data models; though I take a lot of exceptions with implementation. Normalization is great for driving out problems of data capture, but it is a real pain for query performance (always a trade off). I run the risk of driving everyone (other than myself) crazy with the modeling issues. One of my favorite quotes is from Ludwig Wittgenstein’s Tractatus: “Whatever defines all the members of a set is not itself a member of the set”.
I am also a big fan of XML: human readable and machine understandable. It tends to be heavy weight and complex, but that is the price to pay for validation and precision. JSON is a lazy version of XML that sacrifices most of the validation. It is much more performant, but gets sloppy very quickly. In my experience programmers love it (“we will put the validation in the code”), but never follow through with the promises (and let the data degenerate very quickly).


It is good to start getting examples of recipes, as we can begin to go through them looking at the details of structure and wording. The definition of ‘recipe’ tends to be oriented toward food, but applies for our use: “A set of instructions for preparing a particular dish, including a list of the ingredients”, “Something that is likely to lead to a particular outcome.”

The trick is knowing how much to include (and leave out) of the instructions. A chocolate chip cookie recipe may give the sequence for adding ingredients together, but usually says “mix the ingredients” without specifying whether I am using my fingers, a spoon, or an electric mixer. I think plant recipes should be equally agnostic to environment/context, and not assume that there is minimal equipment, or a fully automated system. It should state what I want, not how to achieve it. How it is achieved is important, but I think that goes more toward the instructions (protocols) for operating the environment, that the recipe. The recipe should state things like “Air temperature: 27C, min 24C, max 29C”.

You have caught the idea that we need to break this up into phases (int, germination, growth, …), and that some things are ongoing (daily check).

ll this always be a person, or could it be a robot? Human -vs- automation is on the environment/context, and I think the recipe should be agnostic to these issues.

Rather than ask “obtain a specific type of seed”, the recipe should state what it is for “Plant: Burpee Buttercrunch Lettuce”. We may want to break this out more formally as:
Genus-Species: Lactuca sativa
Name: Buttercrunch Lettuce
Supplier: Burpee

General protocols may best be pulled out of the recipe (since they apply to all recipes), though the procedure needs to be clearly defined somewhere. This should get detailed (“spray with 10% Clorox solution”, “Let dry for two days before using”, …)

“Mix the nutrient solution”: This can be a bit tricky. Is this done only during the initialization, or can it be done at other times? I am inclined to have the recipe state what the nutrient ingredients are, and defer the mixing to a protocol. This can also vary depending on if I am making up the solution manually, or using a dosing system to automate the process. The protocol will vary depending on my environment, and less on the recipe.EC (if dosing is automated). There is a part of me that would like a bit of this in the recipe, so searching for similar data sets had something to query against; say a manual value stating to change every week, and an automated value of the EC to maintain.

Things like pruning and checking plants also need to have protocols, giving them procedural definitions.


@webbhm That’s good stuff. I think your idea of making the distinction between recipes and protocols could be very helpful. In particular, if we use “state what I want, not how to achieve it” in recipes, and supplement that with documentation about equivalent manual and automated protocols, that should make it easier to build a modular food computer ecosystem.

Some recipes would presumably need protocols that would be difficult for a human to implement with acceptable fidelity, but we don’t need to focus on that sort of thing to begin with–just leave that door open so we aren’t creating obstacles that would prevent it.

To help people get started, we could focus on recipes that work well using manual protocols or automated protocols. Said another way, rather than thinking about modularity and upgradability at the level of ROS configuration files (too low level), we could think about protocols–temperature regulation, equipment sanitization, preparing nutrient solutions, etc.

[edit: here’s an example of some protocol stuff on the wiki that could be expanded on:]


@webbhm Would you be interested in drafting a few example recipes [edit: and/or protocols] in the style you’ve got in mind, or maybe using different styles to illustrate alternate approaches you’re considering?

For the purpose of exploring these ideas, how do you think we should write example recipes? My inclination would be to start with prose, then work towards something machine parsable once we understand better what the content should be.


Those are good examples of protocols. Since these are most likely to be manually performed, it makes sense to have protocols in prose.
Here is what I started a while back. The JSON is a bit of a pain to try and work in (even with an editor). Maybe we could just do block indent for now (to define the sections). I just noticed I left out the section on plant description on this one.


I like the content of what you started there, but, as you observed, formatting it as JSON leaves a lot to be desired.

Here are a few ideas that come to mind for a potentially more convenient workflow and approach:

  1. What about finding a cookbook you like and copying its style of prose?

  2. I’ve seen the recommendation of writing design or procedure specifications with the intention of being able to teach somebody how to do it by reading to them over the phone. Alternately, trying to explain a procedure over the phone can be used to identify what needs improvement.

  3. What do you think about starting an example recipe repository on GitHub, or perhaps as a wiki section. I mention GitHub because version control is nice, and they have a great online markdown editor that lets you easily switch between plaintext and preview modes–it’s great for simple formatting like bolding, headings, lists, etc. Here’s GitHub’s markdown guide: A lot of the same syntax works here, but a GitHub repo for example recipes would have the advantage of integrating nicely as we start moving toward automatic recipe parsing. Forum posts or wiki pages would be fine too though.


I like the design and thought process. In the formalization of the recipe “language” I think we should have explicit types for time. e.g.
schedule a repeating timer to "send daily_check" repeat 24 hours
The parser (human or machine) will understand the time series syntax: repeat, once, days, hours, minutes, etc.

I would also add a #8. Messaging. Writing to a log, sending an email, popping up a dialog.

Great work, Will! I know @Webb.Peter and @webbhm are both thinking about this issue too.


@webbhm I am fine with either JSON or XML, both are widely used and understood. I hope to never have to type either by hand, but reading and adjusting one for development / debugging is reasonable. In my mind the advantage of XML is a more heavy weight parser and stricter syntax, but also the ability to have COMMENTS in it :slight_smile:


@rbaynes, @wsnook
Yes, we need interval, frequency data; Definitely we need explicit types; No, it is not (technically) part of the recipe.
The recipe should be portable, and I don’t think frequency of operation is likely to be a defining characteristic of a recipe (Check temperature every minute, check water level daily).
I could see “frequency” being an attribute of a protocol, specific to an environment; and the attribute would have two parts: value and unit. We could write it something like:
Action: Check_Temperature
Frequency: value: 1; unit: minute

We will need to have a convention so we don’t end up with…

Action: "send daily_check"
Frequence: value: 24; unit: hour
And also…
Action: "send daily_check"
Frequency: value 1; unit: day

We will also have time intervals, so something x period after an event…
Action: pollinate
Interval: value: 2; unit: day; event: flowering

Again, we need some conventions so we don’t end up defining regulators as interval events rather than responses to events (ie water level sens gets triggered).
Action: turn water_off
Interval: value: 5; unit: minutes; event: water_on