"Backend" openag-cloud-v1 Q&A for PFC 3.0


#21

Thank you for taking the time to explain your thoughts and providing links to supporting material! I gave the two items you mentioned a read and I think I understand where you’re coming from.

I guess I’m having trouble applying the “Separate Research and Development Phases” to the data set itself. In summary, the data set will change as designs and prototypes improve, but “sharing data” as a general concept isn’t related to either research or development. Sharing data isn’t something you need to research, develop, and optimize - you just do it.

I’ll digress at this point as you’ve been gracious enough to discuss it this far. My next step is actually contributing to a solution instead of running my mouth :face_with_hand_over_mouth:! Perhaps I’ll lead an effort to migrate the data to an offline file system but I doubt that will scale well.

Once again, thanks for all the insights and thanks for being a part of the discussion!


#22

@Drew I like your suggestion of trying an experiment to validate your ideas. If you haven’t read back carefully through Howard’s posts for the past 2 years, you might find useful ideas and observations there. The forum’s user profile activity tab is a good way to find such things.

Sorry to keep dragging this out, but it sounds like you have a genuine interest…

As an analogy, research applies to OpenAg’s sharing of data in the way that having something coherent to say and sharing a common language with readers applies to writing. Sharing data is essentially an instance of human written communication, so all the typical problems around syntax, semantics, grammar, vocabulary, semiotics, logic, and style apply. Things like pictures, timestamps, and temperature measurements are comparable to letters in an alphabet or syllables in spoken words. We need layers of commonly agreed upon structure on top of those things to make them useful.

You and I are able to have this conversation because we share a lot of common understanding about the use of the English language. OpenAg did not inherit a common language that is suitable for exchanging data sets that can be used to create optimized growth recipes with machine learning. Various companies and researchers have their own internal systems, but those are the linguistic equivalent of local dialects that outsiders would not be able to understand.

The Growth Chamber Handbook from NCERA-101 is an example of a common language for exchanging data about controlled environment agriculture, but it’s meant for writing papers in journals rather than feeding machine learning algorithms. Caleb has mentioned genome research as an example he would like to follow. We could look more closely at how the genetics folks share their data. As I write this, I’m thinking that needs to go on my todo list.


#23

Thanks for helping me revisit this from a different angle. I hope we’re on the same page in the end.

I respectfully disagree with that statement as you’ve put it. While common structure or semantics tend to make things more useful, it is in no sense required for something to be of use. As an example, there have been recent advances in cancer screening techniques using ML trained against previously collected data - specifically images of gastric cancer. I hope this can illustrate my point that data lacking some special purpose structure can be very useful.

In our case I would say we already have enough structure to make any data collected by these systems useful. JavaScript Object Notation does the job of scoping serialized objects and converting them into readable text. I could easily pick out attrib-value pairs that are useful to analysis today. Thankfully we already have a local dialect that outsiders will be able to understand! If the schema changes often then quality analysis is much harder to maintain but this doesn’t make the data useless.

As PFCs mature, the body of useful data collected will increase… in usefulness! In my opinion this warrants establishing a data sharing expectation or contract. Until this exists I’m worried that using any data store (including my own in isolation) will work against the combined benefits of data sharing on this platform.

To be very frank, I’m concerned that data we collect using open source tools will be walled off and monetized at some point. I see this threat as potentially detrimental to the benefit of people in general. My sentiments are unlikely to change without a clear direction from the holding companies that most likely fund, and thus own, OpenAg data resources moving forward.

My sincere regrets if this has already been broached elsewhere. I’ve spent a chunk of time today using the forum search and did not find any relevant posts.


#24

@Drew I hear you. Those are reasonable questions to wonder about.

For whatever it’s worth, I’ve come around to the view that Caleb has good intentions, he pretty much has his act together, and his team is taking a credible shot at worthwhile and challenging goals. Success certainly isn’t a sure thing, but, in this life, what is? I’ve been watching OpenAg closely for a couple years. 2017 was a rough patch. But, particularly since last fall, it seems like things are starting to come together nicely and solidify.


#25

@wsnook You give me hope! I really want to avoid sounding like a skeptic. The straight truth is that OpenAg is the best shot I’ve found to collaborate. I really want to build faith and invest some time contributing. I have a different take on the client side open source hardware (which I’m developing now) but this consolidated data thing is the next big step. I can feel myself going off topic so unless there is more to be discussed on the OT I’ll conclude by thanking you again for the great dialogue!


#26

@Drew I like the designs on your site.

Hey @Webb.Peter, I think you would like this stuff.