The following is written for circulation in the “data science” research communities, on some advances in scientific methods of system recognition I’d like to share. It starts with mention of the very nice 9 year old work published by Google on “Detecting Influenza Epidemics using search engine query data” taken from a letter to that paper’s authors. Take the reference to be to your own work, though, as it involves system recognition either in life or exposed by streams of incoming data.
I expect a lot of new work has followed your seminal paper on detecting epidemics as natural systems.
But are there people starting to focus on more general “system recognition”,
studying “shapes of data” that expose “design patterns” for the systems producing it?
Any individual “epidemic” is a bit like a fire running it’s course, and sometimes innovating the way it spreads. That change in focus directs attention to how epidemics operate as emergent growth systems, with sometimes shifting designs that may be important and discoverable, if you ask the right questions. You sometimes hear doctors talking about them that way. In most fields there may be no one thinking like doctors, even though in a changing world it really would apply to any kind of naturally changing system.
Turning the focus to the systems helps one discover transformations taking place, exposed in data of all sorts. One technique allows data curves to be made differentiable, without distortion. That lets you display evidence of underlying systems perhaps entering periods of convergence, divergence or oscillation, for example, prompting questions about what evidence would confirm it or hint at how and why.
Focusing on “the system” uses “data” as a “proxy” for the systems producing it, like using a differentiable “data equation” to closely examine a system’s natural behavior. In the past we would have substituted a statistic or an equation instead. By prompting better questions that way it makes data more meaningful, whether you find answers right away or not. I think over the years I’ve made quite a lot of progress, with new methods and recognized data signatures for recurrent patterns, and would like to find how to share it with IT, and collaborate on some research.
Where it came from is very briefly summarized with a few links below. Another quick overview is in 16 recent Tweets that got a lot of attention this past weekend, collected as an overview of concepts for reading living systems with bigdata.
I hope to find research groups I can contribute to. If you’re interested you might look at my consulting resume too. If you have questions and want to talk by phone or Skype please just email a suggested time.
Thanks for listening! – Jessie Henshaw