Spalba - What can migratory birds tell us about creating better AI-based predictive analytical models?

Explore Topics


Stay up to date with what's happening in the event tech space.

Follow us

Veery © Jean-Pierre Marcil/Flickr via Creative Commons license

In July 2018, an Ornithologist named Christopher Heckscher tweeted a prediction that the severity of Atlantic Ocean hurricane season would be greater than average. This prediction was quite at odds with the forecasts generated by most sophisticated computer models using decades of meteorological data. It may be pertinent to point here that Hecksher is an Ornithologist and not a data/computer scientist. His prediction was based on the correlation he observed (observations made over 2 decades) between the time of migration of veery birds from the southern USA to Amazon forest in Brazil and the severity of Atlantic Ocean hurricane season.

When the hurricane season got over, about five months later, Hecksher’s prediction was found to be bang on spot. “The birds were saying bad season, and everyone else was saying below-average season,” says Heckscher, an associate professor at Delaware State University.

To understand this, we need to look at how the AI-based prediction models work. In simple terms, the AI models predict a specific outcome based on the underlying factors (known as features in AI parlance). Using several alternate techniques, the AI models establish a statistical correlation between the factors and the outcome. This is done using the training data set. Once the model has been trained, it is then provided with a set of values of features, using which the model then gives the prediction for the outcome.

Now, imagine that you are a domain expert in a specific field and want to create such a prediction model. You have data scientists and AI experts who can create the model for you. However, they look at you to help them identify the relevant factors (or features) that need to be considered while creating the model. After all, you understand your field much better than they do.

In such a scenario, it is natural to assume that you would give them all the possible factors that you think might have a role to play. At times, I have seen data sets with upwards of 1,000 such features for predicting an outcome. Data scientists in your team would then use this data set to try and create a prediction model. However, since this requires very close coordination between the domain experts and data scientists (maybe months of painful efforts), I have seen this step overlooked in most cases.

This is where most prediction models start to go awry. With many irrelevant or cross-correlated features in the training data set, a lot of so-called noise gets introduced in the model which reduces the accuracy of the prediction model. To reduce noise, the model creators then use a technique known as “hyper parameterisation.” It involves tweaking the correlation algorithm across a wide array of possibilities and coming up with a set of parameters which have the least noise. It is fundamentally like repairing the car engine when the fault lies with the fuel that is being injected into it.

Given the limited brain size and capacity of veery birds, it would be safe to assume that these migratory birds use very few factors to predict weather conditions that would be prevalent a few months later. Yet, they were able to give better predictions, because in their case, they acted both as the domain experts, and, the data scientists. The birds zero down on a few relevant factors, out of the possibly thousands of factors that might impact the weather prediction and therefore create a better model. Compare this to the computer models that, one would assume, uses hundreds and thousands of such factors and, as a result, creates noise.

However, using this data set, our trained models were able to give predictions that were sometimes off by more than 40% even with the best of hyper-parameterisation leading to extensive domain-specific discussions and introspections. Post weeks of deliberations and back and forth, we were then able to reduce the input data set to about 15 key features. This has given us fantastic results, with the average error, reduced to less than 7%.

Needless to say, we are now looking at more such nature-based inspirations to drive our software development efforts. Do share, in case you have any.

Happy Predicting!!!