October 09 2017, 15:18

Colleagues, programmers. Could you direct me to some proper material to read about automatic trend detection in data?

For example, if you have a log of events – say, temperatures from 10,000 sensors. We need to identify which sensor suddenly began to rise rapidly.

The first thing that comes to mind is to find trends over a short period of time and analyze micro-trends for two or three periods, but this approach has plenty of downsides: starting with the fact that there could be fluctuations not related to growth, and secondly, some sensors might rarely show readings compared to the analysis period, which causes a problem in correctly choosing the time period for finding the average. Essentially, this approach only works with very high-density information on sensors. And here, the density fluctuates – thick at times, then sparse for different sensors. Well, okay, you can make dynamic groups and somehow tag sensors as “frequent” and “rare”. But all this complicates things, and I feel like my thought is heading the wrong way.

Essentially, it is necessary to construct first and second order derivatives over time and analyze their shapes. Another problem is that the number of sensors is generally unlimited – some may appear, others disappear. Generally, new ones should also be trending.

What to read?

Share this:

Related

Leave a comment Cancel reply