Data access: History Service vs. Influx direct access

andylytical · June 27, 2019, 2:13am

In post Custom sampling rate for brewblox graph widget?
you mentioned the History Service and direct access to Influx data.

Can you explain:
What the different databases are for: autogen downsample_1m downsample_10m downsample_1h downsample_6h?

When / how / why does data move between them?

Are any of them ever emptied?

My ultimate goal is exporting data for safekeeping and external visualization.

What is the best way and/or database to export data for:

An event that gathers data for a few hours (ie: mashing)?
An event that gathers data for days or weeks (ie: fermentation)?

Are there any plans to create a front-end for data access or are the methods mentioned (cmdline, API, Grafana) intended for end-user consumption?
In other words, should I write quick & dirty, throw-away scripts for data access until something official is released OR should I spend more time on something I will be using for years to come?

Bob_Steers · June 27, 2019, 6:15am

The various databases (technically: retention policies in the same database) are used for server-side downsampling.

There are timed queries running on the database: one for each policy. Each one grabs new data from the previous policy, and inserts it as a single point.

New points are inserted in autogen.
Every 1m downsample_1m selects points from autogen where time > now - 1m. It inserts the mean value as a single point in downsample_1m.

The same thing is done by downsample_10m, downsample_1h, and downsample_6h, except that 10m selects from 1m, 1h from 10m, and 6h from 1h.

When a graph in the UI subscribes to data, we route the query to the most appropriate policy, to avoid casually emitting a few million points. You can check the selected downsampling policy in the graph settings menu. (Realtime = autogen).

Autogen only contains points from the last 24h. The other policies keep their data forever.

If you want to export data I’d always use autogen or 1m, depending on whether you need data from >24h ago. Only use the other policies if you have performance issues.

The Graph and Metric widgets are our front-end for end user data consumption. Grafana can’t be integrated in our UI, and anyone seriously looking for advanced commandline tools may as well directly access influx.

For now I’d err on the side of quick and dirty scripts. We’re still in beta: breaking changes do happen.

I hope that answers your questions. If you have more, feel free to post them here, or PM me for a Slack invite.