Brewblox release 2020/06/23

It looks like the equally new USB drive on my Pi just died as well. As a development device, that one saw a lot more disk I/O than the typical install, but six months is still way too soon.

We already included some tweaks to prevent unneccessary disk writes, but we’ll do some more digging to identify what is doing the most disk I/O on a running system.

Could you please run dmesg | termbin.com 9999? That will show more info on device-level errors.

https://termbin.com/mg21

By some miracle, my system is working again after almost 5 days of downtime… I tried to run a fsck on reboot (without repair) and after one reboot Brewblox showed the graph again… No idea why and tbh I’ve lost all confidence at the moment. Let’s see

Understandable.

dmesg output doesn’t show any IO read errors, but there’s a lot of noise about virtual network interfaces (used by docker containers).
Issues with docker network management can explain all symptoms (unreachable services, unpredictable crashes). There’s no smoking gun, but I’ll look into that.

The absence of dmesg IO errors does not fully rule out card issues, but makes them much less likely to be the (sole) problem.

I can try moving the brewblox directory to a USB thumbdrive and have it run for a couple of days?

That could eliminate disk issues, yes.

To check: you already ran brewblox-ctl disable-ipv6? IPv6 + Docker is the source of some related issues where restarting containers interrupt the network for all other containers.

Yes, ipv6 is disabled (and I also disabled swap)

1 Like

@Bob_Steers unfortunately another ‘crash’ or at least a “No data for selected period”

Logs: https://termbin.com/ag0d
Dmesg: https://termbin.com/5zg6

I moved the Brewblox directory to a USB thumb drive, so it is not running from the main SD card

Edit: Could this error be related?

The active endpoints could very well be related to the veth messages.

dmesg mentions multiple out-of-memory process kills on influxdb. If you restart the system, and run htop, what is the stable memory load?
We typically see a total system load of ~900mb on a Pi, with docker stats indicating InfluxDB normally using some ~140mb.

@Bob_Steers after 15 mins: Looks like InfluxDB is using a lot more memory than expected?


Yes, that’s very much not supposed to happen.
To get a working system, I recommend backing up / resetting the influx dir.

brewblox-ctl down
zip -r influxdb.zip influxdb

# Check whether zip was created ok
# Stop here if this doesn't show a list of files
unzip -l influxdb.zip

sudo rm -rf influxdb/
mkdir influxdb/
# Update always reconfigures InfluxDB settings
brewblox-ctl update --prune --migrate --no-pull --no-update-ctl
brewblox-ctl up

If you want me to inspect / fix the existing influx data, you can send or upload the zip (depending on size).
Either way, if this theory is correct, the new system should have a normal memory footprint, and avoid weird crashes. Let’s see whether the 6th time is the charm.

Should I empty the couchDb as well to start with a clean slate?

I emptied the influxes (was 195mb) and now memory usage is down to 22mb. No need to fix the history, but am happy to send it to you if you’d like to figure out what went wrong. I have been using Brewblox for quite some versions, so maybe in one update or another, something went wrong

Please do send. I have no clue why it decided to triple its footprint, and we’d rather get in front of it if it can affect other users.

1 Like

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.