I’ve noticed a few times now that when I’m running the brewebox update script, the raspberry pi becomes unresponsive after it stops all of the services and before it removes them. Not sure if this is a pi issue or something else going on. After resetting power, the update runs as expected. Any suggestions on what to check? I pulled the log file for review if it helps. https://termbin.com/y4fx I also see some console error messages when pulling the log
INFO Writing Spark blocks…
HTTPSConnectionPool(host=‘localhost’, port=443): Max retries exceeded with url: /spark-one/blocks/all/read (Caused by NewConnectionError(’<urllib3.connection.VerifiedHTTPSConnection object at 0x75cff2d0>: Failed to establish a new connection: [Errno 111] Connection refused’))
HTTPSConnectionPool(host=‘localhost’, port=443): Max retries exceeded with url: /spark-sim/blocks/all/read (Caused by NewConnectionError(’<urllib3.connection.VerifiedHTTPSConnection object at 0x75c812b0>: Failed to establish a new connection: [Errno 111] Connection refused’))
INFO Writing dmesg output…
Was poking through the message log and looks like then removing the network interfaces the Pi fell off the network. I found I still had IPV6 enabled, so disabled with brewblox-ctl disable-ipv6. Will check back and run an update tomorrow night to see if it behaves better now.
IPv6 does sound possible, but the symptoms are not something we’ve seen before.
The known bug is that (re)starting containers with IPv6 enabled triggers network resets for all other containers.
If that htop screenshot is from when it went unresponsive, then network issues do seem more likely: CPU is practically idling, and RAM is only half used.
To confirm, you could ping the pi while it’s busy. Pings are handled in the kernel, and don’t take much. If the pi suddenly stops responding to that, then it either lost power, or network.
In my experience it is not just the containers that are affected. The whole network stack on the host is reset. On my dev system, a docker restart in the background can cause page load issues in the browser on unrelated websites.
Not sure if this was the same issue, but the symptoms were somewhat similar. While running an update (after flashing buster and installing from a snapshot), it hung at the following step:
INFO Starting services...
Building with native build. Learn about native build in Compose here: https://docs.docker.com/go/compose-native-build/
Creating network "brewblox_default" with the default driver
Creating brewblox_redis_1 ...
Creating brewblox_history_1 ...
Creating brewblox_traefik_1 ...
Creating brewblox_influx_1 ...
Creating brewblox_spark-sim_1 ...
Creating brewblox_ui_1 ...
Creating brewblox_spark-one_1 ...
Creating brewblox_eventbus_1 ...
I tried connecting a second ssh session to see what was up but my ssh client couldn’t find the ssh host (ssh: Could not resolve hostname raspberrypi.local: Name or service not known).
After seeing this thread, I power cycled the pi, reconnected, disabled ipv6, and ran brewblox-ctl update successfully.
I probably could have tried updating without disabling ipv6 first for more optimal troubleshooting, but I was in a bit of a hurry…