Spark intermittent connection issues after update

Doug_Little · March 9, 2020, 1:18am

Hi,

I have been using brewblox for a while now on a really early version that was stable as long as my network remained stable (my spark needs to connect through a repeater as the signal where the chamber is is poor). Anyway I can get a connection to the spark but its not the best (pings typically are below 10ms but sometimes can be above 100ms). This all worked fine on the old version. With the latest version I continually get errors with timeouts.

2020/03/09 01:13:35 ERROR brewblox_service.repeater error during runtime: CommandTimeout(ListObjectsCommand)

Is there a configuration setting somewhere that I can set to increase the time before an error is thrown?

logs are here https://termbin.com/z6hu

Cheers, Doug.

Bob_Steers · March 9, 2020, 7:21am

We indeed decreased the timeout (to 10s). If commands are routinely taking longer than that, we may want to first check what causes that.

Could you please export your blocks? (Spark service page, top right corner)

Doug_Little · March 9, 2020, 8:26am

Here you go. I’ve been using the follow option and there seems to be a lot of resetting, etc going on. Hopefully its just a bad block or something.

brewblox-blocks-spark-one.json (8.2 KB)

Cheers, Doug

Bob_Steers · March 9, 2020, 8:46am

You indeed still have data for deprecated blocks on your controller.
Try importing the file you just sent (same menu). This will recreate all non-deprecated blocks on your controller.

Edit: the import will show warnings about being unable to import those old blocks. That’s fine and expected.

Doug_Little · March 10, 2020, 4:08am

Yeah, tried that it seemed to make it slightly more stable but still getting timeouts and lots of resets. Cleared everything out and started from scratch even with the bare minimum of blocks was still getting the same timeouts. I even completely removed my dashboard so it was only the blocks and still the same thing. What else could it be? Is it possible to just increase the timeout interval so I can see if that gets me through?

Cheers, Doug.

Doug_Little · March 10, 2020, 4:57am

Update:

OK, So I logged on with another terminal and ran ping on the pi(pinging the spark) concurrently with “brewblox-ctl follow” and noticed that when it got the time out that coincided with a loss of packets from ping with the spark. So its either something dragging the spark down or the network is dropping out. I can test that after I pull the current beer out of chamber by testing the spark on a better network. Worse case is I need to increase the timeout, not really concerned about refresh rate just want something stable.

Cheers, Doug.

Bob_Steers · March 10, 2020, 8:14am

I made a feature release you can use to tweak timeout values.
Edit your docker-compose.yml with either nano, or brewblox-ctl service editor. For your spark-one service, change the “image” line from

image: brewblox/brewblox-devcon-spark:rpi-${BREWBLOX_RELEASE}

to

image: brewblox/brewblox-devcon-spark:rpi-broadcast-timeout

You now have three options you can tweak by adding arguments to the command line in docker-compose.yml (leave the existing arguments)

--broadcast-interval=<SECONDS> : the interval between the service querying the controller for blocks. Default value: 5
--command-timeout=<SECONDS>: Timeout period for individual commands, including those sent by the broadcaster. Default value: 10
--broadcast-timeout=<SECONDS>: Timeout period for the broadcaster. If it hasn’t successfully broadcast data in this period, it will restart the service. Default value: 60.

After editing the compose file, run brewblox-ctl up to apply the changes.

Doug_Little · March 11, 2020, 6:57am

Thanks for the quick update. Boy did I have to blow those times out to get it stable. The update interval is set to 30 seconds, command timeout set to 60 and broadcast timeout set to 90 and it isn’t throwing any timeouts or rebooting. Now I have to figure out why. Still a nice addition for anybody who does have a flaky network… I think. I’ll let you know what I find out once I can play around with the spark directly.

Cheers, Doug.

Bob_Steers · March 11, 2020, 7:40am

Happy to hear it’s more stable. We’ll add the timeout arguments to next release, in case they’re relevant for others.

Meanwhile, we’re digging through some semi-related issues.
We confirmed this issue:
https://github.com/docker/for-linux/issues/914

Note that that issue triggers when a service restarts, so it’s not the cause of your problem, merely a knock-on result.

You can disable ipv6 by following these instructions:
https://support.purevpn.com/how-to-disable-ipv6-linuxubuntu