New V3 keeps rebooting

Bit more investigation, I have moved the Spark next to me so I can see what happens when the Spark roams between AP’s

The hard fault does not happen during AP switch, but the Web interface does loose connection.

Jul 28 2017 20:42:51 Connecting to controller...
Jul 28 2017 20:42:51 Opening serial port
DEBUG:pySerial.socket:enabled logging
DEBUG:pySerial.socket:enabled logging
DEBUG:pySerial.socket:enabled logging
DEBUG:pySerial.socket:enabled logging
DEBUG:pySerial.socket:enabled logging
DEBUG:pySerial.socket:enabled logging
DEBUG:pySerial.socket:enabled logging
DEBUG:pySerial.socket:enabled logging
DEBUG:pySerial.socket:enabled logging
DEBUG:pySerial.socket:enabled logging
Jul 28 2017 20:43:01 Errors while opening serial port: 

This is happening continuously even though I can ping the Spark.

If I manually reboot the Spark it all comes back and is working again.

“The hard fault does not happen during AP switch, but the Web interface does loose connection.”

Do you mean the script? Does it hard fault soon after?

Yes sorry the script, and no the hard fault does not happen soon after, I have just left it after a AP switch and 15 mins later it still will not recover unless I power cycle the Spark.

Think we might have a couple of issues that seem to be related but might not be.

This is the relevant code on the Spark:

0.7.0-rc2 has some interesting bug fixes that could be related:

@elco would you like me to install the pre-release firmware?

That’s only the system firmware from particle. I don’t think our user firmware is compatible, because its compiled against 0.5.3.

I think it would be a good test, but I think I’ll need to recompile the firmware for you. I expect breaking changes in the particle system firmware update, so I expect this to be non-trivial.

I’m deep in financial figures for the quarterly tax report, so I won’t have time for that this weekend.

Ok I won’t bother, It does seem to not be V3 specific during this testing I didn’t look at my V2 Spark which is also setup for wi-fi and that to has lost connection until I have rebooted it.

I assume it was moving between the AP;s during this testing swell.

I’ll temporarly config a seperate SSID that is only broadcast on one AP until you find a solution/fix.

If you need me to do any more testing let me know.

Rich

Great, thanks for the tests! You gave me a lot of info and I should be able to reproduce the scenario by setting up an extra router myself.

Still getting the Hard Fault, This is with no devices plugged in as I had left it on in the living room still in Beer Constant mode.

There was not roaming of the AP since 2034 yesterday.

Terminating due to fatal serial error
Jul 29 2017 05:08:05 Notification: Script started for beer 'Cellar-01'
Jul 29 2017 05:08:05 Connecting to controller...
Jul 29 2017 05:08:05 Opening serial port
DEBUG:pySerial.socket:enabled logging
INFO:pySerial.socket:ignored port configuration change
INFO:pySerial.socket:ignored _update_dtr_state(True)
INFO:pySerial.socket:ignored _update_rts_state(True)
INFO:pySerial.socket:ignored reset_output_buffer
INFO:pySerial.socket:ignored reset_output_buffer
Jul 29 2017 05:08:05 Checking software version on controller... 
INFO:pySerial.socket:ignored port configuration change
INFO:pySerial.socket:ignored reset_output_buffer
INFO:pySerial.socket:ignored port configuration change
Jul 29 2017 05:08:05 Found BrewPi v0.5.2 build 0.5.2-0-g72e633171, running on a Particle p1 with a V3 shield on port socket://192.168.0.240:6666?logging=debug

INFO:pySerial.socket:ignored port configuration change
INFO:pySerial.socket:ignored port configuration change
Jul 29 2017 05:31:04 Serial Error: read failed: [Errno 104] Connection reset by peer)
DEBUG:pySerial.socket:enabled logging
INFO:pySerial.socket:ignored port configuration change
INFO:pySerial.socket:ignored _update_dtr_state(True)
INFO:pySerial.socket:ignored _update_rts_state(True)
INFO:pySerial.socket:ignored reset_output_buffer
DEBUG:pySerial.socket:enabled logging
INFO:pySerial.socket:ignored port configuration change
INFO:pySerial.socket:ignored _update_dtr_state(True)
INFO:pySerial.socket:ignored _update_rts_state(True)
INFO:pySerial.socket:ignored reset_output_buffer
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner
self.run()
File "/usr/lib/python2.7/threading.py", line 754, in run
self.__target(*self.__args, **self.__kwargs)
File "/home/brewpi/backgroundserial.py", line 110, in __listen_thread
if self.writeln("") > 0:
File "/home/brewpi/backgroundserial.py", line 56, in writeln
return self.write(data + "\n")
File "/home/brewpi/backgroundserial.py", line 59, in write
self.exit_on_fatal_error()
File "/home/brewpi/backgroundserial.py", line 75, in exit_on_fatal_error
self.stop()
File "/home/brewpi/backgroundserial.py", line 38, in stop
self.thread.join() # wait for background thread to terminate
File "/usr/lib/python2.7/threading.py", line 931, in join
raise RuntimeError("cannot join current thread")
RuntimeError: cannot join current thread

Jul 29 2017 05:31:05 Lost serial connection. Cannot write to serial
Terminating due to fatal serial error
Jul 29 2017 05:31:35 Notification: Script started for beer 'Cellar-01'
Jul 29 2017 05:31:35 Connecting to controller...
Jul 29 2017 05:31:35 Opening serial port
DEBUG:pySerial.socket:enabled logging
DEBUG:pySerial.socket:enabled logging
INFO:pySerial.socket:ignored port configuration change
INFO:pySerial.socket:ignored _update_dtr_state(True)
INFO:pySerial.socket:ignored _update_rts_state(True)
INFO:pySerial.socket:ignored reset_output_buffer
INFO:pySerial.socket:ignored reset_output_buffer
Jul 29 2017 05:31:36 Checking software version on controller... 
INFO:pySerial.socket:ignored port configuration change
INFO:pySerial.socket:ignored reset_output_buffer
INFO:pySerial.socket:ignored port configuration change
Jul 29 2017 05:31:37 Found BrewPi v0.5.2 build 0.5.2-0-g72e633171, running on a Particle p1 with a V3 shield on port socket://192.168.0.240:6666?logging=debug

INFO:pySerial.socket:ignored port configuration change
INFO:pySerial.socket:ignored port configuration change

@elco, further update.

This thing just rebooted even while in off mode, what does that tell us?

As you can see from the first line “Temperature control disable” then an hour later two reboots in quick succession.

Jul 29 2017 06:27:00 Notification: Temperature control disabled
Jul 29 2017 06:27:00 Controller debug message: INFO MESSAGE 12: Received new setting: mode = o
Jul 29 2017 07:30:25 Error: controller is not responding anymore. Exiting script.
Jul 29 2017 07:30:55 Notification: Script started for beer 'None'
Jul 29 2017 07:30:55 Connecting to controller...
Jul 29 2017 07:30:55 Opening serial port
DEBUG:pySerial.socket:enabled logging
INFO:pySerial.socket:ignored port configuration change
INFO:pySerial.socket:ignored _update_dtr_state(True)
INFO:pySerial.socket:ignored _update_rts_state(True)
INFO:pySerial.socket:ignored reset_output_buffer
INFO:pySerial.socket:ignored reset_output_buffer
Jul 29 2017 07:30:55 Checking software version on controller... 
INFO:pySerial.socket:ignored port configuration change
INFO:pySerial.socket:ignored reset_output_buffer
INFO:pySerial.socket:ignored port configuration change
Jul 29 2017 07:30:55 Found BrewPi v0.5.2 build 0.5.2-0-g72e633171, running on a Particle p1 with a V3 shield on port socket://192.168.0.240:6666?logging=debug

INFO:pySerial.socket:ignored port configuration change
INFO:pySerial.socket:ignored port configuration change
Jul 29 2017 07:31:14 Serial Error: socket disconnected)
DEBUG:pySerial.socket:enabled logging
DEBUG:pySerial.socket:enabled logging
Jul 29 2017 07:31:24 Lost serial connection. Error: Could not open port socket://192.168.0.240:6666?logging=debug: timed out)
Terminating due to fatal serial error
Jul 29 2017 07:31:54 Notification: Script started for beer 'None'
Jul 29 2017 07:31:54 Connecting to controller...
Jul 29 2017 07:31:54 Opening serial port
DEBUG:pySerial.socket:enabled logging
INFO:pySerial.socket:ignored port configuration change
INFO:pySerial.socket:ignored _update_dtr_state(True)
INFO:pySerial.socket:ignored _update_rts_state(True)
INFO:pySerial.socket:ignored reset_output_buffer
INFO:pySerial.socket:ignored reset_output_buffer
Jul 29 2017 07:31:54 Checking software version on controller... 
INFO:pySerial.socket:ignored port configuration change
INFO:pySerial.socket:ignored reset_output_buffer
INFO:pySerial.socket:ignored port configuration change
Jul 29 2017 07:31:55 Found BrewPi v0.5.2 build 0.5.2-0-g72e633171, running on a Particle p1 with a V3 shield on port socket://192.168.0.240:6666?logging=debug

INFO:pySerial.socket:ignored port configuration change
INFO:pySerial.socket:ignored port configuration change

That rules out a big part of the code. So it makes it more likely that it’s WiFi related. To rule out more, can you test with no devices connected and installed and in off mode?

You can do a factory reset by clicking the button at the bottom of advanced settings to go to that state.

Ok, factory reset and set to off. lets see how it gets on

Whole system reset yesterday at around 11:00, been in off mode all night with no issues then a reboot at 07:06

INFO:pySerial.socket:ignored port configuration change
INFO:pySerial.socket:ignored port configuration change
Jul 29 2017 09:30:14 Notification: Beer temperature set to 12.0 degrees in web interface
Jul 29 2017 09:30:14 Controller debug message: INFO MESSAGE 12: Received new setting: mode = b
Jul 29 2017 09:30:14 Controller debug message: INFO MESSAGE 12: Received new setting: beerSet = 12.0
Jul 29 2017 09:30:40 Notification: Restarted logging for beer 'Cellar-03'.
Jul 29 2017 11:13:42 Resetting controller to factory defaults
Jul 29 2017 11:13:43 Controller debug message: INFO MESSAGE 15: EEPROM initialized
Jul 29 2017 11:13:48 Resetting controller to factory defaults
Jul 29 2017 11:13:48 Controller debug message: INFO MESSAGE 15: EEPROM initialized
Jul 29 2017 11:13:54 Notification: Temperature control disabled
Jul 29 2017 11:13:55 Controller debug message: INFO MESSAGE 12: Received new setting: mode = o
Jul 30 2017 00:00:00 Notification: New day, creating new JSON file.
Jul 30 2017 07:06:15 Error: controller is not responding anymore. Exiting script.
Jul 30 2017 07:06:50 Notification: Script started for beer 'Cellar-03'
Jul 30 2017 07:06:50 Connecting to controller...
Jul 30 2017 07:06:50 Opening serial port
DEBUG:pySerial.socket:enabled logging
INFO:pySerial.socket:ignored port configuration change
INFO:pySerial.socket:ignored _update_dtr_state(True)
INFO:pySerial.socket:ignored _update_rts_state(True)
INFO:pySerial.socket:ignored reset_output_buffer
INFO:pySerial.socket:ignored reset_output_buffer
Jul 30 2017 07:06:50 Checking software version on controller... 
INFO:pySerial.socket:ignored port configuration change
INFO:pySerial.socket:ignored reset_output_buffer
INFO:pySerial.socket:ignored port configuration change
Jul 30 2017 07:06:50 Found BrewPi v0.5.2 build 0.5.2-0-g72e633171, running on a Particle p1 with a V3 shield on port socket://192.168.0.240:6666?logging=debug

INFO:pySerial.socket:ignored port configuration change
INFO:pySerial.socket:ignored port configuration change

I assume this is pointing more towards a Wi-Fi related issue than before.

I have found a way to broadcast an SSID from only one AP using Zones. I will set up a new SSID and reset the P1 so that it knows only this SSID and keep you updated.

Mine appears to be doing the same thing. The WiFi signal in the room I have my equipment in is weak, but stable. Here are the logs for the past 30 minutes

Jul 30 2017 11:49:14 Fresh start! Log files erased.
Jul 30 2017 12:10:09 Error: controller is not responding anymore. Exiting script.
Jul 30 2017 12:10:41 Notification: Script started for beer 'Batch 51+52’
Jul 30 2017 12:10:41 Connecting to controller…
Jul 30 2017 12:10:41 Opening serial port
Jul 30 2017 12:10:41 Checking software version on controller…
Jul 30 2017 12:10:41 Found BrewPi v0.5.2 build 0.5.2-0-g72e633171, running on a Particle p1 with a V3 shield on port socket://192.168.86.35:6666

Jul 30 2017 12:13:36 Error: controller is not responding anymore. Exiting script.
Jul 30 2017 12:14:08 Notification: Script started for beer 'Batch 51+52’
Jul 30 2017 12:14:08 Connecting to controller…
Jul 30 2017 12:14:08 Opening serial port
Jul 30 2017 12:14:08 Checking software version on controller…
Jul 30 2017 12:14:08 Found BrewPi v0.5.2 build 0.5.2-0-g72e633171, running on a Particle p1 with a V3 shield on port socket://192.168.86.35:6666

@elco, looks like it is definitely when the P1 loses Wi-Fi or switches AP’s, since setting a new SSID an broadcasting on one AP only the V3 has not rebooted.

@bpascucci, I would suggest that this could more than likely be causing your issue to as I believe you have already worked out.

Later I’m going to add the devices again and put it back in Beer constant mode.

I’m not 100% sure it is the spark that is losing wifi, but possibly the pi. I’m monitoring pings to both devices and the pi is slower to respond and losing some packets here and there (1% or so). I’m wondering if the communication between the devices is the issue and there needs to be a higher socket timeout to addesss that. I may move my pi out of the room with the spark since they don’t have to be physically connected and see if that helps, but won’t make any changes till my fermentation is done. Don’t want to mess up my Oktoberfest.

I have compiled a new firmware version with the latest particle framework (0.7.0-rc.2).

Because of the system firmware update (0.5.3 -> 0.7.0) I need to make some changes to the update script to make sure you can update smoothly. I’ll try to do that as soon as possible. I am currently testing this version on my fridge.

There are some wifi related fixes in 0.7.0 that might resolve this. I have not been able to recreate the issue myself yet, so @rbpalmer if you want to be my guinea pig, that would be great. Just note that in case you want to downgrade the system firmware, a special order of commands is required.

@elco, really don’t mind being the guinea pig here, as you cannot recreate it and I can then it will make for a more stable platform across many different setups.

I have put a version online based on 0.7.0-rc2.

To flash it, first update the script. Then flash the new system binaries and the brewpi binary with:

sudo python flashDfu.py --tag=0.5.3

I have been running this on my fridge without problems, but the previous version didn’t cause any issues here either.

You need to update with DFU, I have not been able to fix the update process for flashing system without DFU.

Running in docker, how do you do the update, I tried the command above, but get ‘/bin/sh: 1: downloads/dfu-util: not found’

I did not start the container with the USB command (mainly because the command in the wiki setup the container but it wouldn’t start)