Spark 3 hangs every few hours

Status LED and Device Modes - Photon | Tutorials | Particle to wipe current credentials, then connect over USB and use the UI or brewblox-ctl wifi to connect to the new network.

You can force usb by adding --discovery=usb to the command in docker-compose.yml.

Setup a dedicated AP just for the Spark. USB cable removed. Spark power cycled.

I just want to mention that I have a Ubiquiti Wifi setup as well. It might be a coincidence. I was expecting the spark to act independent and to be resilient to Wifi drop out to the Brewblox controller. Did I get something wrong?

It’s designed to be independent, yes. These hangups are not expected or desired behavior.

It lasted about 5hrs. Hung stuck on heat. Back to the USB cable.

I might have a reliable way of causing these hangups. Is there a serial output or something that I can hook into to get some additional information?

Not for the Spark 2/3, but you could reset it immediately after triggering the hangup. It will then show a short log of what it was doing in the spark service logs when it reconnects.

(We’re also very interested in how you manage to reproduce it)

When you say reset it, I assume it is something other than power cycling? I don’t see a reset buton on the Spark3 as far as I can tell.

Also, it is definitely wifi related. I moved by routine update to 6am Saturday, and reliably it hangs.

Last night my AP disconnected for some reason, hung again.

I am working around this by using a smart switch, home assistant and node red to power cycle the Spark if if is offline for more than 10 mins

I do mean power cycling, but there’s also a reset button. You need a toothpick or safety pin to access it. The label on the back indicates which of the two is the reset button.

Power cycling by unplugging it works just as well - the important thing is that it logs whatever it was doing when it rebooted.

The trace is stored in memory that is not cleared on reset, but it is cleared by power cycling.

Have found a way to trigger a hang on demand.

By changing the Wifi channel.

My router default option is to auto allocate a channel based on some unstated rule.
By switching to manual and selecting a new channel the Spark hangs. Was able to reproduce 3 hangs in 10 mins.
Have switched router to manual to prevent future changes will see how it lasts overnight.

Does this still happen with today’s firnware release?

Updated last night, no problems so far, looking good.

Forced wifi to change channel and the Spark hung.

If you switched to the earlier experimental release, did you switch back to edge (or ${BREWBLOX_RELEASE})?

Am using the following.

Then we won’t feel silly later at least.

For now I’d recommend keeping a fixed channel, and we’ll continue digging through the Wifi code.
Alternatively, if you have a spare Pi of any kind, I can write a quick guide for setting it up as Wifi to USB repeater. This lets you bypass the Wifi stack on the Spark itself.

The Wifi AP was and still is set to fixed channel.

I have found something. It seems that Particle made a change to the hard fault handler, without ever testing that it still worked. Simply unacceptable.

So this is caused by 2 Particle bugs:

  • An old bug in the wifi stack causes a hard fault (unrecoverable error) when the WiFi is lost. The code that contains the bug is closed-source, so I cannot help them fix it.
  • A recently introduced bug in the hard fault handler causes the device to completely lock up instead of rebooting.

I have created a PR with a fix at Particle:

And reported the WiFi bug again:

The hard fault handler bug is in the system layer, that the Spark 2 and 3 download directly from Particle.
We can build our own system binaries to flash over USB, but not for over-the-air updates. Therefore, I’ll try to push get Particle to do a hotfix release tomorrow.

1 Like