Many ESP8266 enthusiasts still struggle with unwanted resets and non-recoverable system crashes. That comes from my recent review of the most often expressed issues noted on-line. After investing significant time with this module, many have simply given up and moved on with other options.
That is too bad…
And while I have written several articles on the subject, I have to admit that my long-term ESP8266 reliability test also failed months ago. But the test had not been revisited. That is. not until now. The problem was that the original tests were using an older, less reliable sketch structure.
Now, with some much-needed improvements, my basic setup now provides a reliable and stable platform for IoT projects. That is the point of this writing. The following information should prove useful for anyone wanting to use the ESP8266 on a 24/7 basis with minimal or no down-time…
Here’s a reference to my past writings on this topic:
Past Monitoring the Esp8266 Performance
My original reliability test used a ThingSpeak channel to monitor the ESP8266 performance. This same channel will now be reused to evaluate the updated sketch structure. The longest run-time recorded during the original test was 28 days. Then, the ESP8266 crashed fatally and could not recover. Yet every 24 hour period in the 28 day span recorded at least 10 ESP8266 resets. Fortunately, the sketch was structured to recover from a reset. That is, until the fatal crash occurred during day 28.
This Time Should Be Better
The original sketch used polling to check for http server requests. This was executed each loop() cycle. I used this structure based on project examples found on-line. But the design is flawed. A much more reliable approach is to setup a separate event-driven callback function outside the sketch loop() function to respond to http requests. This post presents more detailed description of this http server sketch structure using the ESP8266. Operating in a separate thread, the responsiveness of the callback is not dependent upon the time required to execute the sequential steps in the loop() function.
I also discovered that my USB to serial device was causing frequent ESP8266 resets. It also was failing frequently during sketch uploads.
This was replaced, and then removed once the final sketch was installed. The operating unit now only uses the 5V and Gnd wires from a USB cable. The 5V is fed through a voltage
regulator to provide the 3.3v for ESP operation.
Monitoring the ESP8266
In this case, the ESP8266 functions as an http server. In addition, it periodically reads any attached sensors. These sensor values are returned as an http reply to a request. The current ESP8266 system time (seconds since the last reset) and the number of WIFI disconnect/reconnections are also returned; all within a JSON string.
So the ESP8266 simply waits for requests and monitors sensors.
I have set up a CRON script, written in php, to request and record the current values from the ESP8266. The script also records the values both to the ThingSpeak channel and a separate mySQL database. This is repeated once every hour, on the hour. A description of this process and the ThingSpeak channel is detailed in this post.
So far, the ESP8266 has been running continuously for almost 8 days without a single WIFI drop-out or ESP8266 reset. There is no reason to doubt that this system will run indefinitely. That is, until the power company delivers a disruption in service or my internet connection goes down.
I hope this experiment offers encouragement to anyone using the ESP8266 that has been frustrated with unreliable performance. Check back periodically to see how long this module performs with crashing with a reset. Here is a quick link to the ThingSpeak channel monitoring the unit.
The Santa Ana winds kicked up this afternoon and with rising temperatures, ACs were cranking everywhere here in Southern California. This triggered a one-minute power outage which also shut down the ESP8266 after a bit over 8 hours of continuous operation. A UPS on the ESP8266 per source is needed to prevent this from happening again.
Five days ago I discovered that the ESP8266 data feed into the ThingSpeak channel was no longer updating values. After some troubleshooting, the root cause was isolated to my 24 port Cisco Ethernet Switch. The problem was not with the switch, but rather one of my devices connected to it. I removed all the connections and re-introduced the essential devices, one-by-one with the system restored to full operational capability. One of the devices was a WIFI access point. This device is used in my home network to extend the range of my WIFI coverage. And that device is what the ESP8266 connects to for network access. Unfortunately, while troubleshooting, a disturbance to the ESP8266 power source occured, reseting the device after 17 days of continuous operation. I used this opportunity to install my new UPS unit to power the ESP and the network modem/router during power outages. This will hopefully eliminate this disruption source to my on-going ESP8266 up-time stress test.