-->
Page 1 of 2

Abusive WebSocket test getting an Exception

PostPosted: Thu Mar 03, 2022 7:15 am
by Inq720
I am running some overly abusive tests on my project. The current test in question is a server, serving four WebSockets. I am trying to push data out (small messages < 50 bytes each) as fast as possible. At any rate below say around 3 KB/sec each (12 KB/sec total) the sever and clients will merrily work far longer than I have patience to test... I use 100 hours as a cutoff for a successful test and of my patience.

I've bumped up the send rate to 1000 sends/socket/second. I don't know if its just coincidence, but the the byte rate per socket is hovering right around 2^15 = 32KB/sec. I've run this test a couple of times since it fails in relatively short order. The last one took only 47 minutes to fail. I'm well past trivial things like memory leaks as over 270 MB have been pushed out.

I started searching the Internet about the exception I get and ran across - Ets intr lock nest (#6484)
https://git.riper.fr/ext/esp8266/commit/d8531cb2c4aec39d917fed8d642afe86af59e1c6

I'm not sure what this is supposed to be telling me... Its a little more Advanced than I know.

Here is the Exception Decoder Output. The first line of code that I actually wrote is about 5 lines up from the bottom. What I noted was the article listed above mentioned some arbitrary count of 7 that sounds like they didn't expect this to ever reach it and yet I see three blocks of 7 calls to this function. So, I think I hitting this magic number. I just don't know what that is suppose to be telling me or how I go about fixing it.

Any help is greatly appreciated.

Code: Select allException 29: StoreProhibited: A store referenced a page mapped with an attribute that does not permit stores
PC: 0x4000e1c3
EXCVADDR: 0x00000018

Decoding stack results
0x40100a30: malloc(size_t) at C:\Users\Dennis\Documents\ArduinoIDE\arduino-1.8.16\portable\packages\esp8266\hardware\esp8266\3.0.2\cores\esp8266\umm_malloc\umm_malloc.cpp line 821
0x40100c9e: calloc(size_t, size_t) at C:\Users\Dennis\Documents\ArduinoIDE\arduino-1.8.16\portable\packages\esp8266\hardware\esp8266\3.0.2\cores\esp8266\umm_malloc\umm_malloc.cpp line 1181
0x40100c90: calloc(size_t, size_t) at C:\Users\Dennis\Documents\ArduinoIDE\arduino-1.8.16\portable\packages\esp8266\hardware\esp8266\3.0.2\cores\esp8266\umm_malloc\umm_malloc.cpp line 1176
0x40100827: umm_free_core(umm_heap_context_t*, void*) at C:\Users\Dennis\Documents\ArduinoIDE\arduino-1.8.16\portable\packages\esp8266\hardware\esp8266\3.0.2\cores\esp8266\umm_malloc\umm_malloc.cpp line 549
0x402053fa: loop_task(ETSEvent*) at C:\Users\Dennis\Documents\ArduinoIDE\arduino-1.8.16\portable\packages\esp8266\hardware\esp8266\3.0.2\cores\esp8266\core_esp8266_main.cpp line 213
0x401009e0: umm_init() at C:\Users\Dennis\Documents\ArduinoIDE\arduino-1.8.16\portable\packages\esp8266\hardware\esp8266\3.0.2\cores\esp8266\umm_malloc\umm_malloc.cpp line 476
0x401000ac: app_entry_redefinable() at C:\Users\Dennis\Documents\ArduinoIDE\arduino-1.8.16\portable\packages\esp8266\hardware\esp8266\3.0.2\cores\esp8266\core_esp8266_main.cpp line 325
0x4010018c: ets_post(uint8, ETSSignal, ETSParam) at C:\Users\Dennis\Documents\ArduinoIDE\arduino-1.8.16\portable\packages\esp8266\hardware\esp8266\3.0.2\cores\esp8266\core_esp8266_main.cpp line 181
0x4010018c: ets_post(uint8, ETSSignal, ETSParam) at C:\Users\Dennis\Documents\ArduinoIDE\arduino-1.8.16\portable\packages\esp8266\hardware\esp8266\3.0.2\cores\esp8266\core_esp8266_main.cpp line 181
0x4010018c: ets_post(uint8, ETSSignal, ETSParam) at C:\Users\Dennis\Documents\ArduinoIDE\arduino-1.8.16\portable\packages\esp8266\hardware\esp8266\3.0.2\cores\esp8266\core_esp8266_main.cpp line 181
0x4010018c: ets_post(uint8, ETSSignal, ETSParam) at C:\Users\Dennis\Documents\ArduinoIDE\arduino-1.8.16\portable\packages\esp8266\hardware\esp8266\3.0.2\cores\esp8266\core_esp8266_main.cpp line 181
0x4010018c: ets_post(uint8, ETSSignal, ETSParam) at C:\Users\Dennis\Documents\ArduinoIDE\arduino-1.8.16\portable\packages\esp8266\hardware\esp8266\3.0.2\cores\esp8266\core_esp8266_main.cpp line 181
0x4010018c: ets_post(uint8, ETSSignal, ETSParam) at C:\Users\Dennis\Documents\ArduinoIDE\arduino-1.8.16\portable\packages\esp8266\hardware\esp8266\3.0.2\cores\esp8266\core_esp8266_main.cpp line 181
0x4010018c: ets_post(uint8, ETSSignal, ETSParam) at C:\Users\Dennis\Documents\ArduinoIDE\arduino-1.8.16\portable\packages\esp8266\hardware\esp8266\3.0.2\cores\esp8266\core_esp8266_main.cpp line 181
0x40100164: ets_intr_unlock() at C:\Users\Dennis\Documents\ArduinoIDE\arduino-1.8.16\portable\packages\esp8266\hardware\esp8266\3.0.2\cores\esp8266\core_esp8266_main.cpp line 167
0x4010018c: ets_post(uint8, ETSSignal, ETSParam) at C:\Users\Dennis\Documents\ArduinoIDE\arduino-1.8.16\portable\packages\esp8266\hardware\esp8266\3.0.2\cores\esp8266\core_esp8266_main.cpp line 181
0x4010018c: ets_post(uint8, ETSSignal, ETSParam) at C:\Users\Dennis\Documents\ArduinoIDE\arduino-1.8.16\portable\packages\esp8266\hardware\esp8266\3.0.2\cores\esp8266\core_esp8266_main.cpp line 181
0x4010018c: ets_post(uint8, ETSSignal, ETSParam) at C:\Users\Dennis\Documents\ArduinoIDE\arduino-1.8.16\portable\packages\esp8266\hardware\esp8266\3.0.2\cores\esp8266\core_esp8266_main.cpp line 181
0x4010018c: ets_post(uint8, ETSSignal, ETSParam) at C:\Users\Dennis\Documents\ArduinoIDE\arduino-1.8.16\portable\packages\esp8266\hardware\esp8266\3.0.2\cores\esp8266\core_esp8266_main.cpp line 181
0x4010018c: ets_post(uint8, ETSSignal, ETSParam) at C:\Users\Dennis\Documents\ArduinoIDE\arduino-1.8.16\portable\packages\esp8266\hardware\esp8266\3.0.2\cores\esp8266\core_esp8266_main.cpp line 181
0x4010018c: ets_post(uint8, ETSSignal, ETSParam) at C:\Users\Dennis\Documents\ArduinoIDE\arduino-1.8.16\portable\packages\esp8266\hardware\esp8266\3.0.2\cores\esp8266\core_esp8266_main.cpp line 181
0x4010018c: ets_post(uint8, ETSSignal, ETSParam) at C:\Users\Dennis\Documents\ArduinoIDE\arduino-1.8.16\portable\packages\esp8266\hardware\esp8266\3.0.2\cores\esp8266\core_esp8266_main.cpp line 181
0x4020dd9a: __d2b at /workdir/repo/newlib/newlib/libc/stdlib/mprec.c line 809
0x4020c454: _dtoa_r at /workdir/repo/newlib/newlib/libc/stdlib/dtoa.c line 515
0x4020dd46: __d2b at /workdir/repo/newlib/newlib/libc/stdlib/mprec.c line 779
0x4010018c: ets_post(uint8, ETSSignal, ETSParam) at C:\Users\Dennis\Documents\ArduinoIDE\arduino-1.8.16\portable\packages\esp8266\hardware\esp8266\3.0.2\cores\esp8266\core_esp8266_main.cpp line 181
0x402091fb: _printf_float at /workdir/repo/newlib/newlib/libc/stdio/nano-vfprintf_float.c line 330
0x4020ca61: _dtoa_r at /workdir/repo/newlib/newlib/libc/stdlib/dtoa.c line 853
0x4020df54: __ssputs_r at /workdir/repo/newlib/newlib/libc/stdio/nano-vfprintf.c line 182
0x4020e018: __ssputs_r at /workdir/repo/newlib/newlib/libc/stdio/nano-vfprintf.c line 232
0x4020e018: __ssputs_r at /workdir/repo/newlib/newlib/libc/stdio/nano-vfprintf.c line 232
0x4020df54: __ssputs_r at /workdir/repo/newlib/newlib/libc/stdio/nano-vfprintf.c line 182
0x40209908: _printf_i at /workdir/repo/newlib/newlib/libc/stdio/nano-vfprintf_i.c line 246
0x4020df54: __ssputs_r at /workdir/repo/newlib/newlib/libc/stdio/nano-vfprintf.c line 182
0x4020e018: __ssputs_r at /workdir/repo/newlib/newlib/libc/stdio/nano-vfprintf.c line 232
0x4020e469: _svfprintf_r at /workdir/repo/newlib/newlib/libc/stdio/nano-vfprintf.c line 662
0x4020df54: __ssputs_r at /workdir/repo/newlib/newlib/libc/stdio/nano-vfprintf.c line 182
0x4020e469: _svfprintf_r at /workdir/repo/newlib/newlib/libc/stdio/nano-vfprintf.c line 662
0x4010018c: ets_post(uint8, ETSSignal, ETSParam) at C:\Users\Dennis\Documents\ArduinoIDE\arduino-1.8.16\portable\packages\esp8266\hardware\esp8266\3.0.2\cores\esp8266\core_esp8266_main.cpp line 181
0x4010018c: ets_post(uint8, ETSSignal, ETSParam) at C:\Users\Dennis\Documents\ArduinoIDE\arduino-1.8.16\portable\packages\esp8266\hardware\esp8266\3.0.2\cores\esp8266\core_esp8266_main.cpp line 181
0x4010018c: ets_post(uint8, ETSSignal, ETSParam) at C:\Users\Dennis\Documents\ArduinoIDE\arduino-1.8.16\portable\packages\esp8266\hardware\esp8266\3.0.2\cores\esp8266\core_esp8266_main.cpp line 181
0x4010018c: ets_post(uint8, ETSSignal, ETSParam) at C:\Users\Dennis\Documents\ArduinoIDE\arduino-1.8.16\portable\packages\esp8266\hardware\esp8266\3.0.2\cores\esp8266\core_esp8266_main.cpp line 181
0x4010018c: ets_post(uint8, ETSSignal, ETSParam) at C:\Users\Dennis\Documents\ArduinoIDE\arduino-1.8.16\portable\packages\esp8266\hardware\esp8266\3.0.2\cores\esp8266\core_esp8266_main.cpp line 181
0x4010018c: ets_post(uint8, ETSSignal, ETSParam) at C:\Users\Dennis\Documents\ArduinoIDE\arduino-1.8.16\portable\packages\esp8266\hardware\esp8266\3.0.2\cores\esp8266\core_esp8266_main.cpp line 181
0x4010018c: ets_post(uint8, ETSSignal, ETSParam) at C:\Users\Dennis\Documents\ArduinoIDE\arduino-1.8.16\portable\packages\esp8266\hardware\esp8266\3.0.2\cores\esp8266\core_esp8266_main.cpp line 181
0x40206d4d: run_scheduled_functions() at C:\Users\Dennis\Documents\ArduinoIDE\arduino-1.8.16\portable\packages\esp8266\hardware\esp8266\3.0.2\cores\esp8266/core_esp8266_features.h line 66
0x40205579: __loop_end() at C:\Users\Dennis\Documents\ArduinoIDE\arduino-1.8.16\portable\packages\esp8266\hardware\esp8266\3.0.2\cores\esp8266\core_esp8266_main.cpp line 188
0x401002a1: millis() at C:\Users\Dennis\Documents\ArduinoIDE\arduino-1.8.16\portable\packages\esp8266\hardware\esp8266\3.0.2\cores\esp8266\core_esp8266_wiring.cpp line 193
0x4022c210: InqPortal::send(char const*, ...) at A:\libraries\Inq\src\InqPortal.cpp line 373
0x4020126a: sendTemperature(void*) at A:\WSTest/WSTest.ino line 51
0x4010018c: ets_post(uint8, ETSSignal, ETSParam) at C:\Users\Dennis\Documents\ArduinoIDE\arduino-1.8.16\portable\packages\esp8266\hardware\esp8266\3.0.2\cores\esp8266\core_esp8266_main.cpp line 181
0x401001ad: esp_schedule() at C:\Users\Dennis\Documents\ArduinoIDE\arduino-1.8.16\portable\packages\esp8266\hardware\esp8266\3.0.2\cores\esp8266\core_esp8266_main.cpp line 128
0x402055c2: loop_wrapper() at C:\Users\Dennis\Documents\ArduinoIDE\arduino-1.8.16\portable\packages\esp8266\hardware\esp8266\3.0.2\cores\esp8266\core_esp8266_main.cpp line 205

Re: Abusive WebSocket test getting an Exception

PostPosted: Thu Mar 03, 2022 4:32 pm
by davydnorris
You wrote to memory that's not writeable - possibly you tried to get some memory, which failed and then you didn't check a null pointer return

Re: Abusive WebSocket test getting an Exception

PostPosted: Fri Mar 04, 2022 5:26 am
by Inq720
davydnorris wrote:You wrote to memory that's not writeable - possibly you tried to get some memory, which failed and then you didn't check a null pointer return


I am an Advanced user... just not advanced enough to understand the sited issue #6484.

If you are an Advanced User and sincerely want to help, you'll need to read beyond the first line of the my OP and of the exception. Although your response is technically correct... the author of the code that failed is not me. Unfortunately my code is 54 calls up the stack! I also checked all values in my function. That exact same method call (of mine) has been called over a billion times (literally) with the same values and processed over a 300 GB of data... at a lower rate in a successful 100 hour test. At this higher rate, it now fails... after only millions of calls. The quickest failure took 45 minutes. This last one, took nearly four hours and had processed almost 2GB before giving the same exception.

So, at the moment, I am asking for advice about how I, at 54 calls up the stack in my routine, can compensate for the failing of what I think is related to the #6484 issue - https://git.riper.fr/ext/esp8266/commit/d8531cb2c4aec39d917fed8d642afe86af59e1c6. It distinctly says they had only seen 2 nested calls, but they put in an arbitrary, hard-coded 7 to be "on the safe side". On this one exception of mine it shows 7 nested calls in three different places. Way too coincidental to not be at least part of the issue. I surmise that due to my edge-case abusive test, it actually saw seven "nestings" happening and returned an un-checked NULL pointer that finally failed as you describe.

This is what I know about my conditions:
  1. This is a WebSocket library of my writing. It does not use the Arduino Core's WiFi library web socket or AsyncWebSocket. I am as close to the metal as I can get using lwIP's tcp_write() method (at time of failure)
  2. The bug only occurs at high rates... 1000 sends/second. Doesn't fail at 100 sends/second.
  3. It is also random (45 minutes one time, 4 hours the next). I am assuming this might be due to external network traffic (or contention)
  4. Nominally, throughout the test, the Heap memory is readily available (as the history graph shows). At time of failure, yes, the heap is near fully used. I have not determined what causes the low memory spikes (shown in the Memory History). My current assumption is my message buffering routine is filling up the heap while there is some lower level TCP ACK/NAK going on because of network contention. I currently don't allow my buffering to bring the heap below 6KB. The failure does occur at one of these low spikes.
  5. As a side note, the #6484 issue talks a lot about interrupts. None of the 25K lines of code in my library has interrupts explicitly used. I can't speak for Arduino Core and Espressif code that mine might be calling.
Memory.png

Re: Abusive WebSocket test getting an Exception

PostPosted: Fri Mar 04, 2022 7:02 am
by Inq720
Correction on "technically"

An exception that is writing to a malloc() pointer that failed... aka a NULL pointer looks like:

Exception 0: Illegal instruction
PC: 0x402013e5
EXCVADDR: 0x00000000

My exception (in the OP) indicates some piece of code is either trying to write to or call a function in sector 0 - Which is an Espressif sector and is off-limits to Arduino Core and us.

Exception 29: StoreProhibited: A store referenced a page mapped with an attribute that does not permit stores
PC: 0x4000e1c3
EXCVADDR: 0x00000018