net module - memory access error found?
Posted: Tue Mar 31, 2015 12:22 pm
Moin,
a simple lua TCP server application on ESP 01 using build 20150318 crashed for me with nil accesses
when called in a loop.
To debug that I
log output from 115200 Baud c_printf:
net_delete: cb clear: 3fff8520
...
net_server_disconnected: dis cb vanished: 3fff8520/ffffffff
The hex values are pointers of the nud-Structure.
c_printf("net_delete: cb clear: %08x\n", nud);
I interpret that as net_delete freeing the data structure and net_server_disconnect accessing the data structure
afterwards, finding garbage and interpreting that as callback.
Didn't find/look into the SDK docs of callbacks yet (anyone a hint where to look?) to see
whats documented about SDB callback orders.
I found it very useful to see the line number of cfile doing bogus lua_call and get stacktraces (surely not integrated well). I saw that debug.debug() does not work but that could be replaced with printing "not supported yet".
Other than that tracebacks worked well for me.
I didn't looked how much memory it did cost or how much performance.
I assume adding an image to the builds that includes debug module and provides tracebacks on PANIC would
improve bug reports alot.
Even if you find out that some callback was tries to be executed without ever being installed.
Keep on going,
Carsten
a simple lua TCP server application on ESP 01 using build 20150318 crashed for me with nil accesses
when called in a loop.
To debug that I
- checked out esp-opensdk/ 31ef9133b581788a9766245566da3154f84a9683
- checked out nodemcu-firmware/d8f2c2ba3455f7e132f46b41c97deb5df24c04ec (2015-03-29)
- enabled debug module and readline method
- module net: replaced lua_call with wrapper for lua_pcall + printing callback + line number in net.c
- found out that the problem was caused by what seems net_server_disconnected calling a nil callback
- found out after some time that there never was a callback installed but:
log output from 115200 Baud c_printf:
net_delete: cb clear: 3fff8520
...
net_server_disconnected: dis cb vanished: 3fff8520/ffffffff
The hex values are pointers of the nud-Structure.
c_printf("net_delete: cb clear: %08x\n", nud);
I interpret that as net_delete freeing the data structure and net_server_disconnect accessing the data structure
afterwards, finding garbage and interpreting that as callback.
Didn't find/look into the SDK docs of callbacks yet (anyone a hint where to look?) to see
whats documented about SDB callback orders.
I found it very useful to see the line number of cfile doing bogus lua_call and get stacktraces (surely not integrated well). I saw that debug.debug() does not work but that could be replaced with printing "not supported yet".
Other than that tracebacks worked well for me.
I didn't looked how much memory it did cost or how much performance.
I assume adding an image to the builds that includes debug module and provides tracebacks on PANIC would
improve bug reports alot.
Even if you find out that some callback was tries to be executed without ever being installed.
Keep on going,
Carsten