* * * Work in Progress * * *
I have started a thread on the ESP8266 forum, Discussions on my nodeMCU Lua unofficial FAQ. Please use this to discuss any issues that you have with this FAQ; any areas where you feel that the explanation is unclear or needs further expansion; or any or Qs that you feel need answering and would help others if they were included here. Thank-you. Terry Ellison
This FAQ does not aim to help you to learn to program or even how to program in Lua. There are plenty of resources on the Internet for this, some of which are listed in Where to start . What this FAQ does is to answer some of the common questions that a competent Lua developer would ask in learning how to develop Lua applications for the ESP8266 based boards running the nodeMCU firmware.
The nodeMCU firmware implements Lua 5.1 over the Espressif SDK for its ESP8266 SoC and the IoT modules based on this.
nodeMCU Lua in an implementation of eLua over the ESP8266 SDK. eLua is a full-featured implementation of Lua 5.1 that is optimized for embedded system development and execution to provide a scripting framework that can be used to deliver useful applications within the limited RAM and Flash memory resources of embedded processors such as the ESP8266.
A key goal of eLua is to reduce the RAM requirements for the Lua runtime system. One of the key techniques used in this implementation is to use read-only tables and constants wherever practical for library modules. On a typical build this approach reduces the RAM footprint by some 20-25KB and this makes a Lua implementation for the ESP8266 feasible. This technique is called LTR and this is documented in detail in an eLua technical paper: Lua Tiny RAM .
The Espressif SDK is the interface that is freely available albeit in closed format to developers building applications for the ESP8266. The nodeMCU eLua implementation must therefore use this SDK as its kernel layer and work within any design constraints that the SDK API imposes. In particular, the SDK employs an event and task-oriented structure, where individual events can trigger an associated task; this task then runs to completion uninterrupted, at which point the next event queued can be initiated. (Note that the SDK contain device drivers which are interrupt driven. However, these are internal to the SDK, so it treats all event triggered application tasks as atomic.)
The API calls for each type of event typically use a callback parameter to bind a C function implementing a given task to a given event. In the case of the nodeMCU Lua implementation, this task is wrapper around a developer-provided Lua function. This event-driven model imposed by the SDK is very different to a conventional procedural implementation of Lua. Some standard Lua modules and eLua platform modules don't fit well within this structure, and so the nodeMCU implementation replaces these by ESP8266-specific versions. For example, the standard io
and os
libraries don't work, but have been largely replaced by the nodeMCU node
and file
libraries.
The debug
and math
libraries have also been omitted to reduce the runtime footprint.
core
, coroutine
, string
and table
are implemented.file
library.debug
library support. So you have to use 1980s-style “binary-chop” to locate errors and use print statement diagnostics though the systems uart interface. (This omission was largely because of the Flash memory footprint of this library, but there is no reason in principle why we couldn't make this library available in the near future as an custom build option).function table.pack()
will cause a runtime error because you can't write to the global table
. (Yes, there are standard sand-boxing techniques to achieve the same effect by using metatable based inheritance, but if you try to use this type of approach within a real application, then you will find that you run out of RAM before you implement anything useful.)init.lua
script. It then “listens” to the serial port for input Lua chunks, and executes them once syntactically complete. There is no luac
or batch support, although automated embedded processing is normally achieved by setting up the necessary event triggers in the init.lua
script.net
, tmr
, wifi
, etc.) use the SDK callback mechanism to bind Lua processing to individual events (for example a timer alarm firing). Developers should make full use of these events to keep Lua execution sequences short. If any individual task takes too long to execute then other queued tasks can time-out and bad things start to happen.socket:send()
are on consecutive lines in a Lua programme, then the first has completed by the time the second is executed. This is wrong. Each socket:send()
request simply queues the send operation for dispatch. Neither will start to process until the Lua code has return to is calling C function. Stacking up such requests in a single Lua task function burns scarce RAM and can trigger a PANIC. This true for timer, network, and other callbacks. It is even the case for actions such as requesting a system restart, as can be seen by the following example:<code Lua> node.restart(); for i = 1, 20 do print(“not quite yet – ”,i); end </code>
Any SDK-based application for the ESP8266 uses a startup hook void userinit(void)
. The system invokes this hook on boot. The defined by convention in the C module
usermain.cuser_init()
function can by used to do any initialisation required and to call the necessary timer alarms or system functions to bind and callback routines to implement the tasks needed in response to any system events. Individual task callbacks need to implement their actions and return control to the SDK as soon as practical, as the SDK framework is not pre-emptive so any further event tasks are queued on a pending list within the SDK kernel.
Excessively long-running tasks can therefore cause other system functions and services to timeout, or allocate memory to buffer queued data, which can then trigger either the watchdog timer or memory exhaustion, both of which will ultimately cause the system to reboot.
SDK Callbacks include:
socket:on()
callbacks in Lua)The eLua implementation sits within this framework:
app/user/usermain.c
contains the userinit() entry point. This reinitialises the UART, the volatile sections of flash memory (if necessary), the RomFS and SPIFFS before calling
luamain()
with the command-line lua -i
.
* The Lua RTS (see
app/lua/lua.c
) then sets up a timer to poll the input UART every 80 mSec to assemble a complete execution chunk which it then executes with a luapcall().
* The running Lua script can initialise one or more callbacks associated with events such as a timer. The module code will typically store the link to this Lua callback function in the Lua registry . When the callback hook is subsequently invoked, this hook code then retrieves this function reference from the registry and executes it with a
luacall()
.
* There are no concurrency or interlock issues with this approach as the SDK will only initiate a callback after the previously running task has completed, and in the case of Lua when the previous Lua chunk has completed – Lua chunks are executed one-at-a-time.
Consider an simple telnet example given in examples/fragment.lua
:
<code Lua>
s=net.createServer(net.TCP)
s:listen(23,function©
constd = c function soutput(str) if(constd~=nil) then
con_std:send(str) end end node.output(s_output, 0) c:on("receive",function(c,l) node.input(l) end) c:on("disconnection",function(c) con_std = nil node.output(nil) end)
end) </code> This example doesn't use upvalues and all declarations are global, so we can reorder this code for clarity (though doing this adds a few extra globals):
function c_receive(c,l) node.input(l) end function c_disconnection(c) con_std = nil node.output(nil) end function s_output(str) if(con_std~=nil) then con_std:send(str) end end function s_listen(c) con_std = c node.output(s_output, 0) c:on("receive",c_receive) c:on("disconnection",c_disconnection) end s=net.createServer(net.TCP) s:listen(23,s_listen)
So let us consider how this is executed:
creceive,
cdisconnection
, soutput,
slisten
; the server s
is bound to port 23 registering slisten as the initialisation callback. The main routine then exits, with the global variables retained and the main routine code garbage collected.
* When another computer connects to port 23, the listener handler retrieves the reference to
slisten
from the registry and calls it with the socket parameter. This function then binds soutput to the
node.output hook registering it in the registry, and likewise the
creceive
and cdisconnection are bound and registered to the respective on handlers. We now have four routines registered in the registry associated with four events, and this routine then exits with only the routines execution frame garbage collected.
* When a record is received, the onreceive handler retrieves the reference to creceive from the registry and calls it passing it the record. This routine then passes this to the
node.input() and exits. (The node input handler marshals these records into a complete Lua chunk).
* The node.input handler is polling on an 80 mSec alarm and if a compete Lua chunk is available, it executes it. Any output is then passed to the note.output handler which retrieves and calls
soutput
which exits on completion. Any pending sends are then processed.
* This cycle repeats until the other computer disconnect which triggers the ondisconnect handler. This retrieves the
cdisconnection
reference from the registry and calls it. This routine dereferences the connected socket and closes the node.output
hook and exits returning control to the disconnect handler which garbage collects any associated sockets and registered on handlers.The SDK can and will often schedule other event tasks in between these Lua executions (e.g. to do the actual TCP stack processing). The longest individual Lua execution in this example is only 20 bytecode instructions (in the main routine). The original version was a few instructions shorter in that temporary locals were used to hold the closure references instead of globals, but the runtime and memory footprint aren't materially different.
Understanding how the system executes your code can help you structure it better and improve memory usage. Each event task is established by a callback in an API call in an earlier task.
luacall(). Even system initialisation which executes the
dofile(“init.lua”) can be treated as a special case of this. Each function can invoke other functions and so on, but it must ultimate return control to the C library code.
* By their very nature Lua
local variables only exit within the context of an executing Lua function, and so all locals are destroyed between these
luacall()
actions. No locals are retained across events.nil
to it. Globals can be readily enumerated by a for k,v in pairs(_G) do
so their use is transparent.So all Lua callbacks are called by C wrapper functions that are themselves callback activated by the SDK as a result of a given event. Such C wrapper functions themselves frequently need to store state for passing between calls or to other wrapper C functions. The Lua registry is simply another Lua table which is used for this purpose, except that it is hidden from direct Lua access. Any content that needs to be saved is created with a unique key. Using a standard Lua table enables standard garbage collection algorithms to operate on its content.
Note that we have identified a number of cases where library code does not correctly clean up Registry content when closing out an action, leading to memory leaks.
If you are used coding in a procedural paradigm then it is understandable that you consider using tmr.delay()
to time sequence your application. However as discussed in the previous section, with nodeMCU Lua you are coding in an event-driven paradigm.
If you look at the app/modules/tmr.c
code for this function, then you will see that it executes a low level etsdelayus(delay)
. This function isn't part of the nodeMCU code or the SDK; it's actually part of the xtensa-lx106 boot ROM, and is a simple timing loop which polls against the internal CPU clock. It does this with interrupts disabled, because if they are enabled then there is no guarantee that the delay will be as requested.
tmr.delay()
should be correctly used if you want to have exact timing control on an external hardware I/O (e.g. lifting a GPIO pin high for 20 μSec). It will achieve no functional purpose in pretty much every other usecase, as any other system code-based activity will be blocked from execution; at worst it will break your the code and create hard-to-diagnose timeout errors. A good indication here is if you want a delay of more than 10 mSec or so, then using tmr.delay()
is the wrong approach. You should be using a timer alarm or other library callback, to allow the other processing to take place. As the nodeMCU documentation correctly advises (translating Chinese English in to English): tmr.delay()
will make the CPU work in non-interrupt mode, so other instructions and interrupts will be blocked. Take care in using this function.
Most of us have fallen into the trap of creating an init.lua
that has a bug in it, which then causes the system to reboot and hence gets stuck in a reboot loop. If you haven't then you probably will do so at least once.
init.lua
as simple as possible – say configure the wifi and then start your app on a one-shot tmr.alarm()
after a 2-3 sec delay. This delay is long enough to issue a file.remove(“init.lua”)
through the serial port and recover control that way.init.lua
by creating it as inittest.lua, say, and manually issuing a
dofile(“inittest.lua”)
through the serial port, and then only rename it when you are certain it is working as you require.Note that there are two methods of saving compiled Lua to SPIFFS:
node.compile()
on the .lua
source file, which generates the equivalent bytecode .lc
file. This approach strips out all the debug line and variable information.loadfile()
to load the source file into memory, followed by string.dump()
to convert it in-memory to a serialised load format which can then be written back to a .lc
file. This approach creates a bytecode file which retains the debug information.
The memory footprint the bytecode created by method (2) is the same as when executing source files directly, but the footprint of bytecode created by method (1) is typically 60% of this size, because the debug information is almost as large as the code itself. So using .lc
files generated by node.compile()
considerably reduces code size in memory – albeit with the downside that any runtime errors are extremely limited.
In general consider method (1) if you have stable production code that you want to run in as low a RAM footprint as possible. Yes, method (2) can be used if you are still debugging, but you will probably be changing this code quite frequently, so it is easier to stick with .lua
files for code that you are still developing.
Note that if you use require(“XXX”)
to load your code then this will automatically search for XXX.lc
then XXX.lua
so you don't need to include the conditional logic to load the bytecode version if it exists, falling back to the source version otherwise.
luac
against your source on your PC with the -l -s
option will give you a good idea of what your code will generate. The main difference between these two variants is the sizet for ESP8266 is 4 bytes rather than 8bytes found on modern 64bit development PCs; and the eLua variants generate different access references for ROM data types. If you want to see what the string.dump()
version generates then drop the -s
option to retain the debug information.
* Upload your .lc
files to the PC and disassemble then there. There are a number of Lua code disassemblers which can list off the compiled code that you application modules will generate, if
you have a script to upload files from your ESP8266 to your development PC. I use ChunkySpy which can be downloaded here , but you will need to apply the following patch so that ChunkySpy understands eLua data types:
<code diff>
— a/ChunkSpy-0.9.8/5.1/ChunkSpy.lua 2015-05-04 12:39:01.267975498 +0100
+++ b/ChunkSpy-0.9.8/5.1/ChunkSpy.lua 2015-05-04 12:35:59.623983095 +0100
@@ -2193,6 +2193,9 @@
config.AUTODETECT = true
elseif a == “–brief” then
config.DISPLAYBRIEF = true
+ elseif a == “–elua” then
+ config.LUATNUMBER = 5node.heap()
regularly through your code.
Consider the output of dofile(“test1a.lua”)
on the following code compared to the equivalent where the function pnh()
is removed and the extra print(heap())
statement is placed inline:
-- test1b.lua collectgarbage() local heap = node.heap print(heap()) local function pnh() print(heap()) end pnh() print(heap())
Heap Value | Function Call | Inline |
---|---|---|
1 | 20712 | 21064 |
2 | 20624 | 21024 |
3 | 20576 | 21024 |
Here bigger means less RAM used.
Of course you should still use functions to structure your code and encapsulate common repeated processing, but just bear in mind that each function definition has a relatively high overhead for its header record and stack frame (compared to the 20 odd KB RAM available). So try to avoid overusing functions. If there are less than a dozen or so lines in the function then you should consider putting this code inline if it makes sense to do so.
node.compile()
to pre-compile any production code. This removes the debug information from the compiled code reducing its size by roughly 40%. (However this is still perhaps 1.5-2x larger than a LuaSrcDiet-compressed source format, so if SPIFFS is tight then you might consider leaving less frequently run modules in Lua format. If you do a compilation, then you should consider removing the Lua source copy from file system as there's little point in keeping both on the ESP8266.nil
dereferences the previous context of that variable. (Note that reference-based variables such as tables, strings and functions can have multiple variables referencing the same object, but once the last reference has set to nil
, the collector will recover the storage.require()
functions creates a reference for the loaded module in the package.loaded
table, and this reference prevents the module being garbage collected. To make a module volatile, you should remove this reference by setting it to nil
. You can't do this in the outermost level of the module (since the reference is only created once execution has returned from the module code), but you can in any module function, and typically an initialisation function for the module, as in the following example:
<code Lua>local s=net.createServer(net.TCP) s:listen(80,function© (require(“connector”)).init© end) </code>
connector.lua
would be a standard module pattern except that the M.init()
routine must include the lines
<code Lua>
local M, module = {}, …function M.init(csocket) package.loaded[module]=nil
end
return M </code>
require()
will automatically search for connector.lc
followed by connector.lua
, so the code will work for both source and compiled variants. luac
to list off bytecode listing of your code and syntactically validate new code before downloading to the ESP8266. This will also allow you to develop server-side applications and embedded applications in a common language.