The goal is defined to have a robust self repairing mesh that is not hostage to any esp device failing.
The strategy is less obvious.
What may be true
a)
A time base is established by a creator device if two devices simultaneously try to create a time base the wifi server device wins and establishes a time base.
The time base is measured in ticks and uses a 32 bit number that rolls over after about 71 minutes. This roll over confuses past and present since past time is as best I know defined in code as a tick count less than the current tick count.
b)
The universal time base is used to presumably stop devices talking over each other. Devices squawk at a random interval from the mesh network time.
In other words collision avoidance.
Now with other networks eg ethernet rare collisions can be detected by the absence of an acknowledge from the recipient.
We need a defined strategy and then a push to improve the code to fit the strategy.
Probably the devices on the mesh should be considered as asynchronous and be allowed to squawk when they choose...this removes the synchronization of devices on the mesh the code currently appears to enforce via the time base. This means there is a need to detect collisions on an asynchronous design and resend if needed.
Wifi imposes a server client design so there is a need to reestablish the unique server if a server goes offline apart from this client server requirement the devices are essentially peers of each other. Any device can squawk to one or more other devices when it chooses and accepts the risk it can collide with other transmissions and if it does it backs off a random amount of time and re-sends a finite number of times before assuming the recipient went off line.
The nodes talking to nodes mimicking nets and subnets is a bit of overkill since the mesh is most useful for several sensors attached to esp's connecting via wifi.. single node might suffice. Nodes spanning to other nodes could come later.
The esp radio section is half duplex so as best I know there is no way to detect a talkover by an esp listening to its own transmission.
sfranzyshen wrote:sfranzyshen wrote:changes to the way destId was being set have been made in both nodesync and timesync functions and a fix in easyMeshConnection to correct data corruption that might correct some of the problems we have been having ... have a look at the changes in https://github.com/sfranzyshen/easyMesh/tree/devel
I ran this code over night with no resets or mesh loss ... can anyone else confirm this?
I had seven modules running. I reduced the message frequency so it wasn't as hard to follow. Closer to once per second but that varied.
It seemed to be working reliably. Then there was a lot of messages going back and forth. Like ten times the normal frequency. But after a couple of minutes it slowed down and got back to normal. And then there was a crash.
Three resets. One restart after the wdt reset