Chat freely about anything...

User avatar
By picstart
#56951 Possible issues with latest development code
"This node: 9406A82F Received from 7F138CFE msg=Hello from node 940c2823"
I don't understand how if each mode sends its unique message "hello from node xxxxxxxx" a message from one node can contain the message from another node?
I probably have misunderstood the design I thought it was peer to peer but it appears from the below it is a client to server then repeated server to destination client.
The original code lacks documentation within the code which will make it difficult even for the original coder to follow the code after sometime elapses.
There are 3 esp's in this test.
[quote]
startHere: New Connection, adopt=1

This node: 9406A82F Received from 7F138CFE msg=Hello from node 940c2823

This node: 9406A82F Received from 7F138CFE msg=Hello from node 7f138cfe

This node: 9406A82F Received from 7F138CFE msg=Hello from node 940c2823

This node: 9406A82F Received from 7F138CFE msg=Hello from node 7f138cfe

This node: 9406A82F Received from 7F138CFE msg=Hello from node 940c2823

This node: 9406A82F Received from 7F138CFE msg=Hello from node 940c2823

This node: 9406A82F Received from 7F138CFE msg=Hello from node 7f138cfe[/quote]


The "start here" code below was modified to show the NodeID's in HEX
[code]// if the time is ripe, send everyone a message!
if ( sendMessageTime != 0 && sendMessageTime < mesh.getNodeTime() ){
String msg = "Hello from node ";
msg+=String(mesh.getNodeId(),HEX); //// dk modded note id is concoted from MAC aka bssid
// msg += mesh.getNodeId();
mesh.sendBroadcast( msg );
sendMessageTime = 0;[/code]
[code] Serial.printf("This node: %X Received from %X msg=%s\n\r",mesh.getNodeId(), from, msg.c_str());[/code]
User avatar
By sfranzyshen
#56966
picstart wrote:Possible issues with latest development code
"This node: 9406A82F Received from 7F138CFE msg=Hello from node 940c2823"
I don't understand how if each mode sends its unique message "hello from node xxxxxxxx" a message from one node can contain the message from another node?
I probably have misunderstood the design I thought it was peer to peer but it appears from the below it is a client to server then repeated server to destination client.
The original code lacks documentation within the code which will make it difficult even for the original coder to follow the code after sometime elapses.


all tcp/ip communication takes place only between a node's STA and it's parent node's AP ... no routing of any IP traffic across other nodes ... each node sets up it's own AP with it's own private little class c wireless network ... and since there is no IP forwarding on the AP any node connected to this AP would only see packets from that AP and not from any other connected node ... so all messages sent from any node to any node has to be sent along this parent/child connections hand-off process to exchange messages ... unless of course the message was sent from the parent to the child or vice versa ... but it is not clear to me how the nodesync and timesync stuff is expected to function ... the nodesync control messages seam to expand across parents (child to all other parents and children) while the timesync only happens between the parent and child ... and one of them keeps kicking off wdt resets ...

If all that didn't just confuses the sh*t out of you ... please chime in ;)
User avatar
By sfranzyshen
#57014 I took the example mentioned by muratdemirtas and removed all of the timesync stuff ... and just left the nodesync in ... I then started two nodes. NodeA starts first and scans the network and finds no other nodes ... NodeB starts second and scans the network and finds NodeA and makes a STA connection ... Now here is where things are different. NodeA since it didn't find a AP during the scan a timer is started that calls startStationScan() to rescan the wifi again in SCAN_INTERVAL (10000) [and never gets turned off again so it is called over and and over again even if a connection is made] ... BUT NodeB never has a timer set ... and scanning stops the wifi connection is established. and several nodesync messages are exchanged successfully ... However, NodeA does not have a STA connection and keeps scanning wifi for a AP ... it is during one of these scans that the AP reaches it's NODE_TIMEOUT (3000000 //uSecs) for NodeB's TCP connection and drops it ... Where NodeB see's the lost TCP connection and it calls meshDisconCb() ... then wifi_station_disconnect() ... that kicks off connectToBestAP() ... and a wifi STA scans and connects again ... then everything repeats.

STA:
Code: Select all0x20\0x09meshRecvCb(): data={"dest":10469232,"from":10417291,"type":5,"subs":[]} fromId=10417291
0x10\0x09handleNodeSync(): with 10417291
0x10\0x09handleNodeSync(): valid NODE_SYNC_REQUEST 10417291 sending NODE_SYNC_REPLY
0x20\0x09sendMessage(conn): conn-chipId=10417291 destId=10417291 type=6 msg=[]
0x20\0x09Sending to 10417291-->{"dest":10417291,"from":10469232,"type":6,"subs":[]}<--
0x20\0x09meshRecvCb(): lastRecieved=215338692 fromId=10417291
0x8\0x09meshDisconCb(): 0x8\0x09Station Connection! Find new node. local_port=17370
0x8\0x09wifiEventCb(): EVENT_STAMODE_DISCONNECTED
0x8\0x09connectToBestAP():0x8\0x09connectToBestAP(): no nodes left in list, rescanning
0x4\0x09stationStatus Changed to STATION_IDLE
0x8\0x09manageConnections(): dropping 10417291 ESPCONN_CLOSE
0x8\0x09closeConnection(): conn-chipId=10417291
0x8\0x09-->scan started @ 227302978<--
0x8\0x09stationScanCb():-- > scan finished @  229429473 < --

AP:
Code: Select all0x10\0x09manageConnections(): start nodeSync with 10469232
0x10\0x09startNodeSync(): with 10469232
0x20\0x09sendMessage(conn): conn-chipId=10469232 destId=10469232 type=5 msg=[]
0x20\0x09Sending to 10469232-->{"dest":10469232,"from":10417291,"type":5,"subs":[]}<--
0x20\0x09meshRecvCb(): data={"dest":10417291,"from":10469232,"type":6,"subs":[]} fromId=10469232
0x10\0x09handleNodeSync(): with 10469232
0x10\0x09handleNodeSync(): valid NODE_SYNC_REPLY from 10469232
0x20\0x09meshRecvCb(): lastRecieved=198334261 fromId=10469232
0x8\0x09-->scan started @ 199169317<--
0x10\0x09manageConnections(): start nodeSync with 10469232
0x10\0x09startNodeSync(): with 10469232
0x20\0x09sendMessage(conn): conn-chipId=10469232 destId=10469232 type=5 msg=[]
0x20\0x09Sending to 10469232-->{"dest":10469232,"from":10417291,"type":5,"subs":[]}<--
0x8\0x09stationScanCb():-- > scan finished @  201307844 < --
0x8\0x09\0x09found : Mesh10469232, -19dBm0x8\0x09 MESH_PRE< ---0x8\0x09
0x8\0x09\0x09found : Sasquatch Sighting, -75dBm0x8\0x09
0x8\0x09\0x09Found  1 nodes with _meshPrefix = "Mesh"
0x8\0x09connectToBestAP():0x8\0x09connectToBestAP(): no nodes left in list, rescanning
0x8\0x09manageConnections(): dropping 10469232 NODE_TIMEOUT last=198334261 node=202334800
0x8\0x09closeConnection(): conn-chipId=10469232
0x8\0x09meshDisconCb(): 0x8\0x09AP connection.  No new action needed. local_port=5555
0x8\0x09wifiEventCb(): EVENT_SOFTAPMODE_STADISCONNECTED

After changing NodeA to NOT perform rescans (disabled the timer) ... the NODE_TIMEOUT is never reached and everything stays connected and exchanging nodesync messages .. indefinitely ... my conclusion is we need to handle scans better ... or not timeout if in a scan ??? .
here's the diff between the original and what I did here for testing ...
https://github.com/sfranzyshen/easyMesh ... g?expand=1
and here is the branch ...
https://github.com/sfranzyshen/easyMesh/tree/no-timing

UPDATE: after running for around 10 hours ... with two nodes just handing nodesync messages back and forth (and nothing else since all the time stuff was ripped out) ... I still have NODE_TIMEOUTS being reached on the AP ... resulting in the STA being dropped... then repeating the connection process.

UPDATE UPDATE : I am mistaken ... the repeat_flag is being set false (0) ... so this makes or shouldn't make a difference ... I too am coming to the conclusion that this whole protocol (code) is fun for the demo ... but is becoming obvious just how inefficient and useless it is in practice ... this whole thing needs to have the basic building blocks redesigned and tightened up ... I'm stepping back even further on this ... and starting over ...

UPDATE UPDATE UPDATE : I have set the timeout up to 10 sec ... so far ( couple hours) no timeouts ... so my guess is it's more about how efficient or speedy the code is ... than messages being lost or dropped ... we just need a better handshake mechanism ... or/and at very least not fail so hard at the first signs of timeout ...
Last edited by sfranzyshen on Mon Oct 24, 2016 2:20 pm, edited 3 times in total.
User avatar
By picstart
#57040 OK here goes.
A possible design for scanning
All devices are equal but one needs to be the AP and upfront no device can be considered as the designated AP.

1) a device scans to see if there is already an AP
if there is an AP it meshes with that AP
If there is no AP it backs off a random amount of time and re-scans; if at that time, it again finds no AP it establishes itself as the AP
2) the random amount of time allows the device that drew the shortest amount of random wait time to win the AP role;
but only in the situation where there is no existing AP and more than one device is competing for it.
3) if the existing AP drops out then the same method to establish a new AP is used.

I'm not sold that we need a synchronizing time base....TCPIP if I have it right will accommodate the resend of broken transmissions...so the issue is in getting a design that
establishes and if needed re-establishes the mesh AP.
There is the special case where a device finds itself alone..we need to consider what to do while it waits for the company of another device.

Randomness is established via the analog pin voltage being used as the seed value.
Since we are considering the bssid (MAC) as establishing a unique mesh ID its uniqueness is good to use for back off time ( frees up the analog pin)...the bssid value modulo some large number to create unique back off time for establishing the device that wins the AP role. Almost the same as a designated device for the AP but not really since the device with the lowest bssid will only win if it is simultaneously competing for the AP role.