-->
Page 1 of 5

Bug hunt

PostPosted: Fri Jun 09, 2017 3:41 pm
by Electroguard
I think it worth mentioning some different perspective 'bug' observations in case it might help pinpoint the problem, because otherwise the bug could remain forever.

Stage 1.
The first sign of problems is when a script has grown sufficiently to cause a reboot when clicking SAVE.
This may not seem much of a problem at the time because the script says it has Saved ok in the browser status bar, and indeed may pop up the "Saved ok" confirmation window after the device has rebooted and reconnected, but it is just the beginning of an inevitable impossibly slippery slope.

Stage 2.
Eventually the rebooted interpreter fails to return a "Saved OK" confirmation, forcing a second SAVE.

Initially the rebooting only occurs with an interpreter which has done a previous SAVE, therefore a newly powered-up or re-booted interpreter will probably SAVE ok without a reboot, but subsequent SAVEs will likely cause a reboot, resulting in every alternate SAVE rebooting or succeeding.

This difference of behaviour suggests that additional (virgin or non-virgin) 'interpreter state' info is also being saved for subsequent interpreter use along with the actual script, else if it was just the script being SAVEd the results should be identical every time.

If the saved 'state' data is part of the problem it's possible (even likely) that all other symptoms such as quantum weirdess, script corruption and eventually interpreter corruption, are all just different extremes of the same root cause.

So if the interpreter is saving 'state' data as well as the script, it could suggest a problem with the writing of the 'state' data, ie: saving too big a block... causing it to outgrow its allocated area and bleed over things like the browser cache, and perhaps even the interpreter cache or program counter or REBOOT code (or whatever else might be able to actually trigger a reboot?).

It's quite possible that the interpreter may have been incorrectly saving it's 'state' data from day 1 without being noticed, until the saved state data grew sufficient to trespass out-of-bounds and bleed over the 'reboot' trigger and browser cache areas etc. Presumably the virgin state data is smaller, which would explain why a virgin SAVE doesn't trigger a reboot.

Stage 3.
Quantum weirdness starts when the browser fails to display some of the web components correctly, but with strange intermittent symptoms which may change each time the script is RUN. Simply doing a browser Back then re-RUN may cause previous problems to now display correctly but other web components to be wrong, often with different results for many re-RUNs, and perhaps eventually everything might display correctly.
This is not script corruption, else items that didn't display first time could never display correctly, so it would appear to be the browser cache that is being corrupted, and apparently different areas each time.

Stage 4.
Permanent script corruptions. Clicking EDIT would now load the corrupted script, but the corruption may not necessarilly be obvious or noticable. The unwary could chase many ghosts.

Stage 5.
Corruption of the Esp-Basic interpreter. Madness. Nothing can be trusted any more.


If a script is needed for demonstrating the quantum weirdness please ask, cos I have many such abandoned projects which have hit that brick wall.

If I had sufficient knowledge I would try to find out the resulting saved 'state data' memory differences between a virgin save and a non-virgin save, then perhaps try to populate the memory past their ends with tests data to see if it got overwritten when doing another SAVE, and by how much.

I can visualise the situation as putting one foot in front of the other, then laying a carton of eggs in front. I could raise and lower my front foot as much as I liked without problems... until I put on bigger boots - which would then stomp on anything in front.
My gut feeling is that all those problem stages mentioned above are just different numbers of broken eggs caused by the same growing footprint as it gets bigger.

I hope some of this might make some sense, and hopefully could help isolate the crippling bug once and for all.

Re: Bug hunt

PostPosted: Fri Jun 09, 2017 4:54 pm
by heckler
Post Moved to viewtopic.php?f=42&t=14885
My intention was not to derail your efforts.


Are you aware of any tools that show the memory map of the various parts of the espBASIC OS??
it might provide more troubleshooting tools to see how much space is allocated for
program code
webpage contents
variable codespace
etc.

I'd be happy to give some feedback by trying some bit of code that causes more strange errors than I have so far encountered.

we definitely do not want to just have to "live with" the current state of espBASIC.
I believe in its current state many newcommers will try and find it unstable enough to just leave it behind.

dwight

Re: Bug hunt

PostPosted: Sat Jun 10, 2017 4:25 am
by Electroguard
Yes, thanks for the enthusiastic response heckler, and there is nothing wrong with your 'method', but it was a bit misplaced unfortunately, belonging under the category of How To Live With The Bug Problem Forever
It rather dilutes my efforts of trying to offer mmiscool possible ideas to help root out the cause of the long-term persistent bug rather than keep mitigating its effects like we've been doing since last year, cos it gives the false impression that you've found the answer to the bug you subsequently named, which unfortunately is not the case.
Cos when you think about it, your 'method' does not avoid or prevent the bug, it just does its rebooting/power-recycling dirty work for it.
Nor is there anything you can do to prevent or reduce the quantum weirdness effects resulting in an unstable and unusable browser display once the script gets large enough for the interpreter to start corrupting different parts of the browser cache.
If you can ignore browser instability and push on regardless, you will inevitably start suffering script corruptions, and although you can avoid using EDIT to reload the corruption into your browser by keep pasting in a constant supply of sacrificial lamb scripts, but each newly downloaded and corrupted esp script when RUN can still cause obscure program errors and malfunctions.
If you keep suffering everything thrown at you, eventually the Esp-Basic interpreter will corrupt and plunge you into a world of illogical madness without even any clue as to when all the old madness decayed into the latest inescapable insanity. Eventual realisation leaves just the one inevtable 'final solution' to reformat and reflash - then abandon your project, else retrace steps back to the same conclusion again.

The only way for Esp-Basic to break free from the bugs 'tractor-pulling sledge' shackles is to get rid of 'The Bug', else Esp-Basic is always going to keep suffering from crippling restrictions whatever steps are taken to mitigate them. There are reasons why cicciocb won't be rushing to the rescue, so I was hoping I might be able to offer mmiscool some alternative clues to follow which he may not have yet tried, cos I don't think anyone else is likely to have the required intimate knowledge and expertise to find the problem and fix it. But he is a man of few words who seems to prefer avoiding them, so I can only hope for everyones sake that our 'barrage' of words doesn't deter him from considering any clues along with any possible opportunity for fixing things.
Time will tell.

Re: Bug hunt

PostPosted: Sun Jun 11, 2017 12:08 pm
by Electroguard
No I'm not familiar with the developer side of things heckler, but I've thought of a way we laymen might be able to help pin it down...

By trolling back through history to discover what version it was first introduced in (which may be earlier than when it was first noticed).

Cos if we can pin down when the bug was introduced, it should allow mmiscool to focus on that versions code changes which he'll know for sure must cause the bug.

The practical problem with that is that almost every script developed since then is likely to utilise new features which prevent it from running on the earlier versions, making it hard to find a 'test' script that exhibits the bug symptoms now but would still try to run ok on earlier versions.

Edit: Also, all of the earlier scripts were buggy anyway one way or another, so if the reboot bug is just a mutation of an earlier bug it's not going to be easy to pin down its origin.

So this might involve creating a pointless generic script which triggers the reboot bug now but which will still run on the earlier V3 versions, then starting as far back as possible, trying to progressively utilise any new features as they were introduced... but it's likely to take some time.


In answer to another comment, I am trying to patch up a rickety script that displays both quantum weirdness and more persistent script corruption symptoms to post for you to look at, but it's like trying to catch a bunch of rats in a barn using a bag with holes in - it'll take me a bit of time, but I will send it.