To keep things short:
Layout of my PCB itself seems good, devices work for months now.
On some occasions the ESP8266 (ESP-12F) dies, though after some time - either not booting and drawing well above 200-300mA
or still booting, but drawing ~170mA from the beginning and not able to do I2C communication anymore (I found that GPIO4 (SDA) is not able to go low when set as output in that case). There is no defect on the PCB, because putting a new ESP-12F and all is good again.
I am trying to find the root cause and hope someone has an idea where to start.
Tantal buffering capacitors close to each IC on the PCB, larger elko caps for voltage buffering and general voltage supply comes from a stepdown converter set to 3,41V.
(That is because I put a 3,3V LDO in parallel that supplies the board during deepsleep when the step down converter is being switched off for better efficiency)
All input pin voltage levels (at least by design) do not exceed 3,41V VCC. (No 5V involved at all)
All pull-ups/pull-downs are present for correct boot mode, only three things seem to be worth mentioning:
1. Currently I do not have a power-good-circuit, so EN pin of ESP8266 is pulled up to VCC
2. I am using serial swap to connect a GSM/LTE modem on the alternative HW serial pins (using 5k pulldown on GPIO15 to be able to boot the chip from flash)
3. besides 3 onboard I2C ICs I have also (optionally) external I2C devices connected to the same I2C bus via external cables (~1m, shielded Ethernet cable) while max. I2C bus capacitance is ok (confirmed rise time with oscilloscope).
The ESP-12f chips keep dying on unregular basis, the pcbs always work when I flash them for the fist time (PCBs by professional PCBA service by the way, definitely no issue on assembly here) and also run for let’s say 1-5 weeks without any issues on my desk.
I found out that it seems to be a problem
1. to do a „fast“ power cycle (say within 5s), then the ESP might not come up again until disconnecting power for some more seconds (say about 10-20) or may be defective even from that short power cycle.
They never ever die when waiting 10s+ before reconnecting power
2. to „hotswap“ my external I2C devices, so the ESP8266 seems to die 1 out of 10 cases if I connect let’s say 20cm of Ethernet cable and the other (slave) I2C PCB(s).
The Ethernet cable runs 12V, GND, 3,41V SCL, 3,41V SDA, 3,41V 1Wire and 3,41V Interrupt line to the ESP8266, and the Slave PCB has larger decoupling caps (~470uF x4) on 12V (and it’s own LDO to supply 3,3V for the I2C slave chip).
1wire and I2C (SCL and SDA) do NOT have inline resistors to the ESP8266, while Interrupt line does have (1k)
So I have an ESP8266 at 3,41V, I2C pull-ups (1,8k) and 1-wire data pull-up (4,7K) to 3,41V on main PCB and I2C slave IC at 3,3V without I2C pull-ups on a second PCB.
That voltage difference is within spec for the slave IC.
Also, an actively powered DS18B20 1-Wire temp sensor (@3,3V) may be connected to the second PCB.
All is common GND of course, all PCBs are SMD and professionally made, with good GND surface design and also simulation result of cross talk, stray capacitance etc was very good.
So maybe I am missing something about voltage spikes (down or up) with my setup, but I don’t really know where to start searching.
I cannot see what could cause an ESP8266 defect that results in increased current consumption while being unable to set GPIO4 (SDA) to low (all other pins OK) - GPIO4 measures 2 Ohm to VCC on the bare chip. and also to have a second failure mode where the ESP8266 does not start at all and draws 200-300mA.
Any tips appreciated.