So I moved the WLAN configuration (SSID, password) to my own config. That works and still works.
To get the proper values into the config, I added a "quick bootstrap" mode. When the ESP is started, you have 30 seconds to enter the SSID and the password over the UART. Enter the SSID (no spaces allowed!), then a space, then the password and then a linefeed (no carriage return! use ^J if necessary, instead of enter). The SSID and the password will be entered in the config, written to non-volatile storage and it will try to connect immediately as well. See the output to the UART for progress.
If you don't want the messages on the UART, set print_debug to 0 in the config. The quick bootstrap mode will continue to work, but in silence. After 30 seconds, the channel is completely transparent again.