Skip to content

Crash when using FunctionalInterrupt.h handler #8887

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
5 of 6 tasks
petersohn opened this issue Mar 12, 2023 · 9 comments
Open
5 of 6 tasks

Crash when using FunctionalInterrupt.h handler #8887

petersohn opened this issue Mar 12, 2023 · 9 comments

Comments

@petersohn
Copy link

petersohn commented Mar 12, 2023

Basic Infos

  • This issue complies with the issue POLICY doc.
  • I have read the documentation at readthedocs and the issue is not addressed there.
  • I have tested that the issue is present in current master branch (aka latest git).
  • I have searched the issue tracker for a similar issue.
  • If there is a stack dump, I have decoded it.
  • I have filled out all fields below.

Platform

  • Hardware: ESP-01, but it can also be reproduced on other devices (e.g. ESP-12)
  • Core Version: 3.1.1
  • Development Env: Sloeber IDE, but it can also be reproduced with the Arduino IDE
  • Operating System: [Windows|Ubuntu|MacOS]

Settings in IDE

  • Module: Generic ESP8266 Module
  • Flash Mode: DOUT
  • Flash Size: 1 MB
  • lwip Variant: v2 Lower Memory
  • Reset Method: dtr
  • Flash Frequency: 40Mhz
  • CPU Frequency: 80Mhz
  • Upload Using: SERIAL
  • Upload Speed: 921600

Problem Description

If I use an interrupt handler that triggers frequently, the program will crash after some time. When the ESP resets after the exception, Wifi connection can no longer be established.

To reproduce, I connected an oscillator to the port where the interrupt is attached. I use GPIO 2 in this example which is available on an ESP01, but I could reproduce it with different ports, for example GPIO 3 (AKA serial RXD), or on an ESP12 with GPIO 5. I run the oscillator at about 20 Hz. The problem also occurs at lower frequencies, but it might take more time.

MCVE Sketch

#include <Arduino.h>
#include <ESP8266WiFi.h>
#include <FunctionalInterrupt.h>

unsigned long lastPrint = 0;
volatile unsigned counter = 0;

void setup() {
  Serial.begin(115200);
  WiFi.begin("<ssid>", "<password>"); // fill with real values
  attachInterrupt(2, [&]() { ++counter; }, RISING);
}

void loop() {
    auto now = millis();
    if (now - lastPrint > 1000) {
      Serial.println(now);
      Serial.println(WiFi.status());
      Serial.println(counter);
      lastPrint = now;
    }
    delay(1);
}

Stack Trace

Exception 0: Illegal instruction
PC: 0x40201020
EXCVADDR: 0x00000000

Decoding stack results
0x40100630: interrupt_handler(void*, void*) at /home/petersohn/eclipse/Sloeber/arduinoPlugin/packages/esp8266/hardware/esp8266/3.1.1/cores/esp8266/core_esp8266_wiring_digital.cpp line 167
0x40100f06: check_poison_block(umm_block*) at /home/petersohn/eclipse/Sloeber/arduinoPlugin/packages/esp8266/hardware/esp8266/3.1.1/cores/esp8266/umm_malloc/umm_poison.c line 86
0x401014d6: umm_poison_calloc(size_t, size_t) at /home/petersohn/eclipse/Sloeber/arduinoPlugin/packages/esp8266/hardware/esp8266/3.1.1/cores/esp8266/umm_malloc/umm_poison.c line 189
0x4010056c: interrupt_handler(void*, void*) at /home/petersohn/eclipse/Sloeber/arduinoPlugin/packages/esp8266/hardware/esp8266/3.1.1/cores/esp8266/core_esp8266_wiring_digital.cpp line 138
0x40100f06: check_poison_block(umm_block*) at /home/petersohn/eclipse/Sloeber/arduinoPlugin/packages/esp8266/hardware/esp8266/3.1.1/cores/esp8266/umm_malloc/umm_poison.c line 86
0x40100f06: check_poison_block(umm_block*) at /home/petersohn/eclipse/Sloeber/arduinoPlugin/packages/esp8266/hardware/esp8266/3.1.1/cores/esp8266/umm_malloc/umm_poison.c line 86
0x40101160: umm_malloc_core(umm_heap_context_t*, size_t) at /home/petersohn/eclipse/Sloeber/arduinoPlugin/packages/esp8266/hardware/esp8266/3.1.1/cores/esp8266/umm_malloc/umm_local.c line 47
0x401003d0: ets_post(uint8, ETSSignal, ETSParam) at /home/petersohn/eclipse/Sloeber/arduinoPlugin/packages/esp8266/hardware/esp8266/3.1.1/cores/esp8266/core_esp8266_main.cpp line 238
0x40101490: umm_malloc(size_t) at /home/petersohn/eclipse/Sloeber/arduinoPlugin/packages/esp8266/hardware/esp8266/3.1.1/cores/esp8266/umm_malloc/umm_malloc.cpp line 912
0x40212b9d: sys_timeout_abs at core/timeouts.c line 189
0x401003d0: ets_post(uint8, ETSSignal, ETSParam) at /home/petersohn/eclipse/Sloeber/arduinoPlugin/packages/esp8266/hardware/esp8266/3.1.1/cores/esp8266/core_esp8266_main.cpp line 238
0x4020346a: loop_task(ETSEvent*) at /home/petersohn/eclipse/Sloeber/arduinoPlugin/packages/esp8266/hardware/esp8266/3.1.1/cores/esp8266/core_esp8266_main.cpp line 273
0x40100094: app_entry() at /home/petersohn/eclipse/Sloeber/arduinoPlugin/packages/esp8266/hardware/esp8266/3.1.1/cores/esp8266/core_esp8266_main.cpp line 392

Debug Messages

The program runs fine for about 10 minutes on my scenario (it's random, and occurs earlier if the oscillator runs at higher frequency). Then I get an exception (stack trace above).

Fatal exception 0(IllegalInstructionCause):
epc1=0x40201020, epc2=0x00000000, epc3=0x00000000, excvaddr=0x00000000, depc=0x00000000

Then the ESP reboots, but the Wifi connection cannot be established.

scandone
no <ssid> found, reconnect after 1s
wifi evt: 1
STA disconnect: 201
reconnect
@d-a-v
Copy link
Collaborator

d-a-v commented Mar 12, 2023

Can you try with your handler specifically declared in IRAM:

IRAM_ATTR void handler ()
{
    counter++;
}

The & is superfluous and is also in the way preventing for the lambda to be of the right type to be placed in IRAM by the linker.
But without the & there's an ambiguity that we seemingly should solve, with [](){counter++;} instead I get:

intr/intr.ino:13:49: error: call of overloaded 'attachInterrupt(int, setup()::<lambda()>, int)' is ambiguous
cores/esp8266/Arduino.h:186:6: note: candidate: 'void attachInterrupt(uint8_t, void (*)(), int)'
cores/esp8266/FunctionalInterrupt.h:31:6: note: candidate: 'void attachInterrupt(uint8_t, std::function<void()>, int)'

@dok-net
Copy link
Contributor

dok-net commented Mar 13, 2023

@d-a-v Not pitching my PR - OK, just a little bit - but I remembered that I had done work on FunctionalInterrupt etc and I've verified that #6047 fixed the ambiguity error you found.
If you consider it, I would be grateful for a careful review that I haven't cluelessly introduced any out-of-IRAM issues.

@petersohn
Copy link
Author

I could not reproduce the issue with the native handler. However, in my application where the problem comes from, I tried getting rid of the FunctionalInterrupt, but the problem still persists. I'm still trying to come up with a minimal reproduction, but it looks like the problem is not with FunctionalInterrupt, it just gets the problem manifest.

@dok-net
Copy link
Contributor

dok-net commented Mar 13, 2023

@petersohn Not saying it's related to your crash, but you should call pinMode(2, INPUT);
Oscillator at high(er) frequency - are you perhaps triggering more interrupts than the MCU can handle without important internal functions beginning to fail?

@petersohn
Copy link
Author

Yes, I missed the pinMode() call, but putting it in doesn't help.

I could reproduce the issue without FunctionalInterrupt, by using a simple function call.

void interruptHandlerImpl() {
	++counter;
}

IRAM_ATTR void interruptHandler() {
	interruptHandlerImpl();
}

@dok-net
Copy link
Contributor

dok-net commented Mar 13, 2023

@petersohn Every bit of code that may run inside the interrupt service routine must be in IRAM.

@petersohn
Copy link
Author

If that's true, it makes the bug in FunctionalInterrupt obvious, which is weird at the very least, because FunctionalInterrupt has been part of the core for a long time, which makes it strange nobody has ever noticed that it should not work.

@dok-net
Copy link
Contributor

dok-net commented Mar 14, 2023

@petersohn

If that's true

It is a well documented fact.

What about the question that your interrupt frequency is just to high to handle?

@petersohn
Copy link
Author

It is a well documented fact.

Then why is FunctionalInterrupt there? It should never have been able to work.

What about the question that your interrupt frequency is just to high to handle?

In the real application, the problem comes up regardless of interrupt frequency. It just takes more time to reproduce. I used the frequency I did because it can reproduce the problem relatively fast.

@mcspr mcspr changed the title Crash when using interrupt handler Crash when using FunctionalInterrupt.h handler Apr 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants