-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Kernel oops in vc4_overflow_mem_work #2217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Clusters of apparently unrelated errors like this are usually either due to collateral damage caused by the first crash, memory corruption or an inadequate power supply. Check that your power supply is good - |
Thank you! Looks like someone else has this problem too: anholt#114 |
Be advised - Eric isn't always very responsive, but he will get round to it eventually. |
@anyc Did my patch fixes the issue for you? Edit: Sorry, i didn't look close enough at the trace. I assume your issue never occur during boot. So the patch won't help. Looks more like a memory corruption, because the spin lock shouldn't be NULL after complete VC4 init. |
It is now running for over an hour without errors - only some "Resetting GPU" messages. Looks good already but I'll keep it running non-stop for an afternoon on the weekend to be sure. Thank you! |
Does the issue occur without the Qt app? |
I will keep it running without the app. It's running now for about 1.5 hours. According to my current backlog, the longest time without issues was 2.5 hours. It varies randomly, I'd say. Only a TV, WiFi usb stick and a uart2usb adapter to another RPi. The other RPi uses kernel 4.9.36 that I used on the faulty RPi before, too. |
In case the issue doesn't occur without app, try to remove WiFi usb stick and uart2usb adapter but use the app again. |
It also stopped with an oops eventually. :/ The GUI app did run for a short time in the beginning though as it is started automatically. I switch the SD cards now and check if the issue occurs also on the other RPi. |
Since your issue doesn't occur during boot, i don't think it's the same issue. In your case we have some kind of corruption which overwrite the spin lock structure. |
I just noticed during the last two oops that ~50 ms before the initial "Unable to handle kernel NULL pointer dereference" message there is a "[drm] Resetting GPU." message. |
What would be the best way to debug this? Setting a watchpoint on the structure? Would this be possible on the RPi over UART? I have not much experience with kernel debugging except printk debug output. |
The same thing also happens with an unmodified raspbian. Shall I open an issue in Eric's repo? Maybe he has an idea? |
Hi,
I am using a RPi2 with a Qt5+QML app to display a slideshow with large moving images on an HDMI tv. This works well for quite some time - I'd say at least 30 minutes - but I always get a kernel oops eventually. I compiled the kernel myself and I use an Ubuntu userspace. I managed to get two oops reports that (as far as I can tell) occured at the same place in the vc4_overflow_mem_work function. Here is one of them:
From time to time, there are also
[drm] Resetting GPU
messages in the kernel log. In the config.txt,gpu_mem
is set to 128.The text was updated successfully, but these errors were encountered: