Skip to content

NVME SSD boot: initramfs corrupted #1731

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
aurel32 opened this issue Jul 15, 2022 · 12 comments
Closed

NVME SSD boot: initramfs corrupted #1731

aurel32 opened this issue Jul 15, 2022 · 12 comments

Comments

@aurel32
Copy link

aurel32 commented Jul 15, 2022

Is this the right place for my bug report?
The issue is linked to the NVME boot so it seems to be the right place.

Describe the bug
When booting from a NVME SSD, the initramfs loaded with the initramfs directive in the config.txt is getting corrupted. The kernel issue the following error:
Initramfs unpacking failed: junk within compressed archive

To reproduce
Take a working system using an initramfs on an SSD card (in my case Debian Bullseye). On the SSD, create the same partitions than on the SD card. Copy the partitions with dd if=/dev/mmcblk0pX of=/dev/nvme0n1pX to make sure the same exact same things is being tested. Remove the SD card and try to boot from the SSD.

(This is one way to reproduce it. I also tried copying the content of the partitions instead and also using an MSDOS partition table instead of the GPT partition table).

Expected behaviour
The system should boot on the NVME SSD the same way it boots from the SD card.

Actual behaviour
The system fails tos to boot with
Initramfs unpacking failed: junk within compressed archive

System

  • Which model of Raspberry Pi?
    Compute Module 4 2GB without wifi

  • Which OS and version (cat /etc/rpi-issue)?
    Debian Bullseye

  • Which firmware version (vcgencmd version)?
    Mar 24 2022 13:19:26
    Copyright (c) 2012 Broadcom
    version e5a963efa66a1974127860b42e913d2374139ff5 (clean) (release) (start)

  • Which kernel version (uname -a)?

Linux pascal 5.18.0-0.bpo.1-arm64 #1 SMP Debian 5.18.2-1~bpo11+1 (2022-06-14) aarch64 GNU/Linux

Logs
I have attached the boot logs including the firmware logs (uart_2ndstage=1) for both the SD card and with the SSD. I have removed the timestamp from both the firmware and the kernel to have an easier comparison.

boot_sd.txt
boot_nvme.txt

Additional context
Diffing the two logs, I noticed the DT, kernel and initramfs have the same size and are loaded at the exact same addresses in both cases.

@pelwell
Copy link
Contributor

pelwell commented Jul 18, 2022

A quick test worked for me. Can you upload your initramfs somewhere for me to try?

@aurel32
Copy link
Author

aurel32 commented Jul 18, 2022

A quick test worked for me. Can you upload your initramfs somewhere for me to try?

Thanks for testing. You can download the initramfs here: https://temp.aurel32.net/rpi/initramfs.img

@pelwell
Copy link
Contributor

pelwell commented Jul 18, 2022

Thanks - calculating sha256 sums shows that even the firmware is seeing the corruption when loading from NVME, but loading from SD gives the correct result. Leave this with me.

@pelwell
Copy link
Contributor

pelwell commented Jul 20, 2022

This has been a confusing issue to debug - by the time I had run some tests and assembled some patches to aid debugging the problem it had started working. However, in the process I found something which could explain the failure, and as it depends on the layout of data on the SSD it is also a problem which could go away after rewriting files.

Here's a trial firmware implementing a fix for the limitation I found (just start4.elf and fixup4.dat): https://drive.google.com/file/d/1yhlXzgj0PczcqQcwmkWXKFxQsPyU3nkg/view?usp=sharing
Extract the files from the .zip archive and copy them into /boot (you could back up the originals just in case), then see if the situation improves.

As a debugging aid, there's also a new config setting - sha256 - which logs the sha256 hashes for files loaded by the firmware. Note that this can take many seconds for large files, but if you need it you won't mind waiting. Enable with sha256=1. Note that adding enable_uart=1 and uart_2ndstage=1 to config.txt is a good idea if you have a way of accessing the UART.

@aurel32
Copy link
Author

aurel32 commented Jul 20, 2022

Thanks a lot for the fix. I confirm that it works fine. I tested it making sure not to move the initramfs to avoid modifying its data layout on the SSD. After a successful boot, I reverted to the original firmware, still without touching the initramfs, and it got corrupted again.

@pelwell
Copy link
Contributor

pelwell commented Jul 21, 2022

Cool - the fix will be in all future firmware releases.

timg236 added a commit to timg236/rpi-eeprom that referenced this issue Jul 22, 2022
* NVMe fix large file reads - see raspberrypi/firmware#1731
  The firmware fix is also relevant for the bootloader when loading large boot.img files.
popcornmix added a commit that referenced this issue Jul 22, 2022
kernel: dtoverlays: Add nohdmi options to vc4-kms-v3d overlays
raspberrypi/linux#5099

kernel: overlays: Make more overlays runtime-capable
See: raspberrypi/linux#5101

kernel: overlays: Mark more overlays as Pi4-specific

kernel: Revert ext4: make mb_optimize_scan performance mount option work with extents
See: raspberrypi/linux#5097

kernel: configs: Enable IIO software trigger modules
See: raspberrypi/linux#4984

kernel: configs: Enable IP_VS_IPV6 (for loadbalancing)
See: raspberrypi/linux#2860

kernel: configs: Enable CEPH_FS=m
See: raspberrypi/linux#2916

firmware: arm_loader: initramfs over NVME fix
See: #1731
popcornmix added a commit to raspberrypi/rpi-firmware that referenced this issue Jul 22, 2022
kernel: dtoverlays: Add nohdmi options to vc4-kms-v3d overlays
raspberrypi/linux#5099

kernel: overlays: Make more overlays runtime-capable
See: raspberrypi/linux#5101

kernel: overlays: Mark more overlays as Pi4-specific

kernel: Revert ext4: make mb_optimize_scan performance mount option work with extents
See: raspberrypi/linux#5097

kernel: configs: Enable IIO software trigger modules
See: raspberrypi/linux#4984

kernel: configs: Enable IP_VS_IPV6 (for loadbalancing)
See: raspberrypi/linux#2860

kernel: configs: Enable CEPH_FS=m
See: raspberrypi/linux#2916

firmware: arm_loader: initramfs over NVME fix
See: raspberrypi/firmware#1731
@pelwell
Copy link
Contributor

pelwell commented Jul 22, 2022

There is an official beta release of a firmware containing this fix available now via sudo rpi-update, but if you are happy with the trial version I provided then I suggest you wait for the production release installable using sudo apt upgrade.

@aurel32
Copy link
Author

aurel32 commented Jul 23, 2022

Thanks for the info. I just tried it the latest beta release, and confirm it also works fine.

@ww898
Copy link

ww898 commented Sep 21, 2022

Just fo info: gpu_mem=16 in config.txt prevent booting ubuntu 22.04.1 with start4.elf and fixup4.dat from https://drive.google.com/file/d/1yhlXzgj0PczcqQcwmkWXKFxQsPyU3nkg/view?usp=sharing. The error is the same: Initramfs unpacking failed: Decoding failed.

@pelwell
Copy link
Contributor

pelwell commented Sep 21, 2022

If I understand you correctly, that without gpu_mem=16 it boots as expected, then I'm going to close this issue. The fix is in the kernel apt package and the latest RPiOS images.

@ww898
Copy link

ww898 commented Sep 21, 2022

Hi @pelwell,

Yes, without gpu_mem everything works

@pelwell
Copy link
Contributor

pelwell commented Sep 21, 2022

Thanks. I guess it just needs a bit more RAM than 16MB.

@pelwell pelwell closed this as completed Sep 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants