Debugging and Profiling#
DriveOS Linux has a dedicated eMMC partition (gos0-crashlogs) where Linux Guest OS kernel panic/OOPS logs are stored. In cases, such as in a production system, where kernel logs are not available over UART, this feature allows kernel panic/OOPS logs to be stored in a non-volatile storage partition that can be retrieved and analyzed on subsequent bootups.
This feature relies on:
pstore functionality in Linux Guest OS kernel
systemd-pstore service in Linux Guest OS
eMMC partition (accessed by Linux Guest OS as a virtual storage partition)
nvidia,tegra-hv-oops-storage driver
Due to the mechanism by which virtual storage partition works, there are some limitations where panic/OOPS logs might not be stored. The following components need to be functional at minimum when Linux Guest OS kernel tries to store panic/OOPS logs: IVC, Mempool, Hypervisor and Storage Server. If Linux Guest OS is generating OOPS/panic logs as an indirect effect of any of these modules not functioning, the OOPS/panic logs will not be stored. For example, consider a case where Storage Server stops responding to data requests for user storage partitions (also accessed by Linux Guest OS as a virtual storage partition) due to internal error. In this case Linux Guest OS kernel can generate OOPS/panic logs, but it will not be stored in gos0-crashlogs partition.
For additional information regarding how Linux Guest OS panic/OOPS logs are stored and retrieved in a subsequent boot cycle, refer to Storage Server.
By default, when Linux Guest OS is operating normally (and has been operating normally from when DriveOS was flashed), there are no panic/OOPS logs:
root@tegra-ubuntu:/home/nvidia# ls -al /var/lib/systemd/pstore/
ls: cannot access '/var/lib/systemd/pstore/': No such file or directory
Each time there is Linux Guest OS kernel panic/OOPS, the pstore kernel subsystem will automatically store the panic/OOPS logs in gos0-crashlogs partition. There will be separate files for each instance of kernel panic/OOPS (note: these files are only visible on a subsequent reboot, they won’t be visible on the current boot cycle if the system is functional after OOPS). On subsequent boot, systemd-pstore service in conjunction with the pstore kernel subsystem, will retrieve logs and store them in rootfs. User doesn’t need to do any direct read/write to gos0-crashlogs partition, but would only need to read/write from rootfs directory maintained by systemd-pstore service.
Here is an example where there was kernel panic followed by DriveOS reboot. After reboot, user can see some files under /var/lib/systemd/pstore/ directory:
root@tegra-ubuntu:/home/nvidia# ls -al /var/lib/systemd/pstore/
total 168
drwxr-xr-x 2 root root 4096 Jan 10 04:56 .
drwxr-xr-x 10 root root 4096 Jan 10 04:56 ..
-rw------- 1 root root 78611 Mar 18 18:23 dmesg-tegra_hv_vblk_oops-0
-rw-r----- 1 root root 78639 Jan 10 04:56 dmesg.txt
root@tegra-ubuntu:/home/nvidia# tail -20 /var/lib/systemd/pstore/dmesg-tegra_hv_vblk_oops-0
<4>[ 181.650674] dump_backtrace+0x0/0x1d0
<4>[ 181.650683] show_stack+0x2c/0x40
<4>[ 181.650686] dump_stack+0xd8/0x138
<4>[ 181.650692] panic+0xd0/0x3a4
<4>[ 181.650694] sysrq_reset_seq_param_set+0x0/0xa0
<4>[ 181.650700] __handle_sysrq+0x90/0x1a0
<4>[ 181.650702] write_sysrq_trigger+0x144/0x250
<4>[ 181.650704] proc_reg_write+0xc4/0x110
<4>[ 181.650708] vfs_write+0xc0/0x3f0
<4>[ 181.650711] ksys_write+0x78/0x100
<4>[ 181.650713] __arm64_sys_write+0x24/0x30
<4>[ 181.650716] el0_svc_common.constprop.0+0x7c/0x1c0
<4>[ 181.650719] do_el0_svc+0x34/0xa0
<4>[ 181.650722] el0_svc+0x1c/0x30
<4>[ 181.650724] el0_sync_handler+0xa8/0xb0
<4>[ 181.650725] el0_sync+0x16c/0x180
<2>[ 181.756669] SMP: stopping secondary CPUs
<0>[ 181.757182] Kernel Offset: disabled
<0>[ 181.757615] CPU features: 0x0040006,4800a238
<0>[ 181.758168] Memory Limit: none