Linux Security Hardening#
This section describes Linux security hardening guidelines.
Acronyms
Acronym |
Definition |
AES |
Advanced Encryption Standard |
ASLR |
Address Space Layout Randomization |
CVE |
Common Vulnerabilities and Exposures |
CVSS |
Common Vulnerability Scoring System |
DMESG |
Diagnostic Messages |
DoS |
Denial of Service |
EL |
Exception Level |
ESSIV |
Encrypted Sector Salt Initial Vector |
GCC |
The GNU Compiler Collection |
ICMP |
Internet Control Message Protocol |
IP |
Internet Protocol |
KASLR |
Kernel Address Space Layout Randomization |
KPTI |
Kernel Page Table Isolation |
LSM |
Linux Security Module |
MAC |
Mandatory Access Control |
MMAP |
Memory MAP |
NX |
Never eXecute |
PA |
Pointer Authentication |
PAC |
Pointer Authentication Code |
PAN |
Privileged Access Never |
PIC |
Position Independent Code |
PIE |
Position Independent Executable |
RELRO |
RElocation ReadOnly |
RO |
Read Only |
ROP |
Return Oriented Programming |
TCP |
Transmission Control Protocol |
TTBR |
Translation Table Base Register |
VDSO |
Virtual Dynamically linked Shared Object |
Toolchain Versions
Toolchain |
Version |
---|---|
GCC |
13.2 |
Binutils |
2.42 |
Ubuntu |
24.04 |
Linux Kernel |
6.1.119 |
Building Holistic Security Story for DriveOS Based Products#
DriveOS, as a foundational software platform developed by NVIDIA is built using components sourced from third parties—such as the Linux operating system—and is further extended by customers who integrate their own software components into the final product.
To achieve holistic security, every component in the software supply chain must be secure and compliant. Security cannot be guaranteed by any single stakeholder alone, instead, each contributor must identify and protect their own assets.
NVIDIA is committed to protecting all assets it owns or delivers as part of DriveOS. These assets, such as binaries, processes and services, are safeguarded using well-established access control mechanisms, including:
Discretionary Access Control (DAC): to manage ownership-based access rights.
Mandatory Access Control (MAC): to enforce system-wide security policies beyond user discretion.
Similarly, NVIDIA expects third-party OS vendors and OEMs to follow the same principles. Each party is responsible for:
Defining their software assets (e.g., binaries, configuration files, data stores).
Specifying the required security controls to protect those assets.
Ensuring those controls are implemented correctly in the final integrated system.
This collaborative and transparent approach ensures that, when the product reaches end users, every software component is traceably secured, contributing to the overall trustworthiness and resilience of the final system.
Based on their system integration and product configuration, OEMs must work closely with the OS vendor to identify all critical assets, including those originating from the OS layer. They must then assess potential threats in the context of their use case and deploy appropriate security controls to mitigate those threats. This includes aligning with platform-level policies, hardening runtime environments, and validating the enforcement of access control mechanisms across both their own and inherited software components.
The following sections outline the essential security controls required to ensure that Linux virtual machine (VM) is securely deployed within the DriveOS platform.
Linux Kernel Hardening#
Kernel Configurations Hardening#
The following table describes enabled DriveOS Linux kernel configurations.
Kernel configuration |
Description |
Why enable? |
CONFIG_BUG |
Enables BUG() macro |
Recommended to be enabled to report BUG() conditions and kill the offending process. Disabling this option may lead to ignoring fatal conditions. |
CONFIG_STRICT_KERNEL_RWX |
When enabled, kernel text and rodata memory will be made read-only |
Provides memory safety, prevents tampering code and read-only critical kernel data. |
CONFIG_STACKPROTECTOR |
Enables stack protector |
Detects stack smashing and can potentially prevent hijacking execution flow. |
CONFIG_STACKPROTECTOR_STRONG |
Enables strong stack protector. This is the same as using GCC option “-fstack-protector-strong” |
Detects stack smashing, and can potentially prevent hijacking execution flow as detection leads to execution termination. |
CONFIG_STRICT_DEVMEM |
Enables strict access to “/dev/mem” meaning that only memory-mapped peripherals are allowed to be accessed by user space. |
When enabled, this kernel configuration prevents accessing kernel and user space memory. |
CONFIG_SYN_COOKIES |
Enables TCP SYN flooding attack mitigation by using TCP SYN cookies |
Disabling this kernel configuration makes TCP SYN DoS (Denial of Service) attacks possible as victims consume resources for half established TCP connections. |
CONFIG_SECURITY_YAMA |
Enables YAMA Linux LSM (Linux Security Module). YAMA |
ptrace is a Linux debug capability used for tracing processes and allows reading/writing process memory/execution state. Yama enables controlling ptracing processes and the feature can be controlled during runtime with “ |
CONFIG_SECURITY_APPARMOR |
Enables AppArmor module as MAC (Mandatory Access Control). |
AppArmor enforces access control restrictions by attaching profiles to running processes, limiting their capabilities and file access privileges. Effectiveness of AppArmor depends on its policies and enforcement mode. Depending on its policies, it can be effective enforcing access control restrictions to processes, privileged, or unprivileged. |
CONFIG_LSM=”landlock,lockdown,yama,loadpin,safesetid,integrity,apparmor,bpf” |
Enables the listed LSMs. |
The above security configurations for various LSM modules only take effect if specified in this kernel configuration. |
CONFIG_HARDENED_USERCOPY |
Checks memory regions when copying memory to/from the kernel (via copy_to_user() and copy_from_user() functions) and to make sure that
|
Enabling this kernel configuration option can mitigate a class of vulnerabilities that involve buffer overflows and writing to arbitrary kernel addresses. |
CONFIG_SLAB_FREELIST_RANDOM |
Randomizes the freelist order used on creating new pages. |
This security feature reduces the predictability of the kernel slab allocator against heap overflows. |
CONFIG_SLAB_FREELIST_HARDENED |
Hardens the kernel slab allocator against common freelist exploit methods |
Makes kernel heap attacks difficult against slab cache metadata. Slab caches are memory locations where kernel structures (e.g. task structure) are kept. |
CONFIG_SHUFFLE_PAGE_ALLOCATOR |
Randomizes high-order page allocation freelist. |
Randomization of the page allocator improves the average utilization of a direct-mapped memory-side-cache. |
CONFIG_FORTIFY_SOURCE |
Hardens common str/mem functions against buffer overflows during build time and runtime by checking memory copies that might overflow a structure in str() and mem() functions. |
Effectively detects and prevents buffer overflows. |
CONFIG_SECURITY_DMESG_RESTRICT |
Restricts unprivileged access to the kernel syslog. The dmesg logs can only be accessed by root processes. Can also be controlled with “kernel.dmesg_restrict” sysctl setting. |
Prevents kernel memory address exposures via dmesg. |
CONFIG_STRICT_MODULE_RWX |
Sets loadable kernel module data as NX and text as RO |
Mitigates runtime attacks that involve code and data tampering. |
CONFIG_MODULE_SIG=y CONFIG_MODULE_SIG_ALL=y CONFIG_MODULE_SIG_SHA512=y CONFIG_MODULE_SIG_HASH=“sha512” CONFIG_MODULE_SIG_KEY=“certs/signing_key.pem” |
Enables kernel module signing. Each kernel module is digitally signed. Kernel module signature is checked by the kernel while loading kernel modules |
Kernel module signing ensures integrity and authenticity of kernel modules while loading them. When enforced, the kernel loads kernel modules only from known origin(s) and if their integrity is verified. |
CONFIG_DEFAULT_MMAP_MIN_ADDR=32768 |
Disallows allocating the first 32k of memory. This is the portion of low virtual memory that should be protected from user space allocation. |
Keeping a user from writing to low pages can help reduce the impact of kernel NULL pointer bugs |
CONFIG_RANDOMIZE_BASE |
Enables KASLR (Kernel Address Space Layout Randomization). Randomizes the virtual address at which the kernel image is loaded |
Deters exploit attempts relying on knowledge of the location of kernel internals. |
CONFIG_ARM64_SW_TTBR0_PAN |
Emulates PAN (Privileged Access Never) using TTBR0_EL1 switching |
Enabling this option prevents the kernel from accessing user space memory directly by pointing TTBR0_EL1 to a reserved zeroed area and reserved ASID. |
CONFIG_UNMAP_KERNEL_AT_EL0 |
Enables Kernel Page Table Isolation (KPTI) to remove an entire class of cache timing side-channels. Unmaps the kernel when running in userspace, mapping it back in on exception entry via a trampoline page in the vector table. |
Can mitigate Meltdown like vulnerabilities. |
CONFIG_GCC_PLUGINS |
GCC plugins are loadable modules that provide extra features to the compiler. |
Allows the use of GCC plugin based kernel security hardening such as preventing struct and stack leakages. |
CONFIG_CRYPTO_SHA2_ARM64_CE |
Enables ARMv8 Crypto Extensions SHA-256 support. |
Needed for accelerating dm-verity |
CONFIG_OVERLAY_FS |
Builds overlay file system support into the kernel |
Needed to support dm-verity |
CONFIG_DM_VERITY |
Enables dm-verity support |
dm-verity provides transparent integrity protected of read-only block based file systems. dm-verity is based on Merkle hash tree, hashes are verified by the kernel during disk access. Integrity failures lead to I/O failures. |
CONFIG_DM_VERITY_VERIFY_ROOTHASH_SIG |
Enables verification of PKCS#7 signature of dm-verity root hash signature against the builtin trusted keyring by default |
Allows the use of PKCS#7 signatures to verify authenticity and integrity of dm-verity protected file system |
CONFIG_CRYPTO_USER_API_HASH |
Allows accessing the kernel crypto hash API from user space. |
When veritysetup tool is included in initramfs, it is necessary to reduce initramfs by getting rid of OpenSSL backend dependency. This requires enabling CONFIG_CRYPTO_USER_API_HASH kernel configuration option |
CONFIG_ARM64_PTR_AUTH |
Enables the Arm PA (Pointer Authentication) instructions at EL0 (i.e. for userspace). Choosing this option will cause the kernel to initialize secret keys for each process at exec() time, with these keys being context-switched along with the process |
Pointer authentication (part of the ARMv8.3 Extensions) provides instructions for signing and authenticating pointers against secret keys, which can be used to mitigate Return Oriented Programming (ROP) and other attacks. |
The following table describes DRIVE Linux kernel configurations for debug builds.
Kernel Configuration |
Description |
Why Enable? |
---|---|---|
CONFIG_GCC_PLUGIN_STRUCTLEAK_BYREF_ALL |
Zero initialize any stack variables that may be passed by reference Enabled only in the debug builds |
Eliminates all classes of uninitialized stack variable exploits and information exposures |
CONFIG_GCC_PLUGIN_STACKLEAK |
Scrubs the kernel stack before returning from system calls. |
Eliminates information leakages |
CONFIG_DEBUG_NOTIFIERS |
Enables sanity checking for notifier call chains |
Useful for kernel developers to make sure that modules properly unregister themselves from notifier chains |
CONFIG_DEBUG_SG |
Enables checks on scatter gather tables |
Helps to find problems with drivers that do not properly initialize their sg tables |
CONFIG_DEBUG_LIST |
Enables checks in the linked list walking routines |
Prevents some exploits that involve manipulating a linked list |
The following table describes disabled DriveOS Linux kernel configurations.
Kernel Configuration |
Description |
Why Disable It? |
---|---|---|
CONFIG_COMPAT_BRK |
Disable heap randomization. |
Enabling this configuration breaks ASLR. |
CONFIG_DEVKMEM |
Enables “/dev/kmem” |
“/dev/kmem” provides read and write access to the kernel address space. |
CONFIG_PROC_KCORE |
Enables “/proc/kcore” |
“/proc/kcore” represents the physical memory of the system and is stored in the core file format. For debug purposes only |
CONFIG_COMPAT_VDSO |
Enabling the VDSO (Virtual Dynamically linked Shared Object) memory page is used by a process to access kernel features, without using a system call. |
VDSO memory page is always located at the same address and it contains ROP gadgets. If enabled, it allows bypassing address space layout randomization (ASLR). |
CONFIG_HIBERNATION |
Enables the suspend to disk (STD) functionality. |
Potential security issue when secure boot is enabled. |
The following table describes recommended DriveOS Linux kernel configurations for production.
Recommendation |
Reason |
---|---|
Disable CONFIG_KEXEC |
Disables unsecure kexec syscall, which allows loading custom kernel without any signature checks. |
Disable CONFIG_DEVKMEM |
CONFIG_DEVKMEM enables “ |
Disable CONFIG_PROC_KCORE |
CONFIG_PROC_KCORE enables “ |
Disable CONFIG_COMPAT_VDSO |
Enabling the VDSO (Virtual Dynamically linked Shared Object) memory page is used by a process to access kernel features, without using a system call. |
Disable CONFIG_HIBERNATION |
Enables the suspend to disk (STD) functionality. Can potentially allow bypassing secure boot. |
Enable CONFIG_MODULE_SIG_FORCE |
Please see the kernel documentation for more information on signing kernel modules and securing signing keys. How to test signed kernel module loading:
00059b40 ce c2 36 8d 37 ee 77 9b d6 dd aa 32 39 58 25 99
|..6.7.w....29X%.|
00059b50 5e 08 73 3a 0e aa 61 21 0b be 74 9b b8 cf 9e ab
|^.s:..a!..t.....|
00059b60 b8 c1 c8 06 63 fe 8a d2 0c a3 fa b1 0c ff fb 5c
|....c..........\|
00059b70 ef 1f ef 38 49 cd d7 77 15 e7 f0 8e 40 5a 7c 58
|...8I..w....@Z|X|
00059b80 13 17 e7 ae ea 8f fd 10 73 d1 d5 26 43 8d af 75
|........s..&C..u|
00059b90 16 e7 bd 58 b4 9a ad 7c 70 1b fc 7d 27 57 d3 54
|...X...|p..}'W.T|
00059ba0 24 00 00 02 00 00 00 00 00 00 00 02 a9 7e 4d 6f
|$............~Mo|
00059bb0 64 75 6c 65 20 73 69 67 6e 61 74 75 72 65 20 61 |dule signature
a|
00059bc0 70 70 65 6e 64 65 64 7e 0a
|ppended~.|
|
Enable CONFIG_IO_STRICT_DEVMEM |
Prevents userspace (root) access to all io-memory by filtering I/O access to /dev/mem. If enabled, the /dev/mem file only allows userspace access to idle io-memory ranges. |
Disabling the Spectre-BHB Mitigations#
Spectre-BHB (Branch History Buffer) is a variant of cross-privilege Spectre-v2 attacks that allows leaking arbitrary kernel memory and can be initiated by running untrusted code. It is tracked with CVE-2022-23960. The Linux kernel has already been patched.
The performance test done by NVIDIA has indicated significant performance degradation when the Spectre-BHB mitigations are enabled. As a result, nospectre_bhb
kernel boot parameter is in use to disable the Spectre-BHB mitigations.
Linux Toolchain Hardening#
The following table describes enabled compiler/linker security-related options in the release.
Option |
Description |
Additional information |
---|---|---|
-fstack-clash-protection |
Stack slash protection |
Prevents stack clash attacks |
-fstack-protector-strong |
Stack smashing protector |
Protects functions that call alloca() and/or with local array definitions |
-D_FORTIFY_SOURCE=1 |
Fortify sources with checks for calls to unsafe libc functions and potential buffer overflows. |
The _FORTIFY_SOURCE macro adds buffer overflow checks for the following libc functions: memcpy, memset, stpcpy, strcpy, strncpy, strcat, strncat, sprintf, snprintf, vsprintf, vsnprintf, gets -D_FORTIFY_SOURCE=1 adds checks that shouldn’t change the behavior of conforming programs. See feature_test_macros(7) - Linux manual page (man7.org) -D_FORTIFY_SOURCE=2 adds additional runtime checks, which may cause some conforming programs to fail. DriveOS Linux enables mode 1 as default, but this can be updated by the customers or the component owners. |
-Wl,-z,relro; -Wl,-z,now |
Full RELRO (Relocation Read Only) protection |
(RELocationReadOnly)Mark relocation table entries resolved at load-time as read-only.
|
-pie/-PIE and -pic/-PIC |
PIE (Position Independent Executable) and PIC (Position Independent Code) |
For randomizing standalone executable test and data segments. Complements ASLR (Address Space Layout Randomization) |
-Wformat=2 |
Enable -Wformat plus additional format checks |
Enable -Wformat plus additional format checks. See Warning Options (Using the GNU Compiler Collection (GCC)) |
-mbranch-protection=pac-ret+b-key |
Enable Branch Target protection |
The GCC compiler flag -mbranch-protection=pac-ret+b-key is used to enable branch protection for the compiled code. This flag specifically activates the “pac-ret” and “b-key” features. The pac-ret feature refers to “Pointer Authentication Codes for return addresses”, it is a security feature that tries to mitigate the risk of “Return Oriented Programming” (ROP) attacks by signing and authenticating the return addresses before they’re popped from the stack. The b-key feature is referring to the selection of the key “B” for Pointer Authentication. This adds another layer of security by using a different key for different purposes, avoiding potential key reuse attacks. Thus, using this flag will enhance the security of the compiled code, making it harder for potential attackers to exploit it. |
-Warray-bounds=1, -Wformat-overflow, -Wstringop-overflow |
Detect overflows at compilation |
Warns overflows when detected in arrays, format string or in string operations |
-Wfree-nonheap-object |
Trigger warning on non-heap object |
The GCC compiler flag, -Wfree-nonheap-object, is an option that triggers a warning if an attempt is made to free an object that was not allocated on the heap. This could include objects located on the stack, global or static objects, or other non-dynamically allocated memory. This warning helps detect incorrect uses of the ‘free’ function, which could potentially lead to serious runtime errors or memory corruption in a program. |
-Wmaybe-uninitialized, -Wuninitialized |
Trigger warning on use of uninitialized object |
Warns uninitialization of the objects during compilation |
-mharden-sls=all |
Fix for Spectre-V2 BHB [CVE-2022-25368] vulnerability and Straight Line Speculation [CVE-2020-13844] vulnerability |
The -mharden-sls= flag is a GCC compiler flag in Armv8.3-A or later versions. It enables or disables the generation of code that mitigates the Straight-Line Speculation (SLS) vulnerability. This vulnerability refers to the potential for speculatively executing instructions linearly in memory beyond an unconditional change in control flow. It accepts a list of mitigation strategies, specified as “all” or “none”. Using ‘-mharden-sls=all’ enables all available mitigations, while ‘-mharden-sls=none’ turns off all Straight-Line Speculation hardening. |
-Wenum-int-mismatch |
Detect use of int instead of an enum |
The GCC compiler flag -Wenum-int-mismatch is used to warn the programmer in situations where an enumerator and an integer value are compared. This can pose a potential bug in the code as the integer might not represent a valid value of the enumeration. This flag assists in identifying potential problems of this nature, enhancing code correctness. |
-Wanalyzer-allocation-size -Wanalyzer-deref-before-check -Wanalyzer-exposure-through-uninit-copy -Wanalyzer-imprecise-fp-arithmetic -Wanalyzer-infinite-recursion -Wanalyzer-jump-through-null -Wanalyzer-out-of-bounds -Wanalyzer-putenv-of-auto-var -Wanalyzer-tainted-assertion -Wanalyzer-fd-access-mode-mismatch -Wanalyzer-fd-double-close -Wanalyzer-fd-leak -Wanalyzer-fd-phase-mismatch -Wanalyzer-fd-type-mismatch -Wanalyzer-fd-use-after-close -Wanalyzer-fd-use-without-check -Wanalyzer-va-list-leak -Wanalyzer-va-list-use-after-va-end -Wanalyzer-va-arg-type-mismatch -Wanalyzer-va-list-exhausted |
Enable and enforce the static analysis diagnostics to detect and mitigate potential security vulnerabilities |
Enable and enforce the static analysis diagnostics to detect and mitigate potential security vulnerabilities |
Linux Userspace Hardening#
Security Hardened Sysctl Configurations#
Option |
Description |
---|---|
net.ipv4.conf.default.accept_redirects=0 net.ipv4.conf.all.accept_redirects=0 net.ipv6.conf.default.accept_redirects=0 net.ipv6.conf.all.accept_redirects=0 |
Disable accepting ICMP redirect messages. The settings are defined in /etc/sysctl.d/10-nv_network_security.conf |
net.ipv4.conf.all.send_redirects=0 |
Disable sending ICMP redirect messages. The setting is defined in /etc/sysctl.d/10-nv_network_security.conf |
net.ipv4.conf.default.accept_source_route=0 net.ipv4.conf.all.accept_source_route=0 |
Disable accepting IPv4 source-routed packets. The setting is defined in /etc/sysctl.d/10-nv_network_security.conf |
net.ipv4.ip_forward=0 net.ipv6.conf.all.forwarding=0 net.ipv6.conf.default.forwarding=0 |
Disable IP forwarding. The settings are defined in /etc/sysctl.d/10-nv_network_security.conf |
net.ipv4.tcp_syncookies=1 |
Enable TCP SYN flood protection. The setting is defined in /etc/sysctl.d/10-nv_network_security.conf |
kernel.kptr_restrict = 1 |
The %pK format specifier is designed to hide exposed kernel pointers, specifically via /proc interfaces. Exposing these pointers leads to information leakage. If kptr_restrict is set to 1, if the current user does not have CAP_SYSLOG, kernel pointers using %pK are printed as 0s. If kptr_restrict is set to 2, kernel pointers using %pK are printed as 0’s regardless of privileges. The setting is defined in /etc/sysctl.d/10-kernel-hardening.conf |
kernel.randomize_va_space = 2 |
Fully enables ASLR (Address Space Layout Randomization) 2’ means that stack/heap, mmap base, VSDO pages are randomized. See https://www.kernel.org/doc/Documentation/sysctl/kernel.txt |
kernel.unprivileged_bpf_disabled=1 |
Prevents unprivileged users from being able to use eBPF via the kernel. |
Enabling and Disabling Debug Capabilities Further#
Enabling or disabling debug capabilities can be done by running sysctl command or updating sysctl values in the relevant files under /etc/sysctl.d
The following sysctl commands need to be executed as root for being able to enable or disable debug features as described below. Refer to the kernel documentation for more information about the settings.
Enabling Debug Capabilities
$ cat enable_debug.sh
#!/usr/bin/env bash
if [[ $EUID > 0 ]]; then
echo "Must run as root" >&2
exit 1
fi
# See
# https://www.kernel.org/doc/Documentation/sysctl/kernel.txt
# https://kernsec.org/wiki/index.php/Kernel_Self_Protection_Project/Recommended_Settings
# disable hiding kernel pointers
sysctl -w kernel.kptr_restrict=0
# disable ptrace_scope
sysctl -w kernel.yama.ptrace_scope=0
# fully enable the sysrq functions
sysctl -w kernel.sysrq=256
# no dmesg restrictions
sysctl -w kernel.dmesg_restrict=0
# allow use of all performance events by all users.
sysctl -w kernel.perf_event_paranoid=-1
Run enable_debug.sh
as root:
# enable_debug.sh
Disabling Debug Capabilities
$ cat disable_debug.sh
#!/usr/bin/env bash
if [[ $EUID > 0 ]]; then
echo "Must run as root" >&2
exit 1
fi
# See
# https://www.kernel.org/doc/Documentation/sysctl/kernel.txt
# https://kernsec.org/wiki/index.php/Kernel_Self_Protection_Project/Recommended_Settings
# kernel pointers printed using the %pK format
# specifier are replaced with 0s unless the user
# has CAP_SYSLOG
sysctl -w kernel.kptr_restrict=1
# no processes may use ptrace with PTRACE_ATTACH
# nor via PTRACE_TRACEME. Once set, this sysctl
# value cannot be changed
sysctl -w kernel.yama.ptrace_scope=3
# disable the sysrq functions
sysctl -w kernel.sysrq=0
# insecure kexec syscall cannot be used
sysctl -w kernel.kexec_load_disabled=1
# restrict access to the dmesg log
sysctl -w kernel.dmesg_restrict=1
# disallow kernel profiling by users without
# CAP_PERFMON
sysctl -w kernel.perf_event_paranoid=3
Run disable_debug.sh
as root:
# disable_debug.sh
Linux System Hardening#
DM-VERITY#
dm-verity provides transparent integrity checking of block devices using a cryptographic digest provided by the kernel crypto API. It uses “Merkle Hash Trees”.
dm-verity provides protection against “offline” tampering attacks and Merkle hash tree root hash has to be part of the secure boot chain.
Enabling and Disabling dm-verity#
Kernel Command Line |
Description |
---|---|
verity |
DM-Verity Enable or Disable flag for Bootburn and Initramfs.
verity=1
verity=0 |
verityinfo |
Provides ext4 device and root hash. verityinfo=<ext4_device>:<root_hash>:<hash_offset> where:
|
The following is an example of the Linux kernel command line parameters when dm-verity is enabled:
aurixfw=AFW root=/dev/vblkdev0p1 loglevel=3 ip=off usr_fs=/dev/vblkdev1:/mnt/persistent/metadata usr_fs2=/dev/vblkdev3:/mnt/persistent/data rw_overlay=/dev/vblkdev4:/rw_overlay ufs_fs=/dev/vblkdev2 gpt rootwait ro gpt tegra_keep_boot_clocks disable_android_paranoid_network sdhci_tegra.en_boot_part_access=1 console=ttyS2,115200n8 pci=ecrc=on verity=1 verityinfo=/dev/vblkdev0p1:12ad433befe7a2319f3cb8b53d93305cf82ce20463f94ac10f79af0d3de6d16d:2809126400
Enabling and Verifying dm-verity#
Platform bind options when enabling dm-verity
Option |
Value |
---|---|
OS_ARGS_ENABLE_DM_VERITY |
1 |
OS_ARGS_ROOT_MOUNT_PER |
ro |
Sample usage:
_bind_partitions -b p3663-a01 linux_``** _OS_ARGS_ENABLE_DM_VERITY=1 OS_ARGS_ROOT_MOUNT_PER=ro_**
veritysetup Tool
The veritysetup tool of the recommended version must be installed in both the host and the initramfs. DriveOS Linux Initramfs comes with the veritysetup preinstalled.
version: >=2.0.2
Package: cryptsetup-bin
Linux Kernel Configurations for dm-verity
Refer section Kernel Configurations Hardening.
Verification of dm-verity
Verification step |
Additional information |
---|---|
The kernel command line includes “verity=1” and “verityinfo” information |
Run the following command and confirm that the output includes “verity=1” and “verityinfo”. As an example:
aurixfw=AFW root=/dev/vblkdev0p1 loglevel=3 ip=off
usr_fs=/dev/vblkdev1:/mnt/persistent/metadata
usr_fs2=/dev/vblkdev3:/mnt/persistent/data rw_overlay=/dev/vblkdev4:/rw_overlay
ufs_fs=/dev/vblkdev2 gpt rootwait ro gpt tegra_keep_boot_clocks
disable_android_paranoid_network sdhci_tegra.en_boot_part_access=1
console=ttyS2,115200n8 pci=ecrc=on **verity=1
verityinfo=/dev/vblkdev0p1:c2e2c13f1a04d413a8d0da8c3da95659db08063e9c84d132cb71c6d3a517e0cc:2809178624** tegraid=23.4.1.0.0 bl_debug_data=65536@0x7fe97f0000
bl_prof_dataptr=2048@0x7fe0cb7000
|
rootfs is mounted as read-only and dm-verity protected |
Run the following command and confirm that rootfs is mounted as read-only and the kernel enables dm-verity with Arm Crypto Extensions (CE) SHA256 acceleration $ sudo dmesg | grep -i verity [ 1.971875] device-mapper: verity: sha256 using implementation "sha256-ce"
$ mount
/dev/mapper/vroot on / type ext4 (ro,relatime,seclabel,norecovery)
…
|
“/var”, “/home” and “/tmp” are overlayfs mounted as read-write to ensure write operations to these directories are routed to gos0-rw-overlay partition |
overlay on /etc type overlay
(rw,relatime,seclabel,lowerdir=/new_root/etc,upperdir=/new_root/rw_overlay/etc,workdir=/new_root/rw_overlay/tmp_ov/etc)
overlay on /home type overlay
(rw,relatime,seclabel,lowerdir=/new_root/home,upperdir=/new_root/rw_overlay/home,workdir=/new_root/rw_overlay/tmp_ov/home)
overlay on /var type overlay
(rw,relatime,seclabel,lowerdir=/new_root/var,upperdir=/new_root/rw_overlay/var,workdir=/new_root/rw_overlay/tmp_ov/var)
|
Performance of dm-verity
The use of Arm Crypto Extensions (CE) for SHA256 acceleration provides ~4x better dm-verity performance.
DM-CRYPT#
DM-Crypt in Linux Kernel provides transparent encryption of block devices. DM-Crypt uses hardware accelerated Armv8 crypto AES instructions. DriveOS Linux NSR provides the means to users to set up a sample Encrypted File System (EFS) to demonstrate DM-Crypt based block-device encryption facilities.
DM-CRYPT based EFS Setup
The following instructions outline the steps to configure a DM-Crypt based encrypted file system using AES CBC-ESSIV (Encrypted Sector Salt Initial Vector) as the symmetric cipher. These instructions are provided for reference purposes only.
The encrypted partition setup steps have been scripted at path /etc/systemd/scripts/nv_cpu_encrypt_run_once.sh
.
To configure the encrypted file system (EFS) partition, execute the following command
bash /etc/systemd/scripts/nv_cpu_encrypt_run_once.sh
Access Control#
Access control mechanisms are essential for maintaining system security and ensuring that users and processes can only interact with resources in authorized ways. These controls determine who can read, write, or execute files, and how information flows between different users and processes.
Linux primarily supports two core models of access control:
Discretionary Access Control (DAC): Under DAC, the owner of a file or directory has the discretion to set permissions for other users. This model uses file permission bits (read, write, execute) and ownership information (user and group) to regulate access.
Mandatory Access Control (MAC): Unlike DAC, MAC enforces a more rigid security policy defined by the system administrator. MAC systems use security labels and policies to restrict access based on defined rules.
Together, DAC and MAC provide a layered approach to security, with DAC offering user-level flexibility and MAC enabling strict policy enforcement where needed.
All NVIDIA owned assets are protected with tailored DAC and MAC policies in DriveOS Linux, whereas assets owned by third party vendors are protected with default permissions provided by the Vendors. However, the OEM or customer is encouraged to perform due diligence in further hardening the permissions and access controls of assets provided by third party vendors, based on the security criticality and contextual relevance to their specific application use case.
Discretionary Access Control (DAC) in Linux VM#
Introduction#
Discretionary access control is a type of security protection that is the minimal default protection enabled in most Linux and Unix-like operating systems. It allows an ‘asset’ owner (asset being file, device nodes, sys nodes etc) to define who has access to the asset.
In Linux everything is treated as a file and hence all our assets are files. Each file in Linux has:
An owner (usually the user who created it)
An associated group
- Permission bits for:
The owner (user)
The group
Others (everyone else, a.k.a world permissions)
Example permissions:
# ls -l /home/ubuntu/nvidia.txt
-rw-r------ 1 ubuntu nvidia 0 Apr 24 17:40 /home/ubuntu/nvidia.txt
In example above,
User ‘ubuntu’ has read and write permissions on /home/ubuntu/nvidia.txt.
Any user being part of ‘nvidia’ group (This usually includes nvidia user also unless you custom create a user without adding a group) can read the file.
‘Other’ users who do not belong to above two categories have no permissions for the file.
Concept of SupplementaryGroups
All users in Linux will have two category of groups:
Primary Groups: This will be the groupname of any file created by the user.
Supplementary Groups: These are additional groups a user can be part of for it to gain access to resources that are usually restricted.
Example of Supplementary groups:
‘sudo’: allows users to execute privileged applications without having to be root.
‘video’: allows applications to connect to video devices without needing root privileges.
Supplementary Groups are of two types: ‘Permanent’ and ‘Temporary’ Supplementary Groups.
Permanent Supplementary Groups are created by actually modifying the target filesystem’s etc/group file where groups are registered. Commands like usermod -aG, edit this file to grant a user permanent access to a group which persists across boot and is available for any process launched by the user.
Temporary Supplementary Groups are groups associated with a user for a particular process or a session. Alternative processes launched by the same user will not have access to these groups.
The latter, Temporary Supplementary Groups, is what is used in DriveOS Linux to grant access to restricted assets, i.e. to reduce the attack vector, no user has permanent access to any asset configured in the filesystem, rather access is granted to each asset for a user on a per ‘process’ basis.
Capabilities
Linux Capabilities break down the all-powerful root privileges into smaller, more specific chunks. Instead of granting a process full root access, each capability provides only the necessary permissions for a given task. For example, a process may be given the ability to bind to network ports or modify system time, without granting it complete control over the system. This approach enhances security by limiting the scope of what each process can do, following the principle of least privilege.
If an application is required to do privileged action, DriveOS users are recommended to start the process as non-root and grant them the minimal required capabilities for the privileged operation. See man(7) capabilities for details.
User Management
DriveOS recommends each process launched in the Linux VM, be launched as a unique user to isolate privileges.
Similarly it is recommended to protect each asset with a unique group to ensure that no asset can accidentally be compromised due to a malicious exploit in an unrelated application. DriveOS also reserves a user with the same name and ID as the group being requested to be reserved to ensure there is no accidental overlap of user id and group id, reducing access control isolation. It is recommended DriveOS users also continue to do the same.
DriveOS delivers a tool NV UID and GID Reservation Tool to aid in the reservation of users and groups in the reserved ranges for DriveOS customers. The tool will ensure no users accidentally overlap, and DriveOS UID/GIDs never get used by the DriveOS users and vice versa.
Systemd
DriveOS in production will not have an interface or shell to launch applications and hence needs a startup manager to aid in launching the applications on boot. DriveOS chooses the Ubuntu default systemd to achieve that. DriveOS creates systemd service units to launch applications on boot; systemd provides various parameters related to DAC to ensure the applications can access all the resources it needs (following least access privilege):
SupplementaryGroups: This field inside a systemd unit is used to set all the additional groups the process will need to be part of to access the restricted resources
User: (Recommended to be unique) The unique user which launches the process configured in the systemd unit. Must not be “root”.
BoundingCapabilitySet: All the Linux ‘Capabilities’ the process can ever obtain during its lifetime.
AmbientCapabilities: All the Linux ‘Capabilities’ the process acquires on startup.
/proc/device-tree protection
DriveOS Linux protects the device-tree as a whole using a single group - nv_device_tree. Any applications accessing any path under /proc/device-tree/ will be granted the group nv_device_tree by the tool discussed below.
Users are not expected to grant this group to their application explicitly, but rather requested to document the /proc/device-tree/ paths used by the applications in the input documentation of the tool documented below.
DAC Policies and Tools#
Above sections, specify the interfaces with which applications are run on the target and DAC is configured on the Linux filesystem. To simplify the work on getting the info on the exact supplementary groups the process will need, and to confirm all security requirements hold true, NVIDIA delivers Security Configuration Tool which works on input security documentation YAMLs from both customer and NVIDIA, generates the required files to be added to the Linux filesystem and copies them over via NVIDIA Build-FS (Image generation tool).
Steps to Adopt DAC for New Processes and Assets#
Users must ensure they document all their assets in the input YAML documentation provided by the tool to provide them minimal privileges.
Users must also document all the processes they need to run in the target, its ordering dependencies, which paths they need to access etc, which will create a .service file for the process that will be launched automatically on startup.
Steps to Bypass DAC#
DriveOS will have DAC enforced for all their assets and recommends users to do the same. However if the DAC requirements must be bypassed, users can use the ‘All NVIDIA DAC Memberships’ feature of NVIDIA Build-FS (Optional Config fields) to add the user to all the DAC groups NVIDIA has created to bypass protections to shared NVIDIA assets.
Note: This doesn’t grant access to all Canonical restricted files, so the method used to access those files, shall continue to exist. This also doesn’t grant access to non-shared assets protected by DAC, any such non-shared assets will need to be updated to shared to be accessible by nvidia user.
Mandatory Access Control (MAC) in Linux VM#
Introduction#
DriveOS has adopted AppArmor as the MAC mechanism. AppArmor works by optionally defining a profile for each process that runs on the system. Each profile defines the capabilities and restrictions enforced on the associated process. This includes paths that the attached process can access with specific access modes (read, write, execute, etc.), along with capabilities it is allowed to acquire, and other privileges (network access, mounting, signals, etc.).
Systemd will launch processes with the profile specified in the “AppArmorProfile” field. DriveOS supplies profiles for all daemons and executables that are supplied by Nvidia. It also supplies abstraction files for all libraries supplied by Nvidia. These abstraction files can be included by any custom profiles that the DriveOS user wishes to create.
DriveOS does not include any AppArmor profiles/policy for upstream Linux distro components. It is recommended that the DriveOS user defines AppArmor profiles for any high-risk components. Any process that runs without an associated AppArmor profile will by default run in “unconfined” mode, which disables any MAC enforcement for that process.
AppArmor profiles are stored in /etc/apparmor.d/. The abstraction files for DriveOS libraries are stored in /etc/apparmor/abstractions/driveos/.
By default, all AppArmor processes run in enforcing mode. Any permissions denied by AppArmor will be logged in the kernel dmesg. For development, a profile can be set in complain mode, which only logs permission errors, but otherwise allows access. This is useful, as it allows collecting the list of needed permissions in one shot, instead of doing so one by one.
Kernel Configuration#
Refer section Kernel Configurations Hardening.
AppArmor Policies and Tools#
DriveOS leverages the same YAML based tools for AppArmor profile generation as the DAC mechanism. Any process defined in process_information.yaml will generate a different AppArmor profile that gives it access to every file that is listed as a dependency, and the generated systemd service files will launch the process with that profile. Refer section DAC Policies and Tools.
It is possible to customize the AppArmor profiles generated for processes defined in process_information.yaml with the following attributes:
AppArmorFlags (List):
Flags for AppArmor profiles can be defined here. The following flags are supported:
complain
: Sets the profile in complain mode. Useful for debugging and development. Must not be set in production.attach_disconnected
: Allows the use of POSIX message queues. This is required due to a lack of support in Linux kernel 6.1. Note thatlibnvsciipc.so
relies on POSIX mqueues, so this flag is required for any process usinglibnvsciipc.so
.
AdditionalAppArmorRules (List):
Additional statements for the AppArmor profile can be added here. This allows customization and inclusion of rules or permissions beyond file dependencies captured in the YAML files.
These rules must follow the AppArmor profile syntax exactly. See the apparmor(7) man page for details.
Rules must not be terminated with a comma; the tool automatically adds them.
#include
statements are not supported here.Due to
dm-verity
, all paths specified must begin with/new_root/
instead of/
. For example: use/new_root/proc/stat
instead of/proc/stat
.
Steps to Adopt AppArmor for New Processes#
The recommended method is to define the new process in the YAML file as described above, and to list its dependencies. This will ensure that DriveOS dependencies are captured fully, and the associated profile will be automatically generated.
It is also possible to manually define profiles in /etc/apparmor.d/ with a handwritten set of rules. Any profile file placed in this folder will be loaded in the kernel at boot. Systemd can be configured to use this profile for a particular process by changing the AppArmorProfile field.
It is highly recommended to use complain mode for debugging and development, and later switch to enforcing mode.
It is possible to validate that a particular process is running with a particular AppArmor profile by either running “sudo aa-status” or the “ps -eZ” command.
Debugging Tips#
Any process that is bound to an AppArmor profile will only be able to perform actions allowed by that profile. Actions that are disallowed will be logged using the kernel audit framework, which will appear in dmesg and/or system log. An example error message:
audit: type=1400 audit(1743805584.548:595): apparmor="DENIED" operation="file_mmap" profile="/new_root/etc/systemd/scripts/nv_tacp_init.sh" name="/new_root/usr/lib/aarch64-linux-gnu/libc.so.6" pid=2340 comm="bash" requested_mask="rm" denied_mask="rm" fsuid=2434 ouid=0
This message has the necessary information to add any missing rules to the AppArmor profile. See the AppArmor man page to see what the different file access masks mean. In complain mode, the DENIED section will instead show ALLOWED, and the action will be permitted.
Note that if a process is launched early during boot, the audit messages may be lost due to the kernel’s built-in rate limiting. If AppArmor is suspected to interfere with a process, but no audit messages are seen, it is recommended to restart the service using systemd and check the logs again.
Steps to Disable AppArmor#
AppArmor can be disabled by passing apparmor=0 in the kernel command line. The kernel command line can be modified in the PCT, using the linux_storage.cfg file.
Alternatively, AppArmor can also be disabled by disabling the kernel configuration: CONFIG_SECURITY_APPARMOR.
Filesystem Hardening#
Security for Untrusted Partitions in Linux VM#
Introduction#
The write access from untrusted partitions introduces a potential attack vector for executing untrusted code. To mitigate this risk, security controls are implemented as detailed in this section.
Mount Partitions with noexec, nosuid, nodev#
Untrusted writable partitions are mounted with the flags noexec, nosuid, and nodev to enforce the following restrictions:
Files shall not have the SUID bit enabled for privileged operations. This reduces the risk of privilege escalation through compromised or untrusted executables.
Executables shall not be allowed to run from the partition. This prevents the execution of unverified or malicious code and protects system integrity.
Device nodes shall not be created within the partition. This prevents unauthorized access to hardware and helps enforce security policies.
Similarly, below security measures are applied to kernel-managed partitions located at /proc, /sys, and /dev as follows:
Mount /proc, /sys, and /dev/shm with the nodev, noexec, and nosuid options to restrict execution, device node creation, and privileged operations.
Mount /proc with the hidepid=2 option to prevent users (other than root) from viewing metadata of processes owned by other users, thereby improving confidentiality.
Mount /dev/pts with the noexec and nosuid options to restrict execution and privileged operations.
Steps to Disable Untrusted Partitions Security#
To maintain backward compatibility, an option is provided to disable security hardening and restore legacy mount parameters. To do so, create an empty file at the following path in the rootfs partition:
/etc/nvidia/disable_security_hardening
Endpoint Security#
NvSciIpc Endpoint Security#
Introduction#
Note
It is recommended to avoid running processes as the root user. However, if an NvSciIpc client process must be run as root, endpoint security configuration is not applicable for that process.
NvSciIpc client processes may use multiple endpoints or channels, but each endpoint or channel must be assigned to a single user process. In cases where multiple user processes require access, additional endpoints or channels must be defined.
NvSciIpc enforces DAC-based endpoint security by verifying the UID of a client process against the UID mapped to the endpoint in the nvsciipc.cfg file. Access is granted or denied based on this authentication. DriveOS requires all NvSciIpc client processes to have appropriate permissions to access their endpoints, regardless of backend type.
DriveOS recommends launching each user process with a unique username and UID. NvSciIpc endpoint security is enabled by mapping the UID of the client process to its associated endpoints in the nvsciipc.cfg file. This mapping links each UID to a specific endpoint or channel entry.
INTER_PROCESS backends require two UIDs to be defined, one for each communicating process. INTER_THREAD, INTER_VM, and INTER_CHIP backends require only one UID, corresponding to the client process.
Example of NvSciIpc config file format with DAC
DriveOS provides a separate configuration file for each hardware platform. The correct configuration file must be modified based on the target platform. To support the new endpoint security format, the configuration file includes the version tag #CFGVER:1.
/drive-linux/lib-target/nvsciipc_t23x.cfg
/drive-linux/lib-target/nvsciipc_t26x.cfg
Example of configuration file entries:
# INTER_PROCESS <Endpoint-name1> <Endpoint-name2> <backend-specific-info> <UID of Endpoint-name1 owner process: UID1> <UID of Endpoint-name2 owner process: UID2>
# INTER_THREAD <Endpoint-name1> <Endpoint-name2> <backend-specific-info> <UID of Endpoint-name1/2 owner process>
# INTER_VM <Endpoint-name> <backend-specific-info> <UID of Endpoint-name owner process>
# INTER_CHIP <Endpoint-name> <backend-specific-info> <UID of Endpoint-name owner process>
#CFGVER:1
INTER_PROCESS ipc_test_0 ipc_test_1 64 1536 5000 5001
INTER_THREAD itc_test_0 itc_test_1 64 1536 5002
INTER_VM ivm_test 255 5003
INTER_CHIP nvscic2c_pcie_s1_c5_1 0000 5004
For the INTER_PROCESS backend, two UIDs must be defined—one for each communicating process. The second process must include the GID of the first UID in its supplementary group list, as NvSciIpc resources are shared between both processes.
Group membership can be updated following NV UID and GID Reservation Tool.
For example, if the first process runs under UID 5000 and the second under UID 5001, then the process with UID 5001 must be added to the group associated with UID 5000. This additional group assignment is required only for channels using the INTER_PROCESS backend.
In other example configurations, the process with UID 5002 owns both itc_test_0 and itc_test_1 endpoints, the process with UID 5003 owns the ivm_test endpoint, and the process with UID 5004 owns the nvscic2c_pcie_s1_c5_1 endpoint.
If a UID is not defined for an endpoint or channel in nvsciipc.cfg, NvSciIpc security is disabled for that entry. The Linux journal or system log will report the following error when a required security configuration is missing:
# journalctl | grep nvsciipc_init
Or
# cat /var/log/syslog | grep nvsciipc_init
"ERROR: security configuration is missing in nvsciipc.cfg, please review the configuration. Unsecured nvsciipc configuration shall be used at your own risk"
Steps to Adopt Security for NvSciIpc Endpoints#
The UID of the client process that owns each endpoint must be added to the corresponding entry in nvsciipc.cfg (refer to the example above).
For channels using the INTER_PROCESS backend, the second process must be added to the group of the first process, as it shares NvSciIpc resources. Group membership can be updated following NV UID and GID Reservation Tool.
- The process_information.yaml file must also be updated to reflect the correct UID and security settings for each client process. See ‘Process Information’ section in Security Configuration Tool.
For Intra-VM endpoint use cases, NvSciIpc uses mqueue for IVC signaling. In some cases, client processes may fail to access mqueue due to kernel-level restrictions. As a workaround for systems running kernel version 6.1, the following AppArmor flag must be added to the client process entry in the process_information.yaml file.
This flag is required due to limitations in the upstream AppArmor driver for kernel 6.1. Later kernel versions include proper support for mqueue access.
Rebuild and flash the image, or reboot the system, to apply all configuration changes.
Example of setting up AppArmor flag:
AppArmorFlags:
- attach_disconnected
Steps to Bypass NvSciIpc Endpoint Security#
If no UID is specified for a channel in the nvsciipc.cfg file, endpoint security is disabled for that specific NvSciIpc channel. NvSciIpc does not provide a global option to disable endpoint security across all channels.
FAQs#
Q1. How to use endpoint security in a test or user application without systemd ?
A1. NvSciIpc client processes are expected to be launched by systemd, which ensures that all required environment variables, group memberships, and AppArmor profiles are correctly applied. Launching a client process outside of systemd is not officially supported and may lead to access permission or AppArmor errors that require manual resolution.
In unsupported scenarios, such as testing, a client process may be started manually using the following command format:
sudo -u <UID> {environment variable settings} test_program <program options>
Manual execution requires proper group membership and an appropriately configured AppArmor profile to avoid runtime errors.
Q2. How to check which endpoint/channel is missing the security configuration ?
A2. The ‘sudo getipccfg’ command can be used to inspect the current NvSciIpc configuration. Endpoints that do not have a UID assigned will appear with a value of -1, indicating that the UID is missing.
[00001] 2 ivc_test 2000(nvsciipc)
[00002] 2 loopback_tx 2000(nvsciipc)
[00003] 2 loopback_rx 2000(nvsciipc)
[00004] 2 latency 2000(nvsciipc)
[00005] 2 nvscistream_ivm_0 -1(missing UID)
[00006] 2 nvscistream_ivm_1 -1(missing UID)
[00007] 2 aaos_nvsci_display_ivm_0 -1(missing UID)
[00008] 2 aaos_nvsci_display_ivm_1 -1(missing UID)
[00009] 1 ipc_test_0 2000(nvsciipc)
[00010] 1 ipc_test_1 2001(nvsciipc2)
[00011] 1 ipc_test_a_0 2000(nvsciipc)
[00012] 1 ipc_test_a_1 2001(nvsciipc2)
...skipped...
[01360] 1 nvsf_ipc_j_1 -1(missing UID)
[01361] 1 nvsf_ipc_k_0 -1(missing UID)
[01362] 1 nvsf_ipc_k_1 -1(missing UID)
[01363] 1 nvsf_ipc_l_0 -1(missing UID)
[01364] 1 nvsf_ipc_l_1 -1(missing UID)
[01365] 1 nvsf_ipc_m_0 -1(missing UID)
[01366] 1 nvsf_ipc_m_1 -1(missing UID)
BT:0, count:46
BT:1, count:1220
BT:2, count:54
BT:3, count:46
BT:4, count:0
count:1366
Q3. Process triggers permission error in opening endpoint. How to debug?
A3. AppArmor denial messages related to a client process should be checked in the system logs. If any DENIED messages are observed, the AppArmor profile for the process must be updated to grant the necessary permissions.
Examples of access denied messages:
Apr 23 02:46:00 tegra-ubuntu kernel: [TS:69168966090] audit: type=1400 audit(1745376360.040:512270): apparmor="DENIED" operation="open" profile="/new_root/usr/bin/<your_binary>" name="/new_root/dev/shm/sem.<sem_node>" pid=2806 comm="<your_process>" requested_mask="wr" denied_mask="wr" fsuid=2001 ouid=2000
Apr 23 02:46:00 tegra-ubuntu kernel: [TS:69168994738] audit: type=1400 audit(1745376360.040:512271): apparmor="DENIED" operation="open" profile="/new_root/usr/bin/<your_binary>" name="/new_root/dev/shm/<shm_node>" pid=2806 comm="<your_process>" requested_mask="wr" denied_mask="wr" fsuid=2001 ouid=2000
Examples of AppArmor profile updates:
AdditionalAppArmorRules:
- /new_root/dev/shm/sem.<sem_node> rw
- /new_root/dev/shm/<shm_node> rw
DriveOS Chain-C Linux Security Hardening#
The Recovery Boot Chain (Boot Chain-C) resides in QSPI, and is primarily a recovery partition in case the system is bricked for any reason. The Chain-C is also extended by customers who integrate their own software components. To achieve holistic security, every component in the software supply chain must be secure and compliant. NVIDIA is committed to protecting all assets it owns or delivers. These assets, such as binaries, processes and services, are safeguarded using well-established discretionary access control mechanisms. It is expected that the customers that update Chain-C protect their assets in a similar way that Nvidia has protected its assets.
OS and Toolchain Versions
Component |
Version |
---|---|
GCC |
13.2 |
Binutils |
2.42 |
Linux kernel |
6.1.119 |
Linux Kernel Hardening#
Kernel Configurations#
The following table describes enabled DriveOS Linux kernel configurations.
Kernel Configuration |
Description |
Why Enable? |
---|---|---|
CONFIG_STRICT_KERNEL_RWX |
When enabled, kernel text and rodata memory will be made read-only |
Provides memory safety, prevents tampering code and read-only critical kernel data. |
CONFIG_STACKPROTECTOR |
Enables stack protector |
Detects stack smashing and can potentially prevent hijacking execution flow. |
CONFIG_STACKPROTECTOR_STRONG |
Enables strong stack protector. This is the same as using GCC option “-fstack-protector-strong” |
Detects stack smashing, and can potentially prevent hijacking execution flow as detection leads to execution termination. |
CONFIG_HARDENED_USERCOPY |
Checks memory regions when copying memory to/from the kernel (via copy_to_user() and copy_from_user() functions) and to make sure that: - size of data being copied to kernel object doesn’t exceed object’s allocated size - target address is not within the kernel text |
Enabling this kernel configuration option can mitigate a class of vulnerabilities that involve buffer overflows and writing to arbitrary kernel addresses. |
CONFIG_STRICT_MODULE_RWX |
Sets loadable kernel module data as NX and text as RO |
Mitigates runtime attacks that involve code and data tampering. |
CONFIG_MODULE_SIG=y |
Enables kernel module signing. Each kernel module is digitally signed. Kernel module signature is checked by the kernel while loading kernel modules |
Kernel module signing ensures integrity and authenticity of kernel modules while loading them. |
CONFIG_MODULE_SIG_ALL=y |
When enforced, the kernel loads kernel modules only from known origin(s) and if their integrity is not broken. |
|
CONFIG_DEFAULT_MMAP_MIN_ADDR=32768 |
Disallows allocating the first 32k of memory. This is the portion of low virtual memory that should be protected from user space allocation. |
Keeping a user from writing to low pages can help reduce the impact of kernel NULL pointer bugs |
CONFIG_RANDOMIZE_BASE |
Enables KASLR (Kernel Address Space Layout Randomization). Randomizes the virtual address at which the kernel image is loaded |
Deters exploit attempts relying on knowledge of the location of kernel internals. |
CONFIG_ARM64_PTR_AUTH |
Enables the Arm PA (Pointer Authentication) instructions at EL0 (i.e. for userspace). Choosing this option will cause the kernel to initialise secret keys for each process at exec() time, with these keys being context-switched along with the process |
Pointer authentication (part of the ARMv8.3 Extensions) provides instructions for signing and authenticating pointers against secret keys, which can be used to mitigate Return Oriented Programming (ROP) and other attacks. |
CONFIG_MODULE_SIG_FORCE |
Linux kernel loads only signed kernel modules |
CONFIG_MODULE_SIG_FORCE shall be enforced by the customers. |
The following table describes disabled DriveOS Linux kernel configurations.
Kernel Configuration |
Description |
Why Disable? |
---|---|---|
CONFIG_COMPAT_BRK |
Disable heap randomization. |
Enabling this configuration breaks ASLR. |
CONFIG_DEVMEM |
Enables “/dev/mem” |
“/dev/kmem” provides read and write access to the kernel address space. |
CONFIG_COMPAT_VDSO |
Enabling the VDSO (Virtual Dynamically-linked Shared Object) memory page is used by a process to access kernel features, without using a system call. VDSO memory page is always located at the same address and it contains ROP gadgets. |
If enabled, it allows bypassing ASLR. |
Linux Toolchain Hardening#
The following table describes enabled compiler/linker security-related options in the release.
Option |
Description |
Additional Information |
---|---|---|
-fstack-clash-protection |
Stack slash protection |
Prevents stack clash attacks. |
-fstack-protector-strong |
Stack smashing protection |
Protects functions that call alloca() and/or with local array definitions |
-D_FORTIFY_SOURCE=1 |
Fortify sources with checks for unsafe libc functions and potential buffer overflows. |
The _FORTIFY_SOURCE macro adds buffer overflow checks for the following libc functions: memcpy, memset, stpcpy, strcpy, strncpy, strcat, strncat, sprintf, snprintf, vsprintf, vsnprintf, gets,.. -D_FORTIFY_SOURCE=1 adds checks that shouldn’t change the behavior of conforming programs. See feature_test_macros(7) - Linux manual page. -D_FORTIFY_SOURCE=2 adds additional runtime checks, which may cause some conforming programs to fail. DRIVE OS Linux enables mode 1 as default, but this can be updated by the customers or the component owners |
-Wl,-z,relro; -Wl,-z,now |
Full RELRO (Relocation Read Only) protection |
(RELocationReadOnly) Mark relocation table entries resolved at load-time as read-only.
|
-pie/-PIE and -pic/-PIC |
PIE (Position Independent Executable) and PIC (Position Independent Code) |
For randomizing standalone executable text and data segments. Complements ASLR (Address Space Layout Randomization) |
-Wformat=2 |
Enable -Wformat plus additional format checks |
Enable -Wformat plus additional format checks. See Warning Options. |
-mbranch-protection=pac-ret+leaf and -march=armv8.3-a |
-march=armv8.3-a Enables PAC. “-mbranch-protection=pac-ret” signs and authenticates return addresses to protect against Return-Oriented Programming (ROP) attacks. |
Authentication is also performed on leaf function with “+leaf” option |
Linux Userspace Hardening#
Security Hardened Sysctl Configurations#
Option |
Description |
---|---|
kernel.kptr_restrict = 1 |
The %pK format specifier is designed to hide exposed kernel pointers, specifically via /proc interfaces. Exposing these pointers leads to information leakage. If kptr_restrict is set to 1: if the current user does not have CAP_SYSLOG, kernel pointers using %pK are printed as 0’s. If kptr_restrict is set to 2: kernel pointers using %pK are printed as 0’s regardless of privileges. The setting is defined in |
kernel.randomize_va_space = 2 |
Fully enables ASLR (Address Space Layout Randomization) 2 means that stack/heap, mmap base, VSDO pages are randomized. See kernel.txt. |
Discretionary Access Control (DAC)#
Introduction#
Discretionary Access Control in Recovery VM allows access to objects - such as files or directories - to be controlled at the discretion of their owners, who can set or modify permissions for users (UID) and groups (GID). This includes management of supplementary group IDs (SGIDs), which grants users secondary group memberships beyond their primary GID.
This granular approach allows:
Least-privilege access to libraries/utilities via SGIDs
Primary group isolation for core functionality
Dynamic permission stacking without modifying file ownership
DAC policies and Tools#
The DAC policy for Recovery VM is to assign a unique Group ID (GID) to all NVIDIA owned assets. The current implementation approach hardcodes these GIDs in a Yocto recipe, and further assigns them as supplementary GIDs to the user that requires a subset of these assets. This allows every new user to have access to specific resources via the GIDs which are assigned to it as the supplementary GID during build time itself.
There are no additional tools required for setting DAC policies in Recovery VM.
Steps to adopt DAC for new processes and assets#
To adopt DAC in the Recovery VM for new processes and assets, new users can be added in the below Image recipe by adding useradd entries to the EXTRA_USERS_PARAMS field to add new users.
tegra-initramfs-recovery.bb
The following example of the “nvsciipc” user demonstrates how it is assigned UID, GID and SGIDs:
# groupadd -g 2000 nvsciipc;
# useradd -u 2000 -g nvsciipc -G 1018,1020,1021,1100,1101 nvsciipc;
On the first line, a new group “nvsciipc” is created with Group ID (GID) 2000, and on the second line a new user “nvsciipc” is created with User ID (UID) 2000 with primary group as “nvsciipc” and the supplementary groups as 1018,1020,1021,1100, and 1101. These GIDs are of the libraries and configuration files required by the “nvsciipc” user - libnvos, libnvscievent, libnvsciipc, nvsciipc_t23x.cfg, nvsciipc_t26x.cfg, respectively.
The full list of all the NVIDIA owned resources can be found under the GROUPADD_PARAM:${PN} field in the below file:
${YOCTO_TOP}/drive-linux_src/yocto/layers/meta-tegra/recipes-bsp/tegra-drivers/nv-tegra-drivers_1.0.bb
To add a new user, say, “du_user” with subscriptions to all the DU libraries, it can be done with below:
# useradd -g nvsciipc -G 1010,1011,1012,1013,1014,1015,1016 du_user;
In the example above, “nvsciipc” group is assigned as the primary group for the “du_user” user, and the supplementary GIDs are 1010:libdu_mcc, 1011:libnvdubhc_api, 1012:libnvducc, 1013:libnvdulink, 1014:libnvduplugin, 1015:libnvdusclient, and 1016:libnvdutransport.
Steps to bypass DAC#
Before launching the Yocto build for recovery initramfs, set the environment variable IS_DAC_DISABLED to “1” like the following:
# export IS_DAC_DISABLED=”1”
# source oss/scarthgap/poky/oe-init-build-env
# bitbake tegra-initramfs-recovery
This bypasses DAC, and all NVIDIA owned assets are installed as “root” user with default filesystem permissions.
Device Node Protection#
Access to following device nodes is restricted so that they cannot be accessed by non-root users.
Device Node |
Permissions |
Owner |
Group |
---|---|---|---|
/dev/autofs |
rw-r—– |
root |
root |
/dev/ptmx |
rw-rw—- |
root |
root |
/dev/tty |
rw-rw—- |
root |
root |
/dev/urandom |
rw-rw—- |
root |
root |
/dev/zero |
rw-rw—- |
root |
root |
/dev/random |
rw-rw—- |
root |
root |
/dev/null |
rw-rw—- |
root |
root |
/dev/full |
rw-rw—- |
root |
root |
/dev/kmsg |
rw-r—– |
root |
root |
/dev/nvsciipc |
rw-rw—- |
root |
nvsciipc |
/dev/ivc* |
rw-rw—- |
root |
nvsciipc |
One of the below methods should be selected to allow access to above device nodes for non-root users.
Change the default permissions with root user.
It is not recommended to follow this method for device nodes such as /dev/nvsciipc and /dev/ivc* whose access needs to be controlled.
However, this method can be followed for common device nodes such as /dev/zero.
Example command to allow access to /dev/zero to all the users:
chmod 666 /dev/zero
Add the user to appropriate groups.
Recommended method for device nodes whose access needs to be controlled, for example, /dev/nvsciipc and /dev/ivc*.
Example command to allow a <user> access to
/dev/nvsciipc
and /dev/ivc*:useradd -G nvsciipc <user>