Enhancing Anomaly Detection in Linux Audit Logs with AI

In cybersecurity, identifying threats swiftly and accurately is paramount to the success of the modern enterprise. Linux audit logs, which record system activities, offer a goldmine of data for spotting unusual activities that could signify security breaches and insider threats.

NVIDIA Morpheus, an AI-driven cybersecurity framework, is at the forefront of enhancing anomaly detection in these logs. This post explores how NVIDIA Morpheus can be used to identify threats in Linux audit logs.

Challenges with current SIEM tools

Traditional security information and event management (SIEM) tools are focused on predefined rules for alert generation. They face several issues:

Identifying new threats: Limited to known threat patterns, they often miss novel or complex attacks.
High false positive rates: Strict rules lead to numerous false alerts, burdening security teams with unnecessary investigations.
Limited contextual insight: Lack of context in evaluating events can cause overlooked threats or misinterpretations.

What are Linux audit logs and what information do they hold?

Linux audit logs play a pivotal role in monitoring system activities within Linux environments, offering comprehensive insights into various aspects:

User activity: Details on logins, commands, and system setting changes.
System events: Records of startups/shutdowns, time changes, and kernel actions.
File access: Information on file interactions, modifications, or deletions.
Network activities: Data on network connections, transfers, and security events.
Authentication and authorization: Logs of login attempts and permission adjustments.

This data is vital for system oversight, security assessments, and troubleshooting. It helps administrators and security experts ensure system integrity, identify vulnerabilities, and meet compliance requirements.

Here’s an example of what an entry in a Linux audit log might look like:

type=SYSCALL msg=audit(1614699353.204:12345): arch=c000003e syscall=59 success=yes exit=0 a0=555555554000 a1=5555555562a0 a2=5555555568e0 a3=7fffd9f71c20 items=2 ppid=2914 pid=2915 auid=1000 uid=1000 gid=1000 euid=1000 suid=1000 fsuid=1000 egid=1000 sgid=1000 fsgid=1000 tty=pts0 ses=1 comm="cat" exe="/usr/bin/cat" key=(null)

Understanding anomaly detection in Linux audit logs

Anomalies, or outliers, are incidents in data that deviate from the normal behavior in a given context.

Anomaly detection in Linux audit logs refers to the process of identifying unusual patterns or activities that deviate from the norm within system logs. These anomalies can signify potential security threats, such as unauthorized access attempts, malware infections, or insider threats.

By analyzing the vast amount of data contained in audit logs, anomaly detection algorithms can pinpoint irregularities that might otherwise go unnoticed.

NVIDIA Morpheus cybersecurity AI framework

NVIDIA Morpheus, part of NVIDIA AI Enterprise, is an AI application framework designed to enhance cybersecurity capabilities. It enables you to create AI-driven solutions that can filter, process, and classify large volumes of data for cybersecurity purposes.

Harnessing GPU acceleration, Morpheus enhances the speed of data inspection and analysis, facilitating real-time detection and response to security threats.

NVIDIA Morpheus presents a compelling solution for analyzing Linux audit logs, an area where the volume and the dynamic nature of the data can overwhelm traditional monitoring tools.

Here’s why Morpheus is well suited for this task:

Handling high-volume data
Ease of integration and use
Built-in support for anomaly detection: digital fingerprinting

Handling high-volume data

Linux audit logs generate a massive amount of data, capturing detailed records of system events which can be critical for security. Traditional tools may struggle with this volume, leading to delayed or missed insights.

Morpheus leverages GPU acceleration, enabling systems to process data up to 600x faster than conventional, non-GPU accelerated servers. This capability ensures that even the densest audit logs can be analyzed efficiently, identifying anomalies without lag.

Ease of integration and use

Morpheus supports common deep learning frameworks and model formats, allowing cybersecurity developers to easily integrate their existing models or deploy pre-trained models.

It offers a range of stages and pipeline examples for quick deployment and customization. This ensures that teams can concentrate on tackling security threats instead of getting bogged down in engineering tasks.

Built-in support for anomaly detection: Digital fingerprinting

The Digital Fingerprinting AI workflow included with Morpheus offers a sophisticated approach to anomaly detection in cybersecurity. This workflow employs unsupervised learning algorithms to create unique identifiers, or fingerprints, for each entity on a network.

By analyzing these fingerprints, Morpheus can detect deviations from normal behavior patterns, signaling potential security threats or anomalies.

The Digital Fingerprinting AI workflow runs within the Morpheus framework, featuring both training and inference pipelines that communicate through a shared model store. This enables dynamic and efficient anomaly detection and scaling to manage tens of thousands of fingerprints without having to develop a custom workflow from scratch.

Creating an anomaly detection workflow for Linux audit logs using the Morpheus framework

We built an anomaly detection workflow for Linux audit logs based on the existing Morpheus Digital Fingerprinting AI workflow with some modifications.

Diagram shows end-to-end overall fraining and inference workflow. — *Figure 1. Anomaly detection workflow*

Preprocessing and feature engineering

In the preprocessing stage, logs generated by the system are filtered out to reduce noise in the dataset.

From the Linux logs, we designed and developed a set of features by aggregating data over a rolling window of 5 minutes. The following are features used in the model:

User activity: Boolean features of modification or access to certain sensitive configuration files such as sshd_config or passwd files.
Process activity: Spikes in process activity, package install, or removal.
File access patterns: Count of files deleted, moved, or copied.
Network activity: Inbound and outbound activities.
Authentication: Count of user login failures, sudo command execution, and SSH connection.

Model training and evaluation

While assembling the training and validation datasets, we ensured that they accurately depicted the baseline behavior of the given server, excluding any anomalous data points.

Conversely, for the test dataset, we incorporated anomalous data points, created with the assistance of the security operations center (SOC) analyst.

We trained the Autoencoder model for each server on their Linux audit logs separately. The trained model along with the respective metadata is stored in MLflow.

Training pipeline

The training pipeline includes steps such as retrieving data from the Delta Lake source stage, preprocessing, and feature extraction. Following feature engineering, the model is trained on the provided logs, and the trained model is then saved in MLflow.

Figure 2 shows the training pipeline we developed using the Morpheus framework.

Diagram shows security log data from Delta Lake moving through the pipeline stages to the MLFlow model repo: Delta Lake source, preprocessing and feature engineering, the training model autoencoder, and the MLFlow writer. — *Figure 2. Anomaly detection training pipeline*

Using the training pipeline, we successfully trained a model on a single server’s log data, encompassing a month’s worth of activity with 100M log lines in just 8 minutes, due to accelerated GPU compute. This would take hours to days using traditional approaches.

Bar chart shows training times for 100M logs, 300M logs, and 500M logs. — *Figure 3. Model training pipeline metrics*

Inference pipeline

The Delta Lake source and feature engineering stage are used in both the training and inference pipelines. Model weights are fetched from MLflow using the machine name, and Morpheus offers a feature to cache the model.

After performing inference on the given logs, alerts are filtered and post-processed based on a predefined threshold. When post-processing is complete, the alerts are dispatched to Splunk.

Figure 4 shows the inference pipeline we developed using the Morpheus framework.

Workflow diagram shows security log data moving through the inference pipeline stages: Delta Lake source, preprocessing and feature engineering, inference, filter, post-processing, and the Delta Lake writer. — *Figure 4. Anomaly detection inference pipeline*

The inference pipeline can process 100K log lines in approximately 120 seconds.

Bar chart shows inference times for 50K logs, 100K logs, and 500K logs. — *Figure 5. Model inference pipeline metrics*

Monitoring and alerting

The anomaly detection workflow is designed to forward alerts directly to an SIEM tool such as Splunk. This enables security operation center analysts to promptly take appropriate measures based on these notifications.

Screenshot showcasing detected anomalies with their respective anomaly score. — *Figure 6. Alert dashboard*

Anomalies detected by the model: External intrusion and insider threat

The anomaly detection workflow is finely tuned to detect two predominant types of security threats within Linux audit logs: unauthorized access attempts and unusual system behavior.

Unauthorized access attempts

These are attempts by unauthorized users to gain access to the system or sensitive data, often signaling external intrusion efforts. They can manifest as repeated login failure or, usage of sudo commands by non-privileged users.

By aggregating and analyzing login failure counts and sudo command executions within a certain time frame, the model can effectively detect such unauthorized access attempts. Patterns deviating significantly from the baseline behavior, such as a spike in failed login attempts or sudo commands from unusual user accounts, trigger alerts, indicating potential security breaches.

Unusual system behavior

This category involves a broad range of anomalous activities within the network, suggesting possible malware infections or insider threats.

For instance, a sudden increase in file manipulation activities, such as deletions, movements, or copying of sensitive files, could indicate a malware infection.

Similarly, abnormal spikes in network activity, especially outbound transfers, might suggest data exfiltration attempts by malicious insiders.

With features that monitor file access patterns, network activities, and changes in critical system files, the model is capable of detecting these unusual behaviors. As the model monitors each host individually and triggers alerts only upon significant deviations from established patterns, it offers a reduced false positive rate and enhanced explainability compared to traditional rule-based SIEM anomaly detectors.

Business outcomes

Anomaly detection in Linux audit logs is a critical component of a comprehensive cybersecurity strategy. This post showed how the Morpheus framework can be employed to develop an anomaly detection pipeline specifically for Linux audit logs with the goal of uncovering potential threats in Linux audit logs.

By integrating Linux audit anomaly detection workflow, SOC analysts can achieve substantial benefits, such as improved security and risk management, through the early detection of threats.

To enhance AI transparency, demonstrate accountability in the model development process, and promote ethical considerations for frameworks like Morpheus, NVIDIA has up-leveled all of its model cards to Model Card++s. For more information about Digital Fingerprinting and other models part of the Morpheus framework, see the Model Card++s published on GitHub.

Apply to experience Morpheus in NVIDIA LaunchPad to get hands-on experience with our Digital Fingerprinting AI workflow or get started with Morpheus in the /nv-morpheus GitHub repo or through the 90-day free trial.

For more information, see the following resources:

Enhancing Anomaly Detection in Linux Audit Logs with AI

Challenges with current SIEM tools

What are Linux audit logs and what information do they hold?

Understanding anomaly detection in Linux audit logs