Traditional cybersecurity methods include creating barriers around your infrastructure to protect it from intruders with ill intentions. However, as enterprises continue along the path of digital transformation, faced with a proliferation of devices, more sophisticated cybersecurity attacks, and an incredibly vast network of data to protect, new cybersecurity methodologies must be explored.
An alternative approach is to address cybersecurity as a data science problem. Aim to better understand all the users and activities across your network so that you can identify which transactions are typical and which are potentially nefarious.
Traditional cybersecurity solutions, including security information and event management (SIEM), collect logs that can be analyzed if a threat is detected. Real-time monitoring is done, but many enterprises monitor only a small fraction of the data they generate. It’s simply too large and computationally expensive to do otherwise.
Next-generation tools and solutions must be distributed, relying not only on central analysis and collection but on edge compute as well. They must also aim to recognize potential threats before they become disruptive, by looking at real-time behavior and alerting security operations teams of issues immediately.
Enterprises can use massive amounts of data within their infrastructure to create this type of proactive cybersecurity posture, however doing so requires powerful and complex tools. The NVIDIA Morpheus AI application framework, built on NVIDIA RAPIDS and NVIDIA AI, can be leveraged to create powerful tools for cybersecurity developers and practitioners to implement cybersecurity solutions that perform on a scale never before possible. When combined with powerful NVIDIA GPU and DPU accelerators, and DOCA telemetry, in NVIDIA-Certified servers, this can bring a new level of security to data centers.
Unsupervised learning enables the detection of threats you can’t see
The NVIDIA Morpheus AI framework lets you harness the power of GPU computing to create fine-grained models and deploy them at scale. We demonstrate this capability with a new, pretrained workflow designed to analyze the behavior of every human and machine across the network to detect anomalous behavior. On any typical day, users across the entire enterprise are accessing multiple accounts and applications to complete their daily work. In addition to these human actions, there are hundreds of thousands of automated actions initiated by machines, which create a massive volume of data that consists of most of the traffic on today’s networks.
However, if a human were to masquerade as a machine and take control of the account to perform unauthorized actions, it would be nearly impossible to detect with traditional cybersecurity measures.
With NVIDIA Morpheus, however, you can implement unsupervised learning on a massive scale that was previously impossible, enabling you to learn from the data on the network. This means that you can watch all the actions on the network and learn what is good or bad, without having to label the actions this way in advance. For every single user, account, token, and machine interacting with a system, you can learn the typical behavioral patterns across multiple dimensions.
You create two models for each of these combinations: a time-series model and a sequential model. You clean the data and create one temporal time-series model, effectively modeling the expected periodicity of activity for a given combination of user, machine, and account. The other models sequential activity using an autoencoder. In effect, it’s learning the set and series of actions that a given combination of user, machine, and account typically performs on the network.
You set the learning time to a 72-hour period, for example, and then flip to inference mode. Morpheus now can deploy and orchestrate this large number of digital fingerprints to detect behavioral changes in one of the two models for a given combination. Then, if a human is trying to take over a machine account, it’s instantly flagged for further inspection by security operations. This use case demonstrates how much data Morpheus can potentially handle, but also the number of models involved: hundreds of thousands or even millions of individual models must be managed.
Continuous monitoring of models and their performance is critical. Morpheus now includes the ability to automatically monitor concept drift (a type of model drift). Using the new concept drift node, you can look for concept drift and pipeline results to MLFlow, a common MLOps platform.
Accelerating cybersecurity AI on a massive scale
The latest release of NVIDIA Morpheus features an updated pipeline engineered specifically for cybersecurity, which enables the analysis of data two orders of magnitude faster than previously possible, providing improved accuracy and detection of threats. Morpheus runs on an NVIDIA GPU-accelerated server. For example, a server with a single A100 would provide performance up to 600X faster than a server without a GPU.
Additional pipeline improvements include the implementation of asynchronous computation and mitigating I/O as well as GPU blocking by using a fiber-based programming approach. Morpheus no longer hits the Python Global Interpreter Lock (GIL).
Because the volume of security feeds is somewhat unreliable, backpressure support is implemented throughout, with concurrent blocking queues between stages. Morpheus also supports distributed computation, using remote direct memory access (RDMA) and Unified Communication X (UCX) to transfer messages quickly and efficiently.
For the first time, Morpheus now also supports dynamic reconfiguration, all handled by a centralized orchestration service. This allows seamless scale-up and scale-out changes at runtime. You can take advantage of Python and C++ APIs to use Morpheus, to write programs thinking sequentially, but benefit from massive parallelism.
NVIDIA Morpheus lets you harness the power of GPU computing to both create these fine-grained models and deploy them at scale. You’re essentially creating customized AI for every actor on a network. Ingesting every piece of data on these networks is challenging, as is creating, using, and maintaining hundreds of thousands of models and signatures in real time. Morpheus enables you to build these complex pipelines and deploy models quickly, helping to protect networks in a way never before possible.
Threat detection with ever-improving accuracy and speed
NVIDIA Morpheus now also includes pretrained models for phishing detection. Phishing continues to be a large problem, affecting 75 percent of organizations across the globe, and in 2020 a staggering 74% of phishing attacks targeting US businesses were successful.
Traditional methods for detecting phishing emails rely on URL-only detection, complex lookups against known attacks, and following suspicious links in a sandbox environment. A better way of protecting the environment from phishing attacks would be to analyze the entire raw body of the email, including syntax and semantics of the text, and the structure of the email, in addition to the words and links used. This wasn’t previously possible, due to compute limits and the lack of generic tools to enable natural language processing (NLP) models to be deployed in cybersecurity environments seamlessly.
With NVIDIA Morpheus, this is now possible. The Morpheus phishing detection model analyzes the entire raw body of an email, just the URLs, or both, and feeds the data into a custom deep neural network (DNN) sequence classifier: the BERT model from Hugging Face. The fine-tuned model is then converted to Tensor RT and loaded into Morpheus for inference.
The new pipeline in Morpheus supports workflows that can run up to 67x faster than previous versions. Figure 3 shows the speedups for both the sensitive information detection (SID) workflow as well as the abnormal behavior profiling for crypto (ABP) workflow. The speedups result in fewer false positives, which translates to less wasted time and fewer cycles investigating false positives.
We’ve also improved the accuracy of existing models, such as the pretrained model used in the SID workflow. Using a larger and more diverse training set, this model now has a macro-F1 of 0.96 compared to a macro-F1 of 0.74 in the previous version. This represents a 22% increase in accuracy.
Summary
The new humans-as-machines, machines-as-humans workflow is available now through the Morpheus Early Access program. Also included are new pretrained models for phishing detection. Apply for early access.
NVIDIA Morpheus development partners are collaborating with NVIDIA to enable AI-based cybersecurity solutions. Join us at NVIDIA GTC this week to hear about how some of these partners are integrating NVIDIA-accelerated AI with their cybersecurity solutions.