Over 300M computed tomography (CT) scans are performed globally, 85M in the US alone. Radiologists are looking for ways to speed up their workflow and generate accurate reports, so having a foundation model to segment all organs and diseases would be helpful. Ideally, you’d have an optimized way to run this model in production at scale.
NVIDIA Research has created a new foundation model to segment full-body CT images and has packaged it into a highly optimized container that can scale in deployment. In this post, we discuss the VISTA-3D foundation model, NVIDIA NIM, and how to run the model on your data.
VISTA-3D
Vision foundation models have attracted increasing interest. In the context of medical image analysis, two essential features make these models particularly practical:
- Fast and precise inference for common tasks
- Effective adaptation or zero-shot ability to novel tasks
NVIDIA has been focusing on 3D CT segmentation and recently developed VISTA-3D (Versatile Imaging SegmenTation and Annotation). The model is trained systematically on more than 12K volumes encompassing 127 types of human anatomical structures and various lesions (lung nodules, liver tumors, pancreatic tumors, colon tumors, bone lesions, and kidney tumors).
It provides accurate out-of-box segmentation as well as state-of-the-art, zero-shot interactive segmentation. The novel model design and training recipe represent a promising step toward developing a versatile medical image foundation model.
VISTA-3D is a domain-specialized interactive foundation model that combines semantic segmentation with interactivity, offering high accuracy and adaptability across diverse anatomical areas for medical imaging. It has the following core workflows:
- Segment everything: Enables whole body exploration, crucial for understanding complex diseases affecting multiple organs and for holistic treatment planning.
- Segment using class: Provides detailed sectional views based on specific classes, essential for targeted disease analysis or organ mapping, such as tumor identification in critical organs.
- Segment point prompts: Enhances segmentation precision through user-directed, click-based selection. This interactive approach accelerates the creation of accurate ground-truth data, essential in medical imaging analysis.
Figure 1 shows a high-level diagram of the VISTA-3D architecture, with an encoder layer followed by two decoder layers in parallel. One decoder is for the automatic segmentation while the other decoder is for the point prompt.
Each head takes in corresponding input as class prompts or point clicks by the user to guide the segmentation. Each head also results in its own segmentation results, which are then combined into the final segmentation result using a merging algorithm.
For more information about the architecture, see the VISTA3D: Versatile Imaging SegmenTation and Annotation model for 3D Computed Tomography research paper.
VISTA-3D NIM microservice
All NVIDIA NIM microservices are hosted on the NVIDIA API Catalog for you to test out different microservices and find out about their capabilities.
Find VISTA-3D under Healthcare and test it out with sample data. View the test dataset in either axial and coronal or sagittal view. VISTA-3D can segment over 100 organs, or you can select specific classes of interest.
Using NIM microservices hosted by NVIDIA
You can run VISTA-3D on your data using the NIM microservice hosted by NVIDIA. Sign up to get a personal key. NVIDIA gives users 1000 free credits to try out any of the NIM microservices.
Give your personal key a name, and an expiration date. Include AI Foundation Models and Endpoints in the services included. You use this key for all API calls. For more information, see Optionally Generate an NGC Key.
To check on your credits, log in to the NVIDIA API Catalog and check your profile. Your credits are listed at the upper right.
To test your first VISTA-3D NIM microservice call, copy the code of your favorite language (shell script, Python, or Node.JS) and use the recently generated API_KEY
value to run inference on the same sample that you used on the NVIDIA AI solution page, similar to the following example in Python.
import requests
invoke_url = "https://health.api.nvidia.com/v1/medicalimaging/nvidia/vista-3d"
headers = {"Authorization": "Bearer <place you key here it should start with nvapi-> ",}
filename="example-1"
payload = {
"image": f"https://assets.ngc.nvidia.com/products/api-catalog/vista3d/{filename}.nii.gz",
"output": {"extension": ".nii.gz","dtype": "uint8"}}
session = requests.Session()
response = session.post(invoke_url, headers=headers, json=payload)
response.raise_for_status()
file_name=f"{filename}_seg.nii.gz"
with open(file_name, "wb") as file:
file.write(response.content)
print("File downloaded successfully!")
Congratulations, you just ran your first VISTA-3D NIM microservice call on the sample data. This also confirms that your API_KEY
value works and you have enough credits, so you are now ready to test with your own data.
Each inference call uses one credit and your credits are shared across all NIM microservices.
Running VISTA-3D with your data
To see VISTA-3D inference on your own data, you must set up an FTP server to serve your medical images.
Unlike LLM models that take a compressible small text payload, medical images are typically large. Instead of sending large images in the API payload, you send the image URL to the FTP server. The VISTA-3D NIM microservice then downloads the images from the FTP server and runs inference before sending back the inference result (Figure 2).
Share files on GitHub
The simplest and fastest way to share sample data is to use GitHub:
- Log into your GitHub account.
- Create a new project and make sure that it is public.
- Choose Upload file and upload some NIfTI files with .nii or .nii.gz. Make sure that the size of each is <25 MB.
- Select your NIfTI file, right-click Raw, and then choose Copy link address.
The resulting URL should look something like the following example:
https://github.com/<your_user_name>/Nifti_samples/raw/main/filename.nii.gz
You can now change the URL in the earlier code example to point to your file and VISTA-3D downloads it.
payload = {
"image": f"https://github.com/<your_user_name>/Nifti_samples/raw/main/filename.nii.gz",
"output": {"extension": ".nii.gz","dtype": "uint8"}}
Share your sample data in the cloud
If you have some data already in the cloud, you can expose a small number of NIfTI images to test VISTA-3D. In that case, start a simple NGINX server or a simple Python HTTP server from your local file system directory. This must be a publicly accessible server or port so that the VISTA-3D microservice hosted by NVIDIA can download the data.
First, connect to the server using SSH, move some sample data, and then start a simple Python HTTP file server:
python -m http.server <port>
You should be able to access the files from a browser using htttp://serverIP:port/
.
Connect to the FTP server created earlier and change the lines to point to your HTTP server.
filename="<file_on_http"
payload = {
"image": f"http://<your server ip:port /{filename}.nii.gz",
You should see the HTTP file server responding to the NIM request to download the image file and then you will get the inference result.
Running NIM microservices locally
To get started running NIM microservices locally, apply for NVIDIA NIM access. After you’ve applied, the NVIDIA team will contact you to schedule an onboarding meeting. You’ll need an NVAIE license or a trial license, but it’s important to wait for instructions before applying for a license.
After completing these steps and receiving approval, you’ll gain access to the VISTA-3D NIM microservice Docker container. This enables you to run the microservice on your preferred hardware, whether locally or in the cloud.
In the following sections, we show you an example setup using Docker Compose to help you get up and running quickly.
Prerequisites
You should have the Docker, Docker Compose, and NVIDIA drivers installed. To check, run the following command, which pulls a small image of the latest CUDA release, starts the Docker container, and then checks that you can access the GPU from within the container.
docker run --rm --gpus all nvidia/cuda: 12.5.0-base-ubuntu20.04 nvidia-smi
You should see an output showing the GPUs that you have on your system.
Docker Compose file
The Docker Compose code example starts two containers:
nim-vista
to run inferencenim-nginx
to serve your images
# docker-compose.yml
version: "3.9"
services:
nim:
container_name: nim-vista
image: nvcr.io/nvidia/nim/medical_imaging_vista3d:24.03
ports:
- 8008:8008
##############################
nginx:
container_name: nim-nginx
image: nginx:1.19-alpine-perl
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf
- <local/folder/containing/nifti/files>:/files
ports:
- 8009:8009
Change the directory to <local/folder/containing/nifti/files>
.
Nginx basic configuration
In the same folder, create the nginx.conf
file:
worker_processes auto;
pid /etc/nginx/.nginx.pid;
events {
worker_connections 768;
}
http {
server {
listen 8009; # internal port
root /files; # Static file directory
autoindex on; # Enable directory listing
location / {
try_files $uri $uri/ = 404;
}
}
}
Run the inference call
To confirm that your setup is correct, run the following curl command:
curl http://localhost:8008/health/ready
This command should return true, indicating that the VISTA-3D NIM microservice is up and running.
To confirm that nginx
is serving your data correctly, on your browser, check http://localhost:8009 to see the folders and files that you are sharing.
Now you can reuse the same code from earlier to do inference, with the minor change of pointing to nginx
for serving your data, instead of pointing to NVIDIA sample data or your sample files on GitHub.
import requests
invoke_url = “http://localhost:8008/vista3d/inference”
headers = {}
filename="test_w_nginx"
payload = {
"image": f"https://localhost:8809/<file Path as you it shows in nginx webpage>.nii.gz",
"output": {"extension": ".nii.gz","dtype": "uint8"}}
session = requests.Session()
response = session.post(invoke_url, headers=headers, json=payload)
response.raise_for_status()
file_name=f"{filename}_seg.nii.gz"
with open(file_name, "wb") as file:
file.write(response.content)
print("File downloaded successfully!")
The invoke_url
value now points to your host and the headers are empty as you don’t need to authenticate using the NGC key.
After this runs, you should have the organs segmentation file with all the labels.
Security tip
When you get this running, you probably don’t want to have the NGINX port open serving your data to anyone with access to the host.
Instead, limit access to VISTA-3D NIM microservice. Edit the docker-compose.yml
file and remove the port mapping of nginx
by deleting the lines for ports: - 8009:8009
.
Next, change the infer call to use nim-nginx
instead of localhost
, which is not accessible anymore.
"image":fhttps://nim-nginx:8809/{filename}.nii.gz
The internal name nim-nginx
is accessible to NIM as Docker compose places all services in the same network.
Conclusion
In this post, we introduced the new NVIDIA AI Foundation model, VISTA-3D, which can segment over 100 organs and multiple diseases in a CT image. We also showed you how NVIDIA NIM simplifies using this model.
Interested in enhancing your CT image analysis? Apply for access to the VISTA-3D NIM microservice. After being approved, you can run this powerful model on your own hardware, improving your segmentation accuracy and streamlining your workflow.