That Awesome AI Model You Found? You Can Run It in 5 Minutes.
Ever felt the excitement of discovering a powerful new AI model on a site like Hugging Face, followed by the familiar dread of actually trying to run it? You’re not alone. The journey from “Wow, this is cool!” to a working model on your machine is often a frustrating maze of dependency hell, environment conflicts, and cryptic error messages
.What if you could skip all that? What if running a sophisticated Large Language Model (LLM) was as simple as running any other standard piece of software?
That’s the exact problem Docker Model Runner is built to solve.
Think of it this way:
Before, getting an AI model to work was like hiring a world-class chef to cook in your home kitchen. You had to find the right chef (the model), then rush out to buy all the specific, exotic ingredients and weird-looking utensils they needed. If you bought the wrong brand of olive oil (a dependency mismatch), the whole recipe would fail. It’s complicated, messy, and you spend more time setting up than actually cooking.
Docker Model Runner is like a “Chef-in-a-Box” service
It delivers the master chef and their entire professional, pre-configured kitchen right to your computer in a single, neat package.
- No More Shopping for Weird Ingredients: The model comes packaged with every single library, tool, and dependency it needs to work perfectly. You just ask for it.
- A Simple Way to Place Your Order: It automatically sets up a standard “service window” (an OpenAI-compatible API) so your other applications can easily talk to the model without needing a special translator.
- Keeps Your Space Tidy: It manages all these AI “kitchens” in one place, so you don’t have different projects with conflicting tools making a mess of your system.
In short, Docker Model Runner takes the complex, frustrating setup process of running local AI models and makes it simple, standardized, and secure.
Availability: Beta
Requires: Docker Engine or Docker Desktop (Windows) 4.41+ or Docker Desktop (MacOS) 4.40+
Picture this:
You’ve built something amazing – maybe a cool web app or a data analysis script. You’re excited to share it with your friend or colleague. You send them the code and say, “Just run it!”
Then comes the inevitable text: “It’s not working on my computer.”
Sound familiar?
Your friend has a different operating system, different versions of Python libraries, or maybe they’re missing some obscure dependency you installed months ago and forgot about. Suddenly, your “simple” project becomes a debugging nightmare.
Docker Model Runner solves the exact same problem, but for AI models.
The AI Model Nightmare (Before Docker Model Runner)
Imagine you discover an incredible AI model that can write poetry, analyze your photos, or help you code. You’re excited to try it out, so you:
- Clone the repository – Easy enough
- Install Python dependencies – pip install -r requirements.txt – Still okay
- Download the model weights – 5GB download, but manageable
- Install CUDA drivers – Wait, you need an NVIDIA GPU?
- Install PyTorch with CUDA support – Different version than what you have
- Install additional libraries – Some conflict with your existing setup
- Configure environment variables – More setup
- Run the model – Finally! But it crashes with a cryptic error about missing libraries
Two hours later, you’re still debugging instead of using the AI model.
The Docker Model Runner Solution
With Docker Model Runner, it’s like ordering from a restaurant that delivers the entire meal, including the chef, the kitchen, and all the ingredients – ready to serve.
Docker Model Runner makes it easy to manage, run, and deploy AI models using Docker. Designed for developers, Docker Model Runner streamlines the process of pulling, running, and serving large language models (LLMs) and other AI models directly from Docker Hub or any OCI-compliant registry.
With seamless integration into Docker Desktop and Docker Engine, you can serve models via OpenAI-compatible APIs, package GGUF files as OCI Artifacts, and interact with models from both the command line and graphical interface.
Whether you’re building generative AI applications, experimenting with machine learning workflows, or integrating AI into your software development lifecycle, Docker Model Runner provides a consistent, secure, and efficient way to work with AI models locally.
Key features
- Pull and push models to and from Docker Hub
- Serve models on OpenAI-compatible APIs for easy integration with existing apps
- Package GGUF files as OCI Artifacts and publish them to any Container Registry
- Run and interact with AI models directly from the command line or from the Docker Desktop GUI
- Manage local models and display logs
Let’s Get It Running on Ubuntu 24.04
To get this “Chef-in-a-Box” service working, we first need to install the main Docker platform. It seems you skipped the last command, but it’s the essential first step to prepare your system.Let’s try it again. This command will update your system’s software list and install the basic tools needed to securely connect to Docker’s software library.
Oh Oh Oh Waaaitttttttttt , Did you created Ubuntu server already ????????
Lets Get started with then, I am using My AWS account and created an simple Ubuntu 24.04 server with my key pair to SSH
Hope Things are working fine till now
Let’s SSH into your server
ssh root@<ip>
ssh -i <pem-file/path> ubuntu@<ip> # whatever you use to SSH Feel free to use and connect to server

Once SSH into server , Run our as usual command
sudo apt update && sudo apt upgrade -y

Docker Model Runner is supported on the following platforms:
- MacOS –> Only APPLE SILICON M1
- WINDOWS
- LINUX
How it works
Models are pulled from Docker Hub the first time they’re used and stored locally. They’re loaded into memory only at runtime when a request is made, and unloaded when not in use to optimize resources. Since models can be large, the initial pull may take some time — but after that, they’re cached locally for faster access.
Ensure you have installed Docker Engine.
Lets run commands to install Docker Engine
# Add Docker's official GPG key:
sudo apt-get update
sudo apt-get install ca-certificates curl
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc
# Add the repository to Apt sources:
echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
$(. /etc/os-release && echo "${UBUNTU_CODENAME:-$VERSION_CODENAME}") stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update

Install the Docker packages.
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin -y

Use Root
sudo su # Add ubuntu user docker group if needed

Login to docker using your HUB creds
docker login -u username

DMR is available as a package. To install it, run:
sudo apt-get update
sudo apt-get install docker-model-plugin -y
Check DMR version
docker model version
docker model --help # for available commands

Pull a model and run it
docker model run ai/smollm2

Ask questions and learn something new from your own model , owned by Docker

Check model list

Tag model and push into your hub
docker model tag ai/smollm2 sevenajay/smollm2
Model "ai/smollm2" tagged successfully with "index.docker.io/sevenajay/smollm2:latest"

Push model into dockerhub and use
docker model push sevenajay/smollm2

Use different models and test
docker model pull hf.co/bartowski/Llama-3.2-1B-Instruct-GGUF
docker model pull ai/smollm2:360M-Q4_K_M
You can also directly package a model file in GGUF format as an OCI Artifact and publish it to Docker Hub.
# Download a model file in GGUF format, e.g. from HuggingFace
curl -L -o model.gguf https://huggingface.co/TheBloke/Mistral-7B-v0.1-GGUF/resolve/main/mistral-7b-v0.1.Q4_K_M.gguf
# Package it as OCI Artifact and push it to Docker Hub
docker model package --gguf "$(pwd)/model.gguf" --push myorg/mistral-7b-v0.1:Q4_K_M
Integrate Docker Model Runner into your software development lifecycle
You can now start building your Generative AI application powered by the Docker Model Runner.
If you want to try an existing GenAI application, follow these instructions.
- Set up the sample app. Clone and run the following repository:
git clone https://github.com/docker/hello-genai.git

2. In your terminal, navigate to the hello-genai
directory.
3. Run run.sh
for pulling the chosen model and run the app(s):
4. Open you app in the browser at <server-public-ip:8081> # open security group
You’ll see the GenAI app’s interface where you can start typing your prompts.
You can now interact with your own GenAI app, powered by a local model. Try a few prompts and notice how fast the responses are — all running on your machine with Docker.


App runs on different ports

The Future of AI Development is Here
As we wrap up this journey into Docker Model Runner, it’s clear that we’re witnessing a fundamental shift in how we interact with AI models. What used to be a complex, time-consuming process of environment setup and dependency management has been transformed into a simple, streamlined experience.
What We’ve Accomplished
By the end of this guide, you’ll have:
- ✅ Installed Docker Engine on Ubuntu 24.04
- ✅ Set up Docker Model Runner for seamless AI model management
- ✅ Learned how to pull and run AI models with a single command
- ✅ Discovered how to serve models via OpenAI-compatible APIs
- ✅ Gained the ability to package and share your own AI models
Leave a Reply