How to Run Docker Model Runner on Ubuntu 24.04

Docker model runner

That Awesome AI Model You Found? You Can Run It in 5 Minutes.

Ever felt the excitement of discovering a powerful new AI model on a site like Hugging Face, followed by the familiar dread of actually trying to run it? You’re not alone. The journey from “Wow, this is cool!” to a working model on your machine is often a frustrating maze of dependency hell, environment conflicts, and cryptic error messages

.What if you could skip all that? What if running a sophisticated Large Language Model (LLM) was as simple as running any other standard piece of software?

That’s the exact problem Docker Model Runner is built to solve.

Think of it this way:

Before, getting an AI model to work was like hiring a world-class chef to cook in your home kitchen. You had to find the right chef (the model), then rush out to buy all the specific, exotic ingredients and weird-looking utensils they needed. If you bought the wrong brand of olive oil (a dependency mismatch), the whole recipe would fail. It’s complicated, messy, and you spend more time setting up than actually cooking.

Docker Model Runner is like a “Chef-in-a-Box” service

It delivers the master chef and their entire professional, pre-configured kitchen right to your computer in a single, neat package.

  • No More Shopping for Weird Ingredients: The model comes packaged with every single library, tool, and dependency it needs to work perfectly. You just ask for it.
  • A Simple Way to Place Your Order: It automatically sets up a standard “service window” (an OpenAI-compatible API) so your other applications can easily talk to the model without needing a special translator.
  • Keeps Your Space Tidy: It manages all these AI “kitchens” in one place, so you don’t have different projects with conflicting tools making a mess of your system.

In short, Docker Model Runner takes the complex, frustrating setup process of running local AI models and makes it simple, standardized, and secure.

Availability: Beta

Requires: Docker Engine or Docker Desktop (Windows) 4.41+ or Docker Desktop (MacOS) 4.40+

Remember That Time You Tried to Share Your Code Project?

Picture this:

You’ve built something amazing – maybe a cool web app or a data analysis script. You’re excited to share it with your friend or colleague. You send them the code and say, “Just run it!”

Then comes the inevitable text: “It’s not working on my computer.”

Sound familiar?

Your friend has a different operating system, different versions of Python libraries, or maybe they’re missing some obscure dependency you installed months ago and forgot about. Suddenly, your “simple” project becomes a debugging nightmare.

Docker Model Runner solves the exact same problem, but for AI models.

The AI Model Nightmare (Before Docker Model Runner)

Imagine you discover an incredible AI model that can write poetry, analyze your photos, or help you code. You’re excited to try it out, so you:

  1. Clone the repository – Easy enough
  2. Install Python dependencies – pip install -r requirements.txt – Still okay
  3. Download the model weights – 5GB download, but manageable
  4. Install CUDA drivers – Wait, you need an NVIDIA GPU?
  5. Install PyTorch with CUDA support – Different version than what you have
  6. Install additional libraries – Some conflict with your existing setup
  7. Configure environment variables – More setup
  8. Run the model – Finally! But it crashes with a cryptic error about missing libraries

Two hours later, you’re still debugging instead of using the AI model.

The Docker Model Runner Solution

With Docker Model Runner, it’s like ordering from a restaurant that delivers the entire meal, including the chef, the kitchen, and all the ingredients – ready to serve.

Docker Model Runner makes it easy to manage, run, and deploy AI models using Docker. Designed for developers, Docker Model Runner streamlines the process of pulling, running, and serving large language models (LLMs) and other AI models directly from Docker Hub or any OCI-compliant registry.

With seamless integration into Docker Desktop and Docker Engine, you can serve models via OpenAI-compatible APIs, package GGUF files as OCI Artifacts, and interact with models from both the command line and graphical interface.

Whether you’re building generative AI applications, experimenting with machine learning workflows, or integrating AI into your software development lifecycle, Docker Model Runner provides a consistent, secure, and efficient way to work with AI models locally.

Key features

  • Pull and push models to and from Docker Hub
  • Serve models on OpenAI-compatible APIs for easy integration with existing apps
  • Package GGUF files as OCI Artifacts and publish them to any Container Registry
  • Run and interact with AI models directly from the command line or from the Docker Desktop GUI
  • Manage local models and display logs

Let’s Get It Running on Ubuntu 24.04

To get this “Chef-in-a-Box” service working, we first need to install the main Docker platform. It seems you skipped the last command, but it’s the essential first step to prepare your system.Let’s try it again. This command will update your system’s software list and install the basic tools needed to securely connect to Docker’s software library.

Oh Oh Oh Waaaitttttttttt , Did you created Ubuntu server already ????????

Lets Get started with then, I am using My AWS account and created an simple Ubuntu 24.04 server with my key pair to SSH

Hope Things are working fine till now

Let’s SSH into your server

ssh root@<ip>
ssh -i <pem-file/path> ubuntu@<ip> # whatever you use to SSH Feel free to use and connect to server 

Once SSH into server , Run our as usual command

sudo apt update && sudo apt upgrade -y

Docker Model Runner is supported on the following platforms:

  • MacOS –> Only APPLE SILICON M1
  • WINDOWS
  • LINUX

How it works

Models are pulled from Docker Hub the first time they’re used and stored locally. They’re loaded into memory only at runtime when a request is made, and unloaded when not in use to optimize resources. Since models can be large, the initial pull may take some time — but after that, they’re cached locally for faster access.

Ensure you have installed Docker Engine.

Lets run commands to install Docker Engine

# Add Docker's official GPG key:
sudo apt-get update
sudo apt-get install ca-certificates curl
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc

# Add the repository to Apt sources:
echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
  $(. /etc/os-release && echo "${UBUNTU_CODENAME:-$VERSION_CODENAME}") stable" | \
  sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update

Install the Docker packages.

sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin -y

Use Root

sudo su # Add ubuntu user docker group if needed

Login to docker using your HUB creds

docker login -u username

DMR is available as a package. To install it, run:

sudo apt-get update
sudo apt-get install docker-model-plugin -y

Check DMR version

docker model version
docker model --help # for available commands

Pull a model and run it

docker model run ai/smollm2

Ask questions and learn something new from your own model , owned by Docker

Check model list

Tag model and push into your hub

docker model tag ai/smollm2 sevenajay/smollm2
Model "ai/smollm2" tagged successfully with "index.docker.io/sevenajay/smollm2:latest"

Push model into dockerhub and use

docker model push sevenajay/smollm2

Use different models and test

docker model pull hf.co/bartowski/Llama-3.2-1B-Instruct-GGUF
docker model pull ai/smollm2:360M-Q4_K_M

You can also directly package a model file in GGUF format as an OCI Artifact and publish it to Docker Hub.

# Download a model file in GGUF format, e.g. from HuggingFace
curl -L -o model.gguf https://huggingface.co/TheBloke/Mistral-7B-v0.1-GGUF/resolve/main/mistral-7b-v0.1.Q4_K_M.gguf

# Package it as OCI Artifact and push it to Docker Hub
docker model package --gguf "$(pwd)/model.gguf" --push myorg/mistral-7b-v0.1:Q4_K_M

Integrate Docker Model Runner into your software development lifecycle

You can now start building your Generative AI application powered by the Docker Model Runner.

If you want to try an existing GenAI application, follow these instructions.

  1. Set up the sample app. Clone and run the following repository:
git clone https://github.com/docker/hello-genai.git

2. In your terminal, navigate to the hello-genai directory.

3. Run run.sh for pulling the chosen model and run the app(s):

4. Open you app in the browser at <server-public-ip:8081> # open security group

You’ll see the GenAI app’s interface where you can start typing your prompts.

You can now interact with your own GenAI app, powered by a local model. Try a few prompts and notice how fast the responses are — all running on your machine with Docker.

App runs on different ports

The Future of AI Development is Here

As we wrap up this journey into Docker Model Runner, it’s clear that we’re witnessing a fundamental shift in how we interact with AI models. What used to be a complex, time-consuming process of environment setup and dependency management has been transformed into a simple, streamlined experience.

What We’ve Accomplished

By the end of this guide, you’ll have:

  • ✅ Installed Docker Engine on Ubuntu 24.04
  • ✅ Set up Docker Model Runner for seamless AI model management
  • ✅ Learned how to pull and run AI models with a single command
  • ✅ Discovered how to serve models via OpenAI-compatible APIs
  • ✅ Gained the ability to package and share your own AI models

mrcloudbook.com avatar

Ajay Kumar Yegireddi is a DevSecOps Engineer and System Administrator, with a passion for sharing real-world DevSecOps projects and tasks. Mr. Cloud Book, provides hands-on tutorials and practical insights to help others master DevSecOps tools and workflows. Content is designed to bridge the gap between development, security, and operations, making complex concepts easy to understand for both beginners and professionals.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *