
How to Secure Ollama in Kubernetes: A Guide to Network Policies and RBAC
Ollama has rapidly become a favorite tool for developers looking to run large language models (LLMs) locally. Its simplicity is a major advantage—with just a single command, you can have a powerful model like Llama 3 or Mistral running on your machine. However, this same simplicity can introduce significant security risks when you move from your local machine to a shared environment like a Kubernetes cluster.
The primary issue is that the default Ollama deployment exposes an unauthenticated API endpoint. This means anyone with network access to the Ollama pod can interact with it, creating a major security vulnerability. In a multi-tenant cluster, this could allow other applications or users to hijack your GPU resources, access sensitive data being processed, or disrupt your service.
Fortunately, you can lock down your Ollama deployments using standard Kubernetes tools. By implementing a layered security approach with Network Policies and Role-Based Access Control (RBAC), you can run LLMs confidently and securely.
The Core Problem: The Open and Unauthenticated API
When you deploy Ollama in a Kubernetes cluster, it typically runs as a service accessible within the cluster’s network. Without any security controls, any other pod in any namespace might be able to send requests to your Ollama service.
This presents several threats:
- Resource Hijacking: Unauthorized services could run intensive tasks on your models, consuming valuable and expensive GPU cycles.
- Model Tampering: Malicious actors could potentially pull, remove, or otherwise tamper with the models you have loaded.
- Data Exposure: If you are using Ollama to process sensitive information, an open endpoint could lead to a data leak within the cluster.
To mitigate these risks, we need to control who can talk to Ollama and who can manage its deployment.
Step 1: Isolate Ollama with Kubernetes Network Policies
The first line of defense is at the network layer. Kubernetes Network Policies act as a firewall for your pods, allowing you to define explicit rules about what traffic is allowed to enter or leave. The best practice here is to adopt a zero-trust security model.
A zero-trust approach means you start by denying all traffic and then explicitly allow only the connections you need.
Implement a “Deny-by-Default” Policy: First, apply a network policy that denies all incoming (ingress) traffic to every pod in the namespace where Ollama is running. This immediately closes the door to all unauthorized connections.
Create a Specific “Allow” Policy for Ollama: Next, create a new network policy that specifically allows traffic to the Ollama pod, but only from trusted sources. You can define these trusted sources using pod labels. For example, you can create a rule that only allows ingress traffic to Ollama from pods with the label
app: my-frontend
.
By using this strategy, you ensure that only your intended applications can communicate with the Ollama service. All other pods in the cluster, even those in the same namespace, will be blocked by the default deny rule, effectively isolating your AI workload.
Step 2: Implement Role-Based Access Control (RBAC) for Management
Network security is crucial for runtime traffic, but you also need to control who has administrative access to your Ollama resources. Who can delete the Ollama pod? Who can change its configuration or view its logs? This is where Role-Based Access Control (RBAC) comes in.
RBAC is guided by the Principle of Least Privilege, which states that any user or process should only have the minimum permissions necessary to perform its function.
Instead of giving developers or CI/CD pipelines broad cluster-admin
permissions, you should create a tightly scoped role for managing the Ollama deployment. This involves three key Kubernetes objects:
- ServiceAccount: An identity for your application or automation script.
- Role: A set of permissions defining what actions are allowed (e.g.,
get
,list
,patch
pods) on specific resources within a namespace. - RoleBinding: This connects the
ServiceAccount
(the “who”) to theRole
(the “what”).
For an Ollama deployment, you could create a Role
that only allows managing Deployments
, Services
, and Pods
within the specific ollama-namespace
. By assigning this narrow role to the ServiceAccount
used by your deployment pipeline, you prevent unauthorized or accidental changes to your critical AI infrastructure. A compromised key or a buggy script would be restricted to this small blast radius, unable to affect other parts of the cluster.
A Layered Strategy for Robust Security
Neither Network Policies nor RBAC alone is a complete solution. True security comes from layering these controls together:
- Network Policies control runtime traffic between pods, preventing unauthorized applications from accessing the Ollama API.
- RBAC controls administrative access, preventing unauthorized users or processes from managing the Kubernetes resources themselves.
Ollama is an incredibly powerful tool, but like any service deployed in a shared environment, it must be properly secured. By taking these proactive steps to implement Network Policies and fine-grained RBAC, you can harness the power of local LLMs confidently, ensuring your deployments are stable, protected from abuse, and ready for production.
Source: https://collabnix.com/securing-ollama-deployments-networkpolicies-and-rbac-2/