|
|
| Generative AI > Ollama > Ollama Model Management - Pull, List, and Delete Models via Python API | |
|
Ollama Model Management - Pull, List, and Delete Models via Python API
Author: Venkata Sudhakar
ShopMax India runs multiple AI services on local infrastructure across data centres in Mumbai and Hyderabad. Managing which Ollama models are installed on each server manually is error-prone. The Ollama Python library provides programmatic model management so deployment scripts can ensure the right models are present before starting inference services.
The Ollama Python client exposes methods for pulling new models, listing installed models with their sizes, and deleting unused models. These calls communicate with the Ollama REST API running on localhost or a remote server. You can integrate model management into CI/CD pipelines and startup scripts to guarantee model availability before serving requests.
The below example shows a ShopMax India deployment helper that checks for required models, pulls missing ones, and reports disk usage.
It gives the following output,
Installed models:
llama3.2:latest - 2.0 GB
nomic-embed-text:latest - 0.3 GB
Model ready: llama3.2
Model ready: nomic-embed-text
Use stream=True when pulling large models to show progress and avoid connection timeouts. For ShopMax India multi-server deployments, set OLLAMA_HOST to point to the right server and run ensure_models() as a pre-flight check in your service startup script. Schedule periodic cleanup using client.delete() for model versions no longer in your REQUIRED_MODELS list to reclaim disk space.
|
|