Want to run AI models on your own computer, for free, in a super simple and fast way? Ollama is the answer.
Ollama is a command-line tool for running and managing LLMs locally with commands like run, stop, ps, pull, and push (yes, it’s like managing Docker containers).
How to get started:
- Install Ollama from the official website.
- Run ollama run <model-name>, for example:
ollama run deepseek-r1:7b
- Choose a model with more/less parameters depending on your machine’s capacity.
- If the model is not installed yet, Ollama will automatically download and install it.
That’s it! A chat session will start in your terminal with the installed model.
Want a Local HTTP API to Integrate with a Frontend?
Ollama runs a local server on port 11434 and returns responses as a text stream:
curl http://localhost:11434/api/generate -d '{
"model": "deepseek-r1:7b",
"prompt": "Who was Alan Turing?"
}'
Need to disable streaming? Just add "stream": false to the request body.
Why is this tool so useful?
- Great for developing and testing AI projects.
- Useful for local productivity tools with privacy.
- You can even set up a local Copilot in VSCode with different models.
- You can also run it inside a container and deploy it to the cloud to make it available outside your local environment!