Run AI models locally for free with Ollama

Learn how to run AI models on your computer for free, easily, and quickly with Ollama. This command-line tool lets you manage LLMs locally with simple commands.

By Maurício Cantú·Feb 10, 2025

Want to run AI models on your own computer, for free, in a super simple and fast way? Ollama is the answer.

Ollama is a command-line tool for running and managing LLMs locally with commands like run, stop, ps, pull, and push (yes, it’s like managing Docker containers).

How to get started:

  1. Install Ollama from the official website.
  2. Run ollama run <model-name>, for example:
ollama run deepseek-r1:7b
  • Choose a model with more/less parameters depending on your machine’s capacity.
  • If the model is not installed yet, Ollama will automatically download and install it.

That’s it! A chat session will start in your terminal with the installed model.

Want a Local HTTP API to Integrate with a Frontend?

Ollama runs a local server on port 11434 and returns responses as a text stream:

curl http://localhost:11434/api/generate -d '{
"model": "deepseek-r1:7b",
"prompt": "Who was Alan Turing?"
}'

Need to disable streaming? Just add "stream": false to the request body.

Why is this tool so useful?

  • Great for developing and testing AI projects.
  • Useful for local productivity tools with privacy.
  • You can even set up a local Copilot in VSCode with different models.
  • You can also run it inside a container and deploy it to the cloud to make it available outside your local environment!