Follow these steps to run a an Ollama server and set up a conda environment to use it with python
Initial setup
Perform these steps on the head node
- Download and install ollama server
mkdir ~/ollama
cd ~/ollama
curl -L https://ollama.com/download/ollama-linux-amd64.tgz -o ollama-linux-amd64.tgz
tar xvf ollama-linux-amd64.tgz
- Set up conda environment and install ollama libraries. If you have not yet set up anaconda for the first time please follow the steps outlined here: Initialize Anaconda
conda create --name ollama-test python=3.12 anaconda
conda activate ollama-test
pip install ollama
conda deactivate
At this point, everything is ready to use. Each time you need to use it, first initialize everything
Environment Initialization
- Start slurm job
srun --time=1-00:00:00 --partition=gpu --ntasks=1 --gres=gpu:1 --x11 --pty /bin/bash -i
- Start ollama server
cd ~/ollama
nohup bin/ollama serve > ollama.log 2>&1 &
- Pull any models you want to use, as an example:
bin/ollama pull llama3.2
- Activate previously created conda environment
conda activate ollama-test
Now everything is ready for you to use.
You can look in the ollama.log file to watch what the server is doing. And you can write python code using the ollama python library installed in the setup steps to interact with the server.
Since the server was started inside of a slurm job running on a GPU queue, it should be able to find and utilize the GPU. The start up logs in the ollama.log file will indicate that.