How to run an Ollama server for large language models

Tags Ollama LLM

Follow these steps to run a an Ollama server and set up either a python virtual environment (recommended) or a conda environment to interface with the server.

Start up the ollama server

  1. Start an interactive slurm session. Choose your partition, the number of cpus per task, and the amount of RAM you need (this is normal system RAM).
    srun --x11 --time=1-00:00:00 --partition=firefly-gpu --gres=gpu:1 --ntasks=1 --cpus-per-task=8 --mem=32G --pty /bin/bash -l
    
  2. It is best to run ollama in a container.  We provide a module in the HPC to assist with this task.
    module load ollama
    
  3. Start the ollama server
    apptainer exec --nv $OLLAMA_SIF ollama serve &
  4. Wait for the server to fully start (it will stop printing to the screen) and then hit the Enter key to get a CLI prompt back.
  5. You can now either run a model in ollama manually from the CLI, or use python (or other software that can interface with ollama) to interface with the server
    1. If you choose to interface with the server through code (either python or other tools), you need to know the port number that your server is running on.  Use the following to obtain the port number. The port number will be after the colon.
      echo $APPTAINERENV_OLLAMA_HOST

Option 1: Run a model manually

  1. Run a model in ollama. Replace <model name> with the model of your choosing.
    apptainer exec --nv $OLLAMA_SIF ollama run <model name>
  2. You can now type and communicate with the model.

Option 2: Use python to interface with ollama server

  1. Create or activate a python virtual environment for ollama
    1. To create a new python virtual environment. You can store it where ever you want. For this example we are using ~/ollama-test, which creates a folder in your home directory.
      module load python
      mkdir ~/ollama-test
      cd ~/ollama-test
      python -m venv .venv
      source .venv/bin/activate
    2. To activate a previously created python virtual environment
      cd ~/ollama-test
      source .venv/bin/activate
  2. If you haven't already, install ollama into your environment
    pip install ollama
  3. You can now write python code that interfaces with the ollama server.

Option 3: Use conda to interface with ollama server

  1. Set up a conda environment and install ollama libraries. You can name the environment whatever you want.  We are using ollama-test in this example.
    module load anaconda
    conda create --name ollama-test python=3.14
    conda activate ollama-test
    pip install ollama
  2. You can now write python code that interfaces with the ollama server.

 

Additional Support 

Open an IT Helpdesk request ticket.
Send an email to ITHelp@utc.edu.

 

Return to top of page