2025 · Archived

dspy_llamacpp

A lightweight Python wrapper I built for my thesis work to start, configure, and clean up llama.cpp's llama-server for DSPy workflows.

Overview

dspy_llamacpp is a small Python package I built while working on my master's thesis, where I needed repeatable local DSPy experiments on top of llama.cpp. Instead of manually starting llama-server, wiring an LM endpoint, and remembering to shut the server down, the package wraps that lifecycle behind an AutoLlamaCpp class.

The implementation starts llama-server in its own process group, accepts keyword-style server options, configures DSPy to talk to the local endpoint, and registers cleanup through atexit and signal handling. The goal was to avoid the memory and cleanup issues I had run into with other local llama.cpp approaches.

The package is installable directly from GitHub and includes examples for configuring the model path through environment variables, starting the server, and using DSPy calls against the running local model.

Highlights

Built thesis-driven tooling around llama.cpp's llama-server so DSPy experiments could spin up a local model endpoint with minimal boilerplate.
Handled process groups, atexit cleanup, and signal handling to avoid lingering llama-server processes.
Exposed server options through a small Python API while automatically configuring DSPy's LM client.