LLM Providers

AutoFlow uses a provider abstraction so all agents can work with any supported LLM backend. Providers are registered in src/autoflow/llm/registry.py and implement the BaseLLMClient interface.

Provider Interface

All providers implement a single method:

class BaseLLMClient:
    def complete(
        self,
        system_prompt: str,
        user_prompt: str,
        temperature: float = 0.7,
        max_tokens: int = 2000,
        model: str | None = None,
    ) -> dict:
        """Returns {"content": str, "model": str, "usage": dict}"""

Supported Providers

Provider	Module	Requires
`mock`	`mock_client.py`	Nothing (default)
`openai`	`openai_client.py`	`OPENAI_API_KEY`
`anthropic`	`anthropic_client.py`	`ANTHROPIC_API_KEY`
`gemini`	`gemini_client.py`	`GOOGLE_API_KEY`
`ollama`	`ollama_client.py`	Local Ollama server

Provider Selection

from autoflow.llm.registry import get_client

client = get_client("openai")  # Returns OpenAIClient instance
client = get_client("mock")    # Returns MockClient instance

Or via CLI:

python -m autoflow "Your request" --provider anthropic

Ollama Provider

The Ollama provider connects to a local Ollama server and supports automatic model discovery.

Model Selection

When using --provider ollama, AutoFlow will:

Query your local Ollama server for installed models
Validate the requested model (via --model) or the default (llama3)
Prompt you to choose from available models if the default isn't installed
Suggest install commands if no models are found

# Use a specific installed model
python -m autoflow "Classify emails" --provider ollama --model llama3.2:latest

# Base name works too (resolves to llama3.2:latest)
python -m autoflow "Classify emails" --provider ollama --model llama3.2

# If the default model (llama3) is installed, it's used automatically
python -m autoflow "Classify emails" --provider ollama
# Using Ollama model: llama3:latest

# If the default isn't installed, you'll be prompted:
python -m autoflow "Classify emails" --provider ollama
# Default Ollama model 'llama3' is not installed locally.
#
# Available Ollama models:
#   1. llama3.2:latest
#   2. mistral-small:24b-instruct-2501-q8_0
#   3. bge-m3:latest
#
# Select a model [1-3] (or press Enter for 1):

If Ollama is not running or no models are installed, you'll see a helpful message with install instructions:

WARNING: Could not connect to Ollama or no models installed.
Make sure Ollama is running (ollama serve) and pull a model:
  ollama pull llama3.2

Programmatic Usage

from autoflow.llm.ollama_client import OllamaClient

client = OllamaClient(model="llama3.2:latest")

# List locally available models
models = client.list_models()
# ["llama3.2:latest", "mistral-small:24b-instruct-2501-q8_0", ...]

# Validate a model name
client.validate_model("llama3.2")  # True
client.validate_model("gpt-4")     # False

# Resolve to a full model name
client.resolve_model("llama3.2")   # "llama3.2:latest"

Environment Variables

Variable	Default	Description
`OLLAMA_BASE_URL`	`http://localhost:11434`	Ollama server address

Mock Provider

The MockClient is the default provider and is used for all testing. It returns deterministic canned responses by detecting the current pipeline stage from the system prompt (persona text).

Key behaviors:

Stage detection — Matches keywords in the persona text to determine which stage is running
Iteration tracking — The evaluator returns FAIL on first call and PASS on subsequent calls, simulating the critique loop
Generate mode — Returns a v1-format workflow (real Ask Sage shape: nodes, edges, agent_config) with decision_tree and variable_transform nodes
Deterministic — Same inputs always produce the same outputs

This makes it possible to run the full test suite without any API keys or external services.

The GeneratorAgent enriches prompts with model recommendations based on the detected workflow pattern. The canonical Ask Sage model catalog lives in node_registry/model_catalog.yaml and is the single source of truth for valid model names.

Concrete models: gpt-5.4, google-claude-46-sonnet, google-claude-45-opus, gpt-4o, gpt-4o-mini, auto

Roles (logical names mapped to concrete models in the catalog): default, cheap_extractor, analyzer, classifier, deliverable_writer, evaluator, creative

The catalog also defines the pattern → role mapping consumed by the GeneratorAgent. Use the CLI to inspect or maintain the catalog:

python -m autoflow --list-models           # show the canonical catalog
python -m autoflow --lint-models           # warn on uncatalogued model names in templates
python -m autoflow --remap-model gpt-4o=gpt-5.4   # bulk-rename across templates

These recommendations are hints included in the generated workflow's LLM node configurations (under the LLM node's model field, alongside temperature, persona, is_terminal, live_search, file_variables, max_output_tokens, etc.), not the model used to run the pipeline itself.