Orchestrator Framework

Overview

Orchestrator is a powerful, flexible AI pipeline orchestration framework that simplifies the creation and execution of complex AI workflows. By combining YAML-based configuration with intelligent model selection and automatic ambiguity resolution, Orchestrator makes it easy to build sophisticated AI applications without getting bogged down in implementation details.

Key Features

🎯 YAML-Based Pipelines: Define complex workflows in simple, readable YAML with full template variable support
🤖 Multi-Model Support: Seamlessly work with OpenAI, Anthropic, Google, Ollama, and HuggingFace models
🧠 Intelligent Model Selection: Automatically choose the best model based on task requirements
🔄 Automatic Ambiguity Resolution: Use <AUTO> tags to let AI resolve configuration ambiguities
📦 Modular Architecture: Extend with custom models, tools, and control systems
🛡️ Production Ready: Built-in error handling, retries, checkpointing, and monitoring
⚡ Parallel Execution: Efficient resource management and parallel task execution
🐳 Sandboxed Execution: Secure code execution in isolated environments
💾 Lazy Model Loading: Models are downloaded only when needed, saving disk space
🔧 Reliable Tool Execution: Guaranteed execution of file operations with LangChain structured outputs
📝 Advanced Templates: Support for nested variables, filters, and Jinja2-style templates

Quick Start

Installation

pip install py-orc

For additional features:

pip install py-orc[ollama]      # Ollama model support
pip install py-orc[cloud]        # Cloud model providers
pip install py-orc[dev]          # Development tools
pip install py-orc[all]          # Everything

Basic Usage

Create a simple pipeline (hello_world.yaml):

id: hello_world
name: Hello World Pipeline
description: A simple example pipeline

steps:
  - id: greet
    action: generate_text
    parameters:
      prompt: "Say hello to the world in a creative way!"
      
  - id: translate
    action: generate_text
    parameters:
      prompt: "Translate this greeting to Spanish: {{ greet.result }}"
    dependencies: [greet]

outputs:
  greeting: "{{ greet.result }}"
  spanish: "{{ translate.result }}"

Run the pipeline:

# Using the CLI script
python scripts/run_pipeline.py hello_world.yaml

# With inputs
python scripts/run_pipeline.py hello_world.yaml -i name=World -i language=Spanish

# From a JSON file
python scripts/run_pipeline.py hello_world.yaml -f inputs.json -o output_dir/

# Or programmatically
import orchestrator as orc

# Initialize models (auto-detects available models)
orc.init_models()

# Compile and run the pipeline
pipeline = orc.compile("hello_world.yaml")
result = pipeline.run()

print(result)

Using AUTO Tags

Orchestrator's <AUTO> tags let AI decide configuration details:

steps:
  - id: analyze_data
    action: analyze
    parameters:
      data: "{{ input_data }}"
      method: <AUTO>Choose the best analysis method for this data type</AUTO>
      visualization: <AUTO>Decide if we should create a chart</AUTO>

Model Configuration

Configure available models in models.yaml:

models:
  # Local models (via Ollama) - downloaded on first use
  - source: ollama
    name: llama3.1:8b
    expertise: [general, reasoning, multilingual]
    size: 8b
    
  - source: ollama
    name: qwen2.5-coder:7b
    expertise: [code, programming]
    size: 7b

  # Cloud models
  - source: openai
    name: gpt-4o
    expertise: [general, reasoning, code, analysis, vision]
    size: 1760b  # Estimated

defaults:
  expertise_preferences:
    code: qwen2.5-coder:7b
    reasoning: deepseek-r1:8b
    fast: llama3.2:1b

Models are downloaded only when first used, saving disk space and initialization time.

Advanced Example

Here's a more complex example showing model requirements and parallel execution:

id: research_pipeline
name: AI Research Pipeline
description: Research a topic and create a comprehensive report

inputs:
  - name: topic
    type: string
    description: Research topic
    
  - name: depth
    type: string
    default: <AUTO>Determine appropriate research depth</AUTO>

steps:
  # Parallel research from multiple sources
  - id: web_search
    action: search_web
    parameters:
      query: "{{ topic }} latest research 2025"
      count: <AUTO>Decide how many results to fetch</AUTO>
    requires_model:
      expertise: [research, web]
      
  - id: academic_search
    action: search_academic
    parameters:
      query: "{{ topic }}"
      filters: <AUTO>Set appropriate academic filters</AUTO>
    requires_model:
      expertise: [research, academic]
      
  # Analyze findings with specialized model
  - id: analyze_findings
    action: analyze
    parameters:
      web_results: "{{ web_search.results }}"
      academic_results: "{{ academic_search.results }}"
      analysis_focus: <AUTO>Determine key aspects to analyze</AUTO>
    dependencies: [web_search, academic_search]
    requires_model:
      expertise: [analysis, reasoning]
      min_size: 20b  # Require large model for complex analysis
      
  # Generate report
  - id: write_report
    action: generate_document
    parameters:
      topic: "{{ topic }}"
      analysis: "{{ analyze_findings.result }}"
      style: <AUTO>Choose appropriate writing style</AUTO>
      length: <AUTO>Determine optimal report length</AUTO>
    dependencies: [analyze_findings]
    requires_model:
      expertise: [writing, general]

outputs:
  report: "{{ write_report.document }}"
  summary: "{{ analyze_findings.summary }}"

Complete Example: Research Report Generator

Here's a fully functional pipeline that generates research reports:

# research_report.yaml
id: research_report
name: Research Report Generator
description: Generate comprehensive research reports with citations

inputs:
  - name: topic
    type: string
    description: Research topic
  - name: instructions
    type: string
    description: Additional instructions for the report

outputs:
  - pdf: <AUTO>Generate appropriate filename for the research report PDF</AUTO>

steps:
  - id: search
    name: Web Search
    action: search_web
    parameters:
      query: <AUTO>Create effective search query for {topic} with {instructions}</AUTO>
      max_results: 10
    requires_model:
      expertise: fast
      
  - id: compile_notes
    name: Compile Research Notes
    action: generate_text
    parameters:
      prompt: |
        Compile comprehensive research notes from these search results:
        {{ search.results }}
        
        Topic: {{ topic }}
        Instructions: {{ instructions }}
        
        Create detailed notes with:
        - Key findings
        - Important quotes
        - Source citations
        - Relevant statistics
    dependencies: [search]
    requires_model:
      expertise: [analysis, reasoning]
      min_size: 7b
      
  - id: write_report
    name: Write Report
    action: generate_document
    parameters:
      content: |
        Write a comprehensive research report on "{{ topic }}"
        
        Research notes:
        {{ compile_notes.result }}
        
        Requirements:
        - Professional academic style
        - Include introduction, body sections, and conclusion
        - Cite sources properly
        - {{ instructions }}
      format: markdown
    dependencies: [compile_notes]
    requires_model:
      expertise: [writing, general]
      min_size: 20b
      
  - id: create_pdf
    name: Create PDF
    action: convert_to_pdf
    parameters:
      markdown: "{{ write_report.document }}"
      filename: "{{ outputs.pdf }}"
    dependencies: [write_report]

Run it with:

import orchestrator as orc

# Initialize models
orc.init_models()

# Compile pipeline
pipeline = orc.compile("research_report.yaml")

# Run with inputs
result = pipeline.run(
    topic="quantum computing applications in medicine",
    instructions="Focus on recent breakthroughs and future potential"
)

print(f"Report saved to: {result}")

Documentation

Comprehensive documentation is available at orc.readthedocs.io, including:

Available Models

Orchestrator supports a wide range of models:

Local Models (via Ollama)

Gemma3 27B: Google's powerful general-purpose model
Llama 3.x: General purpose, multilingual support
DeepSeek-R1: Advanced reasoning and coding
Qwen2.5-Coder: Specialized for code generation
Mistral: Fast and efficient general purpose

Cloud Models

OpenAI: GPT-4.1 (latest)
Anthropic: Claude Sonnet 4 (claude-sonnet-4-20250514)
Google: Gemini 2.5 Flash (gemini-2.5-flash)

HuggingFace Models

Mistral 7B Instruct v0.3: High-quality instruction-following model
Llama, Qwen, Phi, and many more
Automatically downloaded on first use

Requirements

Python 3.8+
Optional: Ollama for local model execution
Optional: API keys for cloud providers (OpenAI, Anthropic, Google)

Contributing

We welcome contributions! Please see our Contributing Guide for details.

Support

License

This project is licensed under the MIT License - see the LICENSE file for details.

Citation

If you use Orchestrator in your research, please cite:

@software{orchestrator2025,
  title = {Orchestrator: AI Pipeline Orchestration Framework},
  author = {Manning, Jeremy R. and {Contextual Dynamics Lab}},
  year = {2025},
  url = {https://github.com/ContextLab/orchestrator},
  organization = {Dartmouth College}
}

Acknowledgments

Orchestrator is developed and maintained by the Contextual Dynamics Lab at Dartmouth College.

Built with ❤️ by the Contextual Dynamics Lab

Name		Name	Last commit message	Last commit date
Latest commit History 80 Commits
.github/workflows		.github/workflows
config		config
docs		docs
examples		examples
output/tool_integrated		output/tool_integrated
scripts		scripts
src/orchestrator		src/orchestrator
tests		tests
.gitignore		.gitignore
.readthedocs.yaml		.readthedocs.yaml
CONTRIBUTING.md		CONTRIBUTING.md
FRAMEWORK_REDESIGN_PLAN.md		FRAMEWORK_REDESIGN_PLAN.md
FRAMEWORK_TRANSFORMATION_COMPLETE.md		FRAMEWORK_TRANSFORMATION_COMPLETE.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
design.md		design.md
models.yaml		models.yaml
pyproject.toml		pyproject.toml
simple_test.yaml		simple_test.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Orchestrator Framework

Overview

Key Features

Quick Start

Installation

Basic Usage

Using AUTO Tags

Model Configuration

Advanced Example

Complete Example: Research Report Generator

Documentation

Available Models

Local Models (via Ollama)

Cloud Models

HuggingFace Models

Requirements

Contributing

Support

License

Citation

Acknowledgments

About

Uh oh!

Releases

Packages

Contributors 2

Languages

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

License

ContextLab/orchestrator

Folders and files

Latest commit

History

Repository files navigation

Orchestrator Framework

Overview

Key Features

Quick Start

Installation

Basic Usage

Using AUTO Tags

Model Configuration

Advanced Example

Complete Example: Research Report Generator

Documentation

Available Models

Local Models (via Ollama)

Cloud Models

HuggingFace Models

Requirements

Contributing

Support

License

Citation

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Packages