Skip to content

ContextLab/orchestrator

Repository files navigation

Orchestrator Framework

PyPI Version Python Versions Downloads License: MIT Tests Coverage Documentation

Overview

Orchestrator is a powerful, flexible AI pipeline orchestration framework that simplifies the creation and execution of complex AI workflows. By combining YAML-based configuration with intelligent model selection and automatic ambiguity resolution, Orchestrator makes it easy to build sophisticated AI applications without getting bogged down in implementation details.

Key Features

  • 🎯 YAML-Based Pipelines: Define complex workflows in simple, readable YAML with full template variable support
  • πŸ€– Multi-Model Support: Seamlessly work with OpenAI, Anthropic, Google, Ollama, and HuggingFace models
  • 🧠 Intelligent Model Selection: Automatically choose the best model based on task requirements
  • πŸ”„ Automatic Ambiguity Resolution: Use <AUTO> tags to let AI resolve configuration ambiguities
  • πŸ“¦ Modular Architecture: Extend with custom models, tools, and control systems
  • πŸ›‘οΈ Production Ready: Built-in error handling, retries, checkpointing, and monitoring
  • ⚑ Parallel Execution: Efficient resource management and parallel task execution
  • 🐳 Sandboxed Execution: Secure code execution in isolated environments
  • πŸ’Ύ Lazy Model Loading: Models are downloaded only when needed, saving disk space
  • πŸ”§ Reliable Tool Execution: Guaranteed execution of file operations with LangChain structured outputs
  • πŸ“ Advanced Templates: Support for nested variables, filters, and Jinja2-style templates

Quick Start

Installation

pip install py-orc

For additional features:

pip install py-orc[ollama]      # Ollama model support
pip install py-orc[cloud]        # Cloud model providers
pip install py-orc[dev]          # Development tools
pip install py-orc[all]          # Everything

Basic Usage

  1. Create a simple pipeline (hello_world.yaml):
id: hello_world
name: Hello World Pipeline
description: A simple example pipeline

steps:
  - id: greet
    action: generate_text
    parameters:
      prompt: "Say hello to the world in a creative way!"
      
  - id: translate
    action: generate_text
    parameters:
      prompt: "Translate this greeting to Spanish: {{ greet.result }}"
    dependencies: [greet]

outputs:
  greeting: "{{ greet.result }}"
  spanish: "{{ translate.result }}"
  1. Run the pipeline:
# Using the CLI script
python scripts/run_pipeline.py hello_world.yaml

# With inputs
python scripts/run_pipeline.py hello_world.yaml -i name=World -i language=Spanish

# From a JSON file
python scripts/run_pipeline.py hello_world.yaml -f inputs.json -o output_dir/

# Or programmatically
import orchestrator as orc

# Initialize models (auto-detects available models)
orc.init_models()

# Compile and run the pipeline
pipeline = orc.compile("hello_world.yaml")
result = pipeline.run()

print(result)

Using AUTO Tags

Orchestrator's <AUTO> tags let AI decide configuration details:

steps:
  - id: analyze_data
    action: analyze
    parameters:
      data: "{{ input_data }}"
      method: <AUTO>Choose the best analysis method for this data type</AUTO>
      visualization: <AUTO>Decide if we should create a chart</AUTO>

Model Configuration

Configure available models in models.yaml:

models:
  # Local models (via Ollama) - downloaded on first use
  - source: ollama
    name: llama3.1:8b
    expertise: [general, reasoning, multilingual]
    size: 8b
    
  - source: ollama
    name: qwen2.5-coder:7b
    expertise: [code, programming]
    size: 7b

  # Cloud models
  - source: openai
    name: gpt-4o
    expertise: [general, reasoning, code, analysis, vision]
    size: 1760b  # Estimated

defaults:
  expertise_preferences:
    code: qwen2.5-coder:7b
    reasoning: deepseek-r1:8b
    fast: llama3.2:1b

Models are downloaded only when first used, saving disk space and initialization time.

Advanced Example

Here's a more complex example showing model requirements and parallel execution:

id: research_pipeline
name: AI Research Pipeline
description: Research a topic and create a comprehensive report

inputs:
  - name: topic
    type: string
    description: Research topic
    
  - name: depth
    type: string
    default: <AUTO>Determine appropriate research depth</AUTO>

steps:
  # Parallel research from multiple sources
  - id: web_search
    action: search_web
    parameters:
      query: "{{ topic }} latest research 2025"
      count: <AUTO>Decide how many results to fetch</AUTO>
    requires_model:
      expertise: [research, web]
      
  - id: academic_search
    action: search_academic
    parameters:
      query: "{{ topic }}"
      filters: <AUTO>Set appropriate academic filters</AUTO>
    requires_model:
      expertise: [research, academic]
      
  # Analyze findings with specialized model
  - id: analyze_findings
    action: analyze
    parameters:
      web_results: "{{ web_search.results }}"
      academic_results: "{{ academic_search.results }}"
      analysis_focus: <AUTO>Determine key aspects to analyze</AUTO>
    dependencies: [web_search, academic_search]
    requires_model:
      expertise: [analysis, reasoning]
      min_size: 20b  # Require large model for complex analysis
      
  # Generate report
  - id: write_report
    action: generate_document
    parameters:
      topic: "{{ topic }}"
      analysis: "{{ analyze_findings.result }}"
      style: <AUTO>Choose appropriate writing style</AUTO>
      length: <AUTO>Determine optimal report length</AUTO>
    dependencies: [analyze_findings]
    requires_model:
      expertise: [writing, general]

outputs:
  report: "{{ write_report.document }}"
  summary: "{{ analyze_findings.summary }}"

Complete Example: Research Report Generator

Here's a fully functional pipeline that generates research reports:

# research_report.yaml
id: research_report
name: Research Report Generator
description: Generate comprehensive research reports with citations

inputs:
  - name: topic
    type: string
    description: Research topic
  - name: instructions
    type: string
    description: Additional instructions for the report

outputs:
  - pdf: <AUTO>Generate appropriate filename for the research report PDF</AUTO>

steps:
  - id: search
    name: Web Search
    action: search_web
    parameters:
      query: <AUTO>Create effective search query for {topic} with {instructions}</AUTO>
      max_results: 10
    requires_model:
      expertise: fast
      
  - id: compile_notes
    name: Compile Research Notes
    action: generate_text
    parameters:
      prompt: |
        Compile comprehensive research notes from these search results:
        {{ search.results }}
        
        Topic: {{ topic }}
        Instructions: {{ instructions }}
        
        Create detailed notes with:
        - Key findings
        - Important quotes
        - Source citations
        - Relevant statistics
    dependencies: [search]
    requires_model:
      expertise: [analysis, reasoning]
      min_size: 7b
      
  - id: write_report
    name: Write Report
    action: generate_document
    parameters:
      content: |
        Write a comprehensive research report on "{{ topic }}"
        
        Research notes:
        {{ compile_notes.result }}
        
        Requirements:
        - Professional academic style
        - Include introduction, body sections, and conclusion
        - Cite sources properly
        - {{ instructions }}
      format: markdown
    dependencies: [compile_notes]
    requires_model:
      expertise: [writing, general]
      min_size: 20b
      
  - id: create_pdf
    name: Create PDF
    action: convert_to_pdf
    parameters:
      markdown: "{{ write_report.document }}"
      filename: "{{ outputs.pdf }}"
    dependencies: [write_report]

Run it with:

import orchestrator as orc

# Initialize models
orc.init_models()

# Compile pipeline
pipeline = orc.compile("research_report.yaml")

# Run with inputs
result = pipeline.run(
    topic="quantum computing applications in medicine",
    instructions="Focus on recent breakthroughs and future potential"
)

print(f"Report saved to: {result}")

Documentation

Comprehensive documentation is available at orc.readthedocs.io, including:

Available Models

Orchestrator supports a wide range of models:

Local Models (via Ollama)

  • Gemma3 27B: Google's powerful general-purpose model
  • Llama 3.x: General purpose, multilingual support
  • DeepSeek-R1: Advanced reasoning and coding
  • Qwen2.5-Coder: Specialized for code generation
  • Mistral: Fast and efficient general purpose

Cloud Models

  • OpenAI: GPT-4.1 (latest)
  • Anthropic: Claude Sonnet 4 (claude-sonnet-4-20250514)
  • Google: Gemini 2.5 Flash (gemini-2.5-flash)

HuggingFace Models

  • Mistral 7B Instruct v0.3: High-quality instruction-following model
  • Llama, Qwen, Phi, and many more
  • Automatically downloaded on first use

Requirements

  • Python 3.8+
  • Optional: Ollama for local model execution
  • Optional: API keys for cloud providers (OpenAI, Anthropic, Google)

Contributing

We welcome contributions! Please see our Contributing Guide for details.

Support

License

This project is licensed under the MIT License - see the LICENSE file for details.

Citation

If you use Orchestrator in your research, please cite:

@software{orchestrator2025,
  title = {Orchestrator: AI Pipeline Orchestration Framework},
  author = {Manning, Jeremy R. and {Contextual Dynamics Lab}},
  year = {2025},
  url = {https://github.com/ContextLab/orchestrator},
  organization = {Dartmouth College}
}

Acknowledgments

Orchestrator is developed and maintained by the Contextual Dynamics Lab at Dartmouth College.


Built with ❀️ by the Contextual Dynamics Lab

About

A convenient wrapper for LangGraph, MCP, model spec, and other AI agent control systems

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy