Skip to content

Repositories & Workspaces

Dagster repositories and workspaces provide a mechanism for managing pipelines easier when operating at scale - e.g., across multiple teams within an organisation all sharing the same Dagster cluster.

Repositories can be defined in code as follows:

This module defines the Dagster pipeline repository.
from dagster import repository

from pipelines.example_pipeline import cereal_data_pipeline

def team_one():
    return {
        "pipelines": {
            "cereal_data_pipeline": lambda: cereal_data_pipeline

And workspaces are configured via:

  - python_file: 
      executable_path: ".venv/bin/python"

At a basic level the above example shows how to associate an execution environment (i.e., a Python virtual environment), with a given team's pipeline repository. This enables teams to specify their own Python requirements - e.g., a ML engineering team may want to use a newer version of NumPy than that used by an adjacent data engineering team.