Virtual environments

Both Catalyst and pcore clusters use virtual python environments (venvs) to run all pipelines. Once a venv is created it can be linked with a pipeline when uploaded. You can create as many venvs as you like and can attach one venv to multiple pipelines.

Concepts

  • Requirements - python packages to be included in your virtual env

Environment creation

To create a new venv on Catalyst or a pcore cluster you import the create function from the pipeline-ai library (replace <username> with your username):

from pipeline.cloud.environments import create_environment

env_id = create_environment(name="<username>/numpy", python_requirements=["numpy==1.25.2"])

This environment can then be tagged with a pipeline when uploading like so:

from pipeline.cloud.pipelines import upload_pipeline

...

upload_pipeline(
 "<username>/env-example",
  environment_id_or_name="<username>/numpy",
)

We highly reccomend pointing to specific package versions as defined on pypi.org

🚧

Fixed requirements

Once an environment is created its requirements cannot be changed, you must create a new one if you need to modify the requirements.

Git repos

It is possible to pass in a specific git repo as a package, we strongly recommend pointing to a specific commit as follows:

from pipeline.cloud.environments import create_environment

env_id = create_environment(
  name="<username>/numpy", 
	python_requirements=[
    "git+https://github.com/numpy/numpy.git@ea677928332c37e8052b4d599bf6ee52cf363cf9",
  ],
)

Current limitations

  • We do not currently support OS level packages, no apt install but will be coming in a future release. The only package you can expect at the moment is ffmpeg - please reach out to the team on discord if you have an urgent need for an OS level package: Discord.
  • Only python==3.10 venvs are currently supported, and you must upload new pipelines from python==3.10.
  • The only supported source language is python for uploading and running the venv.