Knowing what when wrong with your Pipeline when it fails is necessary to know how to fix it. Debugging mode allows you to get verbose information about your pipelines when they execute remotely.
When debugging mode is used:
- When triggering a remote run with
- Coming soon when creating a new environment with
You can activate debugging mode by adding the following code at the top of your scripts:
from pipeline.configuration import current_configuration current_configuration.set_debug_mode(True)
Make sure you're logged in
More info on this in the Getting Started guide!
Whats shown from
- The current state of the run, along with any updates
- Run outputs
- Run ID
- Run logs
- Run errors
To test the debugging out, lets create a basic pipeline that prints some things out, and has a basic progress bar with
tqdm. You can find the source code for the below example here on our Github.
import time from tqdm import tqdm from pipeline import Pipeline, Variable, pipe from pipeline.cloud.compute_requirements import Accelerator from pipeline.cloud.environments import create_environment from pipeline.cloud.pipelines import run_pipeline, upload_pipeline from pipeline.configuration import current_configuration # Activate debugging mode current_configuration.set_debug_mode(True) @pipe def test(i: int) -> str: for i in tqdm(range(i)): time.sleep(0.5) print("I'm done now, goodbye!") return "Done" with Pipeline() as builder: input_var = Variable( int, gt=0, lt=20, ) b = test(input_var) builder.output(b) pl = builder.get_pipeline() # Create a basic environment with tqdm installed env_id = create_environment( name="basic", python_requirements=[ "tqdm", ], allow_existing=True, ) # Upload our pipeline result = upload_pipeline( pl, "debugging-pipeline:latest", environment_id_or_name="basic", accelerators=[ Accelerator.cpu, ], ) # Run the pipeline remotely output = run_pipeline( result.id, 5, )
When we run this we'll see the following happen:
When performing a run with debugging mode the
async_run field set to
True and all runs are submitted in a non-blocking way. This means that a run API request is sent to the
/v3/runs endpoint with
async_run=True and the API immediately responds with the
run_id without waiting for the run to be completed. The python SDK then polls the state of that run until it is completed while opening up a websocket to the logs endpoint to stream all logs back.
It is possible to view historical logs of runs via the CLI. This will also be coming to the Catalyst dashboard soon, but currently is only available on the CLI.
To view a runs logs simple enter:
pipeline logs run <run_id>
You can only view logs from a run in the past hour
This will soon be increased to the past 7 days!
Updated 10 days ago