Keyword Schemas
Overview
Using many keyword arguments is very common in ML, especially as seen in the transformers library. Defining, maintaining, and versioning these in production setups is supported in Pipeline by the use of schemas (similar to pydantic). They're called an InputSchema
that can have several associated InputField
variables:
from pipeline.objects.graph import InputField, InputSchema
class MyInputSchema(InputSchema):
in_1: int = InputField(lt=5, gt=-5, description="kwarg 1", title="my_int")
in_2: int | None = InputField(
default=0, lt=5, ge=-5, description="kwarg 1", title="my_optional_int"
)
These InputSchema
objects can be used as a variable type in a Pipeline:
@pipe
def my_func(in_1: int, other_schema: MyInputSchema) -> int:
return in_1 + other_schema.in_2 + other_schema.in_1
with Pipeline() as builder:
var_1 = Variable(int, lt=10, ge=0)
var_2 = Variable(MyInputSchema)
output = my_func(var_1, var_2)
builder.output(output)
You can only use
InputFields
in the schema definition
Some important notes when using the InputFields:
Optional fields and default values
There are two way to define an optional field (a default value must always be provided). Either by using the typing.Optional
object or using the or operator with None (int | None
).
note: None can be the default value, ellipsis are the pythonic default representing a literal absence of an input
from typing import Optional
class MyInputSchema(InputSchema):
in_1: Optional[int] = InputField(default=1)
# OR
in_2: int | None = InputField(default=2)
Runs
When performing a run when using the the InputSchema
object you treat it as a dictionary. The conversion and validation into the full schema class object is handled for you:
my_pl = builder.get_pipeline()
rm_pipeline = upload_pipeline(
my_pl,
"schema-demo",
"numpy",
minimum_cache_number=1,
)
result = run_pipeline(
"schema-demo:v1",
1,
{"in_1": 2}
)
Operating this way ensures that any client can send an API request over http without needing any python specific objects.
Validation
The InputField
object takes in the following kwargs
for validation (all are optional):
default
(type:any
) - The default value of the variabletitle
(type :str
) - The name of the variabledescription
(type :str
) - Basic description of the variableexamples
(type :list
) - List of possible inputsgt
(type :int
) - Greater than (int/float)ge
(type :int
) - Greater than or equal to (int/float)lt
(type :int
) - Less than (int/float)le
(type :int
) - Less than or equal to (int/float)multiple_of
(type :int
) - Must be a multiple of this number (int/float)allow_inf_nan
(type :bool
) - Whether to allow infinities or nan values (int/float)max_digits
(type :int
) - Maximum number of digits in the number to allow (int/float)decimal_places
(type :int
) - maximum number of decimal places to allow in the number (int/float)min_length
(type :int
) - Minimum length of an input string (string)max_length
(type :int
) - Maximum length of the input string (string)choices
(type :list
) - A list of the only inputs that can be entered
Updated 14 days ago