InferencePipeline for Sagemaker

Creates a standard inference pipeline with the following steps in order:

Train preprocessor
Create preprocessor model
Transform input data using preprocessor model
Train estimator
Create estimator model
Endpoint configuration
Deploy estimator model

Super class

stepfunctions::WorkflowTemplate -> InferencePipeline

Methods

Public methods

InferencePipeline$new()
InferencePipeline$build_workflow_definition()
InferencePipeline$execute()
InferencePipeline$clone()

Inherited methods

Method `new()`

Initialize InferencePipeline class

Usage

InferencePipeline$new(
  preprocessor,
  estimator,
  inputs,
  s3_bucket,
  role,
  client = NULL,
  compression_type = NULL,
  content_type = NULL,
  pipeline_name = NULL
)

Arguments

preprocessor

(sagemaker.estimator.EstimatorBase): The estimator use to preprocess and transform the training data.

estimator

(sagemaker.estimator.EstimatorBase): The estimator to use for training. Can be a BYO estimator, Framework estimator or Amazon algorithm estimator.

inputs

: Information about the training data. Please refer to the `fit()` method of the associated estimator, as this can take any of the following forms:

(str) - The S3 location where training data is saved.
(list[str, str] or list[str, `sagemaker.inputs.TrainingInput`]) - If using multiple channels for training data, you can specify a list mapping channel names to strings or `sagemaker.inputs.TrainingInput` objects.
(`sagemaker.inputs.TrainingInput`) - Channel configuration for S3 data sources that can provide additional information about the training dataset. See `sagemaker.inputs.TrainingInput` for full details.
(`sagemaker.amazon.amazon_estimator.RecordSet`) - A collection of Amazon `Record` objects serialized and stored in S3. For use with an estimator for an Amazon algorithm.
(list[`sagemaker.amazon.amazon_estimator.RecordSet`]) - A list of `sagemaker.amazon.amazon_estimator.RecordSet` objects, where each instance is a different channel of training data.

s3_bucket

(str): S3 bucket under which the output artifacts from the training job will be stored. The parent path used is built using the format: ``s3://s3_bucket/pipeline_name/models/job_name/``. In this format, `pipeline_name` refers to the keyword argument provided for TrainingPipeline. If a `pipeline_name` argument was not provided, one is auto-generated by the pipeline as `training-pipeline-<timestamp>`. Also, in the format, `job_name` refers to the job name provided when calling the :meth:`TrainingPipeline.run()` method.

role

(str): An AWS IAM role (either name or full Amazon Resource Name (ARN)). This role is used to create, manage, and execute the Step Functions workflows.

client

(SFN.Client, optional): sfn client to use for creating and interacting with the training pipeline in Step Functions. (default: None)

compression_type

(str, optional): Compression type (Gzip/None) of the file for TransformJob. (default:None)

content_type

(str, optional): Content type (MIME) of the document to be used in preprocessing script. See SageMaker documentation for more details. (default:None)

pipeline_name

(str, optional): Name of the pipeline. This name will be used to name jobs (if not provided when calling execute()), models, endpoints, and S3 objects created by the pipeline. If a `pipeline_name` argument was not provided, one is auto-generated by the pipeline as `training-pipeline-<timestamp>`. (default:None)

Method `build_workflow_definition()`

Build the workflow definition for the inference pipeline with all the states involved.

Usage

InferencePipeline$build_workflow_definition()

Returns

:class:`~stepfunctions.steps.states.Chain`: Workflow definition as a chain of states involved in the the inference pipeline.

Method `execute()`

Run the inference pipeline.

Usage

InferencePipeline$execute(job_name = NULL, hyperparameters = NULL)

Arguments

job_name: (str, optional): Name for the training job. This is also used as suffix for the preprocessing job as `preprocess-<job_name>`. If one is not provided, a job name will be auto-generated. (default: None)
hyperparameters: (list, optional): Hyperparameters for the estimator training. (default: None)

Returns

:R:class:`~stepfunctions.workflow.Execution`: Running instance of the inference pipeline.

Method `clone()`

The objects of this class are cloneable with this method.

Usage

InferencePipeline$clone(deep = FALSE)

Arguments

deep: Whether to make a deep clone.

Super class

Methods

Public methods

Method new()

Usage

Arguments

Method build_workflow_definition()

Usage

Returns

Method execute()

Usage

Arguments

Returns

Method clone()

Usage

Arguments

Method `new()`

Method `build_workflow_definition()`

Method `execute()`

Method `clone()`