Creates a Task State to execute a `SageMaker Transform Job` https://docs.aws.amazon.com/sagemaker/latest/dg/API_CreateTransformJob.html

Super classes

stepfunctions::Block -> stepfunctions::State -> stepfunctions::Task -> TransformStep

Methods

Public methods

Inherited methods

Method new()

Initialize TranzformStep class

Usage

TransformStep$new(
  state_id,
  transformer,
  job_name,
  model_name,
  data,
  data_type = "S3Prefix",
  content_type = NULL,
  compression_type = NULL,
  split_type = NULL,
  experiment_config = NULL,
  wait_for_completion = TRUE,
  tags = NULL,
  input_filter = NULL,
  output_filter = NULL,
  join_source = NULL,
  ...
)

Arguments

state_id

(str): State name whose length **must be** less than or equal to 128 unicode characters. State names **must be** unique within the scope of the whole state machine.

transformer

(sagemaker.transformer.Transformer): The SageMaker transformer to use in the TransformStep.

job_name

(str or Placeholder): Specify a transform job name. We recommend to use :py:class:`~stepfunctions.inputs.ExecutionInput` placeholder collection to pass the value dynamically in each execution.

model_name

(str or Placeholder): Specify a model name for the transform job to use. We recommend to use :py:class:`~stepfunctions.inputs.ExecutionInput` placeholder collection to pass the value dynamically in each execution.

data

(str): Input data location in S3.

data_type

(str): What the S3 location defines (default: 'S3Prefix'). Valid values:

  • 'S3Prefix' - the S3 URI defines a key name prefix. All objects with this prefix will be used as inputs for the transform job

  • 'ManifestFile' - the S3 URI points to a single manifest file listing each S3 object to use as an input for the transform job.

content_type

(str): MIME type of the input data (default: None).

compression_type

(str): Compression type of the input data, if compressed (default: None). Valid values: 'Gzip', None.

split_type

(str): The record delimiter for the input object (default: 'None'). Valid values: 'None', 'Line', 'RecordIO', and 'TFRecord'.

experiment_config

(list, optional): Specify the experiment config for the transform. (Default: None)

wait_for_completion

(bool, optional): Boolean value set to `True` if the Task state should wait for the transform job to complete before proceeding to the next step in the workflow. Set to `False` if the Task state should submit the transform job and proceed to the next step. (default: True)

tags

(list[list], optional): List to tags https://docs.aws.amazon.com/sagemaker/latest/dg/API_Tag.html to associate with the resource.

input_filter

(str): A JSONPath to select a portion of the input to pass to the algorithm container for inference. If you omit the field, it gets the value ‘$’, representing the entire input. For CSV data, each row is taken as a JSON array, so only index-based JSONPaths can be applied, e.g. $[0], $[1:]. CSV data should follow the RFC format. See Supported JSONPath Operators for a table of supported JSONPath operators. For more information, see the SageMaker API documentation for CreateTransformJob. Some examples: “$[1:]”, “$.features” (default: None).

output_filter

(str): A JSONPath to select a portion of the joined/original output to return as the output. For more information, see the SageMaker API documentation for CreateTransformJob. Some examples: “$[1:]”, “$.prediction” (default: None).

join_source

(str): The source of data to be joined to the transform output. It can be set to ‘Input’ meaning the entire input record will be joined to the inference result. You can use OutputFilter to select the useful portion before uploading to S3. (default: None). Valid values: Input, None.

...

: Extra Fields passed to Task class


Method clone()

The objects of this class are cloneable with this method.

Usage

TransformStep$clone(deep = FALSE)

Arguments

deep

Whether to make a deep clone.