Create Profile Job
gluedatabrew_create_profile_job | R Documentation |
Creates a new job to analyze a dataset and create its data profile¶
Description¶
Creates a new job to analyze a dataset and create its data profile.
Usage¶
gluedatabrew_create_profile_job(DatasetName, EncryptionKeyArn,
EncryptionMode, Name, LogSubscription, MaxCapacity, MaxRetries,
OutputLocation, Configuration, ValidationConfigurations, RoleArn, Tags,
Timeout, JobSample)
Arguments¶
DatasetName
[required] The name of the dataset that this job is to act upon.
EncryptionKeyArn
The Amazon Resource Name (ARN) of an encryption key that is used to protect the job.
EncryptionMode
The encryption mode for the job, which can be one of the following:
SSE-KMS
-SSE-KMS
- Server-side encryption with KMS-managed keys.SSE-S3
- Server-side encryption with keys managed by Amazon S3.
Name
[required] The name of the job to be created. Valid characters are alphanumeric (A-Z, a-z, 0-9), hyphen (-), period (.), and space.
LogSubscription
Enables or disables Amazon CloudWatch logging for the job. If logging is enabled, CloudWatch writes one log stream for each job run.
MaxCapacity
The maximum number of nodes that DataBrew can use when the job processes data.
MaxRetries
The maximum number of times to retry the job after a job run fails.
OutputLocation
[required]
Configuration
Configuration for profile jobs. Used to select columns, do evaluations, and override default parameters of evaluations. When configuration is null, the profile job will run with default settings.
ValidationConfigurations
List of validation configurations that are applied to the profile job.
RoleArn
[required] The Amazon Resource Name (ARN) of the Identity and Access Management (IAM) role to be assumed when DataBrew runs the job.
Tags
Metadata tags to apply to this job.
Timeout
The job's timeout in minutes. A job that attempts to run longer than this timeout period ends with a status of
TIMEOUT
.JobSample
Sample configuration for profile jobs only. Determines the number of rows on which the profile job will be executed. If a JobSample value is not provided, the default value will be used. The default value is CUSTOM_ROWS for the mode parameter and 20000 for the size parameter.
Value¶
A list with the following syntax:
Request syntax¶
svc$create_profile_job(
DatasetName = "string",
EncryptionKeyArn = "string",
EncryptionMode = "SSE-KMS"|"SSE-S3",
Name = "string",
LogSubscription = "ENABLE"|"DISABLE",
MaxCapacity = 123,
MaxRetries = 123,
OutputLocation = list(
Bucket = "string",
Key = "string",
BucketOwner = "string"
),
Configuration = list(
DatasetStatisticsConfiguration = list(
IncludedStatistics = list(
"string"
),
Overrides = list(
list(
Statistic = "string",
Parameters = list(
"string"
)
)
)
),
ProfileColumns = list(
list(
Regex = "string",
Name = "string"
)
),
ColumnStatisticsConfigurations = list(
list(
Selectors = list(
list(
Regex = "string",
Name = "string"
)
),
Statistics = list(
IncludedStatistics = list(
"string"
),
Overrides = list(
list(
Statistic = "string",
Parameters = list(
"string"
)
)
)
)
)
),
EntityDetectorConfiguration = list(
EntityTypes = list(
"string"
),
AllowedStatistics = list(
list(
Statistics = list(
"string"
)
)
)
)
),
ValidationConfigurations = list(
list(
RulesetArn = "string",
ValidationMode = "CHECK_ALL"
)
),
RoleArn = "string",
Tags = list(
"string"
),
Timeout = 123,
JobSample = list(
Mode = "FULL_DATASET"|"CUSTOM_ROWS",
Size = 123
)
)