Create Profile Job
gluedatabrew_create_profile_job | R Documentation |
Creates a new job to analyze a dataset and create its data profile¶
Description¶
Creates a new job to analyze a dataset and create its data profile.
Usage¶
gluedatabrew_create_profile_job(DatasetName, EncryptionKeyArn,
EncryptionMode, Name, LogSubscription, MaxCapacity, MaxRetries,
OutputLocation, Configuration, ValidationConfigurations, RoleArn, Tags,
Timeout, JobSample)
Arguments¶
DatasetName |
[required] The name of the dataset that this job is to act upon. |
EncryptionKeyArn |
The Amazon Resource Name (ARN) of an encryption key that is used to protect the job. |
EncryptionMode |
The encryption mode for the job, which can be one of the following:
|
Name |
[required] The name of the job to be created. Valid characters are alphanumeric (A-Z, a-z, 0-9), hyphen (-), period (.), and space. |
LogSubscription |
Enables or disables Amazon CloudWatch logging for the job. If logging is enabled, CloudWatch writes one log stream for each job run. |
MaxCapacity |
The maximum number of nodes that DataBrew can use when the job processes data. |
MaxRetries |
The maximum number of times to retry the job after a job run fails. |
OutputLocation |
[required] |
Configuration |
Configuration for profile jobs. Used to select columns, do evaluations, and override default parameters of evaluations. When configuration is null, the profile job will run with default settings. |
ValidationConfigurations |
List of validation configurations that are applied to the profile job. |
RoleArn |
[required] The Amazon Resource Name (ARN) of the Identity and Access Management (IAM) role to be assumed when DataBrew runs the job. |
Tags |
Metadata tags to apply to this job. |
Timeout |
The job's timeout in minutes. A job that attempts to run longer
than this timeout period ends with a status of
|
JobSample |
Sample configuration for profile jobs only. Determines the number of rows on which the profile job will be executed. If a JobSample value is not provided, the default value will be used. The default value is CUSTOM_ROWS for the mode parameter and 20000 for the size parameter. |
Value¶
A list with the following syntax:
list(
Name = "string"
)
Request syntax¶
svc$create_profile_job(
DatasetName = "string",
EncryptionKeyArn = "string",
EncryptionMode = "SSE-KMS"|"SSE-S3",
Name = "string",
LogSubscription = "ENABLE"|"DISABLE",
MaxCapacity = 123,
MaxRetries = 123,
OutputLocation = list(
Bucket = "string",
Key = "string",
BucketOwner = "string"
),
Configuration = list(
DatasetStatisticsConfiguration = list(
IncludedStatistics = list(
"string"
),
Overrides = list(
list(
Statistic = "string",
Parameters = list(
"string"
)
)
)
),
ProfileColumns = list(
list(
Regex = "string",
Name = "string"
)
),
ColumnStatisticsConfigurations = list(
list(
Selectors = list(
list(
Regex = "string",
Name = "string"
)
),
Statistics = list(
IncludedStatistics = list(
"string"
),
Overrides = list(
list(
Statistic = "string",
Parameters = list(
"string"
)
)
)
)
)
),
EntityDetectorConfiguration = list(
EntityTypes = list(
"string"
),
AllowedStatistics = list(
list(
Statistics = list(
"string"
)
)
)
)
),
ValidationConfigurations = list(
list(
RulesetArn = "string",
ValidationMode = "CHECK_ALL"
)
),
RoleArn = "string",
Tags = list(
"string"
),
Timeout = 123,
JobSample = list(
Mode = "FULL_DATASET"|"CUSTOM_ROWS",
Size = 123
)
)