Create Profile Job
| gluedatabrew_create_profile_job | R Documentation |
Creates a new job to analyze a dataset and create its data profile¶
Description¶
Creates a new job to analyze a dataset and create its data profile.
Usage¶
gluedatabrew_create_profile_job(DatasetName, EncryptionKeyArn,
EncryptionMode, Name, LogSubscription, MaxCapacity, MaxRetries,
OutputLocation, Configuration, ValidationConfigurations, RoleArn, Tags,
Timeout, JobSample)
Arguments¶
DatasetName[required] The name of the dataset that this job is to act upon.
EncryptionKeyArnThe Amazon Resource Name (ARN) of an encryption key that is used to protect the job.
EncryptionModeThe encryption mode for the job, which can be one of the following:
SSE-KMS-SSE-KMS- Server-side encryption with KMS-managed keys.SSE-S3- Server-side encryption with keys managed by Amazon S3.
Name[required] The name of the job to be created. Valid characters are alphanumeric (A-Z, a-z, 0-9), hyphen (-), period (.), and space.
LogSubscriptionEnables or disables Amazon CloudWatch logging for the job. If logging is enabled, CloudWatch writes one log stream for each job run.
MaxCapacityThe maximum number of nodes that DataBrew can use when the job processes data.
MaxRetriesThe maximum number of times to retry the job after a job run fails.
OutputLocation[required]
ConfigurationConfiguration for profile jobs. Used to select columns, do evaluations, and override default parameters of evaluations. When configuration is null, the profile job will run with default settings.
ValidationConfigurationsList of validation configurations that are applied to the profile job.
RoleArn[required] The Amazon Resource Name (ARN) of the Identity and Access Management (IAM) role to be assumed when DataBrew runs the job.
TagsMetadata tags to apply to this job.
TimeoutThe job's timeout in minutes. A job that attempts to run longer than this timeout period ends with a status of
TIMEOUT.JobSampleSample configuration for profile jobs only. Determines the number of rows on which the profile job will be executed. If a JobSample value is not provided, the default value will be used. The default value is CUSTOM_ROWS for the mode parameter and 20000 for the size parameter.
Value¶
A list with the following syntax:
Request syntax¶
svc$create_profile_job(
DatasetName = "string",
EncryptionKeyArn = "string",
EncryptionMode = "SSE-KMS"|"SSE-S3",
Name = "string",
LogSubscription = "ENABLE"|"DISABLE",
MaxCapacity = 123,
MaxRetries = 123,
OutputLocation = list(
Bucket = "string",
Key = "string",
BucketOwner = "string"
),
Configuration = list(
DatasetStatisticsConfiguration = list(
IncludedStatistics = list(
"string"
),
Overrides = list(
list(
Statistic = "string",
Parameters = list(
"string"
)
)
)
),
ProfileColumns = list(
list(
Regex = "string",
Name = "string"
)
),
ColumnStatisticsConfigurations = list(
list(
Selectors = list(
list(
Regex = "string",
Name = "string"
)
),
Statistics = list(
IncludedStatistics = list(
"string"
),
Overrides = list(
list(
Statistic = "string",
Parameters = list(
"string"
)
)
)
)
)
),
EntityDetectorConfiguration = list(
EntityTypes = list(
"string"
),
AllowedStatistics = list(
list(
Statistics = list(
"string"
)
)
)
)
),
ValidationConfigurations = list(
list(
RulesetArn = "string",
ValidationMode = "CHECK_ALL"
)
),
RoleArn = "string",
Tags = list(
"string"
),
Timeout = 123,
JobSample = list(
Mode = "FULL_DATASET"|"CUSTOM_ROWS",
Size = 123
)
)