Create Crawler
glue_create_crawler | R Documentation |
Creates a new crawler with specified targets, role, configuration, and optional schedule¶
Description¶
Creates a new crawler with specified targets, role, configuration, and
optional schedule. At least one crawl target must be specified, in the
s3Targets
field, the jdbcTargets
field, or the DynamoDBTargets
field.
Usage¶
glue_create_crawler(Name, Role, DatabaseName, Description, Targets,
Schedule, Classifiers, TablePrefix, SchemaChangePolicy, RecrawlPolicy,
LineageConfiguration, LakeFormationConfiguration, Configuration,
CrawlerSecurityConfiguration, Tags)
Arguments¶
Name
[required] Name of the new crawler.
Role
[required] The IAM role or Amazon Resource Name (ARN) of an IAM role used by the new crawler to access customer resources.
DatabaseName
The Glue database where results are written, such as:
arn:aws:daylight:us-east-1::database/sometable/*
.Description
A description of the new crawler.
Targets
[required] A list of collection of targets to crawl.
Schedule
A
cron
expression used to specify the schedule (see Time-Based Schedules for Jobs and Crawlers. For example, to run something every day at 12:15 UTC, you would specify:cron(15 12 * * ? *)
.Classifiers
A list of custom classifiers that the user has registered. By default, all built-in classifiers are included in a crawl, but these custom classifiers always override the default classifiers for a given classification.
TablePrefix
The table prefix used for catalog tables that are created.
SchemaChangePolicy
The policy for the crawler's update and deletion behavior.
RecrawlPolicy
A policy that specifies whether to crawl the entire dataset again, or to crawl only folders that were added since the last crawler run.
LineageConfiguration
Specifies data lineage configuration settings for the crawler.
LakeFormationConfiguration
Specifies Lake Formation configuration settings for the crawler.
Configuration
Crawler configuration information. This versioned JSON string allows users to specify aspects of a crawler's behavior. For more information, see Setting crawler configuration options.
CrawlerSecurityConfiguration
The name of the
SecurityConfiguration
structure to be used by this crawler.Tags
The tags to use with this crawler request. You may use tags to limit access to the crawler. For more information about tags in Glue, see Amazon Web Services Tags in Glue in the developer guide.
Value¶
An empty list.
Request syntax¶
svc$create_crawler(
Name = "string",
Role = "string",
DatabaseName = "string",
Description = "string",
Targets = list(
S3Targets = list(
list(
Path = "string",
Exclusions = list(
"string"
),
ConnectionName = "string",
SampleSize = 123,
EventQueueArn = "string",
DlqEventQueueArn = "string"
)
),
JdbcTargets = list(
list(
ConnectionName = "string",
Path = "string",
Exclusions = list(
"string"
),
EnableAdditionalMetadata = list(
"COMMENTS"|"RAWTYPES"
)
)
),
MongoDBTargets = list(
list(
ConnectionName = "string",
Path = "string",
ScanAll = TRUE|FALSE
)
),
DynamoDBTargets = list(
list(
Path = "string",
scanAll = TRUE|FALSE,
scanRate = 123.0
)
),
CatalogTargets = list(
list(
DatabaseName = "string",
Tables = list(
"string"
),
ConnectionName = "string",
EventQueueArn = "string",
DlqEventQueueArn = "string"
)
),
DeltaTargets = list(
list(
DeltaTables = list(
"string"
),
ConnectionName = "string",
WriteManifest = TRUE|FALSE,
CreateNativeDeltaTable = TRUE|FALSE
)
),
IcebergTargets = list(
list(
Paths = list(
"string"
),
ConnectionName = "string",
Exclusions = list(
"string"
),
MaximumTraversalDepth = 123
)
),
HudiTargets = list(
list(
Paths = list(
"string"
),
ConnectionName = "string",
Exclusions = list(
"string"
),
MaximumTraversalDepth = 123
)
)
),
Schedule = "string",
Classifiers = list(
"string"
),
TablePrefix = "string",
SchemaChangePolicy = list(
UpdateBehavior = "LOG"|"UPDATE_IN_DATABASE",
DeleteBehavior = "LOG"|"DELETE_FROM_DATABASE"|"DEPRECATE_IN_DATABASE"
),
RecrawlPolicy = list(
RecrawlBehavior = "CRAWL_EVERYTHING"|"CRAWL_NEW_FOLDERS_ONLY"|"CRAWL_EVENT_MODE"
),
LineageConfiguration = list(
CrawlerLineageSettings = "ENABLE"|"DISABLE"
),
LakeFormationConfiguration = list(
UseLakeFormationCredentials = TRUE|FALSE,
AccountId = "string"
),
Configuration = "string",
CrawlerSecurityConfiguration = "string",
Tags = list(
"string"
)
)