Update Crawler

glue_update_crawler

R Documentation

Updates a crawler¶

Description¶

Updates a crawler. If a crawler is running, you must stop it using stop_crawler before updating it.

Usage¶

glue_update_crawler(Name, Role, DatabaseName, Description, Targets,
  Schedule, Classifiers, TablePrefix, SchemaChangePolicy, RecrawlPolicy,
  LineageConfiguration, LakeFormationConfiguration, Configuration,
  CrawlerSecurityConfiguration)

Arguments¶

Name: [required] Name of the new crawler.
Role: The IAM role or Amazon Resource Name (ARN) of an IAM role that is used by the new crawler to access customer resources.
DatabaseName: The Glue database where results are stored, such as: ⁠arn:aws:daylight:us-east-1::database/sometable/*⁠.
Description: A description of the new crawler.
Targets: A list of targets to crawl.
Schedule: A cron expression used to specify the schedule (see Time-Based Schedules for Jobs and Crawlers. For example, to run something every day at 12:15 UTC, you would specify: ⁠cron(15 12 * * ? *)⁠.
Classifiers: A list of custom classifiers that the user has registered. By default, all built-in classifiers are included in a crawl, but these custom classifiers always override the default classifiers for a given classification.
TablePrefix: The table prefix used for catalog tables that are created.
SchemaChangePolicy: The policy for the crawler's update and deletion behavior.
RecrawlPolicy: A policy that specifies whether to crawl the entire dataset again, or to crawl only folders that were added since the last crawler run.
LineageConfiguration: Specifies data lineage configuration settings for the crawler.
LakeFormationConfiguration: Specifies Lake Formation configuration settings for the crawler.
Configuration: Crawler configuration information. This versioned JSON string allows users to specify aspects of a crawler's behavior. For more information, see Setting crawler configuration options.
CrawlerSecurityConfiguration: The name of the SecurityConfiguration structure to be used by this crawler.

Value¶

An empty list.

Request syntax¶

svc$update_crawler(
  Name = "string",
  Role = "string",
  DatabaseName = "string",
  Description = "string",
  Targets = list(
    S3Targets = list(
      list(
        Path = "string",
        Exclusions = list(
          "string"
        ),
        ConnectionName = "string",
        SampleSize = 123,
        EventQueueArn = "string",
        DlqEventQueueArn = "string"
      )
    ),
    JdbcTargets = list(
      list(
        ConnectionName = "string",
        Path = "string",
        Exclusions = list(
          "string"
        ),
        EnableAdditionalMetadata = list(
          "COMMENTS"|"RAWTYPES"
        )
      )
    ),
    MongoDBTargets = list(
      list(
        ConnectionName = "string",
        Path = "string",
        ScanAll = TRUE|FALSE
      )
    ),
    DynamoDBTargets = list(
      list(
        Path = "string",
        scanAll = TRUE|FALSE,
        scanRate = 123.0
      )
    ),
    CatalogTargets = list(
      list(
        DatabaseName = "string",
        Tables = list(
          "string"
        ),
        ConnectionName = "string",
        EventQueueArn = "string",
        DlqEventQueueArn = "string"
      )
    ),
    DeltaTargets = list(
      list(
        DeltaTables = list(
          "string"
        ),
        ConnectionName = "string",
        WriteManifest = TRUE|FALSE,
        CreateNativeDeltaTable = TRUE|FALSE
      )
    ),
    IcebergTargets = list(
      list(
        Paths = list(
          "string"
        ),
        ConnectionName = "string",
        Exclusions = list(
          "string"
        ),
        MaximumTraversalDepth = 123
      )
    ),
    HudiTargets = list(
      list(
        Paths = list(
          "string"
        ),
        ConnectionName = "string",
        Exclusions = list(
          "string"
        ),
        MaximumTraversalDepth = 123
      )
    )
  ),
  Schedule = "string",
  Classifiers = list(
    "string"
  ),
  TablePrefix = "string",
  SchemaChangePolicy = list(
    UpdateBehavior = "LOG"|"UPDATE_IN_DATABASE",
    DeleteBehavior = "LOG"|"DELETE_FROM_DATABASE"|"DEPRECATE_IN_DATABASE"
  ),
  RecrawlPolicy = list(
    RecrawlBehavior = "CRAWL_EVERYTHING"|"CRAWL_NEW_FOLDERS_ONLY"|"CRAWL_EVENT_MODE"
  ),
  LineageConfiguration = list(
    CrawlerLineageSettings = "ENABLE"|"DISABLE"
  ),
  LakeFormationConfiguration = list(
    UseLakeFormationCredentials = TRUE|FALSE,
    AccountId = "string"
  ),
  Configuration = "string",
  CrawlerSecurityConfiguration = "string"
)