This creates a file system "like" API based off fs
(e.g. dir_ls, file_copy, etc.) for AWS S3 storage.
This method will only update the modification time of the AWS S3 object.
s3_cache
Cache AWS S3
s3_cache_bucket
Cached s3 bucket
s3_client
paws s3 client
region_name
AWS region when creating new connections
profile_name
The name of a profile to use
multipart_threshold
Threshold to use multipart
request_payer
Threshold to use multipart
pid
Get the process ID of the R Session
retries
number of retries
new()
Initialize S3FileSystem class
S3FileSystem$new(
aws_access_key_id = NULL,
aws_secret_access_key = NULL,
aws_session_token = NULL,
region_name = NULL,
profile_name = NULL,
endpoint = NULL,
disable_ssl = FALSE,
multipart_threshold = fs_bytes("2GB"),
request_payer = FALSE,
anonymous = FALSE,
...
)
aws_access_key_id
(character): AWS access key ID
aws_secret_access_key
(character): AWS secret access key
aws_session_token
(character): AWS temporary session token
region_name
(character): Default region when creating new connections
profile_name
(character): The name of a profile to use. If not given, then the default profile is used.
endpoint
(character): The complete URL to use for the constructed client.
disable_ssl
(logical): Whether or not to use SSL. By default, SSL is used.
multipart_threshold
(fs_bytes): Threshold to use multipart instead of standard copy and upload methods.
request_payer
(logical): Confirms that the requester knows that they will be charged for the request.
anonymous
(logical): Set up anonymous credentials when connecting to AWS S3.
...
Other parameters within paws
client.
file_chmod()
Change file permissions
S3FileSystem$file_chmod(
path,
mode = c("private", "public-read", "public-read-write", "authenticated-read",
"aws-exec-read", "bucket-owner-read", "bucket-owner-full-control")
)
file_copy()
copy files
S3FileSystem$file_copy(
path,
new_path,
max_batch = fs_bytes("100MB"),
overwrite = FALSE,
...
)
path
(character): path to a local directory of file or a uri.
new_path
(character): path to a local directory of file or a uri.
max_batch
(fs_bytes): Maximum batch size being uploaded with each multipart.
overwrite
(logical): Overwrite files if the exist. If this is FALSE
and the file exists an error will be thrown.
...
parameters to be passed to s3_put_object
file_create()
Create file on AWS S3, if file already exists it will be left unchanged.
path
(character): A character vector of path or s3 uri.
overwrite
(logical): Overwrite files if the exist. If this is FALSE
and the file exists an error will be thrown.
...
parameters to be passed to s3_put_object
file_delete()
Delete files in AWS S3
path
(character): A character vector of paths or s3 uris.
...
parameters to be passed to s3_delete_objects
file_download()
Downloads AWS S3 files to local
path
(character): A character vector of paths or uris
new_path
(character): A character vector of paths to the new locations.
overwrite
(logical): Overwrite files if the exist. If this is FALSE
and the file exists an error will be thrown.
...
parameters to be passed to s3_get_object
file_info()
Returns file information within AWS S3 directory
A data.table with metadata for each file. Columns returned are as follows.
bucket_name (character): AWS S3 bucket of file
key (character): AWS S3 path key of file
uri (character): S3 uri of file
size (numeric): file size in bytes
type (character): file type (file or directory)
etag (character): An entity tag is an opague identifier
last_modified (POSIXct): Created date of file.
delete_marker (logical): Specifies retrieved a logical marker
accept_ranges (character): Indicates that a range of bytes was specified.
expiration (character): File expiration
restore (character): If file is archived
archive_status (character): Archive status
missing_meta (integer): Number of metadata entries not returned in "x-amz-meta" headers
version_id (character): version id of file
cache_control (character): caching behaviour for the request/reply chain
content_disposition (character): presentational information of file
content_encoding (character): file content encodings
content_language (character): what language the content is in
content_type (character): file MIME type
expires (POSIXct): date and time the file is no longer cacheable
website_redirect_location (character): redirects request for file to another
server_side_encryption (character): File server side encryption
metadata (list): metadata of file
sse_customer_algorithm (character): server-side encryption with a customer-provided encryption key
sse_customer_key_md5 (character): server-side encryption with a customer-provided encryption key
ssekms_key_id (character): ID of the Amazon Web Services Key Management Service
bucket_key_enabled (logical): s3 bucket key for server-side encryption with
storage_class (character): file storage class information
request_charged (character): indicates successfully charged for request
replication_status (character): return specific header if request involves a bucket that is either a source or a destination in a replication rule https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html#S3.Client.head_object
parts_count (integer): number of count parts the file has
object_lock_mode (character): the file lock mode
object_lock_retain_until_date (POSIXct): date and time of when object_lock_mode expires
object_lock_legal_hold_status (character): file legal holding
file_move()
Move files to another location on AWS S3
S3FileSystem$file_move(
path,
new_path,
max_batch = fs_bytes("100MB"),
overwrite = FALSE,
...
)
path
(character): A character vector of s3 uri
new_path
(character): A character vector of s3 uri.
max_batch
(fs_bytes): Maximum batch size being uploaded with each multipart.
overwrite
(logical): Overwrite files if the exist. If this is FALSE
and the file exists an error will be thrown.
...
parameters to be passed to s3_copy_object
file_stream_in()
Streams in AWS S3 file as a raw vector
path
(character): A character vector of paths or s3 uri
...
parameters to be passed to s3_get_object
file_stream_out()
Streams out raw vector to AWS S3 file
S3FileSystem$file_stream_out(
obj,
path,
max_batch = fs_bytes("100MB"),
overwrite = FALSE,
...
)
obj
(raw|character): A raw vector, rawConnection, url to be streamed up to AWS S3.
path
(character): A character vector of paths or s3 uri
max_batch
(fs_bytes): Maximum batch size being uploaded with each multipart.
overwrite
(logical): Overwrite files if the exist. If this is FALSE
and the file exists an error will be thrown.
...
parameters to be passed to s3_put_object
file_temp()
return the name which can be used as a temporary file
file_tag_delete()
Delete file tags
path
(character): A character vector of paths or s3 uri
...
parameters to be passed to s3_put_object
file_tag_info()
Get file tags
data.table of file version metadata
bucket_name (character): AWS S3 bucket of file
key (character): AWS S3 path key of file
uri (character): S3 uri of file
size (numeric): file size in bytes
version_id (character): version id of file
tag_key (character): name of tag
tag_value (character): tag value
file_tag_update()
Update file tags
file_touch()
Similar to fs::file_touch
this does not create the file if
it does not exist. Use s3fs$file_create()
to do this if needed.
path
(character): A character vector of paths or s3 uri
...
parameters to be passed to s3_copy_object
file_upload()
Uploads files to AWS S3
S3FileSystem$file_upload(
path,
new_path,
max_batch = fs_bytes("100MB"),
overwrite = FALSE,
...
)
path
(character): A character vector of local file paths to upload to AWS S3
new_path
(character): A character vector of AWS S3 paths or uri's of the new locations.
max_batch
(fs_bytes): Maximum batch size being uploaded with each multipart.
overwrite
(logical): Overwrite files if the exist. If this is FALSE
and the file exists an error will be thrown.
...
parameters to be passed to s3_put_object
and s3_create_multipart_upload
file_url()
Generate presigned url for S3 object
path
(character): A character vector of paths or uris
expiration
(numeric): The number of seconds the presigned url is valid for. By default it expires in an hour (3600 seconds)
...
parameters passed to s3_get_object
file_version_info()
Get file versions
path
(character): A character vector of paths or uris
...
parameters to be passed to s3_list_object_versions
return data.table with file version info, columns below:
bucket_name (character): AWS S3 bucket of file
key (character): AWS S3 path key of file
uri (character): S3 uri of file
size (numeric): file size in bytes
version_id (character): version id of file
owner (character): file owner
etag (character): An entity tag is an opague identifier
last_modified (POSIXct): Created date of file.
is_bucket()
Test for file types
path
(character): A character vector of paths or uris
...
parameters to be passed to s3_list_objects_v2
bucket_chmod()
Change bucket permissions
S3FileSystem$bucket_chmod(
path,
mode = c("private", "public-read", "public-read-write", "authenticated-read")
)
bucket_create()
Create bucket
S3FileSystem$bucket_create(
path,
region_name = NULL,
mode = c("private", "public-read", "public-read-write", "authenticated-read"),
versioning = FALSE,
...
)
path
(character): A character vector of path or s3 uri.
region_name
(character): aws region
mode
(character): A character of the mode
versioning
(logical): Whether to set the bucket to versioning or not.
...
parameters to be passed to s3_create_bucket
dir_copy()
Copies the directory recursively to the new location.
S3FileSystem$dir_copy(
path,
new_path,
max_batch = fs_bytes("100MB"),
overwrite = FALSE,
...
)
path
(character): path to a local directory of file or a uri.
new_path
(character): path to a local directory of file or a uri.
max_batch
(fs_bytes): Maximum batch size being uploaded with each multipart.
overwrite
(logical): Overwrite files if the exist. If this is FALSE
and the file exists an error will be thrown.
...
parameters to be passed to s3_put_object
and s3_create_multipart_upload
dir_create()
Create empty directory
path
(character): A vector of directory or uri to be created in AWS S3
overwrite
(logical): Overwrite files if the exist. If this is FALSE
and the file exists an error will be thrown.
...
parameters to be passed to s3_put_object
dir_download()
Downloads AWS S3 files to local
path
(character): A character vector of paths or uris
new_path
(character): A character vector of paths to the new locations.
Please ensure directories end with a /
.
overwrite
(logical): Overwrite files if the exist. If this is FALSE
and the file exists an error will be thrown.
...
parameters to be passed to s3_get_object
dir_info()
Returns file information within AWS S3 directory
S3FileSystem$dir_info(
path = ".",
type = c("any", "bucket", "directory", "file"),
glob = NULL,
regexp = NULL,
invert = FALSE,
recurse = FALSE,
refresh = FALSE,
...
)
path
(character):A character vector of one or more paths. Can be path or s3 uri.
type
(character): File type(s) to return. Default ("any") returns all AWS S3 object types.
glob
(character): A wildcard pattern (e.g. *.csv
), passed onto
grep()
to filter paths.
regexp
(character): A regular expression (e.g. [.]csv$
),
passed onto grep()
to filter paths.
invert
(logical): If code
return files which do not match.
recurse
(logical): Returns all AWS S3 objects in lower sub directories
refresh
(logical): Refresh cached in s3_cache
.
...
parameters to be passed to s3_list_objects_v2
data.table with directory metadata
bucket_name (character): AWS S3 bucket of file
key (character): AWS S3 path key of file
uri (character): S3 uri of file
size (numeric): file size in bytes
version_id (character): version id of file
etag (character): An entity tag is an opague identifier
last_modified (POSIXct): Created date of file
dir_ls()
Returns file name within AWS S3 directory
S3FileSystem$dir_ls(
path = ".",
type = c("any", "bucket", "directory", "file"),
glob = NULL,
regexp = NULL,
invert = FALSE,
recurse = FALSE,
refresh = FALSE,
...
)
path
(character):A character vector of one or more paths. Can be path or s3 uri.
type
(character): File type(s) to return. Default ("any") returns all AWS S3 object types.
glob
(character): A wildcard pattern (e.g. *.csv
), passed onto
grep()
to filter paths.
regexp
(character): A regular expression (e.g. [.]csv$
),
passed onto grep()
to filter paths.
invert
(logical): If code
return files which do not match.
recurse
(logical): Returns all AWS S3 objects in lower sub directories
refresh
(logical): Refresh cached in s3_cache
.
...
parameters to be passed to s3_list_objects_v2
dir_ls_url()
Generate presigned url to list S3 directories
path
(character): A character vector of paths or uris
expiration
(numeric): The number of seconds the presigned url is valid for. By default it expires in an hour (3600 seconds)
recurse
(logical): Returns all AWS S3 objects in lower sub directories
...
parameters passed to s3_list_objects_v2
dir_tree()
Print contents of directories in a tree-like format
path
(character): path A path to print the tree from
recurse
(logical): Returns all AWS S3 objects in lower sub directories
...
Additional arguments passed to s3_dir_ls.
dir_upload()
Uploads local directory to AWS S3
S3FileSystem$dir_upload(
path,
new_path,
max_batch = fs_bytes("100MB"),
overwrite = FALSE,
...
)
path
(character): A character vector of local file paths to upload to AWS S3
new_path
(character): A character vector of AWS S3 paths or uri's of the new locations.
max_batch
(fs_bytes): Maximum batch size being uploaded with each multipart.
overwrite
(logical): Overwrite files if the exist. If this is FALSE
and the file exists an error will be thrown.
...
parameters to be passed to s3_put_object
and s3_create_multipart_upload
path()
Constructs a s3 uri path