Utilises AWS Athena to convert AWS S3 backend file types. It also also to create more efficient file types i.e. "parquet" and "orc" from SQL queries.
dbConvertTable(conn, obj, name, ...) # S4 method for AthenaConnection dbConvertTable( conn, obj, name, partition = NULL, s3.location = NULL, file.type = c("NULL", "csv", "tsv", "parquet", "json", "orc"), compress = TRUE, data = TRUE, ... )
conn | An |
---|---|
obj | Athena table or |
name | Name of destination table |
... | Extra parameters, currently not used |
partition | Partition Athena table |
s3.location | location to store output file, must be in s3 uri format for example ("s3://mybucket/data/"). |
file.type | File type for |
compress | Compress |
data | If |
dbConvertTable()
returns TRUE
but invisible.
if (FALSE) { # Note: # - Require AWS Account to run below example. # - Different connection methods can be used please see `RAthena::dbConnect` documnentation library(DBI) library(RAthena) # Demo connection to Athena using profile name con <- dbConnect(athena()) # write iris table to Athena in defualt delimited format dbWriteTable(con, "iris", iris) # convert delimited table to parquet dbConvertTable(con, obj = "iris", name = "iris_parquet", file.type = "parquet") # Create partitioned table from non-partitioned # iris table using SQL DML query dbConvertTable(con, obj = SQL("select iris.*, date_format(current_date, '%Y%m%d') as time_stamp from iris"), name = "iris_orc_partitioned", file.type = "orc", partition = "time_stamp") # disconnect from Athena dbDisconnect(con) }