Utilises AWS Athena to convert AWS S3 backend file types. It also also to create more efficient file types i.e. "parquet" and "orc" from SQL queries.
dbConvertTable(conn, obj, name, ...)
# S4 method for AthenaConnection
dbConvertTable(
  conn,
  obj,
  name,
  partition = NULL,
  s3.location = NULL,
  file.type = c("NULL", "csv", "tsv", "parquet", "json", "orc"),
  compress = TRUE,
  data = TRUE,
  ...
)An AthenaConnection object, produced by [DBI::dbConnect()]
Athena table or SQL DML query to be converted. For SQL, the query need to be wrapped with DBI::SQL() and
follow AWS Athena DML format link
Name of destination table
Extra parameters, currently not used
Partition Athena table
location to store output file, must be in s3 uri format for example ("s3://mybucket/data/").
File type for name, currently support ["NULL","csv", "tsv", "parquet", "json", "orc"].
"NULL" will let Athena set the file type for you.
Compress name, currently can only compress ["parquet", "orc"] (AWS Athena CTAS)
If name should be created with data or not.
dbConvertTable() returns TRUE but invisible.
if (FALSE) {
# Note:
# - Require AWS Account to run below example.
# - Different connection methods can be used please see `RAthena::dbConnect` documnentation
library(DBI)
library(RAthena)
# Demo connection to Athena using profile name
con <- dbConnect(athena())
# write iris table to Athena in defualt delimited format
dbWriteTable(con, "iris", iris)
# convert delimited table to parquet
dbConvertTable(con,
  obj = "iris",
  name = "iris_parquet",
  file.type = "parquet"
)
# Create partitioned table from non-partitioned
# iris table using SQL DML query
dbConvertTable(con,
  obj = SQL("select
                            iris.*,
                            date_format(current_date, '%Y%m%d') as time_stamp
                          from iris"),
  name = "iris_orc_partitioned",
  file.type = "orc",
  partition = "time_stamp"
)
# disconnect from Athena
dbDisconnect(con)
}