Creates an interface to compose CREATE EXTERNAL TABLE
.
# S4 method for AthenaConnection
sqlCreateTable(
con,
table,
fields,
field.types = NULL,
partition = NULL,
s3.location = NULL,
file.type = c("tsv", "csv", "parquet", "json"),
compress = FALSE,
...
)
A database connection.
The table name, passed on to dbQuoteIdentifier()
. Options are:
a character string with the unquoted DBMS table name,
e.g. "table_name"
,
a call to Id()
with components to the fully qualified table name,
e.g. Id(schema = "my_schema", table = "table_name")
a call to SQL()
with the quoted and fully qualified table name
given verbatim, e.g. SQL('"my_schema"."table_name"')
Either a character vector or a data frame.
A named character vector: Names are column names, values are types.
Names are escaped with dbQuoteIdentifier()
.
Field types are unescaped.
A data frame: field types are generated using
dbDataType()
.
Additional field types used to override derived types.
Partition Athena table (needs to be a named list or vector) for example: c(var1 = "2019-20-13")
s3 bucket to store Athena table, must be set as a s3 uri for example ("s3://mybucket/data/").
By default s3.location is set s3 staging directory from AthenaConnection
object.
What file type to store data.frame on s3, RAthena currently supports ["tsv", "csv", "parquet", "json"]. Default delimited file type is "tsv", in previous versions
of RAthena (=< 1.6.0)
file type "csv" was used as default. The reason for the change is that columns containing Array/JSON
format cannot be written to
Athena due to the separating value ",". This would cause issues with AWS Athena.
Note: "parquet" format is supported by the arrow
package and it will need to be installed to utilise the "parquet" format.
"json" format is supported by jsonlite
package and it will need to be installed to utilise the "json" format.
FALSE | TRUE
To determine if to compress file.type. If file type is ["csv", "tsv"] then "gzip" compression is used, for file type "parquet"
"snappy" compression is used. Currently RAthena
doesn't support compression for "json" file type.
Other arguments used by individual methods.
sqlCreateTable
returns data.frame's DDL
in the SQL
format.
if (FALSE) {
# Note:
# - Require AWS Account to run below example.
# - Different connection methods can be used please see `RAthena::dbConnect` documnentation
library(DBI)
# Demo connection to Athena using profile name
con <- dbConnect(RAthena::athena())
# Create DDL for iris data.frame
sqlCreateTable(con, "iris", iris, s3.location = "s3://path/to/athena/table")
# Create DDL for iris data.frame with partition
sqlCreateTable(con, "iris", iris,
partition = "timestamp",
s3.location = "s3://path/to/athena/table")
# Create DDL for iris data.frame with partition and file.type parquet
sqlCreateTable(con, "iris", iris,
partition = "timestamp",
s3.location = "s3://path/to/athena/table",
file.type = "parquet")
# Disconnect from Athena
dbDisconnect(con)
}