RBloggers|RBloggers-feedburner Intro: After developing the package RAthena, I stumbled quite accidentally into the R SDK for AWS paws. As RAthena utilises Python’s SDK boto3 I thought the development of another AWS Athena package couldn’t hurt. As mentioned in my previous blog the paws syntax is very similar to boto3 so alot of my RAthena code was very portable and this gave me my final excuse to develop my next R package.
RBloggers|RBloggers-feedburner Intro: For a long time I have found it difficult to appreciate the benefits of “cloud compute” in my R model builds. This was due to my initial lack of understanding and the setting up of R on cloud compute environments. When I noticed that AWS was bringing out a new product AWS Sagemaker, the possiblities of what it could provide seemed like a dream come true. Amazon SageMaker provides every developer and data scientist with the ability to build, train, and deploy machine learning models quickly.
RBloggers|RBloggers-feedburner Recap: RAthena is a R package that interfaces into Amazon Athena. However, it doesn’t use the standard ODBC and JDBC drivers like AWR.Athena and metis. Instead RAthena utilises Python’s SDK (software development kit) into Amazon, Boto3. It does this by using the reticulate package that provides an interface into Python. What this means is that RAthena doesn’t require any driver installation or setup. That can be particularly difficult when you are considering setting up the ODBC drivers and you are not familiar with how ODBC works on your current operating system.
RBloggers|RBloggers-feedburner Intro: Currently there are two key ways in connecting to Amazon Athena from R, using the ODBC and JDBC drivers. To access the ODBC driver R users can use the excellent odbc package supported by Rstudio. To access the JDBC driver R users can either use the RJDBC R package or the helpful wrapper package AWR.Athena which wraps the RJDBC package to make the connection to Amazon Athena through the JDBC driver simpler.