RBloggers|RBloggers-feedburner I am happy to announce that RAthena-1.9.0 and noctua-1.7.0 have been released onto the cran. They both bring two key features: More stability when working with AWS Athena, focusing on AWS Rate Exceeded throttling errors New helper function to convert AWS S3 backend files to save cost NOTE: RAthena and noctua features correspond to each other, as a result I will refer to them interchangeability. Stability Throttling AWS One of the main problems when working with AWS API is stumbling into Rate Exceeded throttling error.
RBloggers|RBloggers-feedburner RAthena 1.7.1 and noctua 1.5.1 package versions have now been released to the CRAN. They both bring along several improvements with the connection to AWS Athena, noticeably the performance speed and several creature comforts. These packages have both been designed to reflect one another,even down to how they connect to AWS Athena. This means that all features going forward will exist in both packages. I will refer to these packages as one, as they basically work in the same way.
RBloggers|RBloggers-feedburner Intro: After developing the package RAthena, I stumbled quite accidentally into the R SDK for AWS paws. As RAthena utilises Python’s SDK boto3 I thought the development of another AWS Athena package couldn’t hurt. As mentioned in my previous blog the paws syntax is very similar to boto3 so alot of my RAthena code was very portable and this gave me my final excuse to develop my next R package.
RBloggers|RBloggers-feedburner Intro: For a long time I have found it difficult to appreciate the benefits of “cloud compute” in my R model builds. This was due to my initial lack of understanding and the setting up of R on cloud compute environments. When I noticed that AWS was bringing out a new product AWS Sagemaker, the possiblities of what it could provide seemed like a dream come true. Amazon SageMaker provides every developer and data scientist with the ability to build, train, and deploy machine learning models quickly.
RBloggers|RBloggers-feedburner Recap: RAthena is a R package that interfaces into Amazon Athena. However, it doesn’t use the standard ODBC and JDBC drivers like AWR.Athena and metis. Instead RAthena utilises Python’s SDK (software development kit) into Amazon, Boto3. It does this by using the reticulate package that provides an interface into Python. What this means is that RAthena doesn’t require any driver installation or setup. That can be particularly difficult when you are considering setting up the ODBC drivers and you are not familiar with how ODBC works on your current operating system.
RBloggers|RBloggers-feedburner Intro: Currently there are two key ways in connecting to Amazon Athena from R, using the ODBC and JDBC drivers. To access the ODBC driver R users can use the excellent odbc package supported by Rstudio. To access the JDBC driver R users can either use the RJDBC R package or the helpful wrapper package AWR.Athena which wraps the RJDBC package to make the connection to Amazon Athena through the JDBC driver simpler.