Cloudy with a chance of Caffeinated Query Orchestration – New rJava Wrappers for AWS Athena SDK for Java
This article is originally published at https://rud.is/b
There are two fledgling rJava-based R packages that enable working with the AWS SDK for Athena:
They’re both needed to conform with the way CRAN like rJava-based packages submitted that also have large JAR dependencies. The goal is to eventually have wrappers for anything R folks need under the AWS Java SDK menu.
All package pairs will eventually cohabitate under the Cloudy R Project once each gets to 90% API coverage, passes CRAN checks and has passing Travis checks.
One thing I did get working right up front was the asynchronous
dplyr chain query execution
collect_async(), so if you need that and would rather not use reticulated wrappers, now’s your chance.
You would be correct in assuming this is an offshoot of the recent work on updating
metis. My primary impetus for this is to remove the
reticulate dependency from our Dockerized production setups but I also have discovered I like the Java libraries more than the
boto3-based ones (not really a shocker there if you know my views on Python). As a result I should be able to quickly wrap most any library you may need (see below).
The next major wrapper coming is S3 (there are bits of it implemented in
awsathena now but that’s temporary) and — for now — you can toss a comment here or file an issue in any of the social coding sites you like for priority wrapping of other AWS Java SDK libraries. Also, if you want some experience working with rJava packages in a judgement-free zone, drop a note into these or any new AWS rJava-based package repos and I’ll gladly walk you through your first PR.
Please visit source website for post related comments.