Using JDBC Driver to access Athena from Vortex

This tutorial was created by Jackson Pullman (jackson.pullman@yale.edu)

Getting Started

Amazon Athena is an interactive query service that makes it easy to analyze data in Amazon S3 using standard SQL. Athena is serverless, so there is no infrastructure to manage, and you pay only for the queries that you run. Athena is easy to use. Simply point to your data in Amazon S3, define the schema, and start querying using standard SQL. Most results are delivered within seconds. With Athena, there’s no need for complex ETL jobs to prepare your data for analysis. This makes it easy for anyone with SQL skills to quickly analyze large-scale datasets.

This guide will show you how to create a connection to Athena from Vortex and pull in your desired data through an sql query. Before we get started here are a few prerequisites. 

Make sure you have downloaded the JDBC Driver here: https://docs.aws.amazon.com/athena/latest/ug/connect-with-jdbc.html.

You may have to upgrade your JDK to the newest version as well. If you do not have Java installed on your system you can do so here: https://www.oracle.com/java/technologies/javase-downloads.html.

Move this newly downloaded Athena jar file to a ‘libs’ folder in your Vortex Folder. Create a new document called AthenaQuery and save it as a .vpy file in your \vortex\scripts folder. Vortex script extensions must be saved in your \vortex\scripts folder to work properly. The body of your AthenaQuery.vpy file should be copy and pasted from below. You will need to have your Amazon Public and Private Access keys handy, and an s3 location where your output result set can be stored. The script will first ask the user to input this information as well as a few other parameters such as what Athena database you would like to connect to and your SQL query, and then create a connection to Athena to properly query the data. The remainder of the script properly formats the result set so that it can be displayed in Vortex.

Athena_Query.vpy

Running the script

From Vortex, click the scripts tab from the header menu. If you have saved your Athena_Query.vpy file in the correct \vortex\scripts folder you should be able to select the script and it will automatically run. You will be prompted with input boxes to run your query, and depending on the size of your set, should see your data imported soon.

Last updated 14 June 2021

Related Posts