People have always tried to make data analysis easier, but it has never been completely successful. Many technologies have been available to analyze data, but it is only recently that a data analytics tool has shown significant promise in streamlining data analysis. Enter Amazon Web Services’ Athena.
The launch of Amazon Athena, an ANSI-standard query tool or interactive query service that works with big data stored in Amazon Simple Storage Service S3 (Amazon S3), has sparked an interest.
In this article, we will go over the following:
- Amazon Athena in a Nutshell
- Authentication Options for Amazon Athena
- Athena Authentication Best Practices
- Conclusion
This is a part of our guide to Amazon Athena Security.
Amazon Athena in a Nutshell
Amazon Athena is an interactive data analysis tool that speeds up the processing of difficult queries to generate query results fast. It is serverless, so there is no effort in getting started and no infrastructure to manage. Moreover, you pay for the queries you execute as it is not a database service. You have to point your data in S3 using standard SQL.
Even with a huge dataset and sophisticated queries, Amazon Athena can scale on its own automatically by processing queries in parallel. This system expedites the generation of query results.
Authentication Options for Amazon Athena
You can leverage Microsoft’s Azure Active Directory (AD) or Ping Identity’s PingFederate for authentication with suitable business intelligence, SQL, or embedded analytics applications using the latest release of JDBC drivers and ODBC drivers for Amazon Athena.
Customers using Amazon Athena’s drivers now have access to a wider variety of authentication choices due to this release. Thus, if you want your application to access Amazon Web Services and its resources, such as Athena, you must supply the JDBC driver or ODBC driver credentials.
Both the Athena JDBC drivers and Athena ODBC drivers support authentication based on SAML 2.0, which includes the following identity providers:
- Active Directory Federation Services (AD FS)
- Azure Active Directory (AD)
- Okta
- PingFederate
Connect to Athena with the JDBC Driver
Connecting Athena to other applications and business intelligence tools can be accomplished by utilizing a JDBC connection. To achieve this, download the Athena JDBC driver, install it and configure it. There are two different versions of the driver available, beginning with version 2.0.24: one incorporates the Amazon Software Development Kit (SDK), while the other does not.
Connect to Athena with the ODBC Driver
Connecting Athena to other applications and tools for business intelligence can also be accomplished by using an ODBC connection. Downloading the Amazon Athena ODBC driver License Agreement, ODBC drivers, and ODBC documentation, followed by installing and configuring the software, is required.
Athena Authentication Best Practices
The AWSQuicksightAthenaAccess Managed Policy outlines and requires all activities to be enabled for JDBC and ODBC drivers, so make sure you have the correct Identity and Access Management IAM permissions policy in place. Moreover, to ensure the security of their AWS resources, IAM users can follow these guidelines for Athena Authentication.
Lock up Your AWS Account's Root User Access Key ID
To perform programmatic requests to AWS, you need an access key, such as an access key ID and a secret key. However, do not utilize your AWS account’s root user access key. Your AWS account root user’s access key grants full access to your resources for all AWS services, particularly billing information. To keep your root user access key safe, you should use an IAM administrative user instead, and not circulate it in your organization
Use Roles to Delegate Permissions
You can obtain a temporary session of credentials for a role using the AWS Security Token Service or the AWS Management Console. Notably, using federated access to Athena instead of your long-term password or access key credentials is safer. If your credentials are compromised, a session has a restricted duration, which decreases your risk.
Thus, use your IAM role temporary credentials to access only the resources you need to complete your job as a best practice.
Grant the Least Privilege User Access
When creating IAM policies, adhere to the traditional security advice of allowing only the permissions necessary to complete the task. Create policies that only allow users and roles to execute the tasks they have gotten assigned.
You should only grant permissions when it is necessary. As a result, this approach is more secure than starting with too lax permissions and subsequently trying to tighten them.
Review and Validate Your Policies
You should validate the policies you create. Validation of procedures is possible when creating and editing JSON policies. IAM identifies any JSON syntax errors, whereas IAM Access Analyzer delivers over 100 policy tests and actionable recommendations to assist policy authors in writing effective and practical policies.
Utilize Customer-Managed Policies Rather than Inline Policies
You should utilize managed policies rather than inline policies for bespoke policies. The ability to access all of your managed policies in one location in the console is a fundamental advantage of adopting these policies. You can also view this information with a single AWS CLI or AWS API function.
Inline policies exist exclusively on an IAM identity, such as a user, user group, or role. On the other hand, managed policies are distinct IAM resources associated with various identities.
Conclusion
Especially with the transition to cloud storage, the world is currently experiencing a period of abundant data. However, the technologies used to evaluate and interpret this data are not necessarily user-friendly, readily available, or even effective. Data must dwell someplace, and most businesses must consider its storage. Amazon Athena can assist you greatly in this manner.
You can use Amazon Athena to execute interactive ad hoc SQL queries against Amazon S3 data without the need to maintain infrastructure or clusters. Amazon Athena facilitates the execution of ad hoc queries against Amazon S3 data without the need to configure or manage servers.
Protecting Your Amazon Athena Data Lake with Satori
Satori enables you to enforce security policies such as dynamic data masking and row-level security on your Athena data access. In addition, Satori continuously discovers and classifies sensitive data, enables self-service data access to datasets on Athena and other data platforms, and keeps an enriched audit log on all data accessed through Athena.