Skip to main content

Does the connector support parquet? Getting a forbidden error. Any suggestions on privs of the account required.

HIVE_CANNOT_OPEN_SPLIT - com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon S3; Status Code: 403; Error Code: 403 Forbidden;

The query runs fine in AWS Athena Console


Query appears to run, but now we are getting an error on the output bucket. I know the user has access to the bucket. I just ran another FME workspace that Uploads to the same bucket using the same user. Does this reader require list bucket access for the output bucket?

An error occurred (InvalidRequestException) when calling the StartQueryExecution operation: Unable to verify/create output bucket


Query appears to run, but now we are getting an error on the output bucket. I know the user has access to the bucket. I just ran another FME workspace that Uploads to the same bucket using the same user. Does this reader require list bucket access for the output bucket?

An error occurred (InvalidRequestException) when calling the StartQueryExecution operation: Unable to verify/create output bucket

Hi @jbradfor,

I'm not an expert on this connector, but I noticed this error seems to be an Amazon Athena error where resources recommend checking for permissions and to confirm whether the query result location already exists: https://aws.amazon.com/premiumsupport/knowledge-center/athena-output-bucket-error/ It might be worth a try fiddling with additional permissions if you can.

 

 

Are you running a query on a public dataset by chance?

We fought through the permissions issue. Needed List Bucket and proper Athena DB access as well as read/write of the output athena query results bucket. Got it working when we run the workspace from within workbench.

Now when our scheduler software logs on to run the workspace we are getting this error:

ERROR |Python Exception <ModuleNotFoundError>: No module named 'fmepy_amazon_athena' Wed Apr 1 15:30:05 2020


We fought through the permissions issue.  Needed List Bucket and proper Athena DB access as well as read/write of the output athena query results bucket.  Got it working when we run the workspace from within workbench.

Now when our scheduler software logs on to run the workspace we are getting this error:

ERROR |Python Exception <ModuleNotFoundError>: No module named 'fmepy_amazon_athena' Wed Apr  1 15:30:05 2020 

Hi @jbradfor,

Glad to hear that you got past the permissions error. I think the tricky part here is that your scheduling software needs to have access. Packages, including the AmazonAthenaConnector, are installed in 

%appdata%\Safe Software\FME\Packages

My colleague suggested that depending on the account used to run the workspace from your task scheduling software, FME may not have access to this directory, which would contain the module that isn't being found. Hope that offers some ideas to work with!

 

 

Update: 

Some more advice from the team: The account running the FME process will need to have the package installed in its app data directory. I'm not 100% on this, but I think it should be possible to install the package using 

fme package install <path to fpkg file>
 if it's difficult to get a GUI login as the scheduler user

We fought through the permissions issue. Needed List Bucket and proper Athena DB access as well as read/write of the output athena query results bucket. Got it working when we run the workspace from within workbench.

Now when our scheduler software logs on to run the workspace we are getting this error:

ERROR |Python Exception <ModuleNotFoundError>: No module named 'fmepy_amazon_athena' Wed Apr 1 15:30:05 2020

I uninstalled the package. Then I dragged the transformer back onto the workspace. Here is the response.

==============================

FME Package install summary

==============================

Name : safe.amazon-athena

Version : 0.2.4

Author : Safe Software

Transformers : AmazonAthenaConnector

Python Packages : fme-amazon-athena

Adding folder `C:\\Users\\myuserid\\AppData\\Roaming\\Safe Software\\FME\\Packages\\19822-win64\\python\\safe.amazon-athena' to the python path

So, how do I install this package as Global (in a location such as the FME install dir), so everyone can see it?


Tried to install the package in the FME install dir using the command line. It installed, but it when I opened the workspace with the transformer, it installed another copy of the package in my AppData.

Probably best to keep the transformers at the user level anyway, since they would be wiped out by new FME installs.

We ended up copying the package folder over to the scheduler userid AppData. Worked!


Tried to install the package in the FME install dir using the command line. It installed, but it when I opened the workspace with the transformer, it installed another copy of the package in my AppData.

Probably best to keep the transformers at the user level anyway, since they would be wiped out by new FME installs.

We ended up copying the package folder over to the scheduler userid AppData. Worked!

The other option would be to loop in the scheduling team and have them login and open the workspace, which would auto trigger the package install.


Tried to install the package in the FME install dir using the command line. It installed, but it when I opened the workspace with the transformer, it installed another copy of the package in my AppData.

Probably best to keep the transformers at the user level anyway, since they would be wiped out by new FME installs.

We ended up copying the package folder over to the scheduler userid AppData. Worked!

Excellent! Thanks for sharing what worked for you!


Reply