Question

Azure blob storage

  • 16 March 2021
  • 4 replies
  • 36 views

Hi, I asked about parquet format yesterday and I had great feedback about my problem (https://community.safe.com/s/feed/0D54Q00008VE8CZSA1)

 

Right now I have a bigger problem - but I think, I have too small knowledge on this topic.

 

I have to connect to azure blob storage (datalake) based on SAS auth, I have permission into SAS key like:

ss=bfqt

srt=sco

sp=rl

 

So, I can create a list and I can read files, also I have SignedResourceTypes like Container, Service, and Object.

 

With parameters like that I see db, I see parquet files, also I can download files via Chrome (link+SAS)

11aBut if I want to get attributes (and only it, without download files, because data are too big) - I used this kind of parameters:

 

2and I have empty results:

 

2bMy question:

 

Is it possible to get only attribute per parquet file which is existing in azure?

Or downloading data is required?

 

What I have:

  • SAS key, which I can "modify", but as I mentioned - I think, I have required permissions
  • name container
  • account name to SAS auth

What I want:

  • aggregated by filename (like fme basename on new collumn) attributes from azure.

 


4 replies

also to be noted - AzureFileStorageConnector with the same parameters as 3rd picture - I have error:

3

Badge +2

Hi @lukaszmarciniak​ 

 

The AzureBlobStorageConnector uses the Blob Storage REST API to interact with files stored there.

 

My recommendation would be to set this up yourself in the HTTPCaller and use the API calls that you need.

There are API calls to GET blob properties, GET blob metadata, query blob contents.... I'm not sure if these will give you the response that you need.

The documentation for all of these API calls includes example responses, so you should be able to look through those API requests to see whether there is an API call that will return the information you need. If there isn't one, then unfortunately I think you would have to download the data.

 

Within the HTTPCaller it should be easy to still use SAS. This page on Service SAS examples shows how to include that in the request url.

@jlutherthomas​ - thanks for the response, I have checked your link, should be useful in the future 😃

 

But right now what I have - connected transformers with that I can "automatically" download data from azure - little success.

 

Right now I try to add something else, I still fighting with reading data, not download.

If I added feature reader on this way:

datalake_17032021_2after configuration I see "tree" with folders:

datalake_17032021_1but if I want to open the folder which I want I have an error like this:

datalake_17032021Question: it will be a problem with:

  • DB?
  • my permissions
  • config of fme/something on my VM?

If something will be wrong with db - I have contact with responsible persons for this database, but I need to know - what is wrong ..

Right now I have a little workaround, which satisfies me:

parquet_18032021AZureBlob Connector takes me the link, after that into the reader I added a link from azure blob + I added SAS key - if I good understood logic - I don't download files to my local drive, but it reads in dynamic mode.

Right now I looking into mentioned links, maybe should be better than what I have, I think, I will add question in few days again on this topic :D

 

Reply