Hi @csuarez,
To read the link http://www1.kaiho.mlit.go.jp/TUHO/keiho/cgi/disp_warnings.cgi?TYPE=NAVAREA11&TANA;=170795⟨=EG and get the value, i started with transformer Creator and HTTPCaller to get the comunication with website, method GET.
A new attribute _response_body was created:
After i used the custom transformer HTMLStripper to clean the attribute _response_body.
In FME Data Inspector:
Thanks,
Danilo
Hi @danilo_fme, thanks a lot for your answer and the tip on the HTMLStripper. I was actually about to edit my question as realized I wasn't very clear on explain my issue, my end result is exactly what you have from the inspector, but to get there (see attached workspace) I have to pass some parameters in the url:
http://www1.kaiho.mlit.go.jp/TUHO/keiho/cgi/disp_warnings.cgi?TYPE=NAVAREA11&TANA;=17 +
Counter +
⟨=EG
Please noticed the "TANA" variable starts with 17 (year) follow by a 4 digits number. I have created a somehow working version, which creates 10k records, add a counter and concatenate that attribute with the url like the above to extract the value. For me the workaround I have created is not ok as it has to go and create 10k records, process them and then select the ones that actually exists.
The source url still is http://www1.kaiho.mlit.go.jp/TUHO/keiho/navtex_en.html
Any ideas?
Thanks
Cesar
xi-fme.fmw
Hi @danilo_fme, thanks a lot for your answer and the tip on the HTMLStripper. I was actually about to edit my question as realized I wasn't very clear on explain my issue, my end result is exactly what you have from the inspector, but to get there (see attached workspace) I have to pass some parameters in the url:
http://www1.kaiho.mlit.go.jp/TUHO/keiho/cgi/disp_warnings.cgi?TYPE=NAVAREA11&TANA;=17 +
Counter +
⟨=EG
Please noticed the "TANA" variable starts with 17 (year) follow by a 4 digits number. I have created a somehow working version, which creates 10k records, add a counter and concatenate that attribute with the url like the above to extract the value. For me the workaround I have created is not ok as it has to go and create 10k records, process them and then select the ones that actually exists.
The source url still is http://www1.kaiho.mlit.go.jp/TUHO/keiho/navtex_en.html
Any ideas?
Thanks
Cesar
xi-fme.fmw
Hi @csuarez, t seems that the URL with any TANA number always returns a valid HTML, so, if you want to ignore responses that don't have no actual contents, I think that you will have to test the contents after parsing the HTML document, unless you know the valid range of TANA.
Also the StringFormatter or the StringPadder may be useful to create 4 digits number with 0 padding.
Thanks a lot @danilo_fme and @takashi for your insights. For the TANA values, I don't have the way to retrieve them, and that was the reason I created those 10k records (Creator attribute). My main concern is if it is actually possible to use HTTPCaller to retrieve such data from this link:
http://www1.kaiho.mlit.go.jp/TUHO/keiho/navarea11_en.html. This link shows the only records I can (or should) read, and the processing time will be reduced
As opposed to go to the backend and get the data by passing those TANA values generated by the "Creator" transformer, this option works for sure, but takes a lot of time to process, hence my preference for the initial url (http://www1.kaiho.mlit.go.jp/TUHO/keiho/navarea11_en.html).
Once again, thank you,
Cesar
Hello All,
After getting a better understanding on how to pass specific values using HTTPCaller, I was able to pass and retrieved the right values.
The key issue was to identified the url for the form that was making the call after the initial url was accessed (http://www1.kaiho.mlit.go.jp/TUHO/keiho/navarea11_en.html). Once you had the correct url, there were some parameters that needed to be uploaded -Multiplat/Form Data- (Year and Type). This retrieved the TANA records, subsequently another HTTPCaller is made with the url which contains the TANA records.
Once the above is done (see attached workbench for reference to all out there), it comes the fun part, but that's for another adventure.
Thanks so much @danilo_fme and @takashi for your help on this.
Cesar
xi-fme-v3.fmw
Hello All,
After getting a better understanding on how to pass specific values using HTTPCaller, I was able to pass and retrieved the right values.
The key issue was to identified the url for the form that was making the call after the initial url was accessed (http://www1.kaiho.mlit.go.jp/TUHO/keiho/navarea11_en.html). Once you had the correct url, there were some parameters that needed to be uploaded -Multiplat/Form Data- (Year and Type). This retrieved the TANA records, subsequently another HTTPCaller is made with the url which contains the TANA records.
Once the above is done (see attached workbench for reference to all out there), it comes the fun part, but that's for another adventure.
Thanks so much @danilo_fme and @takashi for your help on this.
Cesar
xi-fme-v3.fmw
Hi @csuarez
I saw your Workspace and the configuration inside the transformer HTTPCaller using MultipartUpload. It was great solution!
Thanks,
Danilo