Skip to main content
Question

Extracting png files from website


gisgeek
Contributor
Forum|alt.badge.img+9

I am trying to extract png files from the website https://spark-viewer.wivolo.com/ I can extract the png files individually using the png raster but I have several png files I want to extract out. I was trying to extract a list of these png files so I don't have to do this individually. Unfortunately I am chasing my tail and going around in circles not progressing this. It doesn't appear to be a list as I have tried html table ie. trying to get a list of all the png images to extract. I also tried html extractor but this also isn't working. I'm hoping someone can point me in the direction to generate a list so I can extract all png files without doing this individually.

3 replies

redgeographics
Celebrity
Forum|alt.badge.img+49

If you read the entire page as HTML and then search through it for the png filenames and download those separately using the ImageFetcher it should work, although it kinda depends on how the png's are referenced in the page source. If it's <IMG> tags it should work but if there's a scripting language involved it might be trickier.

First step would be to inspect the page source though and see what you can find.

 

--edit--

Just looked at the page source myself, it doesn't look like there's any (useful) png's referenced in there so it looks like you're out of luck.


nielsgerrits
VIP
Forum|alt.badge.img+54

I think you want to scrape all the tiles from the tilelayer to recreate the map? First I need to say I think this is not a quick job so doing this would not be my first option. Requesting a copy from the dataset is a better alternative. But if you really want to it is doable.

Possible steps are:

  • Understand how the URL is constructed.
https://spark-viewer.wivolo.com/tiles/render?dataset=u850.0108&fmt=png32&nocache=1545350950&index=16/63416/42528 

dataset is the layer

16/63416/42528 describes zoomlevel/x/y, where x and y are rows and columns

  • Read and understand how the Bing/Google Tile Scheme relates to geographic coordinates.
  • Determine zoomlevels and zoomlevels you need. I see a range from 5 to 16, you probably only need the deepest level (16). For developing and iterating through the data I would start with the highest level, 5. When it works as desired, convert to level 16.
  • Determine which datasets you need and look them up. 4G is dataset=lte1800.0028, 3G = dataset=u850.0108, etc.
  • Determine the min / max rows and columns you need.
  • Generate the urls based on dataset / zoomlevel / tiles.
  • Request the images using a HTTPCaller.
  • Georeference the images.
  • Merge the georeferenced images (RasterMosaicker).

I created something similar in the past for reading WMS. You might be able to reuse bits and pieces. See https://knowledge.safe.com/questions/1694/wmts-reading.html 


gisgeek
Contributor
Forum|alt.badge.img+9
  • Author
  • Contributor
  • May 21, 2019
redgeographics wrote:

If you read the entire page as HTML and then search through it for the png filenames and download those separately using the ImageFetcher it should work, although it kinda depends on how the png's are referenced in the page source. If it's <IMG> tags it should work but if there's a scripting language involved it might be trickier.

First step would be to inspect the page source though and see what you can find.

 

--edit--

Just looked at the page source myself, it doesn't look like there's any (useful) png's referenced in there so it looks like you're out of luck.

Yes I went down this route to but didn't get anywhere doing it this way. Thanks for the info though. This might work on other images I am trying to scrape from other websites.


Reply


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings