Question

How to keep full number accuracy in JSON contents, when using JSONiq to query a JSON Array?

3 years ago
November 28, 2021
6 replies
54 views

+10

thijsknapen
Contributor
154 replies

Hi folks,

I started using the Tygron API, which I used to extract building information for a specific location/site. However, when I request the building information, the response body is a JSON Array with JSON Objects of which only approximately 1/3 of the JSON Objects are actually about buildings. To determine if a JSON Object is about a building, I'm checking if it contains a "BAG_ID" attribute.

Below is some a sample_json Array that illustrates the data structure I'm working with:

[
	{
		"id": 1,
		"name": null,
		"attributes": {
			"type": "Bridge"
		}
	},
	{
		"id": 2,
		"name": "just a shed",
		"attributes": {
			"type": "Shed"
		}
	},
	{
		"id": 3,
		"name": "just a house",
		"attributes": {
			"type": "Building",
			"BAG_ID": 1.5010000000035E14
		}
	}
]

In the real world data, the JSON Array contains a lot more features and data.

Initially I queried out the JSON Objects on buildings using the following steps;

JSONFragmenter (create features for individual JSON Objects),

JSONExtractor (Extract BAG_ID attribute)

TestFilter (Extract features where the BAG_ID attribute exists (has a value))

Now this method works just fine, but I thought that a direct query on the data (using JSONiq) would probably be more efficient. When I try this on the real data, this indeed seems to be more efficient (i.e. quicker).

In this construction I'm using a XMLXQueryExtractor transformer with the following query to store the desired features in a list:

for $item in jn:members(fme:get-json-attribute("sample_json"))
where exists($item("attributes")("BAG_ID"))
return $item

then I'm using an AttributeRemover to remove the original sample_json (which contains all the features and is quite large)

Then I'm using a ListExploder to create individual features for the JSON Objects that I queried out in the previous step (note, I want to use the default setting to merge incoming attributes at the ListExploder, and without the previous AttributeRemover the whole original JSON Array would get exploded onto each list element/feature).

Conceptually this method works well for me (most importantly it seems to be quicker), but I'm encountering one issue. When using this JSONiq query method, the value for the BAG_ID gets 'truncated' in some sense. Instead of the original value of '1.5010000000035E14', the value gets truncated to '1.501E14'. Since this BAG_ID attribute is also a key on which I need to join/merge data later on, this unfortunately makes the more efficient JSONiq query not useable in the end.

Is there any way to use the more efficient JSONiq query method, without the BAG_ID getting truncated?

Any ideas are greatly appreciated.

Kind regards,

Thijs

See below screenshot for an overview of this workspace. I also added the workspace itself to this case.

+39

ebygomm
Influencer
3308 replies
3 years ago
November 29, 2021

I don't know how it would compare performance wise, but if you just need the BAG_ID couldn't you just do this in a JSON Fragmenter

json[*]["attributes"]["BAG_ID"]

+10

thijsknapen
Author
Contributor
154 replies
3 years ago
November 29, 2021

ebygomm wrote:

I don't know how it would compare performance wise, but if you just need the BAG_ID couldn't you just do this in a JSON Fragmenter

json[*]["attributes"]["BAG_ID"]

Hi @ebygomm,

Thanks for your reply. If I would only need the BAG_ID that would indeed be a good option.

In my case however I need to extract all (whole) JSON Objects in the array, which contain a BAG_ID attribute (and then I also need to retain the full accuraccy of the BAG_ID).

So basically I need to thin out the array for only relevant JSON Objects, since in the true datasets these contain several attributes/objects that need to be processed (like building height, related adresses, building function, geometry, etc).

+39

ebygomm
Influencer
3308 replies
3 years ago
November 29, 2021

thijsknapen wrote:

Hi @ebygomm,

Thanks for your reply. If I would only need the BAG_ID that would indeed be a good option.

In my case however I need to extract all (whole) JSON Objects in the array, which contain a BAG_ID attribute (and then I also need to retain the full accuraccy of the BAG_ID).

Probably one for safe I think, it doesn't seem to matter how you process the data within a XMLXQueryExtractor, you always end up with the truncated value. I'd be interested to know the outcome.

If performance was essential I'd probably end up resorting to python

import fme
import fmeobjects
import json
 
class FeatureProcessor(object):
    def __init__(self):
        pass
    def input(self,feature):
        self.js = json.loads(feature.getAttribute('sample_json'))
        
    def close(self):
        for item in self.js:
            attrs = item["attributes"]
            if "BAG_ID" in attrs:
                feature = fmeobjects.FMEFeature()
                feature.setAttribute("BAG_ID",attrs["BAG_ID"])
                feature.setAttribute("json_fragment",json.dumps(attrs))
                self.pyoutput(feature)
            else:
                pass

+10

thijsknapen
Author
Contributor
154 replies
3 years ago
November 29, 2021

thijsknapen wrote:

Hi @ebygomm,

Thanks for your reply. If I would only need the BAG_ID that would indeed be a good option.

In my case however I need to extract all (whole) JSON Objects in the array, which contain a BAG_ID attribute (and then I also need to retain the full accuraccy of the BAG_ID).

Hi @ebygomm ,

I'm a bit of a Python noob, but I am interested in learning and doing more with it, specifically in cases like these where you just want to do a bit more with your data. That code snippet is really insightfull, and indeed it works!

One minor thing, in my case I wanted the json_fragment to be the 'item' and not the 'attrs' object, but thats an easy fix.

Although I somewhat still prefer the JSONiq code (a. I'm just more familiar with it, and b. I think the code is a bit shorter/cleaner), its really nice to see and learn you have options. 🙂

Although you provided a very good solution just now, I hope Safe (or a fellow FME'er) can have a look at the main question 'How to keep full number accuracy in JSON contents, when using JSONiq to query a JSON Array?', as it's more on JSONiq and its possibilites in FME. So I'm not selecting it as best answer yet if you don't mind.

+39

ebygomm
Influencer
3308 replies
3 years ago
November 29, 2021

thijsknapen wrote:

Hi @ebygomm,

Thanks for your reply. If I would only need the BAG_ID that would indeed be a good option.

In my case however I need to extract all (whole) JSON Objects in the array, which contain a BAG_ID attribute (and then I also need to retain the full accuraccy of the BAG_ID).

Even if you just put the following in a JSONTemplater, that value gets truncated in the output

[
	{
		"id": 1,
		"name": null,
		"attributes": {
			"type": "Bridge"
		}
	},
	{
		"id": 2,
		"name": "just a shed",
		"attributes": {
			"type": "Shed"
		}
	},
	{
		"id": 3,
		"name": "just a house",
		"attributes": {
			"type": "Building",
			"BAG_ID": 1.5010000000035E14
		}
	}
]

+10

thijsknapen
Author
Contributor
154 replies
3 years ago
November 29, 2021

thijsknapen wrote:

Hi @ebygomm,

Thanks for your reply. If I would only need the BAG_ID that would indeed be a good option.

In my case however I need to extract all (whole) JSON Objects in the array, which contain a BAG_ID attribute (and then I also need to retain the full accuraccy of the BAG_ID).

Yes, I noticed, see also my earlier screenshot with the disabled JSONTemplator (and the annotation at it).

However, that problem can quite easily be circumvented by adding a 'type declaration' in the JSONTemplator, i.e. replace the above line 21 with:

"BAG_ID": xs:decimal(1.5010000000035E14)

Reply

Rich Text Editor, editor1

Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

Cookie settings

We use 3 different kinds of cookies. You can choose which cookies you want to accept. We need basic cookies to make this site work, therefore these are the minimum you can select. Learn more about our cookies.

Basic
Functional

Normal
Functional + analytics

Complete
Functional + analytics + social media + embedded videos + marketing

How to keep full number accuracy in JSON contents, when using JSONiq to query a JSON Array?

6 replies

Reply

Helpful Members This Week

Recently Solved Questions

FME Flow version control how to use different branch

Parameters within group parameters not available in a webhook?

How to restart a REST Server in ArcGIS Server?

Remove last CR/LF from a CSV

1019 error with change detector and polygons

Community Stats

Latest FME

Cookie policy

Cookie settings

Reply

Related Topics

Razer Wolverine V2 and V2 Chroma Deadzone Issue

Razer Wolverine V2 Chroma no longer works after deadzone firmware update

Razer Wolverine v2 chroma not connecting to my pcicon

Issue with Razer Wolverine V2 Chroma left joystickicon

Razer Wolverine v2 Chroma not connecting to Xboxicon

Helpful Members This Week

Recently Solved Questions

FME Flow version control how to use different branch

Parameters within group parameters not available in a webhook?

How to restart a REST Server in ArcGIS Server?

Remove last CR/LF from a CSV

1019 error with change detector and polygons

Popular Tags

Community Stats

Latest FME

Sign up

An FME Account is required to contribute

Login to the community

An FME Account is required to contribute

Scanning file for viruses.

This file cannot be downloaded

Cookie policy

Cookie settings