Solved

JSON file that is dynamic and separated between data and headers

  • 11 March 2024
  • 6 replies
  • 121 views

Badge +2

Hello,
 

I have the following JSON file (sample) where by te attributes names are under “Channels” while the data values are under records. This JSON come to me as a API call and have no way of modifying the source.

 

Using FME 2019

 

{
  "deviceID": 12345,
  "channels": [
    "RecordTime",
    "Status",
    "Current Total Average Main",
    "Current Total Average Load1",
    "Voltage Total Average Total",
    "PhaseAngle Total Average Main",
    "PhaseAngle Total Average Load1",
    "Voltage Total Instantaneous Total",
    "Current Total Instantaneous Main",
    "Current Total Instantaneous Load1",
    "PhaseAngle Total Instantaneous Main",
    "PhaseAngle Total Instantaneous Load1",
    "ModemSignalStrength Total Average Total"
  ],
  "records": [
    {
      "recordingTime": "2002-03-03T00:00:00",
      "status": "None",
      "values": [
        2.781,
        0,
        237.95,
        37.4,
        0,
        238.27,
        2.804,
        0,
        37.66,
        0,
        14.82
      ]
    },
    {
      "recordingTime": "2002-03-03T00:05:00",
      "status": "None",
      "values": [
        2.351,
        0,
        238.14,
        42.7,
        0,
        238.19,
        2.267,
        0,
        44.55,
        0,
        15
      ]
    }
  ]
}

 

Ideally i would like to have the output as a table format. The data also changes with different channels both in names and counts depending on the IoT device. Any hints would be greatly apricated.

 

 

I am down the track with python at the moment 

        
        data = json.loads(feature.getAttribute('json'))
        channels = data['channels'] 
        records = []
        deviceID =  data['deviceID'] 
        for record in data['records']:
            row = {channels[i+2]: record['values'][i] for i in range(len(record['values']))}
            row['RecordTime'] = record['recordingTime']
            row['Status'] = record['status']
            row['DeviceID'] = deviceID         
            f = fmeobjects.FMEFeature()
            for K, V in row.items():
                f.setAttribute(K, V)
            self.pyoutput(f)
            
            #self.pyoutput(feature)
            
 

icon

Best answer by geomancer 11 March 2024, 12:58

View original

6 replies

Userlevel 6
Badge +34

In my approach, I would use the SchemaScanner transformer, but I think that was not available in 2019?

  • Split the JSON in two streams, one for the channels, one for the data.
  • Fragment both into features and then merge the channels on the records using a FeatureMerger and the json_index.
  • Create unexposed attributes using @Value(_attr_name) as attribute name.
  • Merge all columns into one feature per record using an Aggregator.
  • Get the Schema using the SchemaScanner.
  • Write to Excel using a dynamic FeatureWriter.

I attached a workspace demonstrating this, but it is 2021 as I do not have 2019 anymore,

Userlevel 5
Badge +36

I came up with a slightly different approach, using lists and the ListKeyValuePairExtractor from FME Hub. This workspace creates unexposed attributes, and uses a SchemaScanner before dynamically writing to Excel.

  • Extract [“channels"] and [“records”] (JSONExtractor)
  • Write the channels to a list, excluding 'RecordTime’ and Status, as they are not part of ["values"] later on (2 StringReplacers and an AttributeSplitter)
  • Split ["records"] into fragments, writing ["values"] to a list (JSONFragmenter)
  • Combine both lists into (unexposed) attributes (ListKeyValuePairExtractor)
  • Remove unnecessary attributes (AttributeRemover)
  • Get the schema (SchemaScanner), and export to Excel
Badge +2

Thankyou for two slightly different approaches. Looks like the SchemaScanner is a key ingredient which is not available in 2019. I hope our IT department will upgrade to the latest server version.

Userlevel 6
Badge +34

Thankyou for two slightly different approaches. Looks like the SchemaScanner is a key ingredient which is not available in 2019. I hope our IT department will upgrade to the latest server version.

I think it was introduced in 2020 or 2021.

Userlevel 3
Badge +18

Before the SchemaScanner transformer was introduced, the deprecated SchemaSetter custom transformer was used to create schema attributes for dynamic workflows and this custom transformer used Python in the background.

As you’re already using Python to extract data from the JSON, you can implement something similar in your PythonCaller. Attached is a simple example of setting schema attributes using Python and your example data.

There is also a bookmark containing a more JSON approach using JSONTemplater, JSONFragmenter, and XMLXQueryExtractor with JSONiq. Unfortunately, a bug (FMEENGINE-73347) that affects the flowing point precision of certain real64 values and “Current Total Instantaneous Main” becomes 2.8040000000000003 instead of 2.804.

I hope this information helps.

Badge +2

Before the SchemaScanner transformer was introduced, the deprecated SchemaSetter custom transformer was used to create schema attributes for dynamic workflows and this custom transformer used Python in the background.

As you’re already using Python to extract data from the JSON, you can implement something similar in your PythonCaller. Attached is a simple example of setting schema attributes using Python and your example data.

There is also a bookmark containing a more JSON approach using JSONTemplater, JSONFragmenter, and XMLXQueryExtractor with JSONiq. Unfortunately, a bug (FMEENGINE-73347) that affects the flowing point precision of certain real64 values and “Current Total Instantaneous Main” becomes 2.8040000000000003 instead of 2.804.

I hope this information helps.

Wow!!!! I am just going to spend the next days learning more about the cool stuff that is in your JSON Transformers, especially “JSONig”. Appreciate the education :) 

Reply