Skip to main content

I'm extracting data nightly from JSON services of which have no validation. These datasets have over 100 columns. Is there way to import a schema into the attribute validator so for every column it would test data type and max length? It would take forever to test data type, structure and max length for every column in every dataset, but my fme processes keep failing.

Hi @rclarknxf, I think the solution would be different depending on how the desired schema (column names, data type for each column, max length for each column, and other restrictions if needed) is given. Could you please elaborate about this? It would be ideal if you could post a sample JSON document and the schema definition (maybe an external table?).


Hi @rclarknxf, I think the solution would be different depending on how the desired schema (column names, data type for each column, max length for each column, and other restrictions if needed) is given. Could you please elaborate about this? It would be ideal if you could post a sample JSON document and the schema definition (maybe an external table?).

Example of JSON:

 

 

{"success":true,"message":null,"result":{"pageNum":1,"numPages":31,"records":;{"NAME":"VALUE","NAME":"VALUE","NAME":"VALUE","NAME":"VALUE",.....}]}}

 

 

Ideally I would like to have a transformer that works for anything. not sure if the solution needs to be different for each. I used the attribute manager to make then name of the input columns the same as output so all it needs to do is say for X value can I be inserted into this column successfully. So for string - is length of value able to fit into column? For numeric does precision/scale of value fit into column? The writers have the schema (as in the definition of the column).

 

 


Hi @rclarknxf, I think the solution would be different depending on how the desired schema (column names, data type for each column, max length for each column, and other restrictions if needed) is given. Could you please elaborate about this? It would be ideal if you could post a sample JSON document and the schema definition (maybe an external table?).

Are the attribute names (the actual values which appear as "NAME" shown in your example) known?

 

 


Hi @rclarknxf, I think the solution would be different depending on how the desired schema (column names, data type for each column, max length for each column, and other restrictions if needed) is given. Could you please elaborate about this? It would be ideal if you could post a sample JSON document and the schema definition (maybe an external table?).

If the JSON document looks like this

 

{
    "success" : true,
    "message" : null,
    "result" : {
        "pageNum" : 1,
        "numPages" : 31,
        "records" : <
            {
                "NAME1" : "VALUE1",
                "NAME2" : "VALUE2",
                "NAME3" : "VALUE3",
                "NAME4" : "VALUE4"
            }
        ]
    }
} 
you can extract each record containing 4 attributes - NAME1, NAME2, NAME3, and NAME4 with the JSONFragmenter.

 

  • JSON Query: json "result"] "records"]

0684Q00000ArMa9QAF.png

You can then validate attribute values with the AttributeValidator.

Hi @rclarknxf, I think the solution would be different depending on how the desired schema (column names, data type for each column, max length for each column, and other restrictions if needed) is given. Could you please elaborate about this? It would be ideal if you could post a sample JSON document and the schema definition (maybe an external table?).

Hey takashi, appreciate the help but not quite what I'm asking - the attribute validator is the bit I want to automatically populate. Creating a validation rule in that transformer for every column is going to take ages.

 

 


Hi @rclarknxf, I think the solution would be different depending on how the desired schema (column names, data type for each column, max length for each column, and other restrictions if needed) is given. Could you please elaborate about this? It would be ideal if you could post a sample JSON document and the schema definition (maybe an external table?).

Can you elaborate how you want to set the validation rules?

 

 


Hi @rclarknxf, I think the solution would be different depending on how the desired schema (column names, data type for each column, max length for each column, and other restrictions if needed) is given. Could you please elaborate about this? It would be ideal if you could post a sample JSON document and the schema definition (maybe an external table?).

From the schema of the output dataset

 

 


Hi @rclarknxf, I think the solution would be different depending on how the desired schema (column names, data type for each column, max length for each column, and other restrictions if needed) is given. Could you please elaborate about this? It would be ideal if you could post a sample JSON document and the schema definition (maybe an external table?).

Depending on how you can get the schema definition from the output dataset.

 

If the output dataset exists already, you can read the schema from the dataset with the Schema reader and validate the attributes of each record based on the schema definition.

 

If the output dataset doesn't exist yet, there is no way to get the schema definition at run-time unless you have a resource (e.g. an external lookup table) that defines the destination schema.

 

In any case, Python scripting may be necessary to validate the attributes for each record extracted from the JSON document.

 


Reply