Skip to main content

This is opened question to FME Community. FME workspaces are very individual. But I believe there is a way quite universal way to test it (FME Testing Framework, rTest, DatasetValidator, Test Custom Transformer). This topic is about data quality. It is crucial!. How do you test your data in FME?

As you say, it is very individual. In broad terms, I would first check data structure (i.e. the schema), then content.

The one thing that I think is hard to automate is the "sanity check" on the data. Taking a look at the volume and considering whether or not that is an expected amount. If you often process data for local governments you kinda get an idea of the relationship between for example the number of inhabitants vs the number of addresses. I.e. if I process data for a municipality with 100.000 inhabitants but I only have 5.000 addresses I can quite confidently say my data is incomplete.


As you say, it is very individual. In broad terms, I would first check data structure (i.e. the schema), then content.

The one thing that I think is hard to automate is the "sanity check" on the data. Taking a look at the volume and considering whether or not that is an expected amount. If you often process data for local governments you kinda get an idea of the relationship between for example the number of inhabitants vs the number of addresses. I.e. if I process data for a municipality with 100.000 inhabitants but I only have 5.000 addresses I can quite confidently say my data is incomplete.

Okay,

  1. How do you testing schema and content? Manually?
  2. How do you prepare expected data based on requirements?
  3. How do you reporting it? I mean, do you have test cases with execution results passed/failed?

 


Okay,

  1. How do you testing schema and content? Manually?
  2. How do you prepare expected data based on requirements?
  3. How do you reporting it? I mean, do you have test cases with execution results passed/failed?

 

Re. #1: We've used the ChangeDetector on Schema features with some good results. Comparing a supplied schema with an expected schema and then reporting the differences.


Okay,

  1. How do you testing schema and content? Manually?
  2. How do you prepare expected data based on requirements?
  3. How do you reporting it? I mean, do you have test cases with execution results passed/failed?

 

Okay, I had the same idea (below image). Now my FME Testing Framework is just FME Workbench which you can add (instead of ChangeDetector) with WorkspaceRunner parameters: actual in CSV, expected in CSV and in output you have Test Cases (based on CSV columns) Report in html (pytest) and xls. I plan to do it also for SHP.

Thanks


Reply