Skip to main content
Question

How many of us testing data in FME?

  • November 10, 2022
  • 4 replies
  • 34 views

michpil
Contributor
Forum|alt.badge.img+1

This is opened question to FME Community. FME workspaces are very individual. But I believe there is a way quite universal way to test it (FME Testing Framework, rTest, DatasetValidator, Test Custom Transformer). This topic is about data quality. It is crucial!. How do you test your data in FME?

4 replies

redgeographics
Celebrity
Forum|alt.badge.img+50

As you say, it is very individual. In broad terms, I would first check data structure (i.e. the schema), then content.

The one thing that I think is hard to automate is the "sanity check" on the data. Taking a look at the volume and considering whether or not that is an expected amount. If you often process data for local governments you kinda get an idea of the relationship between for example the number of inhabitants vs the number of addresses. I.e. if I process data for a municipality with 100.000 inhabitants but I only have 5.000 addresses I can quite confidently say my data is incomplete.


michpil
Contributor
Forum|alt.badge.img+1
  • Author
  • Contributor
  • November 16, 2022
redgeographics wrote:

As you say, it is very individual. In broad terms, I would first check data structure (i.e. the schema), then content.

The one thing that I think is hard to automate is the "sanity check" on the data. Taking a look at the volume and considering whether or not that is an expected amount. If you often process data for local governments you kinda get an idea of the relationship between for example the number of inhabitants vs the number of addresses. I.e. if I process data for a municipality with 100.000 inhabitants but I only have 5.000 addresses I can quite confidently say my data is incomplete.

Okay,

  1. How do you testing schema and content? Manually?
  2. How do you prepare expected data based on requirements?
  3. How do you reporting it? I mean, do you have test cases with execution results passed/failed?

 


redgeographics
Celebrity
Forum|alt.badge.img+50
michpil wrote:

Okay,

  1. How do you testing schema and content? Manually?
  2. How do you prepare expected data based on requirements?
  3. How do you reporting it? I mean, do you have test cases with execution results passed/failed?

 

Re. #1: We've used the ChangeDetector on Schema features with some good results. Comparing a supplied schema with an expected schema and then reporting the differences.


michpil
Contributor
Forum|alt.badge.img+1
  • Author
  • Contributor
  • November 17, 2022
michpil wrote:

Okay,

  1. How do you testing schema and content? Manually?
  2. How do you prepare expected data based on requirements?
  3. How do you reporting it? I mean, do you have test cases with execution results passed/failed?

 

Okay, I had the same idea (below image). Now my FME Testing Framework is just FME Workbench which you can add (instead of ChangeDetector) with WorkspaceRunner parameters: actual in CSV, expected in CSV and in output you have Test Cases (based on CSV columns) Report in html (pytest) and xls. I plan to do it also for SHP.

Thanks


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings