Question

Querying data within a table

  • 9 August 2017
  • 8 replies
  • 3 views

Badge

Hi, I have a query please.

The scenario is this - employees working for a national company are based at different offices. Each employee has a card number, which is unique to them. Every time an employee uses their card to enter an office by swiping their card, it is registered...and likewise when they leave the office too. And so the table I have includes (for each employee) has a field for "Card Number" and also a field called "Event" which records whether an employee has "Swiped In" or "Swiped Out."

The question I have is this - I want to be able to compare, for each employee, the number of times they have "Swiped In" and "Swiped Out" You would expect if an employee has swiped in say 10 times, there would also be 10 occurrences of swiping out. However, a casual glance at the data does not always suggest this....I can see for example that sometimes employees have swiped in more times than they have swiped out.

Therefore, is there a good way in FME whereby I can report for each employee (using their unique Card Number) where there is a discrepancy between the number of times they have swiped in and swiped out ?

Any help appreciated, thanks


8 replies

Userlevel 4
Badge +25

Assuming your events table has just 2 columns: cardnumber and eventtype, which can be in or out, this is fairly simple:

The ListBuilder groups all entries by unique cardnumber and then the ListHistogrammer counts how many times each value for eventtype occurs. I've introduced some errors in the data and am testing for an equal number of checkins and checkouts.

Badge

Assuming your events table has just 2 columns: cardnumber and eventtype, which can be in or out, this is fairly simple:

The ListBuilder groups all entries by unique cardnumber and then the ListHistogrammer counts how many times each value for eventtype occurs. I've introduced some errors in the data and am testing for an equal number of checkins and checkouts.

I have tried this approach on some test data, and did a combination of even "ins" and "outs" as well as some un-even "ins" and "outs", and all my records went to the "Unfiltered" port from the TestFilter Transformer....?

 

 

Userlevel 4
Badge +25

I don't suppose you accidentally mangled the data so that it turned out to be a valid set again? In other words, are you sure there's errors in there? What do you see if you inspect the data after the ListHistogrammer, can you find the features that are supposed to be bad and confirm the histogram counts are not the same?

Here's my workspace and sample data: xlsxr2none.fmwt

Badge +3

To compare in and out you can list build by card number. Then list build by event. Count the sub lists using a listelementcounter on each and compare the results.

To see where they go asynchronic you can sort by card number and swipe time.

Using attribute creator you can use adjacent feature by 1 under advanced attribute handling

to check whether or not an in-swipe is followed by an out-swipe.

Userlevel 2
Badge +12

Assuming this is stored in a SQL enabled database, you can use the SQLCreator transformer with the query:

Select t1.Cardnumber

, Select Count(*) as SWIPEDIN from table t2 where t2.Cardnumber = t1.Cardnumber) and t2.Event = 'Swiped In'

, Select Count(*) as SWIPEDOUT from table t3 where t3.Cardnumber = t1.Cardnumber) and t3.Event = 'Swiped Out'

From table t1

Group by t1.Cardnumber;

This query returns per Cardnumber the number of SWIPEDIN and SWIPEDOUT.

You can use the Tester transformer to see if these numbers are the same.

Userlevel 3
Badge +13
Hi @cliff_wade, is this the same question as: Replicate in FME ?
Badge
Hi @cliff_wade, is this the same question as: Replicate in FME ?
Yes, I did not feel I explained the issue clearly enough the first time. Thanks very much for your reply on the second thread, that should indeed work - but I do have 2 further questions on that solution, and would appreciate your thoughts on those please...

 

 

Userlevel 4

Assuming this is stored in a SQL enabled database, you can use the SQLCreator transformer with the query:

Select t1.Cardnumber

, Select Count(*) as SWIPEDIN from table t2 where t2.Cardnumber = t1.Cardnumber) and t2.Event = 'Swiped In'

, Select Count(*) as SWIPEDOUT from table t3 where t3.Cardnumber = t1.Cardnumber) and t3.Event = 'Swiped Out'

From table t1

Group by t1.Cardnumber;

This query returns per Cardnumber the number of SWIPEDIN and SWIPEDOUT.

You can use the Tester transformer to see if these numbers are the same.

Completely agree. If there's a lot of data in the event table (and the referenced columns are properly indexed), this will be an order of magnitude faster than doing it in FME.

Reply