Skip to main content
Best Answer

Cleanup OCRd PDF and break into attributes

  • June 5, 2020
  • 2 replies
  • 10 views

jayqueue
Forum|alt.badge.img

Hello,

 

We OCRd an invoice and we want to split it into attributes for further analysis.

The input is:

Overzicht mobiele bundels Bundel Verbruik / Beschikbaar Eenheid 0498/00.00.00 Mobile Bis Pack 10 / 10000 Minutes Mobile Bis Pack 45 / 2048 Mega Bytes 0498/00.00.01 Mobile Bis Pack 64 / 10000 Minutes Mobile Bis Pack 25 I 10000 SMS Mobile Bis Pack 892 I 2048 Mega Bytes 0498/00.00.02 Mobile Bis Pack 81 I 10000 Minutes Mobile Bis Pack 32 I 10000 SMS Mobile Bis Pack 577 I 2048 Mega Bytes 0498/00.00.03 Mobile Bis Pack 38 / 10000 Minutes Mobile Bis Pack 53 / 10000 SMS Mobile Bis Pack 276 / 2048 Mega Bytes

 

 

What I try to achieve (first) then I can split them up further myself

Overzicht mobiele bundels Bundel Verbruik / Beschikbaar Eenheid 0498/00.00.00 Mobile Bis Pack 10 / 10000 Minutes 0498/00.00.00 Mobile Bis Pack 45 / 2048 Mega Bytes 0498/00.00.01 Mobile Bis Pack 64 / 10000 Minutes 0498/00.00.01 Mobile Bis Pack 25 I 10000 SMS 0498/00.00.01 Mobile Bis Pack 892 I 2048 Mega Bytes 0498/00.00.02 Mobile Bis Pack 81 I 10000 Minutes 0498/00.00.02 Mobile Bis Pack 32 I 10000 SMS 0498/00.00.02 Mobile Bis Pack 577 I 2048 Mega Bytes 0498/00.00.03 Mobile Bis Pack 38 / 10000 Minutes 0498/00.00.03 Mobile Bis Pack 53 / 10000 SMS 0498/00.00.03 Mobile Bis Pack 276 / 2048 Mega Bytes

 

I use a testfilter to filter checking if the line starts with 0 or M, the rest is ignored.

 

Then a tester where I check if it starts with 0, if it is: VariableSetter = phonenumber

 

If it's not starting with 0, VariableRetriever to get that phonenumber but that doesn't work

but then I'm stuck.

Can anyone point me in the right direction please?

Best answer by redgeographics

I've used a StringSearcher to look for the phone number, then set that as a variable and retrieve it on the features where there's no phone number found. If the order of features remains the same (and you can enforce that by setting that parameter on the StringSearcher it'll assign the correct phone numbers.

cleanup.fmwt

This post is closed to further activity.
It may be an old question, an answered question, an implemented idea, or a notification-only post.
Please check post dates before relying on any information in a question or answer.
For follow-up or related questions, please post a new question or idea.
If there is a genuine update to be made, please contact us and request that the post is reopened.

2 replies

redgeographics
Celebrity
Forum|alt.badge.img+62
  • Celebrity
  • Best Answer
  • June 5, 2020

I've used a StringSearcher to look for the phone number, then set that as a variable and retrieve it on the features where there's no phone number found. If the order of features remains the same (and you can enforce that by setting that parameter on the StringSearcher it'll assign the correct phone numbers.

cleanup.fmwt


jayqueue
Forum|alt.badge.img
  • Author
  • June 5, 2020

Thanks @redgeographics but I got it working, just came back to say haha.

This is my solution