Skip to main content

I am attempting to filter a table for any records containing characters which fall between Ascii code 128 and 255.

In a Tester, the regex [^\\x00-\\x7F] correctly identifies matches (copied & pasted from an Inspector) in the ‘Test String’ box. However, when I run the full table through it, all records fail.

This seems like a bug. Or am I just missing something about Regex in a Tester?

Are you able to provide a sample value that you are testing?


Are you able to provide a sample value that you are testing?

Below are a few of the test strings in question. But now I'm even more confused. I pushed these values out in to an .FFS (thinking I'd be able to upload it here but I guess there are no attachments), and when I pass the FFs through the Tester, it works. Is the behaviour of the Tester somehow different depending on the source format? The source format is Postgis.

 

SERVICE REPORT ? PREVENTATIVE MAINTENANCE APPT:

2014 RELINE PROGRAM LETTER, INFORMATION, CONTACT SHEET AND AUTHORIZATION MAILED TO HOME OWNER ?MAY 8, 2014, REQUESTING RESPONSE BY MAY 23, 2014.

2013 FEE INCREASE LETTER SENT OUT TODAY?S DATE TO MAKE HOME OWNER(S) AWARE OF CHANGES TO BYLAW C20-12 FOR THE 2013 PREVENTATIVE MAINTENANCE SEASON.

AT ECOLINER?S REQUEST, A RELINE UPDATE LETTER HAS BEEN MAILED OUT TO HOME OWNER, TODAY?S DATE. COPY OF LETTER IS ON FILE. LETTER WILL DETAIL THE FOLLOWING REQUESTS:

AS PER GREAT PLAINS, CHANGE OF TITLE ON ?28-01-14?.

 

 


Below are a few of the test strings in question. But now I'm even more confused. I pushed these values out in to an .FFS (thinking I'd be able to upload it here but I guess there are no attachments), and when I pass the FFs through the Tester, it works. Is the behaviour of the Tester somehow different depending on the source format? The source format is Postgis.

 

SERVICE REPORT ? PREVENTATIVE MAINTENANCE APPT:

2014 RELINE PROGRAM LETTER, INFORMATION, CONTACT SHEET AND AUTHORIZATION MAILED TO HOME OWNER ?MAY 8, 2014, REQUESTING RESPONSE BY MAY 23, 2014.

2013 FEE INCREASE LETTER SENT OUT TODAY?S DATE TO MAKE HOME OWNER(S) AWARE OF CHANGES TO BYLAW C20-12 FOR THE 2013 PREVENTATIVE MAINTENANCE SEASON.

AT ECOLINER?S REQUEST, A RELINE UPDATE LETTER HAS BEEN MAILED OUT TO HOME OWNER, TODAY?S DATE. COPY OF LETTER IS ON FILE. LETTER WILL DETAIL THE FOLLOWING REQUESTS:

AS PER GREAT PLAINS, CHANGE OF TITLE ON ?28-01-14?.

 

 

I suspect it's something to do with the encoding.


@bigclyde If this is a text file you're reading with the Text File reader then one of the reader parameters is the Character Encoding:

 

(you should be able to upload files - try zipping the sample data)


@bigclyde If this is a text file you're reading with the Text File reader then one of the reader parameters is the Character Encoding:

 

(you should be able to upload files - try zipping the sample data)

Thanks Mark,

No it wasn't a text file (sorry I should have been more specific it's Postgis), but you correctly surmised that encoding was the root of it. The database I was working from drew the records from another one (Sql Server) that had different encoding. I ultimately got around it by changing the SQL of the view to replace the dodgy characters with spaces.

So I understand why the Tester couldn't see anything from the raw feed, but why was the Inspector was able to produce a version that the Tester did recognise?