Question

StringSearcher bug ?

  • 18 February 2019
  • 6 replies
  • 3 views

Badge

Hi.

I'm using the StringSearcher to fetch a list of positions from a large JavaScript data string on a HTML page.

I'm getting the right results per se, but for some reason, the "startIndex" property on each match stops showing the correct value at some point, yielding a lot of my found positions and names with the same value: 32572 and 32518 resp.

Now, these values are errornous, and they're uncomfortably close to the infamous value 32768, so I suspect some kind of overflow has happened. Can this be confirmed or ?

I'm using 2017 x64, and haven't been able to verify it in 2018.

Cheers


6 replies

Userlevel 5
Badge +25

It looks like an overflow indeed and it looks like it's present in 2018.1 as well. As soon as the index of the matches goes over 32768 it starts failing and assigning the previous index.

@Mark2AtSafe, could you file this as a bug? I've attached a workspace that recreates the issue.

listindex.fmw

Badge +14

It looks like an overflow indeed and it looks like it's present in 2018.1 as well. As soon as the index of the matches goes over 32768 it starts failing and assigning the previous index.

@Mark2AtSafe, could you file this as a bug? I've attached a workspace that recreates the issue.

listindex.fmw

@redgeographics Here's what it returned in 2019 beta. First match was b. Is that what it should have been?

overflow.txt

Userlevel 5
Badge +25

@redgeographics Here's what it returned in 2019 beta. First match was b. Is that what it should have been?

overflow.txt

Sort of, it's really the match list that comes out of it that shows the problem. Depending on the position in the string of the matches it sometimes, if that position is past character 32768 it reports the wrong number.

Userlevel 4
Badge +25

It looks like an overflow indeed and it looks like it's present in 2018.1 as well. As soon as the index of the matches goes over 32768 it starts failing and assigning the previous index.

@Mark2AtSafe, could you file this as a bug? I've attached a workspace that recreates the issue.

listindex.fmw

I will do that, yes. Thanks for checking it out and letting me know about it.

Userlevel 4
Badge +25

It's filed with the developers as FMEENGINE-58429. I'm imagining that it should be a simple-ish fix, unless there is a regex restriction that forces use of short as the data type.

Userlevel 4
Badge +25

Fixed in 2019 (build 19214 or greater). Indeed, it was an incorrect data type being used. It was Int16 and should have been Int32.

Reply