So this is a problem that keeps coming up...we need to report text differences between two versions of reports. I can access the before and after fields and read them into FME, but I am searching for ways to analyze the strings and report the differences (additions and deletions).
The ultimate goal is to format a report with additions (green) and deletions (red) clearly marked. As a start though, I have to find some method to analyze the two strings and perhaps break things down enough to be able to re-assemble a properly formatted string, perhaps with HTML. Nothing is jumping out right now as a simple answer, so I suspect it is not as easy :)
I have looked into the custom workspaces "FuzzyStringComparer" and "FuzzyStringCompareFrom2Datasets", but I don't think they will help me much with the above. I thought about a process of chopping a string into individual words and do some repetitive looping using Regular Expressions to determine which chunks existed before and identify additions and deletions, but it's now starting to look more like a thesis project and not something easily achieved. So I thought I'd ask here to see if anyone has any other ideas that might juggle my brain and set it on a potential path to success! Thanks in advance for your insight.
PS: I know of several online text diff. tools and even found a very good PDF compare tool that retains the original formatting (which is actually desirable but for this task it is not crucial) but I am looking more at a way to report data differences in a visual way and have some control over the layout. BeyondCompare does a very good job too, but it lacks the control of creating a single variance report with all the differences.