Anyone know of a good way to analyze attribute values, and determine if a given value is a word in English?
Or maybe even check it against a custom dictionary of names? I'm trying to clean up a bunch of values that have spaces from when the PDFs were output to data via OCR, so it looks like this:
Â
Attribute ValueWhat I want attribute to be corrected toThi s is a sent e nc e.This is a sentence.