Hi all,
I have one dataset containing a set of features with attributes like ID, Name, Date, Location.
While none of the individual attributes are unique, the combination of all of them are. (Record)
I have another dataset of features with one attribute containing freeform multiline text. (Text)
Each Text feature contains ALL of the values of ONE feature of the Record dataset, but not in any order, and generally not an exact match on a line.
I need to identify which Text feature corresponds to which Record feature. Each Record should have a zero or one match with a Text feature.
I am assuming that python and regex is the way to go, but I'm not sure as to the most efficient way to process the data.
Record FeaturesIDNameDateLocation24AAA23 MAY 2019X32AAA07 JUN 2019Y24BBB07 JUN 2019Z
A sample text feature could contain something like:
SEE 24
2926m
7000'
Search shelter X
32
500 2000 2500
800 32 200
AAA/ABC
07 JUN 2019
Y
The correct record in this case is 32-AAA.