Skip to main content
Question

Identifying and removing repeated block of text in multiline string

  • June 25, 2019
  • 0 replies
  • 8 views

jdh
Contributor
Forum|alt.badge.img+37
  • Contributor
  • 2002 replies

I have an attribute containing multiline text where an arbitrary block of lines are repeated near the end of the data. I can't just remove duplicate lines, as an individual line might be have legitimate duplicates.

 

 

In the below example lines 2 and 15 are identical, but are correct, whereas the block of lines 17-23 are a repetition of lines 10-16, and need to be removed. The number of lines that are repeated varies, but there will always be at least one, and will include the last REV line before the footer. The footer data varies in length and content, but will never include a line beginning with REV.
SITE 1
REV 10-1P
REV 10-3D
SITE 2
REV 20-3V3
REV 20-3V4
REV 20-3V5
REV 20-3V6
REV 20-3W
REV 20-3X
REV 20-3X4
REV 20-3X5
REV 20-3X6
SITE 3
REV 10-1P
REV 10-1P1
REV 20-3X
REV 20-3X4
REV 20-3X5
REV 20-3X6
SITE 3
REV 10-1P
REV 10-1P1
Footer Data Here

The data is not in a text file that can be read bottom up, though I suppose if required it could be written to one and read back in

This post is closed to further activity.
It may be an old question, an answered question, an implemented idea, or a notification-only post.
Please check post dates before relying on any information in a question or answer.
For follow-up or related questions, please post a new question or idea.
If there is a genuine update to be made, please contact us and request that the post is reopened.