Question

Validate/Read /Export thousand of txt to csv (txt are in 2-3 differents formats)

  • 31 August 2016
  • 5 replies
  • 1 view

Hi, I would like to know how i can import multiples txt file currently with 2-3 different formats and export it in csv.

 

Validate it (If is fail to read it or if it succès and make a txt file with result)

I have thousand of .txt file in a directory

 

The goal is to convert all in one common format in this case .CSV.

 

The majority of the file are in this format:

 

"/" = SKIP LINE (Good point are without / ) Line begin with number are good

 

On this case

 

i want to extract

 

127,126,115

 

Cannot use the header its never have same number of line.

 

//-----------------JOBFILE INFORMATION------------------------ / / Instrument: I 3" R1000 No.: 1610178 / Date: 2016-04-07 / Hours: 09H32 / By: / Job: 60407P3DICF / Description: ; / Location: ; /--------------------END OF JOBFILE INFO--------------------- 127(TAB)1005.4345(TAB)10.2440(TAB)477.0120(SOME SPACES WITH TAB AT THE END);2016-04-07(TAB)##FILENAME##

 

126 1005.4517 -0.0040 476.9552 ; 2016-04-07 ##FILENAME##

 

115 1005.5490 -40.1908 477.9312 ; 2016-04-07 ##FILENAME##

 

//127 1005.4343 10.2444 477.0120 ;

 

//115 1005.5490 -40.1910 477.9320 ;

 

// A 1000.9592 -12.6230 476.3533 ; The other formats are just a CSV like in TXT Like this.

 

55,975.3467,-41.4636,9.0881,""_null"",2016-05-15,P3COAF

 

55,975.3474,-41.4667,9.0881,""_null"",2016-05-15,P3COAF

 

55,975.3454,-41.4638,9.0871,""_null"",2016-05-15,P3COAF

 

132,927.6054,-126.9099,7.1666,""_null"",2016-05-15,P3COAF

 

132,927.6058,-126.9067,7.1662,""_null"",2016-05-15,P3COAF

 

132,927.6065,-126.9092,7.1686,""_null"",2016-05-15,P3COAF

 

133,1051.5018,27.6244,5.6883,""_null"",2016-05-15,P3COAF

 

133,1051.5006,27.6245,5.6886,""_null"",2016-05-15,P3COAF

 

133,1051.5020,27.6238,5.6871,""_null"",2016-05-15,P3COAF

 

51000,1041.9476,-25.3123,3.3441,"FW",2016-05-15,P3COAF 51001,1041.3908,-25.2814,3.3415,"FW",2016-05-15,P3COAF 51002,1041.3800,-24.8944,3.4774,"FW",2016-05-15,P3COAF I would like to validate if it is in there 2 formats or else; Make a list of all rejected file and move it in a folder if it possible. Thx


5 replies

Badge +22

I would create one workspace that converts your "//Job file" format and one that converts your csv format.

 

 

Then create a controller workspace with a text file reader, with a max features of 1 and two workspace runners (one for each format)

 

Analyse that first line to determine what format it is in and send it to the appropriate workspace runner, or log it as rejected,

But how i can read it line by line and skiping line when i have"/" at the begining ? i make a splitter with return char ? and after read and analyse the first char ?

Badge +22

But how i can read it line by line and skiping line when i have"/" at the begining ? i make a splitter with return char ? and after read and analyse the first char ?

Text_line reader followed by a tester

 

text_line_data Begins With / (Negate)

 

 

Userlevel 2
Badge +17

Hi @rogerm, can you determine the format with seeing only the first line? e.g.

  • if the first line starts with a slash, the file is a 'JOBFILE',
  • else if the first line contains a comma, the format is a CSV,
  • else, the file is written in other formats.

If it's possible, you can read first line for every file with a Text File reader (as @jdh mentioned), filter the features with the TestFilter, and then move the files to separate folders according to the file format using the File Copy writer.

Userlevel 4
Badge +25

Hmmm. It would be quite funky if we could add the Tester dialog (or something similar) into the CSV parameters, so that the header and footer (or other lines to skip) can be defined by a special character.

I pasted that as an idea here - so please vote or comment upon it if you would like to see this functionality.

Other than that - as others have said here - the only way to remove comment lines is to read the file as one feature per line and then use a Tester (or other filter transformer) to drop the features representing comments.

Reply