Tuesday 25 December 2012

COPY: How to skip first column that has a different delimiter

I had this very interesting problem sometime back wherein customer had a data file with first column delimiter different from rest of the columns. Customer wanted to load this file using COPY command and wanted to skip first column.

The data file looked like this
The first column of the data file has ~ as a field delimeter
The rest of the columns have pipe (|) as a filed delimeter

$ cat load.out
SkipMe1 ~ A1 | B1 | C1 
SkipMe2 ~ A2 | B2 | C2 

The table definition is below

CREATE TABLE t ( 
AColumn VARCHAR(10), 
BColumn VARCHAR(10), 
CColumn VARCHAR(10) 
); 
CREATE PROJECTION tp ( 
AColumn, 
BColumn, 
CColumn 

AS SELECT * from t; 

The data once loaded was supposed to be as shown below:

AColumn | BColumn | CColumn 
------+-----------+------- 
A1 | B1| C1 
A2 | B2| C2


Skipping a column in COPY command is easy using "Filler", and so is specifying column delimiter. But I had never tried before if we can specify delimiter for a specific column and that too in conjunction with Filler option. So I tested the solution out and voila...

The COPY command I used is shown below. Note the use of "FILLER" and specific use of "DELIMITER" only for first column.

$ cat load.out | vsql -c "copy t(c1 FILLER varchar(10) delimiter '~',c2 FILLER varchar(10), c3 FILLER varchar(10),c4 FILLER varchar(10), AColumn as c2, BColumn as c3, CColumn as c4) from stdin direct"; 


No comments:

Post a Comment