Level Extreme platform
Subscription
Corporate profile
Products & Services
Support
Legal
Français
CSV files with comma and quotes
Message
From
12/11/2018 13:19:59
Dragan Nedeljkovich (Online)
Now officially retired
Zrenjanin, Serbia
 
 
To
09/11/2018 08:46:59
General information
Forum:
Visual FoxPro
Category:
Coding, syntax & commands
Environment versions
Visual FoxPro:
VFP 9 SP2
Miscellaneous
Thread ID:
01663238
Message ID:
01663286
Views:
80
This message has been marked as a message which has helped to the initial question of the thread.
>Hi All
>
>I have a CSV file which uses comma as the primary delimiter between fields. However, some fields might contain a comma (e.g. a description field) and the supplier provides those fields enclosed in quotes. For example:
>
>AAA,BBB,"XXX YYY, ZZZ",CCC,"X1, Y1, Z1",DDD,
>
>I cannot do an import on this CSV file as it contains too many fields. I need to parse the file, line by line.
>
>How can I get each field so that the result would be:
>
>Field 1: AAA
>Field 2: BBB
>Field 3: XXX YYY, ZZZ
>Field 4: CCC
>Field 5: X1, Y1, Z1
>Field 6: DDD
>
>GETWORDNUM() works for fields separated with a comma but then it picks up individual sections in the description fields which are enclosed in quotes. I can write a manual parsing routine but was wondering if there was a faster way.

I tried to copy and paste a piece of code I use, but found that it's too involved with what I had in its history... So a brief description instead
c=filetostr("your.csv")
IF llUTF8   && a parameter passed, I'm not reading BOM because it's often missing.
	c=STRCONV(STRCONV(c,11),2)
ENDIF
lnRows=alines(this.aRows, c)
Now parse the first line into this.aFldNames (and check for any rubbish, as in repeated names, names in multiple lines etc etc, this array is your source of field names); keep the column count somewhere.
For rows 2 to lnRows, make a function like this
function getrow(ii)
lcRow=this.aRows[ii]
do while ["] $ lcRow
* now here do a strextract of anything between quotation marks, with end delimiters
* replace it back without quotes and with commas replaced with, say, pipes
	lcRow=strtran(lcRow, lcCut, lcReplace)
enddo
* now split into fields
lnCols=alines(aValues, lcRow, -1, -1, ",")
o=createobject("empty")
for ij=1 to lncols
	lcVal=chrtran( aValues[ij], "|", ",")
	addproperty(o, this.aFldNames[ij], lcVal)
endif
* lots of error handling needed here - lnCols may differ from the size of aFldNames etc etc.
return o

back to same old

the first online autobiography, unfinished by design
What, me reckless? I'm full of recks!
Balkans, eh? Count them.
Previous
Next
Reply
Map
View

Click here to load this message in the networking platform