Plateforme Level Extreme
Abonnement
Profil corporatif
Produits & Services
Support
Légal
English
CSV files with comma and quotes
Message
De
12/11/2018 16:41:06
 
 
À
12/11/2018 15:04:59
Information générale
Forum:
Visual FoxPro
Catégorie:
Codage, syntaxe et commandes
Versions des environnements
Visual FoxPro:
VFP 9 SP2
Divers
Thread ID:
01663238
Message ID:
01663288
Vues:
57
Such things crop up with one forgotten kink at moments where you are already stressed.

My first action (after overcoming the programmers itch to fix myself) is to contact originator of the file and change the structure of the file if possible, bracketing ALL char fields with almost-never-used character if possible or at least get consistent handling of char fields.
Most of the time that is enough for handling with /with character clauses of append from type delimited.

>>>Hi All
>>>
>>>I have a CSV file which uses comma as the primary delimiter between fields. However, some fields might contain a comma (e.g. a description field) and the supplier provides those fields enclosed in quotes. For example:
>>>
>>>AAA,BBB,"XXX YYY, ZZZ",CCC,"X1, Y1, Z1",DDD,
>>>
>>>I cannot do an import on this CSV file as it contains too many fields. I need to parse the file, line by line.
>>>
>>>How can I get each field so that the result would be:
>>>
>>>Field 1: AAA
>>>Field 2: BBB
>>>Field 3: XXX YYY, ZZZ
>>>Field 4: CCC
>>>Field 5: X1, Y1, Z1
>>>Field 6: DDD
>>>
>>>GETWORDNUM() works for fields separated with a comma but then it picks up individual sections in the description fields which are enclosed in quotes. I can write a manual parsing routine but was wondering if there was a faster way.
>>
>>I tried to copy and paste a piece of code I use, but found that it's too involved with what I had in its history... So a brief description instead
>>
>>
c=filetostr("your.csv")
>>IF llUTF8   && a parameter passed, I'm not reading BOM because it's often missing.
>>	c=STRCONV(STRCONV(c,11),2)
>>ENDIF
>>lnRows=alines(this.aRows, c)
>>
>>
>>Now parse the first line into this.aFldNames (and check for any rubbish, as in repeated names, names in multiple lines etc etc, this array is your source of field names); keep the column count somewhere.
>>For rows 2 to lnRows, make a function like this
>>
>>
>>function getrow(ii)
>>lcRow=this.aRows[ii]
>>do while ["] $ lcRow
>>* now here do a strextract of anything between quotation marks, with end delimiters
>>* replace it back without quotes and with commas replaced with, say, pipes
>>	lcRow=strtran(lcRow, lcCut, lcReplace)
>>enddo
>>* now split into fields
>>lnCols=alines(aValues, lcRow, -1, -1, ",")
>>o=createobject("empty")
>>for ij=1 to lncols
>>	lcVal=chrtran( aValues[ij], "|", ",")
>>	addproperty(o, this.aFldNames[ij], lcVal)
>>endif
>>* lots of error handling needed here - lnCols may differ from the size of aFldNames etc etc.
>>return o
>
>
>Thanks, Dragan. I will give that a deeper look tomorrow.
>
>Regards
Précédent
Suivant
Répondre
Fil
Voir

Click here to load this message in the networking platform