>I'm working on a project where I need to shred a string (ala shredding and XML string) into a set of tables. The string is passed to me by a call to an *.OCX. The typical length of the string is 43K+ bytes. The string structure looks like this:
>
>Data-Pair /pair delimiter/ Data-Pair /pair delimiter/ etc.
>There are additional delimiters inside each data-pair. Essentially a data-pair consists of a delimited string that defines the schema and also separates the schema from the actual value. A Data-Pair ends up looking something like this:
>Grand-ParentField/Delimiter/ParentField/Delimiter/Field/Value Delimiter/Value.
>
>My "parsing" engine works, but it's very slow. (About 2-3 seconds per string on a PII 300 Mhz notebook.) I need to get down to sub-second times without simply buying a faster notebook. I'm working on enhancements to make the string shorter. But I'm also looking for a way to more efficiently pass through the string as I shred it. That's what this post is for. So far the most efficient method of parsing the string is this:
>
>Do While .T.
>DataPair = SubString( LongString, 1, At(
, LongString )- 1 )
>* Parsing code goes here
>LongString = SubString( LongString, At( , LongString ) + 1 )
>If Len( LongString ) = 0
> Exit
>EndIf
>EndDo
>
>I've tried NextWord from foxtools, ( I'm running VFP 6 ) but it's actually slower than this approach is.
>
>Ideas anyone on a better way to work my way through the string?
Without knowing exactly what the delimiter looks like it's a bit tough, but here are some suggestions:
1. Replace the DO WHILE .T. with a FOR...NEXT
2. Use OCCURS() to determine the number of occurances
3. Use the optional occurence number parameter of the AT() function.
It might look something likelncount = OCCURS(lcstring, lcdelimiter)
FOR lni = 1 TO lncount - 1
lnstart = AT(lcdelimieter, lcstring, lni) + 1
lcfinish = AT(lcdelimieter, lcstring, lni + 1) - 1
NEXT
All that will be left is to parse the ending portion.
George
Ubi caritas et amor, deus ibi est