>I need to convert binary strings to ascii by changing nonprintable non-ascii
>charaters to a \\nnn strings where nnn is octal representation.
>
>I created the following routine. For 381 KB byte strings, the conversion takes 5 (!) minutes and task manager shows that my application takes a lot of MBs memory.
>
>Any way to speed up this conversion ?
>
>
FUNCTION toBYTEA( cStr )
>
>LOCAL i, j, cRes, nChr, cDig
>
>cRes = ''
>FOR i=1 TO LEN(m.cStr)
> nChr = ASC( SUBSTR( m.cStr, m.i,1))
> * The following four lines cna be removed if this can speed the conversion.
> IF BETWEEN( nChr, 32,126 ) AND !INLIST( nChr,39,92)
> cRes = m.cRes + SUBSTR( m.cStr, m.i,1)
> LOOP
> ENDIF
>
> cDig = ''
> FOR j=1 TO 3
> cDig= CHR(ASC('0')+ m.nChr%8) + m.cDig
> nChr = INT(m.nChr/8)
> ENDFOR
> cRes = m.cRes + '\\'+ m.cDig
> ENDFOR
>RETURN m.cRes
>ENDFUNC
Andrus,
Think like humans. I tend to visualize strings as trains with N vagons (bytes). What you do could be visualized this way:
You have a train with 380K vagons and one another parallel to it with more vagons (octets would do an expansion). You've maybe thousands of fast runner workers equipped with walkie talkies.
First worker runs to 1st train's 1st vagon and tells the Asc() value using his walkietalkie. Another worker runs and marks second train's 1st (and if need be next 2 vagons) with the octet value (suppose octet value is supplied to him while he's running).
Now it's how substr() works, another worker starts to run from start point, and gets to the 2nd vagon...
This process continues for 380K vagons (and assuming even the last worker is a good sprinter - poor man needs to sprint 380K vagons). In computer memory this is not exactly the same thing but could be modeled roughly (especially in allocating memory for second).
Humans are slow and would solve this eliminating unnecessary runs. Would get to 1st vagon, read, turn back to other train in parallel and mark, move to next vagon... IOW one pass work.
Actually I started writing this exactly with your code on a 500K+ file and it still is running (in between I prepared a coffee myself). Oh it just finished (2085.059 seconds).
Calling your toByteA, for single bytes, 380K times could make it faster (conversion string is short, at most 3 bytes octet). And next it might be faster if octet conversion first done for all 256 chars and kept in a lookup array (on my computer that part takes only 2 milliseconds so I didn't bother to optimize more).
Here is both approaches. Worse one took 5.25 seconds and second 1.649 seconds on my computer (Athlon 2500+, 512Mb RAM, 2*40Gb IDE) - I didn't change your toByteA a bit, it's same:
Local lcFile
lcFile = Getfile()
Strtofile(ConvertToOctet2(m.lcFile), Forceext(m.lcFile,'oc1'))
Function ConvertToOctet2
Lparameters tcFile
Local handleIn, handleOut, lcTemp
lcTemp = Sys(2015)+'.tmp'
handleIn = Fopen(m.tcFile)
handleOut = Fcreate(m.lcTemp)
Do While !Feof(handleIn)
=Fwrite(handleOut, toByteA(Fread(handleIn,1)))
Enddo
Fclose(handleIn)
Fclose(handleOut)
lcResult = Filetostr(m.lcTemp)
Erase (m.lcTemp)
Return m.lcResult
endfunc
Local lcFile
lcFile = Getfile()
Local Array aOctet[256]
For ix=0 To 255
aOctet[m.ix+1] = toByteA(Chr(m.ix))
Endfor
Strtofile(ConvertToOctet(m.lcFile, @aOctet), Forceext(m.lcFile,'oct'))
Function ConvertToOctet
Lparameters tcFile, taConversion
Local handleIn, handleOut, lcTemp
lcTemp = Sys(2015)+'.tmp'
handleIn = Fopen(m.tcFile)
handleOut = Fcreate(m.lcTemp)
Do While !Feof(handleIn)
=Fwrite(handleOut, taConversion[Asc(Fread(handleIn,1))+1])
Enddo
Fclose(handleIn)
Fclose(handleOut)
lcResult = Filetostr(m.lcTemp)
Erase (m.lcTemp)
Return m.lcResult
Endfunc
Cetin