>>>>Hi All
>>>>
>>>>I am receiving text (TXT) files from a 3rd party entity. I read the file in using FILETOSTR(), amend it, and then write it out to another TXT file using STRTOFILE(). Someone has now asked me what character set we are using for the output file. How do I find that out or answer this question?
>>>>
>>>>TIA
>>>
>>>FILETOSTR() / STRTOFILE() work on plain bytes. They do not do any conversion. IOW you can read a binary as well.
>>>If you do not play around with stuff related to that problem the right answer would be: The output file has the same codepage as the input - if any.
>>
>>IOW, what Borislav said is still true, not because of anything that these two functions do, but because of how the app composed the string. If there were any characters in it which were codepage specific, they were done according to those settings, and written into the file as bytes.
>
>So are we all saying that if I:
>
>1) Read in the file - FILETOSTR()
>2) Calculate a checksum on it - SYS(2007)
>3) Add the checksum to the file - concatenate the strings, and
>4) Write it back out to a new file - STRTOFILE()
>
>... then the codepage / character set of the new file is identical to the codepage / character set of the original file?
Jos.
TXT files do not have any code page attached unless there's a BOM at the beginning of the file
Create a txt file with notepad, enter abcd, then do Save As
(1) Ansi ( = no BOM)
(2) UTF8 (UTF8 BOM)
(3) Unicode ( UTF16 little endian BOM)
(4) Unicode big endian ( UTF16 big endian BOM)
Then look at each of the files with a hex editor
>... then the codepage / character set of the new file is identical to the codepage / character set of the original file?
It depends. If the first bytes of the file are a BOM and you did not change it, then yes
ps- if you receive a txt file without BOM ( = ANSI) with Russian chars - you will not be able to see the russian chars if your computer is in code page 1252
Gregory