Getting accented chars to Adobe form

Plateforme Level Extreme

Abonnement

Profil corporatif

Produits & Services

Support

Légal

English

Getting accented chars to Adobe form

Message

19/08/2009 11:41:36

Albert Gostick
Kincardine, Ontario, Canada

19/08/2009 10:24:00

Dragan Nedeljkovich (En ligne)
Now officially retired
Zrenjanin, Serbia

Information générale

Forum:

Visual FoxPro

Catégorie:

Codage, syntaxe et commandes

Titre:

Re: Getting accented chars to Adobe form

Versions des environnements

Visual FoxPro:

VFP 9 SP1

OS:

Windows XP SP2

Network:

Windows 2003 Server

Database:

Visual FoxPro

Application:

Desktop

Divers

Thread ID:

01418540

Message ID:

01418988

Vues:

Hi guys,

I ran the tests on my data and it would appear that the conversion from 8-bit to DBCS does NOT change the string of data i.e. the string before and after the conversion was the same length and using the "==" sign to compare the two says they are equal. When I sent it through STRCONV(lcXMLString,9) to convert to UTF-8, the length grew from 5334 to 5335 - so that conversion did indeed do something. So I am leaning towards the theory that the STRCONV(,1) conversion to DBCS is not necessary in this case when we are just dealing with North American and Western European code pages...but I might leave the code in to do both conversions anyhow :-)

Thanks for all the help to you both.

Albert Gostick

>>Hi Dragan,
>>
>>> this DBCS (which is, AFAIK, the full 16-bit Unicode set) is a necessary intermediate step. UTF-8 is simply a way to encode Unicode strings so that they pass as 8-bit strings, but thanks to the lead bytes are properly recognized and interpreted by the presentation layer. Two bytes header and occasionally two bytes per character surely beats six or seven bytes per character when inserting one of these in HTML or in a character field.
>>
>>I'm getting a bit confused with this part... DBCS uses 1 bytes and occasionally 2 bytes to encode Chines, Japanese and Korean text. Entering, handling and displaying DBCS strings depends on having a compatible OS with appropriate regional settings set up. DBCS has lead bytes whereas UTF-8 uses bit patterns to identify the number of bytes per characters instead of a lead byte. UTF-8 encodes in 1-6 bytes depending on the character.
>
>I'm sure of one thing: I haven't got all the details, and those I got I am not quite sure of :). That's what the AFAIK was intended for. Unfortunately, what text I found on the subject came from Microsoft, so nothing was called in clear terms that relate to the real world, but rather in their solipsistic (or should I say autistic) terminology. Some of it possibly had negative information (i.e. after you read it you actually know less than before).

Répondre

Fil

Voir

Click here to load this message in the networking platform