Saving unicode data - Level Extreme

Plateforme Level Extreme

Abonnement

Profil corporatif

Produits & Services

Support

Légal

English

Saving unicode data

Message

27/12/2011 14:05:03

Naomi Nosonovsky
Wisconsin, États-Unis

27/12/2011 13:47:47

Gregory Adam
Belgique

Information générale

Forum:

Visual FoxPro

Catégorie:

Contrôles ActiveX en VFP

Titre:

Re: Saving unicode data

Versions des environnements

Visual FoxPro:

VFP 9 SP2

OS:

Windows 7

Network:

Windows 2003 Server

Database:

MS SQL Server

Divers

Thread ID:

01531693

Message ID:

01531738

Vues:

>>>You know about SYS(987,.T.) in VFP 9, right? It will map ANSI to Unicode and vice versa for SQL Passthrough and remote SQL connections. So as long as you can represent the captured Unicode input as ANSI text you'll be Ok (IOW, if the the character set you're running VFP in can represent the entered characters).
>>>
>>>Unicode usage generally only makes sense if you need to display/edit multiple different character sets simultaneously.
>>>
>>>+++ Rick ---
>>
>>I want to be able to type in any language. The form I am working on represents available languages for the Kiosk interface. I don't want to apply any restrictions.
>>
>>In any case, I found that MS Forms 2 Textbox can display any language (anything I tried so far, at least). Now I just need to figure out how to properly capture its value.
>>
>>Also, I tried using this setting, but I don't see any change in the behavior.
>
>
>If you want to use unicode, you'll need to understand some basics
>
>(1) A single byte charset has 256 bytes and 256 possible chars ( all western char sets I believe)
>
>In a double byte char set, some bytes are called a lead byte. When a lead byte is encountered the next byte is fetched
>So, some chars take 1 byte whilst others take 2 (eg chinese and japanese char sets)
>
>The SBCS and DBCS (single and double byte char sets) were invented because the ascii table has no room to contain all the possible chars
>
>The above are identified by a code page.
>
>A char in a codepage (mostly above 0x80) is a different char in another codepage
>
>(2) Then comes UTF - to get rid of all those code pages
>
>There's UTF-32, UTF-16 and UTF-8
>
>UTF-32 can represent all the chars of all the code pages. The downside is that a char occupies 4 bytes
>
>Mostly, UTF-16 is sufficient, this is 2 bytes/char. .Net uses UTF-16 internally
>In windows' parlance UTF-16 is called Wide Character
>
>UTF-8 uses 1 byte, 2 bytes or 4 bytes per character. So, functions like substr(s, 49, 4) have to process all the bytes before the offset since it does not know how many bytes a character occupies. UTF-16 and UTF-32 do not have that disadvantage
>
>
>Code pages and utf- see http://msdn.microsoft.com/en-us/goglobal/bb964654
>
>
>(3) Foxpro can only work with codepages
>
>(4) All ( I think) activeX controls work internally in UTF-16.
>
>Now, when passing a string to an activeX ( like regex ) the activeX transforms the chars coming in to UTF16
>Likewise, UTF-16 going out is converted to a codepage
>
>How does the ACtiveX know that code page to convert from and to ?
>
>(a) it uses the code page of sys(3101) which you can change. But I have not checked whether you can use that to specify a different code page for multiple ActiveX controls at ths same time
>
>
>
>(b) There's also ComProp() - I have not used that
>
>Some further reading and a couple of functions Re: Automating Excel extracting russian characters from cell Thread #1523701 Message #1523724
>
>
>Strategy
>
>(A)
>
>Ok, now you have to find out how sqlserver keeps its unicode. Is it in UTF-8, UTF-16 or UTF-32
>Then, you want to retrieve the data in vfp WITHOUT conversion to ANSI ( see sys(987) Rick mentioned)
>
>Best way to find out is to put some cyrillic chars in unicode in sqlserver. Make sure to use chars that need two bytes - somewhere in that range above U+0410
>
>If you have a field of say 3 unicode chars in sqlserver, get the field over in a vfp cursor via odbc. The examine the number of bytes you have in vfp
>
>if you have 3 - you have a problem
>if you have 6, most likely you have received utf-16
>if you have 12, it is utf-32
>
>more than 3, < = 12 would indicate utf-8
>
>But best of all if to know in which format sql server keeps its unicode format. So you don't have to guess the number of bytes you have on the vfp side
>
>(B) Since you only can feed the activeX with code page and a char ( sometimes two for DBCS) - to the best of my knowlegde, you have to convert the unicode to a code page before passing it to the activeX
>
>For that you use some the functions of the message Re: Automating Excel extracting russian characters from cell Thread #1523701 Message #1523724
>
>
>(C) Retrieving the value from the activeX will give you data in a codepage, which you then convert to the unicode format of sqlserver before ...
>
>(D) transferring it back

Gregory,

I'm retriving the data in binary or varbinary format. I found the following so far:

update dbo.Prefs_sl 
set language00 = N'Русский', 
language01 = N'ʕʒϿ'

select language00, CAST(language00 as varbinary(100)), cast(language00 as binary(100)),
CAST(language01 as varbinary(100)) 
 from dbo.prefs_sl

One Russian character corresponds to 4 characters in varbinary / binary.

If I retrieve these characters back to VFP either as varbinary(100) or binary(100), I can directly assign them to the text property of the MS Forms2 ActiveX control and it displays the word Русский correctly.

So, my problem is to get what I typed in this control back to SQL Server. And here is where I'm failing so far.

You're saying you don't have this ActiveX at all? Which ActiveX do you have and can you help me a bit more with the exact code of saving data?

E.g. I feel like half of the problem (the simplest half) is solved and turned out to be easy enough. However, the other half (hard one) is not solvable - not by my feeble brain so far.

If it's not broken, fix it until it is.

My Blog

Répondre

Fil

Voir

Click here to load this message in the networking platform