Plateforme Level Extreme
Abonnement
Profil corporatif
Produits & Services
Support
Légal
English
Check if a UTF-8 string is valid
Message
De
30/11/2012 07:55:53
Emerson Reed
Folhamatic Tecnologia Em Sistemas
Americana - São Paulo, Brésil
 
 
À
29/11/2012 07:05:35
Emerson Reed
Folhamatic Tecnologia Em Sistemas
Americana - São Paulo, Brésil
Information générale
Forum:
Visual FoxPro
Catégorie:
Autre
Versions des environnements
Visual FoxPro:
VFP 9
OS:
Windows 7
Database:
Visual FoxPro
Application:
Desktop
Divers
Thread ID:
01558316
Message ID:
01558462
Vues:
60
I've finally found a solution...
Take a look at the following functions: SafeConvStrToUtf8 and CheckIfStrIsSafeToUtf8Conv.
SafeConvStrToUtf8 function converts your string to UTF-8 in a safe way (you can choose between to remove the invalid characters or to change them to "?").
CheckIfStrIsSafeToUtf8Conv function verify if your string contains only valid characters (that can be converted to UTF-8).
Enjoy!
* References...
* http://blogs.lessthandot.com/index.php/DataMgmt/DataDesign/displaying-and-saving-unicode-data
* http://www.news2news.com/vfp/?function=557
* http://msdn.microsoft.com/en-us/library/windows/desktop/dd319072%28v=vs.85%29.aspx
*
Function SafeConvStrToUtf8
   Lparameters lcString, llRemoveInvalidChars
   Local lcNewString
   If Empty(m.lcString)
      m.lcNewString = m.lcString
   Else
      #Define CP_UTF8 65001
      #Define MB_ERR_INVALID_CHARS 0x8
      Local lnLength
      m.lnLength = Len(m.lcString)
      m.llValid = Not apiMultiByteToWideChar(CP_UTF8, MB_ERR_INVALID_CHARS, @m.lcString, m.lnLength, Null, 0)==0
      If m.llValid
         m.lcNewString = Strconv(m.lcString,9)
      Else
         m.lcNewString = Space(Len(m.lcString)*2)
         apiMultiByteToWideChar(CP_UTF8, 0, @m.lcString,  m.lnLength, @m.lcNewString, Len(m.lcNewString))
         m.lcNewString = Strtran(m.lcNewString,Chr(0),"")
         If m.llRemoveInvalidChars
            m.lcNewString = Strtran(m.lcNewString,"ýÿ","")
         Endif
      Endif
   Endif
   Return m.lcNewString
Endfunc
*
Function apiMultiByteToWideChar
   Lparameters CodePg, dwFlags, lpMultiByteStr, cbMultiByte, lpWideCharStr, cchWideChar
   Declare Integer MultiByteToWideChar In kernel32 As apiMultiByteToWideChar ;
      Integer CodePg, ;
      Long dwFlags, ;
      String lpMultiByteStr, ;
      Integer cbMultiByte, ;
      String @ lpWideCharStr, ;
      Integer cchWideChar
   Return apiMultiByteToWideChar(m.CodePg, m.dwFlags, @m.lpMultiByteStr, m.cbMultiByte, @m.lpWideCharStr, m.cchWideChar)
Endfunc

* References...
* http://stackoverflow.com/questions/1031645/how-to-detect-utf-8-in-plain-c
* http://www.w3.org/International/questions/qa-forms-utf-8.en.php
*
Function CheckIfStrIsSafeToUtf8Conv
   Lparameters lcString
   Local llSafe, loRegExp
   m.loRegExp = Newobject("_regexp",Addbs(Home(1))+"Ffc\_regexp.vcx")
   With m.loRegExp
      .Clear()
      .Pattern = "^([\x09\x0A\x0D\x20-\x7E]|[\xC2-\xDF][\x80-\xBF]|\xE0[\xA0-\xBF][\x80-\xBF]|[\xE1-\xEC\xEE\xEF][\x80-\xBF]{2}|\xED[\x80-\x9F][\x80-\xBF]|\xF0[\x90-\xBF][\x80-\xBF]{2}|[\xF1-\xF3][\x80-\xBF]{3}|\xF4[\x80-\x8F][\x80-\xBF]{2})*$"
      m.llSafe = .Execute(m.lcString,.F.)>0 && .Execute(m.lcString,.F.)==1
   Endwith
   m.loRegExp = Null
   Return m.llSafe
Endfunc
Emerson Santon Reed
"One Developer CAN Make a Difference. A community CAN make a future." - Craig Boyd
Précédent
Répondre
Fil
Voir

Click here to load this message in the networking platform