What would be the easiest way to replace chars. in a str

Plateforme Level Extreme

Abonnement

Profil corporatif

Produits & Services

Support

Légal

English

What would be the easiest way to replace chars. in a str

Message

08/02/2000 18:46:03

Ed Rauh
Trumbull, Connecticut, États-Unis

08/02/2000 17:46:32

Dragan Nedeljkovich (En ligne)
Now officially retired
Zrenjanin, Serbia

Information générale

Forum:

Visual FoxPro

Catégorie:

Codage, syntaxe et commandes

Titre:

Re: What would be the easiest way to replace chars. in a str

Divers

Thread ID:

00327059

Message ID:

00329235

Vues:

>>oRegExp = CREATEOBJ('Vbscript.RegExp')
>>oRegExp.pattern = "\s(\d{4}/\d{2}/\d{2})"
>>s=oRegExp.replace(s," {^$1}")
>>
>>This requires VBSCRIPT.DLL to be installed; this comes in with the WSH or the script component runtime; I'm not sure which.
>
>Lucky you - not only you got this nice toy, but you also have the time to play with it. How do you do if you want to remove some characters from the found selection? Like, if you had a series of strings in a ddd-00dddd format (d for digits, 0 for zeros), and you wanted to remove the leading zeros after the dash? In VFP, I'd simply s=strtran(s, '-0', '-') as long as i can find -0 string, but I have a feeling this vbaby can do better.

You can see it in the following example. A regular expression can be segmented into matching 'parts' of the expression, designated by parenthesis, which are addressible by name - the first group is $1, the second $2, etc. Let's take the following example:

ABCDEF,"GHIJK",22,000,000.00, 222, 333

We want to strip the commas out of the numeric expression, but not from anything else. We want to create a regular expression that matches a comma surrounded on either size by an adjacent digit, and drop that comma:

s ='ABCDEF,"GHIJK",22,000,000.00, 222, 333 ,444'
oRegExp = CREATEOBJ('Vbscript.RegExp')
oRegExp.pattern = "(\d{1}),(\d{1})"  && match a digit, then a comma, then a digit
oRegExp.global = .t.  &&  match all occurances
s = oRegExp.replace(s,"$1$2")
? s

You could do that on a whole file in one pass using filetostr() and strtofile(); you can also have it construct a collection of matches showing you what matched up and where, and how long it is:

s ='ABCDEF,"GHIJK",22,000,000.00, 222, 333 ,444'
oRegExp = CREATEOBJ('Vbscript.RegExp')
oRegExp.pattern = "(\s\d{3})"
oRegExp.global = .t.
oc = oRegExp.execute(s)
? oc.count
for each match in oc
   ? match.value, match.firstindex, match.length
endfor

And I've barely touched the surface here. Expressions can look for specific sets of characters, variable patterns, case-insensitive matches, and lots more. Looks like it might be useful, especially when doing data imports, parsing complex structures and other things? VBSCRIP5.CHM has the details on the RegExp object and the Matches object. Just to whet your appetite a bit:

Description
Sets or returns the regular expression pattern being searched for.
Syntax
object.Pattern [= "searchstring"]
The Pattern property syntax has these parts:

Part Description
object Required. Always a RegExp object variable.
searchstring Optional. Regular string expression being searched for. May include any of the regular expression characters defined in the table in the Settings section.

Settings
Special characters and sequences are used in writing patterns for regular expressions. The following table describes and gives an example of the characters and sequences that can be used.
Character Description
\ Marks the next character as either a special character or a literal. For example, "n" matches the character "n". "\n" matches a newline character. The sequence "\\" matches "\" and "$" matches "(".
^ Matches the beginning of input.
$ Matches the end of input.
* Matches the preceding character zero or more times. For example, "zo*" matches either "z" or "zoo".
+ Matches the preceding character one or more times. For example, "zo+" matches "zoo" but not "z".
? Matches the preceding character zero or one time. For example, "a?ve?" matches the "ve" in "never".
. Matches any single character except a newline character.
(pattern) Matches pattern and remembers the match. The matched substring can be retrieved from the resulting Matches collection, using Item [0]...[n]. To match parentheses characters ( ), use "\(" or "$".
x|y Matches either x or y. For example, "z|food" matches "z" or "food". "(z|f)ood" matches "zoo" or "food".
{n} n is a nonnegative integer. Matches exactly n times. For example, "o{2}" does not match the "o" in "Bob," but matches the first two o's in "foooood".
{n,} n is a nonnegative integer. Matches at least n times. For example, "o{2,}" does not match the "o" in "Bob" and matches all the o's in "foooood." "o{1,}" is equivalent to "o+". "o{0,}" is equivalent to "o*".
{n,m} m and n are nonnegative integers. Matches at least n and at most m times. For example, "o{1,3}" matches the first three o's in "fooooood." "o{0,1}" is equivalent to "o?".
[xyz] A character set. Matches any one of the enclosed characters. For example, "[abc]" matches the "a" in "plain".
[^xyz] A negative character set. Matches any character not enclosed. For example, "[^abc]" matches the "p" in "plain".
[a-z] A range of characters. Matches any character in the specified range. For example, "[a-z]" matches any lowercase alphabetic character in the range "a" through "z".
[^m-z] A negative range characters. Matches any character not in the specified range. For example, "[m-z]" matches any character not in the range "m" through "z".
\b Matches a word boundary, that is, the position between a word and a space. For example, "er\b" matches the "er" in "never" but not the "er" in "verb".
\B Matches a nonword boundary. "ea*r\B" matches the "ear" in "never early".
\d Matches a digit character. Equivalent to [0-9].
\D Matches a nondigit character. Equivalent to [^0-9].
\f Matches a form-feed character.
\n Matches a newline character.
\r Matches a carriage return character.
\s Matches any white space including space, tab, form-feed, etc. Equivalent to "[ \f\n\r\t\v]".
\S Matches any nonwhite space character. Equivalent to "[^ \f\n\r\t\v]".
\t Matches a tab character.
\v Matches a vertical tab character.
\w Matches any word character including underscore. Equivalent to "[A-Za-z0-9_]".
\W Matches any nonword character. Equivalent to "[^A-Za-z0-9_]".
\num Matches num, where num is a positive integer. A reference back to remembered matches. For example, "(.)\1" matches two consecutive identical characters.
\n Matches n, where n is an octal escape value. Octal escape values must be 1, 2, or 3 digits long. For example, "\11" and "\011" both match a tab character. "\0011" is the equivalent of "\001" & "1". Octal escape values must not exceed 256. If they do, only the first two digits comprise the expression. Allows ASCII codes to be used in regular expressions.
\xn Matches n, where n is a hexadecimal escape value. Hexadecimal escape values must be exactly two digits long. For example, "\x41" matches "A". "\x041" is equivalent to "\x04" & "1". Allows ASCII codes to be used in regular expressions.

EMail: EdR@edrauh.com
"See, the sun is going down..."
"No, the horizon is moving up!"
- Firesign Theater

NT and Win2K FAQ .. cWashington WSH/ADSI/WMI site
MS WSH site ........... WSH FAQ Site
Wrox Press .............. Win32 Scripting Journal
eSolutions Services, LLC

The Surgeon General has determined that prolonged exposure to the Windows Script Host may be addictive to laboratory mice and codemonkeys

Répondre

Fil

Voir

Click here to load this message in the networking platform