Index keep corrupting - Level Extreme

Level Extreme platform

Subscription

Corporate profile

Products & Services

Support

Legal

Français

Index keep corrupting

Message

From

29/09/2018 13:02:57

Naoto Kimura
Jantek Electronics, Inc.
Temple City, California, United States

28/09/2018 15:57:01

Sylvain Larin
Montréal, Quebec, Canada

General information

Forum:

Visual FoxPro

Category:

Databases,Tables, Views, Indexing and SQL syntax

Title:

Re: Index keep corrupting

Environment versions

Visual FoxPro:

VFP 9 SP2

OS:

Windows 10

Network:

Windows Server 2016

Database:

Visual FoxPro

Application:

Desktop

Miscellaneous

Thread ID:

01662237

Message ID:

01662397

Views:

>>>>>We have an index that keep corrupting all the time. Even if I delete the table, rebuild it from scratch and recreate the indexes, the same index will eventually corrupt itself again. Allways the same index and allways the same table.
>>>>>
>>>>>Except for power failure or application shutdown in a middle of a write, what can cause this ? And how to avoid it ?
>>>>
>>>>I don't know if this has already been suggested, but I've seen cases where bad characters in table data (e.g. CHR( 0 )) can cause index corruption. These are sometimes caused by power failures or app/workstation crashes as you've mentioned but they persist in the table data, so they cause problems with indexes.
>>>>
>>>>You may want to scan the table(s) in question (char and memo columns) for CHR( 0 ) or other characters that shouldn't be there.
>>>>
>>>>If you're deleting the table, the backup you're restoring or APPENDing from may have this sort of corruption.
>>>
>>>I've done a scan of the table for any CHR (0) in any string fields and didn't found a thing. All the numeric and date fields seem OK also.
>>
>>Did you do this programmatically or did someone just eyeball it?
>>
>>Generally when I table scan for unexpected characters I flag all ASCII characters below 32 (space), except (optionally) for TAB in memo columns. I also optionally flag ASCII characters over 127, English typically doesn't use them, I don't know about French.
>>
>>On a slightly different note, how do you know the index keeps corrupting? Can you do manual SEEKs or similar at the VFP command window and not get the results you know you should get? If that's the case, what is SET COLLATE when this table is used or accessed? http://fox.wikis.com/wc.dll?Wiki~NonMachineCollation~WIN_COM_API
>>
>>I also found a CDX corruption checklist on the Fox Wiki: http://fox.wikis.com/wc.dll?Wiki~CDXCorruptionChecklist~VB
>>
>>IMO a bit dated and some of the suggestions seem to be extreme edge cases. OTOH something is causing your issues, maybe your case is an edge case.
>
>I did a SCAN programmatically for every caracter below 32 and over 122 in all character fields.
>
>I notice the index corruption by setting the index (SET ORDER TO), SEEKing the record I want, browsing the table and scrolling down the grid. At one point, the cursor jump back a couple of records above and keep looping. For exemple, it will reach record #100, down arrow, go back to record #95 and keep looping around 95 to 100 indefintly when I keep pressing the down arrow.
>
>SET COLLATE is set to General everywhere (in code and in all tables).

Have you checked to see if this may be codepage related? I'd seen some weirdness occur when you have different computers configured with different languages on the same network that may be writing to shared files. You could sometimes get weird behavior when the DBFs have one codepage and the computers using a different one.
Codepage problems are often acute with East Asian languages (e.g. Chinese, Japanese, or Korean) as they generally involve double-byte characters. What's worse is that *each* of the languages may have more than one double-byte encoding scheme (e.g. the most common encodings for Japanese are EUC and JIS, with JIS having sub-variants that are almost-but-not-quite-compatible versions). Usually the character codes in the 128-255 range are typically used as lead-in, shift-in/out and page-select functions ("lead-in" signals beginning of double-byte sequence, "shift-in" and "shift-out" change into and out of implied double-byte mode, "page-select" are used to select between different "pages" of characters). Codepage problems within East Asian languages could cause head-scratchers -- like a program that would run fine in English, Spanish or French would end up with various runtime errors like syntax errors while running in East Asian language configurations -- often the result of a string literal somewhere containing a character in the 128-255 range (or more specifically one that ends with such a character -- causing the ending quote to be "eaten" and causing a string termination problem). Tracking these in a PRG is relatively easy, finding the problem within a SCX or FRX is sometimes a bit harder.
Aside from CHR(0), another character that could sometimes cause problems is CHR(26) (corresponds to ctrl-Z, often written out as ^Z) -- which was used as an EOF marker in older versions of DOS (and was a carryover from CP/M). Although it's an old "feature" it's still interpreted as such in some contexts -- you'll notice that if you try to TYPE a file, it stops outputting after it encounters this character.

On an aside on topic of character sets. Years ago I do recall a coworker of mine asking me why his C program is outputting strange characters (e.g. upside-down question mark).. Basically he stumbled across the language feature of trigraphs. He was in the habit of writing out warning and error messages with repeated punctuation marks -- where he had triple question marks, the compiler dutifully translated that to the "upside-down question mark" character (often seen in Spanish text). I'd mainly remembered this because it was a language feature that came in handy back at the time when I'd been dealing with various computers -- in particular IBM minicomputers and mainframes that used EBCDIC and other computers using ASCII. There was often a problem with translation tables where some characters were being incorrectly mapped. Characters that often got mistranslated included square brackets, curly brace, backslash and caret -- characters that figure significantly in the C language. What's worse was that horizontal tabs weren't always translated consistently -- so any hints of program structure from indentation (that could be used to indirectly determine the places where curly-braces used to be) was completely gone. Trigraphs were useful for me to get around the translation table problems (although the trigraphs did make your code look really weird).

Map

View

Click here to load this message in the networking platform