Plateforme Level Extreme
Abonnement
Profil corporatif
Produits & Services
Support
Légal
English
Most strange corruption ever
Message
Information générale
Forum:
Visual FoxPro
Catégorie:
Base de données, Tables, Vues, Index et syntaxe SQL
Divers
Thread ID:
00692378
Message ID:
00692785
Vues:
21
>Peter,
>
>That **IS** a nasty nasty one!! And I regret that I can't help with a solution. However, the following additional information may help others who can. . .
>
>1) Are the workstations all shutdown properly nightly or are they left alive (assuming the application itself to be shut down regardless)?
>

All workstations are shutdown properly. But you are right, this was about the first to investigate.
Sad or happy sidenote : the customer who had it all the time was the only one site for about a year. Then we moved from Novell 3.12 to 5.x and we ourselves were the second. Before this "great happening" stuff like email-downloads in the morning were investigated too, and we found nothing.

>2) Does your FPD application use TTS in its processing? Does the VFP5 application use VFP TRANSACTIONS?
>

No, both not. However :
In both versions we use transactions implicitly by locking records and not releasing them until the logical transaction has been completed (UNLOCK ALL). I know, rather worthless according to rollback (etc.). And, some expected influencing factor knowing that records can't be hold from flushing once the cache of the PC is full (yeah, learned that from this experience ... and stressed PC's for this to no avail.).


>3) When you say nulls, am I correct in assuming that you mean hex00?
>

Exactly right, and to be found by CHR(0) $ ...


>4) Has the problem been found to be related to any specific workstation OS? If so, what OS, and if not, what OSes are known to see this problem happen?
>

This is the toughest one to answer. Please mark this :
I am around 90 % sure at some place within the network a "too many files open" is involved around the time the whole thing is initiated. Beware 90 %, so not 100. For our own situation I can prove this fairly well, but for the situations it wasn't trapped properly. Please note that this can occur as LLF4 error (I think FPD only), File does not exist (not enough handles available), File not in use (same) and a few more.
This is an aera I've been working on for many months, trying to prove that NT4 wasn't properly dealing with that.
Under the hood I say a few things here :

1. The customer having it all the time, had NT4's only.
2. We had 2 NT4's, and further W95's only (vsn a and b showing different behaviour in related aspects).
3. I was never able to prove that NT4 is able to deal with over 180 filehandles properly (formally), since there is no explicit registry entry for that (W95 has).
4. Though I state that in 90 % of cases (at least) a too many files (i.e. no handles available) is at order, I could not prove that the specific PC caused this error. And I say again : I am 100 % sure that all is caused by something happening the day before it really happens (like you say : nasty nasty).
5. I can fairly say that the OS where it occurs the next morning is random (proven).
6. Right now, we don't have NT4 anymore, but we still have NT (2000), the prob still occurring.
7. I've been stress testing the Too many files situation, focused on some internal handle (??) of FPD / VFP to bomb. Though various situations could be emulated (you don't want to meet them), nothing looked like this nasty one. One thing : how on earth to test something proven to occur overnight (what night, what time, what what ?). So my testing was to no avail in advance.

Additional info : since our transaction log shows exactly at what (though random) transaction situations it could happen, we wrote a "user-emulator" by means of a sequence of KEYBOARD commands, and ran that for several months. One sequence was ran the last thing of the day, and the other the first of the next day. It was configured so that it 'd always overflow the block. Guess what ? no problem ever, but in between that we had the problem anyhow regularly from a normal live user-sequence. That's why I don't have my picturte at UT :-)


>5) Do the servers in question support and use ECC RAM?

Yes. But this makes me think of the fact that at the move from 3.12 to 5 we switched the server-hardware as well. In the old server there was no ECC.
This by itself makes me think of 5 auto-configuring its memory, while 3 (etc.) had to be explicitly configured when more than 16 MB is in there. I just learned that 5 still allows the explicit configuration, so we are doing that right now.

>
>6) Do the servers in question all support L2 cache for the full extent of the RAM installed?
>

I think a good question, but how to know the answer ? The only thing I can say is that 256MB memory is in there, and there's (I think) 512KB of L2 cache. Whether that supports the 256MB - and how to calculate that, I don't know. OTOH the 6 different servers we have been running Novell since april 2000 up till now would all have been different to this respect I think.

>FYI, I was once involved with a most nasty problem on a Novell 3.12 system that we chased for several months and which caused bad index corruption regularly (daily or more) causing application shutdown to "fix". In THAT case it turned out to be (believe it or not) that the server (Tricord) was not (at least that model, at that time) certified for Novell!!! Once we learned this we replaced the server and never saw the problem again.
>

I have learned to apply the good-brand servers already for the sake of the memory applied. Have an around the corner server (or PC), and you end up in trouble. For that matter : we ourselves use Compaq always, and this customer uses IBM. And okay, in the server-jacked situation we used some Fuji-Siemens (got the same day from around the corner :), just giving the problem too. And yes and again, this Fuji-Siemens (btw rather known overhere) turned out to be stalled around the hour, at the later use as a normal workstation.


>
>good luck
>Jim

Thanks Jim !
Précédent
Suivant
Répondre
Fil
Voir

Click here to load this message in the networking platform