Level Extreme platform
Subscription
Corporate profile
Products & Services
Support
Legal
Français
Data, cache, networked I/O, FLUSH processing factors
Message
From
05/02/2003 17:12:52
 
 
To
All
General information
Forum:
Visual FoxPro
Category:
Other
Title:
Data, cache, networked I/O, FLUSH processing factors
Miscellaneous
Thread ID:
00749598
Message ID:
00749598
Views:
66
It is difficult to know categorically precisely what is happening between/within processes in a Windows system. I suppose that this is as much a result of Microsoft's need to protect its intellectual property as anything else. And there likely are other documents that I don't have access to or cannot find that tell us more. Or possibly all this stuff becomes clear once one also examines all of the capabilities of all of the APIs and functions surrounding the issues.
In any case I summarize below some of the readings I have encountered as I try to learn more.
The following excerpts all tend to focus on files as opposed to records. While I think it is fair to assume that it all applies equally to records, it is worth keeping in mind that some of the performance implications mentioned need re-consideration when it is records that are concerned.

1.0 File Caching
The following (bold and italics are mine) is taken from: http://msdn.microsoft.com/library/default.asp?url=/library/en-us/fileio/base/file_caching.asp
By default, Windows caches file data that is read from disks and written to disks. This implies that read operations read file data from an area in system memory known as the system file cache, rather than from the physical disk. Correspondingly, write operations write file data to the system file cache rather than to the disk, and this type of cache is referred to as a write-back cache. Caching is managed per file object.
Caching occurs under the direction of the cache manager, which operates continuously while Windows is running. File data in the system file cache is written to the disk at intervals determined by the operating system, and the memory previously used by that file data is freed—this is referred to as flushing the cache. The policy of delaying the writing of the data to the file and holding it in the cache until the cache is flushed is called lazy writing, and it is triggered by the cache manager at a determinate time interval. The time at which a block of file data is flushed is partially based on the amount of time it has been stored in the cache and the amount of time since the data was last accessed in a read operation. This ensures that file data that is frequently read will stay accessible in the system file cache for the maximum amount of time.
The frequency at which flushing occurs is an important consideration that balances system performance with system reliability. If the system flushes the cache too often, the number of large write operations flushing incurs will degrade system performance significantly. If the system is not flushed often enough, then the likelihood is greater that either system memory will be depleted by the cache, or a sudden system failure (such as a loss of power to the computer) will happen before the flush. In the latter instance, the cached data will be lost.
To ensure that the right amount of flushing occurs, the cache manager spawns a process every second called a lazy writer. The lazy writer process queues one-eighth of the pages that have been not been flushed recently to be written to disk. It constantly reevaluates the amount of data being flushed for optimal system performance, and if more data needs to be written it queues more data. Lazy writers do not flush temporary files, because the assumption is that they will be deleted by the system.
A process can also force a flush of a file it has opened by calling the FlushFileBuffers function.

Personally I'm a bit confused by this. It says that the "lazy writer" always FLUSHES whatever cache it writes and this hardly sounds sensible to me. I sure hope that it has the capability to do a physical write based on the 'age' component but still leave it as "active" based on the 'last used time'. In a similar vein it seems inappropriate to free the cache space occupied by a file that has a FlushFileBuffers function executed on it because it could still be referenced by many processes here and on other systems.
2.0 Network I/O Concepts
The following (bold and italics are mine) is taken from:
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/fileio/base/network_i_o_concepts.asp
There are some notable differences between local I/O and network I/O on Windows:
  • Network I/O performance depends on how many network I/O operations are taking place and the speed of the network connection. Your application must be able to handle network I/O operations with servers that may be much faster or slower than your local machine, as well as transient changes in network capacity. In these cases, your application may need to allow more time for the operation to complete.
  • The functions you use to perform local file I/O may behave differently over the network. For example, a network I/O operation that takes a long time to complete may time out. In some situations, file handles may be left open indefinitely because of this. Another example is that functions may return error codes for your application to process that are specific to network I/O.
    2.1 Network Redirectors
    The following (bold and italics are mine) is taken from:
    http://msdn.microsoft.com/library/default.asp?url=/library/en-us/fileio/base/network_redirectors.asp
    A network redirector is a file system driver (or FSD) that functions in the following manner:
  • As a client in a network I/O operation by sending I/O requests to servers and processing the responses from the servers.
  • As a server in a network I/O operation by receiving I/O requests from servers and processing the requests.
    It performs all of the low-level interaction with the server in resolving the file name provided by the application with the location of the resource on the remote server. In this way, the redirector enables the application to access and manipulate resources on remote servers as if they were located on the local machine.
    Redirectors operate entirely within kernel mode. This provides the following performance advantages over user-mode alternatives:
  • It can interact with kernel-mode FSDs running on the server, such as the server FSD, without the need for user-to-kernel mode and kernel-to-user mode context switches.
  • It can interact in kernel mode with the cache manager on the server to cache I/O data that the server cache manager sends on the client.
  • API functions custom-made for remote I/O requests, and changes to the standard file I/O functions to provide this functionality, are not necessary.
    2.2 Description of a Network I/O Operation
    The following (bold and italics are mine) is taken from:
    http://msdn.microsoft.com/library/default.asp?url=/library/en-us/fileio/base/description_of_a_network_i_o_operation.asp
    When an application calls a file I/O function to access a file on a remote machine, the following events occur:
  • The I/O request is intercepted by a network redirector, also referred to simply as a redirector, on the local computer.
  • The redirector constructs a data packet containing all of the information about the request, and sends it to the server where the file is located.
  • The redirector on the server receives the packet from the client, authenticates the access to the file required by the I/O request, and, if authenticated, executes the request on behalf of the client. If not, it returns an error code to the redirector on the client.
  • When the request has been executed, the redirector on the server sends any data resulting from the I/O request to the redirector on the client along with a success notification.
  • The redirector on the client receives the packet from the server and passes the data in the packet to the application along with a success notification.
    Obviously, for the last two bullets above, there could be a failure notification instead of 'success' if that was the case. Similarly, should there be a network (connection) failure somewhere along the line then a time-out or some other threshhold would generate an error condition. The important thing is that notifications occur at each step on both systems.
    2.3 Local Caching
    The following (bold and italics are mine) is taken from:
    http://msdn.microsoft.com/library/default.asp?url=/library/en-us/fileio/base/local_caching.asp
    Local caching of data is a technique used to speed network access to data files. It involves caching data on clients rather than on servers when possible.
    The effect of local caching is that it allows multiple write operations on the same region of a file to be combined into one write operation across the network. Local caching reduces network traffic because the data is written once. Such caching improves the apparent response time of applications because the applications do not wait for the data to be sent across the network to the server.

    [much omitted]
    A hazard of local caching is that written data only has as much integrity as the client itself for as long as the data is cached on the client. In general, locally cached data should be flushed to the server as soon as possible. With modern operating systems and hardware support such as uninterruptible power supplies, the risk of losing locally cached data is reduced. But the risk still exists, and you should consider both the trade-off between data integrity and apparent response speed and the trade-off between data integrity and reduced network traffic.
    2.4 Data Coherency
    The following (bold and italics are mine) is taken from:
    http://msdn.microsoft.com/library/default.asp?url=/library/en-us/fileio/base/data_coherency.asp
    Data that is coherent is data that is the same across the network. In other words, if data is coherent, data on the server and all the clients is synchronized.
    [much omitted]
    A more common situation is one thread modifying the file, and a lot of other threads reading it. The moment a write operation occurs, all of the local caches of that file are obsolete. The server must notify each client to abandon its cache. Any subsequent read operations for the file must be performed across the network.
    In another common situation, multiple threads on one or more network clients might try to write to the same file. This situation is similar to a one in which several RCS users all want to make changes to the same file. Each user in sequence must check out the file, make changes, and then check the file back in. Similarly, in a local caching scheme the server must hand off the privilege of writing to a file to one client thread at a time.

    3. Flushing System-Buffered I/O Data to Disk
    The following (bold and italics are mine) is taken from:
    http://msdn.microsoft.com/library/default.asp?url=/library/en-us/fileio/base/flushing_system_buffered_i_o_data_to_disk.asp
    Windows stores the data in file read and write operations in system-maintained data buffers to optimize disk performance. When an application writes to a file, the system usually buffers the data and writes the data to the disk on a regular basis. An application can force the operating system to write the contents of these data buffers to the disk by using the FlushFileBuffers function. Alternatively, an application can specify that write operations are to bypass the data buffer and write directly to the disk by setting the FILE_FLAG_NO_BUFFERING flag when the file is created or opened using the CreateFile function.
    If there is data in the internal buffer when the file is closed, the operating system does not automatically write the contents of the buffer to the disk before closing the file. If the application does not force the operating system to write the buffer to disk before closing the file, the caching algorithm determines when the buffer is written.


    All in all the above leaves me with more questions than it does provide answers. But I do think it serves adequately to describe the overall processes in a generic way that is useful for discussion purposes. I hope you find it of some value too.
  • Next
    Reply
    Map
    View

    Click here to load this message in the networking platform