Plateforme Level Extreme
Abonnement
Profil corporatif
Produits & Services
Support
Légal
English
10 Things to Avoid in VFP Development
Message
De
01/01/2000 08:54:35
 
 
À
01/01/2000 04:46:11
Walter Meester
HoogkarspelPays-Bas
Information générale
Forum:
Visual FoxPro
Catégorie:
Autre
Divers
Thread ID:
00310318
Message ID:
00311126
Vues:
39
>>In the shipping industry, it's not unusual to deal with mapping tables based on 5 digit postal code to 5 digit postal code rate mapping; there are over 78K valid 5 digit postal codes. Typical rate charts from the national-level trucking firms and LTL freight carriers run in the 800K-1M record range. I provide rating systems for some of them - I may have 10-15 national-level carriers on-hand now, and a similar number of regionals. Even end-users with a single warehouse will have 70-80K records per carrier in the base rating tables they use when arranging shipments, and typically they'll have business with 4-8 different carriers.
>
>I would not regard this mid-sized business anymore.
>

Walter, any business that sends out palette sized or larger shipments needs at least the rates from their point of shipment to other locations in the continental United States, and if they contract their inbound freight, from all points to their location - ~100K-150K pairs for both for a single point of origin if the carrier resolves to a 5 digit Zip; 1/10th that if they resolve to 4 digits. And that's one carrier; anyone who doesn't shop their freight with the deregulation of the trucking industry is probably paying too much for LTL services. A typical small to medium business using Pitney Bowes' Ascent system for package level shipment with the 5 major package/express carriers in the US is probably carrying ~70MB of carrier data before considering their manifests - a single shipping station can probably handle volumes of up to 400 packages per week. Probably more now, since the package carriers have implemented delivery commitment; I have real good ideas of the dataset sizes there, having designed engines to handle that. In some areas like rural Alaska, 5 digit Zips don't give fine enough resolution to determine the maximum delay in package arrival.

Shipping is just one of many aspects of business; anyone who sends stuff to customers needs it. Very small businesses can use a Web-based system, but they generally don't offer competitive rate shopping.

Prospecting by email and mass snail mail is extremely prevalent, with very large databases of potential contacts, even targeted ones, available for relatively low cost. I get spammed by them routinely. UT's members who express an interest in receiving vendor mail represent perhaps 10K names for a very narrow, focused set of prospects. And UT is tiny.

Stock analysis, clearly a database-oriented application, deals with at least daily datapoints for hundreds or thousands of discrete offerings over very long time periods.

There is a whole class of businesses that function as product reps - they have limited or no on-hand stock, a very wide potential inventory, and may be in a position to handle thousands of items among similar numbers of prospective customers. Some of the businesses that fit this model include executive placement and temporary agencies which prospect among employers and track potential candidates, ticket agents, travel agents and barter brokers. Mail-order catalog businesses. Pharmacies. Even the door-to-door sales organizations like Amway.

>>With the advent of mass mailing and emailing, it's not unusual for people with even one man shows to have lists of hundreds of thousands of potential contact names and addresses. Charities, churches, political organizations, the guy selling Ronco(tm)'s latest handy dandy widget...
>
>Note that I was talking about 1+ million records.
>

How much spam do you get each week offering millions of email addresses for a small fee? I probably see a half-dozen a week - and that's with Outlook dropping the obvious ones off into the bit bucket without my ever seeing them. And where do you think all those twits got your email address? And how do they manage them? Hmmmm...

>My eyes are open for the business I'm making a living off. In my business I don't often see 1M+ records in a table. I wonder if the majority of the VFP community does. Also note that I was talking about MAIN tables, by which I mean tables such as articles, employees, clients, etc.. not about tables for OLAP purposes.
>

OLAP consolidates and distills huge data stores into logical groupings organized in a way that makes sense to the user, so they don't have to deal in detail with the huge collections of detail. OLAP reduces data set sizes to something that a user can deal with, organized in a fashion that makes sense to them, not the way the programmer planned to view things. And it does this by maintaining (shudder) summarized and often denormal data and structuring the backing store and queries needed to handle the OLAP-managed data views. With many backend OLAP implementations, the backend can be given the opportunity to study the datasets when idle and preprocess detail into OLAP datasets or optimize query strategies according to a set of priorities outlining the relative costs of added storage requirements, database reorganization, timeliness of presentation and currentness of the business representation. Without the programmer constantly involved in the data retrieval strategy for this.

We've all done some OLAP-ish things in our systems - snapshots of summarized data, perhaps dumped into spreadsheets or other business modelling tools. OLAP just makes it esier for the user to specify his needs without getting me involved. The user gets to play with different models of his data without getting a programmer involved, my data stays normalized, and I don't have to explicitly set aside time to perform the necessary preprocessing. Of course, OLAP needs a way of expressing targeted result sets rather than procedures for creating the desired output...xBASE doesn't address this well.

My perspective here is that I like having tools that take advantage of otherwise idle cycles to analyse the history of user requests and try to. It's inevitable that given a system that is not overburdened, there will some useful and usable idle time available to expend on lower priority tasks that can be performed in part or whole asynchronously in anticipation of future user requirements. If you guess wrong about what will be needed in the future, you're no worse off than if the system just sat there and twiddled its thumbs.

>>>It still depends on the selectivity of the filter. If only a few rows are going to be displayed, yep, you've got a performance problem. But if you want to display all 1,000,000 records, p-views are definitely not an attractive alternative. Since the UI design says that all rows must be accesable at default, SET FILTER is the better choice.
>

P-views, especially remote p-views, are a godsend in these circumstances, because the user can deal with a subset of the data required while the backend continues the filtration and extraction process! It's nice when a horrendously large dataset can be offered to the user even while the backend continues to get the required data -m and it only offers the data on request, so less data moves over the wire. Drilldown by isolating entities based on known attributes and refining the set of data of interest for the task is typically of greater use.

I'll use my company as an example. Someone calls and wonders where their books are. Normally, they know who they are, or can provide enough ancillary detail to help identify them to us - name, address, zipcode, phone number. Once we've narrowed the scope of the domain of all customer orders to orders for one customer (or in the case of wholesalers and chains, perhaps a small set of distinct customers) we can refine the data of interest - an invoice number, approximate purchase date, the item, a PO number, a check or credit card used to pay for them. The result of this refinement is identification of some entity or set of entities related to the customer that can further narrow the scope of things.

The single biggest advantage to SQL over xBASE implementations is that SQL specifies what you want, where xBASE describes the process to follow to get it. SQL deals with a description of the desired result. The choice of how to satisfy the requirement is left to the data engine; and a backend can do things like examine key distributions, system contention, and implement alternative index strategies to enhance performance and take advantage of data patterns and organizations that I didn't anticipate.

We have different philosophies here - the user should be able to address his needs either through data specification or by the scope of the task at hand. I find that drilldown is well-received by my users rather than exhaustive data sets which must be narrowed progressively, with a corresponding increase in retrieval cost as filters are refined. YMMV. My users don't like hundreds of items in view at once, much less thousands or millions, and generally have narrowed the scope of the data domain in question, or don't want to deal in detail with the domain.

>>If I had an end-user who really needed to deal with all 1M records, or even a significant fraction of those 1M records, through a browse a dozen or two records at a time, I'd be looking into psychiatric care for the individual or some general help for him in terms of clarifying the data needed to do the task at hand.
>
>In my cases the grid displays all possible items (lets say persons) on the first page. The other pages show detail data of the selected item in the grid. The user is able to display the grid, sort the table, and use an incremental search to find the record he/she needs. This way, It doesn't matter if there are only a few hundred or a few hundred thousand items in the grid. No lets say i've got a persons table of 50,000 records and the users knows that the person he/she is looking for lives in amsterdam, is male and works for company X. The users applies the filter and the grid lists all posible candidates, picks one out and makes the changes. The time needed for list these candidate would probably not differ too much from a SQL-query to be executed, but has the advantage that you're working live data.
>
>The performance drop will not be a significant problem because this feature is only ment for the more exceptional search cases. If a specific search is getting structual, I might be looking for another solution depending on the performance problems arising.
>
>>p-views and SQL SELECTs work well for my applications, letting the user deal with convenient sets of records, they scale well, and are easily moved to a backend environment when data set size or network performance becomes a major issue.
>
>I fully understand the advantages and disadvantages of p-views; they both have some different properties whereof performance is one that is in benefit of both, depending on the circumstances. Personally I'm very comfortable in using p-views on the many side of a one-to-many relation. I've found some circumstances where filters are much more convinient than p-views. From my POV they're different animals which only have one thing in common: filtering data. Working with live data, index usage, persistency, buffering and performance caracteristic are others which also are important for decision which to use.
>
>>I'm sure some people will continue to use xBASE efffectively; for me, and the people I work with, it doesn't make sense, and won't put us in a position to exploit the inforamtion technologies now reaching the marketplace.
>
>Interesting.. Recently I've read an article translated from a document released by the Gartnerr group (november 1998 ?), saying that they predict a great future for aggregrated query results. From what I did understand of it, this means that rather than copying data to a queryresult cursor, it uses live data to construct the query results resulting in a huge performance increase. I'll bet MS is at least researching this for SQL-server, maybe they've already implemented it in some form in SQL-server. I can see big parallels with the xBase equivalents achieving the same. From my POV this means that among the other xBase features, the SET FILTER command is an important feature that is also showing up in the internals of server RDBMSs.
>
>>If I had to page down through a million records in a grid or browse, my fingers would cramp up if nothing else.
>
>Using an incremental search would not have this problem.
>
>Walter,
EMail: EdR@edrauh.com
"See, the sun is going down..."
"No, the horizon is moving up!"
- Firesign Theater


NT and Win2K FAQ .. cWashington WSH/ADSI/WMI site
MS WSH site ........... WSH FAQ Site
Wrox Press .............. Win32 Scripting Journal
eSolutions Services, LLC

The Surgeon General has determined that prolonged exposure to the Windows Script Host may be addictive to laboratory mice and codemonkeys
Précédent
Suivant
Répondre
Fil
Voir

Click here to load this message in the networking platform