ya, that makes sense!
i have made so far the decission, to seperate data, based on the bussiness logic, not on the 'waste' of space. since i don't have the problem with 2GB and most likely won't get there in my app's, i havn't bothered with it.
of course there are some things to be considered; and that's where you function comes in very handy:
- this evaluation cane only be done with existing apps, which have run for a long time, so you have some data to analyse
- There is a fine line which will have to be tetermined on a field by field basis.
- you could end up with a lot of new tables
- you need to determine at what point you use more space adding a new table with 2 fields instead of using 1 field
- more tables to be maintained, and more tables which could develope problems
- the more files you have in a folder, the more your app will slow down
of course if you've reached 2GB - how cares - you have no choice, and that's where the owner of this thread comes from.
I guess my question here is: when do you start taking those considerations? When do you start spending time to plan, design, and implement a more complex database architecture just because i
might happen?
>As per Tamar's comment - the goal is to determine which fields are wasting space in several contexts:
>
> - Within the field, because too much space has been set aside, this typically applies to character fields.
>
> - Fields that are not used very often, if 90% of a particular column is empty, I would recommend a separate table.
>
> - Range-of-values, if a field has a range of values 3,000,176 to 3,000,987 and you (subsequently) discover that the values never fall below 3,000,000 or go above 3,000,999 then the 3 million is superfluous. The field can be shortened considerably. For small tables this is not probably worth the effort. Also in the range-of-values for text fields, it can be worth doing a SQL Select Distinct, to determine whether your 4 million rows of a 50 character field, only contain few hundred thousand variations. In this case the field should be normalized out into a child table. Range-of-values applies pretty much to all data types even Boolean.
>
> - Etc.