Blog posts written during May 2010

Addition to free applications page

Sunday Sunday, May 23, 2010 by Ian Pettman

Ever so often we update the free applications page.

There is a new addition to the list of useful, worthwhile freely available products.

This update is for graphics packages.

 

 

Use of foreign keys

Friday Friday, May 21, 2010 by Ian Pettman

I partake in several forums related to software development, database design and problem resolution. It ultimately helps improve our service to our Temp Agency and Staff Bank software customers.

Occasionally there are comments or threads that are (in my opinion) really well balanced appraisals of fundamentals. Kenneth Wilhelmsson recently wrote one such contribution and has kindly given permission for its reproduction here. I believe I am correct in saying Kenneth has been contributing wisdom to at least one of the major forums in the SQL community for around a decade.  I have taken the liberty of correcting the very occasional typo as this was initially contributed 'on the fly' to a SQL forum. 

Thank you Kenneth

 

Kenneth writes on use of foreign keys

Well, this is indeed an interesting subject. :o)
I'll try to ramble a bit in the philosophical vein...
This is also just my .02, so feel free to agree or disagree =;o)
First and foremost, I want to make clear that we don't 'use' keys.. They are there, whether we can see them or not.
Keys are a part of the model, so the concept of a key is abstract. A key is *not* an index or a constraint.
Regardless of prefix, such as 'primary', 'foreign', 'alternative' etc etc, a key is - a key - nothing less and nothing more. They are all the same in this respect.
 
Now, usually the debate is around how to *implement* keys in the best way. This is where we stumble into the realms of indexes, constraints et al. 
A quick demo: consider my table here:

 
CREATE table myTab ( col1 int not null, col2 int not null, col3 not null )

 

Now, if I were to say that this table has one primary key and one foreign key... Would I be lying?
No, I wouldn't. It is so, because I have designed the table and have also decided where my keys are.
The problem is though, that one can't see this without directly asking the designer (me).
This isn't very helpful, so it'd be better to document the keys in some way.
Also, there isn't anything visible at the table level that actually helps to enforce that my keys really do behave like keys.
How to do it then?
 
If nothing more is done, the keys are still there.  (They are part of the model/design remember?)
It's however a great risk that the table will become corrupted very quickly.
'Corruption' in this sense is when you end up with data that isn't conforming to the keys present, i.e. a duplicate or an orphan or any anomaly not intended.
 
So, one better need to implement these keys in some way, by placing the rules wanted at some level between the data and the 'other end' - i.e. input/output side.

 

At this point there may be a debate concerning 'performance', and you now have the two camps noted earlier.
I would argue that performance is always secondary to integrity, so better do it 'right'... or 'corruption' will happen.

 

Now, if we decide to place the rules far away from the data, then this will probably work for the purpose of keeping integrity...  as long as this path is *ALWAYS* used! In reality though, the farther away from the data DRI are placed, the higher the risk that someday, something, willingly or unwillingly something will slip into our database without passing those rules. This is one of the major downsides to 'application enforced DRI' - distance. Another downside is that the tables usually look like my example above. It doesn't tell you much information:  you have to go to the upper layer and start to unwind the app to figure out what the intentions really are. And, there's also the chance of actually *missing* on performance gains (regarding FK's, more on this later).
 
Perhaps it make for better sleep at night if the DRI is placed as close to the actual data as possible?
There's usually no debate regarding the key that is designated as 'primary'.
 
CREATE table myTab ( col1 int not null, col2 int not null, col3 not null )
CREATE unique index myInd on myTab(col1)
 
This would suffice as far as DRI on my designated PK is concerned.
It's however still not very informative. One can't really be sure what this is, except that it now behaves like a key.
Another thing that is now demonstrated is that our most efficient tool to implement a key is with the help of a unique index. (Clustered or not has absolutely no relevance in this regard.)
 
So, to better communicate our intentions, we can declare a constraint.
 
CREATE table myTab ( col1 int not null, col2 int not null, col3 not null )
ALTER TABLE myTab ADD CONSTRAINT myInd PRIMARY KEY (col1)
 
Now there's no question about what kind of key it is.
The declaration clearly informs that it's designated as 'primary' it is a 'key' and it uses an index to enforce the behaviour such a key should have.
 
Ok, so similar concept for other kinds of keys, such as Foreign Keys.
ALTER TABLE myTab ADD CONSTRAINT myFK FOREIGN KEY (col3) REFERENCES someOtherTable(colX)
 
This tells us clearly that there is a relation here with a foreign key (information) and it's tightly coupled to the table (reliable)
In essence a FK constraint is a CHECK constraint, and this is where it gets interesting performance-wise...
 
A check constraint (i.e. our FK here) can be one of two things: 'trusted' or 'not trusted'.
Try issuing this query:

 

SELECT OBJECTPROPERTY(object_id('FK_name'), 'CnstIsNotTrusted')
 
If it returns a zero (false), all is well. If it returns a 1 (true) then we have some issues.
Very briefly, if you create/enable a FK constraint *with validation* of existing data, then it will be 'trusted'
In all other cases it will be 'NOT trusted'
 
This is a huge issue for the optimizer among other things, and if done properly, may yield performance benefits as an added bonus.
For the absolute most parts, one would like to have all FK constraints to be trusted.
It means we can rely on it, and that there are no orphans present, for sure.

If it's not trusted, then we won't know for sure, and neither will the optimizer. It may even disregard the constraint altogether when it looks for a plan, resulting in a less optimal plan overall than otherwise.

If it is trusted, the optimizer also trusts this intelligence, and as a side effect may produce a plan that doesn't even touch the table that the FK references, even though the actual query may say something like:

 

SELECT a.col2 FROM myTab a JOIN someOtherTab b on a.col3 = b.colX

 

It depends on what is needed to be retrieved.
If a.col2 is sufficient, then why bother go see if a.col3 ties up with b.colX if the constraint which is present has already done this check? (This is assuming it can be trusted of course.)

So, for possible performance benefits, this is one that may happen in certain circumstances. At any rate, it's a benefit. It also would never ever occur when the DRI is placed at the application level. So, to end this, if anyone has persisted this far.. =;o)

I'm pro the camp that proclaims DRI enforcement at the database level.
It makes me sleep better =;o)
 
/Kenneth

Temp Agency or Staff Bank software and Microsoft SQL 2008

Wednesday Wednesday, May 19, 2010 by Ian Pettman

Microsoft have just released SQL Express 2008 R2. This contains a number of features refined from SQL 2008. Perhaps the most important to Temporary Staff Agencies or Staff Banks is the increase in capacity from 4GB of data to 10GB of data.  When Ava first launched their SQL version of Ava Pa, the limit was a mere five connections whilst we implemented this in a way that allowed 20 of 30 concurrent users, in those days this was sufficient capacity to run a small to medium size agency for five years. Microsoft then set an initial capacity limit of 2GB then 4GB with subsequent versions of SQL Express. We have now got to 10GB and our product has evolved becoming more feature rich, storing more data (emails, text or SMS messages and the like).
The following links provide direct access to download
MS SQL Express 2008 SP1
And
MS SQL Express 2008 R2

For the sake of completeness if you really must install MS SQL 2005 Express

MS SQL Express 2005

 

Please note that for first time installs: the following prerequisites are necessary. we recommend that you download them and install them first.

These are:

Net framework version 3.5 SP1

Windows Installer 4.5

Power shell 1.0 (There is Power shell 2.0 - do not be mislead)

This process may involve a number of reboots: I seem to average about 4.

 

What does this mean for free capacity?

Contact Information

To find out more about Ava solutions you can contact us in a number of ways:
Follow Us...