Tuesday, December 2, 2014

Wide clustered key

I did not pay attention this thread until now.


As for the original question, I would agree with Dan and David. Kalman argued that you should have a surrogate key to facilitate foreign keys. From that perspective, this is the worst you can do. I've been working lately with some tables where the person who designed has worked according to this principle. The tables are difficult to understand, and moreover, the surrogate keys opens for anomalies that would not be possible if he had used composite keys.


There is however one argument for using a clustered surrogate key that is monnoically growing: this prevents page splits, and improves INSERT performance and prevents fragmentation. But in this case you would lose performance for the SELECT, so this is a trade-off.


Now you suddenly say that col1-col4 is not the full key, but there is an optional col5. I prefer to use NULL to mean "not applicable", and with this decision, I would add a surrogate key as a primary key and a UNIQUE constraint on col1-col5. Everything else the same as said above, I would make the PK non-clustrered and the UNIQUE constraint clustered. It follows from this that if I would later have to add a child table, I would need to use the surrogate key for the PK/FK-relation, with the pain that this incurs.


Now, since this column is a character column, I would not object too much against making the column NOT NULL with a default value of the empty string. In practice, there are not very many situations where the empty string and NULL means different things for a string. I would be a different matter for a numeric column or a date/time column where the "nothing" values are less obvious.


I like to stress that the question on how to design the keys is independent from the question on what index should be clustered.





Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

No comments:

Post a Comment