Welcome to EMC Consulting Blogs Sign in | Join | Help

Data Based

Just some thoughts of mine

Distributed Keys and Disjoint Subtypes

Entity subtyping is a very common data modelling scenario that gets extensive coverage in books on relational designStar but it isn’t necessarily well supported by SQL DBMSs. The following is a simple technique I have found useful for implementing some kinds of entity subtypes. Specifically it applies to the case of Disjoint Subtypes that require what Hugh Darwen calls a Distributed Key – a uniqueness constraint spanning multiple tables.

First, here’s an example of a disjoint subtype problem. A retailer sells three types of product: Books, CDs and DVDs. In their product database all these product types have two attributes in common: SKU and Title. They also have other attributes that are unique to one type or the other. The unique attributes are: Number of Pages (for Books), Number of Tracks (for CDs) and Duration (DVDs). Here’s an E/R model prototype. SKU is the common candidate key and is also the identifying foreign key in each of the referencing tables:

So far so good. But the constraints implied by this E/R diagram are not enough to enforce the business rule that Books, CDs and DVDs are disjoint subtypes – i.e. that no SKU can be of more than one type. Most textbook examples don’t seem to address the problem of how to enforce such a constraint in SQL. The solution I like to use is to add a ProductType attribute to both the supertype and the subtypes and then include a compound foreign key on (SKU, ProductType). Here are the additional constraints for the Books table:

ALTER TABLE Books ADD CONSTRAINT Books_Products_ProductType_FK
FOREIGN KEY (SKU, ProductType) REFERENCES Products (SKU, ProductType);

ALTER TABLE Books ADD CONSTRAINT Books_ProductType_CK
CHECK (ProductType ='B');

Similar constraints apply to CDs and DVDs. The following is the DDL for the complete schema. To aid readability I’ve omitted the constraint names. I’ve also added some defaults:

CREATE TABLE Products
 (SKU INT NOT NULL PRIMARY KEY,
  ProductType CHAR(1) NOT NULL
  CHECK (ProductType IN ('B','C','D' /* Book, CD or DVD */)),
  Title VARCHAR(50) NOT NULL,
  UNIQUE (SKU,ProductType));

CREATE TABLE Books
 (SKU INT NOT NULL PRIMARY KEY,
  ProductType CHAR(1) DEFAULT 'B' NOT NULL, CHECK (ProductType ='B'),
  Pages SMALLINT NOT NULL,
  FOREIGN KEY (SKU,ProductType) REFERENCES Products (SKU,ProductType));

CREATE TABLE CDs
 (SKU INT NOT NULL PRIMARY KEY,
  ProductType CHAR(1) DEFAULT 'C' NOT NULL, CHECK (ProductType ='C'),
  Tracks SMALLINT NOT NULL,
  FOREIGN KEY (SKU,ProductType) REFERENCES Products (SKU,ProductType));

CREATE TABLE DVDs
 (SKU INT NOT NULL PRIMARY KEY,
  ProductType CHAR(1) DEFAULT 'D' NOT NULL, CHECK (ProductType ='D'),
  Duration SMALLINT NOT NULL,
  FOREIGN KEY (SKU,ProductType) REFERENCES Products (SKU,ProductType));
 

That’s it. The disjoint requirement is always enforced by the combination of the compound foreign key constraints and check constraints. Although it’s extremely simple I think this is worth documenting here because I have never seen such a solution mentioned in print. I don't claim any originality for the above and I know of others who have used it. If anyone is aware of a good reference that describes the same technique then I’d be grateful to hear about it so that I can attribute a source in future.

One other tiny point of interest. A colleague of mine once objected to this design because he said that the table I've called Products violated Boyce-Codd Normal Form. I am confident that he was wrong however - there is no violation of BCNF. In fact this schema comfortably satisfies 5NF. For the moment I’ll leave further analysis open to anyone who wants to comment on it here...

--

Star See: Halpin - Information Modeling and Relational Databases; Pascal - Practical Issues in Database Management; Date - An Introduction to Database Systems.

 

Published 08 January 2007 22:35 by David.Portas
Filed under: ,

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

tkoppela said:

I've seen this before... will let you know if I can find it again.

What is still not enforced though (and is worth mentioning I think) is the 'mandatory entries' in one of the subtypes requirement:

- For B Products there must be an entry in Books.

- For C Products there must be an entry in CDs.

- For D Products there must be an entry in DVDs.

Toon

February 13, 2007 21:24
 

Johan said:

As you had to get the job done in SQL, I think you did a good job. But of

course, it's a pity that SQL forces us to these kinds of workarounds.

When it comes to the discussion you had with your colleague, I don't agree

with him when he said that the table you have called Products would

violate BCNF. But I wonder if your other relvars (Books, CDs and DVDs)

really satisfy... 2NF! Each attribute should depend on the WHOLE key. But

if we take relvar Books as an example, then its attribute ProductType

doesn't accept any other value than 'B' so it does not depend on the whole

key { SKU }. It just depends on the proper subset { } of that key (an

empty subset as you see! - SQL doesn't understand that but I am confident

you do). With a truly relational language as Tutorial D, we could safely

choose to vertically decompose Books in two base relvars:

- one with the attribute set { SKU, Pages } and the key { SKU }

- the other one with the attribute set { ProductType } and a key with no

components { }

Then we could recompose your relvar Books (and define it as a view if we

wanted) by joining these two base relvars.

Another way would be to drop the second base relvar and obtain the view

Books just by extending the only one remaining base relvar with a new

attribute, ProductType. Even SQL would allow us to do that but then it

would certainly not allow us to define a FK-constraint for that view...

So, as I wrote first: As you had to get the job done in SQL, I think you

did a good job!

February 13, 2007 23:44
 

David Portas' Blog said:

My post on Disjoint Subtypes generated some interesting feedback. I made a point of saying that the Products

February 19, 2007 13:33
 

jamieuk said:

DP: "I have never seen such a solution mentioned in print. I don't claim any originality for the above and I know of others who have used it. If anyone is aware of a good reference that describes the same technique then I’d be grateful to hear about it so that I can attribute a source in future."

Celko has been posting such a schema (CREATE TABLE Vehicles ...) some years now as one of his 'cut & paste' newsgroup specials. This is the earliest appearance I could find (2004-05-29):

http://groups.google.com/group/comp.databases/msg/a23ffb19bfde2f20

Jamie.

--

March 21, 2007 16:25
 

Hunchback said:

Here is a reference, see slide 17.

http://www.cbe.wwu.edu/misclasses/mis421s04/presentations/supersubtype.ppt

Hope it helps.

April 5, 2007 20:29
 

database design --> working database question - dBforums said:

January 7, 2009 21:04
 

database design --> working database question - dBforums said:

January 7, 2009 21:10
 

How to implement multiple assignment? | keyongtech said:

January 22, 2009 02:27
 

Table partitioning to improve performancce - dBforums said:

February 12, 2009 15:33
 

Table partitioning to improve performancce - dBforums said:

February 12, 2009 16:29
 

Table partitioning to improve performancce - dBforums said:

February 14, 2009 07:15
 

inheriting tables (sql server 2008) | keyongtech said:

March 31, 2009 15:11
 

Need help - dBforums said:

April 25, 2009 22:33

Leave a Comment

(required) 
(optional)
(required) 
Submit
Powered by Community Server (Personal Edition), by Telligent Systems