Part of the definition of a true relational database product is that it supports referential integrity. The principle of referential integrity may be stated as:
Every non-null foreign key value must match an existing primary key value.
The non-null part means that a value is not required in the foreign key field, but every one that does exist must match a value in the primary key field of the table to which it is related. Referential integrity is all about preserving the validity of foreign key values.
Depending upon the product that you are using, you, as the database designer, will usually have to do something to invoke the referential integrity between database tables. It may be as part of the table relationship definition or in the way you join the tables. It is important for you to know how the product that you use enforces referential integrity.
Observe the following example of two tables with a one - to - many relationship; Departments is the primary table and Employees is the related table.
|RI||Receiving & Inventory|
Tables with a one-to-many relationship
In the Employees table, DeptCode is the foreign key. What referential integrity is saying is that, if a value is entered in that field, it must match a value in the primary key field (DeptCode) of the Departments table.
Why is this important?
Referential integrity helps to ensure that the database contains valid and usable records by preserving the connections between tables. Without it, the relationships between tables could quickly become meaningless and the queries on the data would return unreliable results. Let's work through the implications of this principle when it is enforced by the RDBMS.
Look at the tables above and think about how the principle of referential integrity would apply.
Examine the data in the tables. If, in the Departments table, you changed the code for the Accounting department from AC to AG, would that violate the principle of referential integrity?
Yes, it would as there would be entries in the DeptCode field of the Employees table (the foreign key field of the related table) that no longer matched a value in the primary key of the primary table.
Another way of expressing this is to say that referential integrity will not allow you to change values in the primary or parent table that would create orphan records in the related or child table.
Examine the above tables once again. What if the Marketing and Sales departments were merged and you wanted to delete the Sales department record in the Departments table?
Would referential integrity allow you to do that?
No, because then you would have orphan records in the Employees table. There would be foreign key values that would not match a primary key value.
Now let's look at a couple of different tables.
Examine the data below; Apperence_Types is the primary or parent table, and Appearances is the related or child table.
|6/30/99||1:00 PM||2:30 PM||104||BS|
|7/13/99||1:00 PM||3:00 PM||112||BS|
|7/14/99||9:00 AM||11:30 AM||112||BS|
|9/1/99||4:00 PM||5:30 PM||106||MA|
|9/1/99||7:30 PM||9:00 PM||106||MA|
|11/20/99||7:00 PM||9:00 PM||111||SA|
|1/5/00||8:00 PM||9:30 PM||102||MG|
|MA||Meet the Author|
|MG||Musical Groupl Performance|
|SA||Solo Artist Appearance|
|SP||Solo Artist Perfomance|
If you deleted the record for "MA Meet the author" in the primary table, how many records could be orphaned in the related (child) table?
This could orphan 2 records
Would referential integrity allow you to change the value in the TypeCode field for the record for 11/20/99 from SA to SS?
No, because it would not match a record in the primary table.
In the real world, coding systems are revised and departments do get merged. So you might be wondering how you can maintain data when the change you need to make violates referential integrity. RDBMS products generally handle this through cascading updates and cascading deletes (different products may have different names and techniques for invoking this concept). In some products, you may have to write a rule or trigger or use an operator.
What a cascading update does is this: If you change the code for the Accounting department in the Departments table from AC to AG, all the occurrences of AC in the Employees table will automatically be changed to AG. The change "cascades" to related tables. This keeps the data valid and in compliance with referential integrity.
If you deleted the SL or Sales department record in the Departments table, a cascading delete would delete all records in the Employees table that contained SL as a value in the DeptCode field. (Most RDBMS products most of the time will warn the user before making this deletion.) This cascade effect could travel even further if there were other tables related to the Employees table that would then be orphaned by the deletion of employee records. This is a little scary but the intention is good. At least the data that's left is valid.
Once you understand how this works, you can make changes to the data in an appropriate sequence so that the data change does not have any undesired or inadvertent effect. Using the example of the merger of the Marketing and Sales departments, say you wanted to change the MK Marketing record in the Departments table to MS Marketing & Sales and then delete the SL Sales record. You would first change the code in the Departments table to MS (and change the department name) and let that change cascade to the related records that had MK as a value. You would then change all the occurrences of SL in the Employees table to MS; this would comply with referential integrity. You could then delete the SL Sales record in the Departments table without deleting any records in the Employees table.
Referential integrity is enforced at the database level in that it controls the integrity of data between tables. You, as the database designer, can also do things at both the field and table levels to help ensure data integrity. This is where the knowledge you gained of business rules can be implemented and you can help ensure that the data entered meets the requirements of the particular setting for the database.
As you implement the business rules, you should document each rule and how and where it is enforced in your design. Rules do change and the documentation will make it much easier to find what part of the design needs to be modified. As you work through them, you may find some that cannot be built in at the field or table level in the RDBMS in which you are working. These would have to be done at the application level using queries, forms or views, reports, or with code-areas that are beyond the scope of this article.
As you implement a rule, it is important to test it.
Do you get the intended result when the rule is met?
What about when the rule is violated?
Good application design will give the users feedback when they break a rule.