databasedev.co.uk - database solutions and downloads for microsoft access

Relational Database Design

Normalize the Data

Once you’ve decided upon and drafted the tables and fields you believe that you will require in your database, the next stage is to enter some sample data in the tables and look for any potential problems with maintaining the data.

Identifying possible data maintenance problems before you enter your real data and start working with your database will save you many headaches and extra work later on. Improperly designed tables can also result in inaccurate data, and you certainly don’t want to be making decisions based on data that you can’t trust.

Data Normalization Guidelines

Data normalization ensures that, when you put into practice the tables you have planned, the resultant database will be able to efficiently provide the database queries and reports desired and ongoing data maintenance will be as straightforward as possible.

Common data normalization procedures are as follows:

  • Each field should contain the smallest meaningful value.
  • There should be no repeated groups of fields (similar data belongs in the same field).
  • There should be no unnecessarily repeated data values.
  • Where practical, all fields in a table should be relevant to every record.

Non-normalized Example:

In the following example the chosen fields are not correctly normalized. The Name field is not the smallest meaningful value. There are two email address fields, and the second will not always have a value.

  • Full Name
  • Work Phone Number
  • Work Email Address 1
  • Work Email Address 2
  • National Insurance Number
  • Start Date

Normalized Example:

In this example, each field has the smallest meaningful value, there are no repeated groups of fields or repeated data values, and every record in the table will have a value in each of these fields.

  • Employee First Name
  • Employee Last Name
  • Work Phone Number
  • Work Email Address
  • National Insurance Number
  • Start Date

Blank Values

Though not a strict rule, in many cases blank values in fields should appear because that value is temporarily missing or unknown—not because that field does not pertain to a record.

So, for example, if you have a table listing Company details that contains a web site field, that field may possibly contain a blank value for some Companies (assuming that not all of them may have a Web site). The value isn’t temporarily missing or unknown; the field just doesn’t apply to every record. That’s a sign that the field might belong in a separate table.

Data Denormalization

Definition:

Denormalization occurs when a database designer combines data into one table that the normalization process indicated should be in two tables. After a database designer has fully normalized the design of a database, he or she may, on occasion, choose to do this for any number of reasons:

  • Performance reasons, such as making queries run faster against very large tables.
  • In order to keep similar data together.
  • To keep the overall table structure simple, particularly when it is clear that breaking normalization guidelines will not negatively affect the ability to run queries or create reports.

Denormalization Example

A common example of denormalization is allowing an occasional field to be blank for some records. For instance, a database may contain a middle initial field, but not all employees have or use a middle initial. Strict normalization rules would state that the field should not be null—but it is common practice to keep this field in the same table with the first and last names.