Relational Database Design Requirements

Microsoft Access Articles

GUI Design

Resources

Online Shop

Info

Designer for Microsoft Access
Create complex MS Access databases without being an expert in relational database design! Designer for Microsoft Access asks you plain-language questions about what you want to manage with your database, and creates the tables and relationships automatically. Free trial available

Database Design

A database is a collection of information related to a particular subject or purpose (e.g. tracking customer orders or maintaining a music collection). If your database isn't stored on a computer, or only parts of it are, you may be tracking information from a variety of sources that you have to coordinate and organize yourself.

NOTE: It doesn’t matter how simple the system is, a flat-file database design (only one table) should not be employed, because it has severe limitations and inefficiencies. Your database should contain at least two tables and set relationship between them.

A well-designed database is a prerequisite for fast data retrieval and updates.

Before you build the tables and other objects that will make up your system, it is important to take time to design it. A good design is the keystone to creating a system that does what you want it to do effectively, accurately, and efficiently. If you are looking for professional database design, a service such as Approved Database Developers is great online resource.

The basic steps in designing a database system:

To determine the purpose of your system.
To determine the tables you need in the system.
To determine the fields you need in the tables.
To identify fields with unique values.
To determine the relationships between tables.
To refine the design.
To add data (populate tables) and create other system objects.

To determine the purpose of your system you need to know what information you want from the database (detailed scenario). From that, you can determine what subjects you need to store facts about (the tables) and what facts you need to store about each subject (the fields in the tables).

So, make sure that you do answer the following questions:

What types of things should the system keep track off? (System objects -> purpose and use)
What would a user want to know about these things and what restrictions should be placed on changing the attributes of these things? (System properties -> features, attributes, nature)
What would a user want to do to these things? (System methods -> actions to be taken)
What is the relationship between the different (things) types? (System Object Model)

NOTES:

Modelling the structures on the paper before opening computer and start coding is highly recommendable. Planning may seem time-consuming up front, but not planning is twice as time-consuming later in the application's life cycle.
At this point it's important to establish and employ naming conventions. Each table and table field should have short but meaningful and descriptive name according to its purpose and function, and incorporated prefix in front of the name to indicate its type. Although you are allowed to include spaces in the table and field names, most examples in the Microsoft documentation show field and control names without spaces because spaces in names can produce naming conflicts in Visual Basic for Applications in some circumstances.

To determine the tables can be the trickiest step in the database design process. That is because the results you want from your database (e.g. the reports you want to print, the forms you want to use, the questions you want answered) don't necessarily provide clues about the structure of the tables that produce them. In fact, it may be better to sketch out and rework your design on paper first. When you design your tables, divide up pieces of information by keeping following fundamental design principles in mind:

A table should not contain duplicate information, and information should not be duplicated between tables (e.g. Store each customer address and phone number once, in one table).
When each piece of information is stored in only one table, you update it in one place. This is more efficient, and also eliminates the possibility of duplicate entries that contain different information.
Each table should contain information about one subject. When each table contains facts about only one subject, you can maintain information about each subject independently from other subjects (e.g. You would store customer addresses in a different table from the customers' orders, so that you could delete one order and still maintain the customer information).

There are several things you can do to optimise your tables:

Design tables without redundant data.
Choose appropriate data types for fields; you can save space in your database and improve join operations.
Create indexes for fields you sort, join, or set criteria. This will make dramatic improvements in the speed of queries by indexing fields on both sides of joins, or by creating a relationship between those fields and indexing any field used to set criteria for the query. Finding records through the Find dialog box is also much faster when searching an indexed field.

NOTE: Indexes aren't appropriate in all cases. However, indexes add to the size of the .mdb file, reduce concurrency (the ability of more than one user to modify a page at the same time) in multi-user applications, and decrease performance when you update data in fields that are indexed, or when you add or delete records. It's a good idea to experiment to determine which fields should be indexed. Adding an index may speed up a query one second, but slow down adding a row of data by two seconds and cause locking problems. Or it may add negligible gains depending on which other fields are indexed (e.g. Adding an index to a PostalCode field may provide very little performance gain if a CompanyName field and LastName field in the table are already indexed). Regardless of the types of queries you create, you should only index fields that have mostly unique values. In a multiple-field index, use only as many fields in the index as necessary.

To determine the fields you need - Each table contains information about the same subject, and each field in a table contains individual facts about the table's subject (e.g. A customer table may include company name, address, city, state, and phone number fields). When sketching out the fields for each table, keep following tips in mind:

Relate each field directly to the subject of the table.
Don't include derived or calculated data (data that is the result of an expression).
Include all the information you need.
Store information in its smallest logical parts (e.g. First Name and Last Name, rather than Name.)

To identify fields with unique values - In order to connect information stored in separate tables (e.g. To connect a customer with all the customer's orders) each table in your database must include a field or set of fields that uniquely identifies each individual record in the table. Such a field or set of fields is called a primary key.

NOTE: The power of a relational database system comes from its ability to quickly find and bring together information stored in separate tables using queries, forms, and reports. In order to do this, each table should include a field or set of fields that uniquely identify each record stored in the table. This information is called the primary key of the table. Once you designate a primary key for a table, to ensure uniqueness, the system will prevent any duplicate or Null values from being entered in the primary key fields. To set relationships between tables, you must establish a link between fields that contain common information. The fields themselves do not need to have the same name (although it is a good practice), but the field’s data type and length must be the same, and (more importantly) the information contained within both fields for a specific record must be the same in both tables for the link to work. Generally, a relationship is established by linking these key fields between tables – the primary key in one table and a foreign key in another table. Every table should have a primary key – one or more fields whose contents are unique to each record. This is called entity integrity in the database management. By having a different primary key in each record you can tell two records apart. The goal of setting primary keys is to create individual records in a table that will guarantee uniqueness. Database usually has more than one table, and these tend to be related in some manner. When linking tables you link the primary key field from one table to a field in the second table that has the same structure and data type. The link field in the second table is known as a foreign key field. By matching the values from the primary key to foreign key in both tables, you can relate two records.

There are three kinds of primary keys that can be defined:

AutoNumber primary keys - An AutoNumber field can be set to automatically enter a sequential number as each record is added to the table. Designating such a field as the primary key for a table is the simplest way to create a primary key. If you don't set a primary key before saving a newly created table, the system will ask if you want it to create a primary key for you. If you answer ‘Yes’, the system will create an AutoNumber primary key.
Single-field primary keys - If you have a field that contains unique values such as unique serial numbers or part numbers, you can designate that field as the primary key. If the field you select as primary key does have duplicate or Null values, the system won't set the primary key. You can run a Find Duplicates query to determine which records contain duplicate data. If you can't readily eliminate duplicate entries by editing your data, you can either add an AutoNumber field and set it as the primary key or define a multiple-field primary key.
Multiple-field primary keys - In situations where you can't guarantee the uniqueness of any single field, you may be able to designate two or more fields as the primary key. The most common situation where this arises is in the table used to relate two other tables in a many-to-many relationship

IMPORTANT: The process of designing a relational database includes making sure fields containing only one item of data, eliminating redundant data, and making sure a table contains only data directly related to the primary key. The task of a database designer is to structure the data in a way that eliminates unnecessary duplication and provides a rapid search path to all necessary information. The process of refining tables, keys, columns, and relationships to create an efficient database is called normalization (minimizing the duplication of information in a relational database through effective table design).

Normalizing is not just for relational files: it's also a common design activity for indexed files. Each step is called a form, and forms range from the first normal form (1NF) to fifth normal form (5NF). There is also one higher level, called domain key normal form (DK/NF).

Normalization is a complex process with many specific rules and different levels intensity. In its full definition, normalization is the process of discarding repeating groups, minimizing redundancy, eliminating composite keys for partial dependency, and separating non-key attributes. In simple terms, the rules for normalization can be summed up in a single phrase: "Each attribute (column) must be a fact about the key, the whole key, and nothing but the key." Said another way, each table should describe only one type of entity (such as a person, place, customer order, or product item). Some normalization benefits are:

Data integrity (because there is no redundant, neglected data).
Optimised queries (because normalized tables produce rapid, efficient joins).
Faster index creation and sorting (because the tables have fewer columns).
Faster UPDATE performance (because there are fewer indexes per table).
Improved concurrency resolution (because table locks will affect less data).

You can normalize most simple databases by following a simple rule of thumb: tables that contain repeated information should be divided into separate tables to eliminate the duplication.

To determine the relationships between tables - Now that you've divided your information into tables and identified primary key fields, you need a way to tell the system how to bring related information back together again in meaningful ways.

To do this, you define relationships between tables. Relationship is an association between common fields (columns) in two tables. A relationship works by matching data in key fields. In most cases, these matching fields are the primary key from one table, which provides a unique identifier for each record, and a foreign key in the other table.

The kind of relationship that the system creates depends on how the related fields are defined. When you physically join two tables by connecting fields with like information, you create a relationship that is recognized by Access. The specified relationship is important. It tells Access how to find and display information from fields in two or more tables. The program needs to know whether to look for only one record in a table or to look for several records on the basis of the relationship.

Relationships between tables are grouped into 4 groups:

One-to-one (1:1) - each record in Table A can have only one matching record in Table B, and each record in Table B can have only one matching record in Table A. This type of relationship is not common and it is very rarely used in database systems, but it can be very useful way to link two tables together. The information related in this way could be in one table. However, you might use a one-to-one relationship to divide a table with many fields in order to isolate part of a table for security reasons, or to store information that applies only to a subset of the main table, or for efficient use of space. A one-to-one relationship is created if both of the related fields are primary keys or have unique indexes.
One-to-many (1:M) - is the most common type of relationship and it is used to relate one record in a table with many records in another table. In a one-to-many relationship, a record (parent) in Table A can have many matching records (children) in Table B, but a record (child) in Table B has only one matching record (parent) in Table A. This kind of relationship is created if only one of the related fields is a primary key or has a unique index.
Many-to-one (M:1) - is used to relate many records in a table with only one (single) record in another table. It is often called the lookup table relationship. Normally, this kind of relationship is not based on a primary key field in either table. Although in theory this relationship is one-to-one, it is known as many-to-one because it does not use a primary key field for the link, and many records from the primary table link to a single record in another table.
Many-to-many (M:M) - is used to relate many records in a table with many records in another table. A record (parent) in Table A can have many matching records (children) in Table B, and a record (child) in Table B can have many matching records (parents) in Table A. It is the hardest relationship to understand and it is not correct. By breaking it into two one-to-many relationships and creating a new (junction/link) table to stand between the two existing tables (e.g. whose primary key may consist of two fields - the foreign keys from both tables A and B) will enable correct and appropriate relationship setting. A many-to-many relationship is really two one-to-many relationships with a junction/link table.

In addition to specifying relationships between two tables in a database, you also set up referential integrity rules that will help in maintaining a degree of accuracy between tables. It would prevent unwanted and accidental deletions of records in a parent (primary) table that relate to records in the child table. This type of problem could be catastrophic. These rules keep the relationships between tables intact and unbroken in a relational database management system, because the referential integrity prohibits you from changing existing data in ways that invalidate the links between tables. Referential integrity operates strictly on the basis of the tables’ key fields. It checks each time a key field, whether primary or foreign, is added, changed or deleted. If any of these listed actions creates an invalid relationship between two tables, it is said to violate referential integrity. Referential integrity is a system of rules that Microsoft Access uses to ensure that relationships between records in related tables are valid, and that you don't accidentally delete or incorrectly change related data.

NOTE: When tables are linked together, one table is usually called parent table (always ‘one end’ of an existing relationship) and another table is called child table (always ‘many end’ of an existing relationship). This is known as a parent-child relationship between tables. Referential integrity guarantees that there will not be an orphan, a child record without a parent record.

To refine the design - After designing the tables, fields, and relationships you need the time to study the design and detect any flaws that might remain. It is easier to change your database design now, rather than after you have populated the tables with data or created the relevant screens.

To enter data and create other system objects - When you are satisfied that the table structures meet the design goals described here, then it's time to go ahead and add all your existing data to the tables. You can then create any queries, forms, reports, macros, and modules that you may want.

Additional reading:

Data Normalization - minimizing the duplication of information in a relational database through effective table design (making sure that fields contain only one item of data that is directly related to the primary key, and eliminating redundant data). More »

Referential Integrity Rules - preserving the defined relationships between tables when records are entered or deleted, and ensuring that key values are consistent, in order to prevent users from: adding records to a related table when there is no associated record in the primary table, changing values in a primary table that would result in "orphaned" records in a related table and deleting records from a primary table when there are matching related records. More »