Normalization
Normalization is the process of efficiently organizing
data in a database. There are two goals of the
normalization process: eliminating redundant data
(for example, storing the same data in more than one
table) and ensuring data dependencies make sense
(only storing related data in a table). Both of these
are worthy goals as they reduce the amount of space
a database consumes and ensure that data is
logically stored.
The Normal Forms:
The database community has developed a series of
guidelines for ensuring that databases are
normalized. These are referred to as normal forms
and are numbered from one (the lowest form of
normalization, referred to as first normal form or 1NF)
through five (fifth normal form or 5NF). In practical
applications, you'll often see 1NF, 2NF, and 3NF along with the occasional 4NF. Fifth normal form
is very rarely seen and won't be discussed in this article.
Before we begin our discussion of the normal forms, it's important to point out that they are
guidelines and guidelines only. Occasionally, it becomes necessary to stray from them to meet
practical business requirements. However, when variations take place, it's extremely important
to evaluate any possible ramifications they could have on your system and account for possible
inconsistencies. That said, let's explore the normal forms.
First Normal Form (1NF)
First normal form (1NF) sets the very basic rules for an organized database:
For more details, read Put t ing your Dat abase in First Normal Form
Second Normal Form (2NF)
Second normal form (2NF) further addresses the
concept of removing duplicative data:
Eliminate duplicative columns from the same table.
Create separate tables for each group of related data and identify each row with a unique
column or set of columns (the primary key).
of a table and place them in separate tables.
Create relationships between these new tables and
their predecessors through the use of foreign keys.
Third Normal Form (3NF)
Third normal form (3NF) goes one large step further:
Meet all the requirements of the second normal form.
Remove columns that are not dependent upon the primary key.
Boyce-Codd Normal Form (BCNF or 3.5NF)
The Boyce-Codd Normal Form, also referred to as the "third and half (3.5) normal form", adds
one more requirement:
Meet all the requirements of the third normal form.
Every determinant must be a candidate key.
Fourth Normal Form (4 NF)
Finally, fourth normal form (4NF) has one additional requirement:
Meet all the requirements of the third normal form.
A relation is in 4NF if it has no multi-valued dependencies.
Remember, these normalization guidelines are cumulative. For a database to be in 2NF, it must
first fulfill all the criteria of a 1NF database.
Meet all the requirements of the first normal form.
Remove subsets of data that apply to multiple rows