Database Normalization
Paul Brown · 07/02/2024 · 1 min read
This post makes no effort to be the final word on normalization, nor does it aim to be exhaustive. It is, rather, a synthesis of existing resources written in plain English as a reference point for the author. More comprehensive coverage of the topic can be found on the corresponding Wikipedia page and in a video on the Decomplexify YouTube channel.
Normalization
Normalization is the process of organizing information in a database. Creating tables and establishing relationships in a way that eliminates redundancy and are flexible for future extension are all part of database normalization.
Sets of rules exist to guide users through database normalization. These rules, when grouped, are called 'normal forms'. The most commonly discussed normal forms range from the first normal form through to the fifth normal form. Adhering to each successive form provides diminishing returns: regarding normalization, the most important thing for database design is that it conforms to the first normal form; the second most important things is that it conforms to the second normal form and so on.
The following sections detail these rules
First normal form
- Database rows are not inherently ordered: they are not permittted to convey order information
- Each database column (sometimes called a 'field') must hold only one type of data. A 'type' in this case is something like an integer (there are various types -- tinyint, int, bigint) or a datetime or a string (of various specified lengths).
- Each table must have a primary key.
- Repeating groups are not permitted. That is, columns with names like 'customer01', 'customer02', 'customer03' and so on are not allowed.
Discussions
Login to Post Comments