SQL Database Design » 14. Normalization » 14.6 Third Normal Form

14.6 Third Normal Form

It is the third normal form (3NF) that makes a difference in database design. It defines another specific type of functional dependencies. Pascal defines 3NF as following: If all [non-key] columns are functionally dependent directly on the whole key ..., a table is in third normal form (3NF).

This means, a 2NF table might still contain indirect or transitive functional dependencies. Such dependencies occurr, if the table contains functional dependencies between non-key columns, which by themselves are dependent on the whole key (as demanded by 2NF). These non-key functional dependencies further hint distinct entity types. You get rid of the dependencies by splitting the table in two, with one table referencing the other, usually via a one-to-many relationship.

As mentioned before, there are even more normal forms, which I will not go into. Usually, reaching 3NF is sufficient, even for larger systems. As Pascal put it: Otherwise put, it is the physical implementation of the DBMS that slows down performance, not the logical normalization of tables.

Performance concerns can still be addressed after design. The cause of bad performance is not necessarily normalization itself, but rather notoriously slow SQL techniques like SELECT * ..., the SELECT DISTINCT ... (because the database is full of duplicates), correlated (dependent) sub selects (use derived tables instead), use of non-deterministic functions (those that create values depending on the DBMSs current state) like CURRENT_DATE(), or the use of aggregate functions on indexed columns (indices cannot be used with them).

Many DBMSs offer optimization techniques beyond those few described. One technique worth mentioning are clustered tables, which are basically stored joins. There are definitely many more techniques of which I have not heard before.

References for Normalization:

Last updated: 2010-08-04