SQL Database Design » 10. Constraints

10. Constraints

Table of Contents

Real world rules are represented in databases as integrity constraints. Constraints are an integral part of the logical model. As Pascal writes: Their name derives from their function, which is to constrain database values to only those that yield rows representing true propositions (facts). In other words, constraints restrict the values a single or a set of columns can take.

Relational constraints can be classified into four categories (by scope):

  • Domain constraints
  • Column constraints
  • Table constraints
  • Database constraints

Note, that these are the constraint categories as the relational theory and not SQL knows them. These categories are distinguished by their execution scope, that is, the time when the defined constraints are checked. I will try to be brief on explaining them.

Domain constraints are the definition of the domain (data type). They can only take the values as specified via domain or inbuilt type. Column constraints define the application of a domain (data type) to a concrete column: "Column X is derives values from domain Y". Table constraints affect one to n rows of one and the same table. The two types of table constraints are based on the scope of the check that must be performed to enforce them: single-row, e.g. NOT NULL, or multi-row, e.g. primary key constraints (think how many rows in a table must be checked to determine if a violation has occurred). Database constraints involve checking rows across multiple tables, e.g. foreign key constraints.

Practitioners will only be familiar with the SQL CONSTRAINT keyword. Do not confuse constraint types based of relational theory with those in SQL or the products based on them - they are not properly defined and confuse. The following constraints are available in SQL:

  • Primary keys
  • Foreign keys
  • Unique constraints
  • Check constraints
  • Not-null constraints

In SQL, you can only formally associate PRIMARY KEY, FOREIGN KEY, UNIQUE, CHECK, and NOT NULL constraints with an SQL CONSTRAINT clause. The DEFAULT keyword does not belong to them. As stated before, relational and SQL constraints cannot be easily assigned/mapped to each other. I will give it a try nontheless.

Not-null constraints are either domain or column constraints, because they affect the domain or column, depending on where the constraint is specified. Primary keys, unique and check constraints are column or table constraints, because they affect one or more columns and are inside the scope of the table (this could be inexact). Foreign keys definitely are database constraints, because they affect several tables.

Column, table and database constraints are the ones you specify with the SQL CREATE TABLE statement. They come in two flavors:

  • Inline declarations
  • Out-of-line declarations

These two types are often referred to as column constraints and table constraints, which is, strictly speaking, wrong. Constraints apply to logical models, so they are semantic features. Constraint declarations in a CREATE TABLE statement are purely syntactical. If at all, they should be called column-level constraints and table-level constraints.

The difference between inline- and out-of-line constraints is the location where they are specified. Inline constraints are specified right along with the column definitions in a CREATE TABLE statement, like ..., id INTEGER [CONSTRAINT name] PRIMARY KEY, .... Out-of-line constraints are declared inbetween columns or at the end of the table, like ... , [CONSTRAINT name] PRIMARY KEY (id), .... The (standard) SQL syntax for specifying inline constraints is:

  [ CONSTRAINT constraint_name ]
  { NOT NULL |
    UNIQUE |
    PRIMARY KEY |
    DEFAULT value |
    CHECK ( condition ) |
    REFERENCES table [ ( column ) ]
               [ MATCH FULL | MATCH PARTIAL ]
               [ ON DELETE action ]
               [ ON UPDATE action ]
               [ DEFERRABLE | NOT DEFERRABLE ]
               [ INITIALLY DEFERRED | INITIALLY IMMEDIATE ]
  }

The (standard SQL) syntax for out-of-line constraints is:

  [ CONSTRAINT constraint_name ]
  { UNIQUE ( column [, ...] ) |
    PRIMARY KEY ( column [, ...] ) |
    CHECK ( condition ) |
    FOREIGN KEY ( column [, ... ] )
                REFERENCES table [ ( column [, ... ] ) ]
                [ MATCH FULL | MATCH PARTIAL ]
                [ ON DELETE action ]
                [ ON UPDATE action ]
                [ DEFERRABLE | NOT DEFERRABLE ]
                [ INITIALLY DEFERRED | INITIALLY IMMEDIATE ]

The above code samples wer taken from the PostgreSQL manual. As you can see, the NOT NULL constraint and the DEFAULT value can only be specified inline. Note again, even though the DEFAULT clause is specified here, it is not a constraint as described in the 9. Default Values section. Of course, constraints may span multiple columns, but they can only be specified at the table level. Column-level constraints can also be written as table constraints, while the reverse is not necessarily possible.

The following source further states: The difference is rarely noticed by users, column-level constraints are checked only if a value in the column is being modified, while the table-level constraints are checked if there is any modification to a row, regardless of whether or not it changes the column in question. I am not sure if this is Oracle-specific behavior.

The constraint keyword and name are always optional. I cling to remember sources, that claimed constraint names are mandatory for multi-column constraints, although I do not know why this should be the case. A constraint can only be altered or dropped if you have previously given it a name. Many DBMSs generate a constraint name if you do not specify one. It is always good practice to name them yourself, although I have never seen anyone giving names to NOT NULL constraints (or default values, if the DBMS allows it at all, which I have seen).

Last updated: 2010-08-04