SQL Database Design » 2. Terminology » 2.1 Related Database Terms

2.1 Related Database Terms

When creating database designs, you deal with recurring concepts over and over: data types, properties, entity types, relationships etc. When modeling, the following abstract features are usually dealt with:

  • Data types
  • Properties
  • Instance formats
  • Instance types
  • Instances
  • Instance sets

A data type represents a range of values. A property is the combination of an identifier plus a data type. An instance format is a sorted set of properties. An instance type is approximately a name plus format plus miscellaneous meta information. An instance is a single element of a kind (entity or relationship). An instance set is a full collection of all existing elements of a kind. These abstract terms allow systemization of all terminology concerning databases in books, magazines, scripts, and other Internet texts.

There are a number of "domains" in which these abstract terms have concrete terms. The following domains are relevant:

  • Relational databases (here vaguely the same as SQL)
  • Relational model
  • Entity-Relationship Diagrams (ERDs)
  • Unified Modeling Language (UML)

The entity-relationship (ER) notation is the most complete term-wise. I borrowed some abstractions for others from it. Some domains don't necessarily define equivalent terms at all, e.g. the object-oriented UML defines concepts, which the relational model or ER notation don't. For each undefined term, the closest equivalent is denoted by enclosing it within parentheses.

Comparison of Related Database Terms
FeatureRelational Database (SQL)Relational ModelEntity Relationship NotationUnified Modeling Language
Data TypeDomainDomainDomainClass or simple type
Property[1]ColumnAttributeAttributeAttribute
Instance FormatSet of all columnsSet of all attributesEntity format (set of all attributes)(Set of all attributes)
Instance Type(Table name + format)[2]Relation type?Entity type / declaration [name + entity format + key]Class
Single InstanceRowTupleEntityObject
Multiple InstancesSet of rows (table contents)[6]Relation (set of tuples)[5]Entity set(Set of objects)[7]
Reference FormatSet of all columnsSet of all attributesRelationship format (set of all attributes)(Set of all attributes)
Reference Type(Table name + format)[2]Relation type?Relationship type / declaration [name + relationship format + set of related entity types]Associative class
Single ReferenceRow with foreign key/s[3]Tuple with foreign key/s[3]RelationshipAssociation, aggregation, composition, generalization[4] or associative object
Multiple ReferencesSet of rows (table contents)[6]Relation (set of tuples)[5]Relationship set (set of relationships)(Set of associations, aggregations, compositions, generalizations[4] or set of associative objects)[7]

Notes:

  1. A property (attribute) is an identifier plus a data type.
  2. The closest equivalent of a type in a relational database is the table, as it keeps the table name and attributes. (To be precise, the data dictionary does.)
  3. For each reference row/tuple, there are "degree # of foreign keys".
  4. The UML also knows "realizations", which are interface-based generalizations. However, the relational model and SQL have no means to represent interfaces, making realizations inapplicable/irrelevant.
  5. A relation is a mathematically defined set of tuples.
  6. A table's contents is approximately equivalent to a relation (which is the optimum SQL implementation of a relation). The difference between them is, that the relational model defines relations as real sets, which do not allow duplicates, in contrast to SQL tables.
  7. In object-oriented programming, there are no natural containers for all objects that have been instantiated from a specific class. You would have to create and manage a collection (array, set, list, ...) of all instances manually.

Note that even though I laid out the differences between terms in this section, I caught myself mixing certain terms. The following terms are regularly used interchangably:

  • Entity and row
  • Entity set, entity type, and table
  • Property, attribute, and column

Using entities and tables interchangably is probably the most prominent case, even though it's technically inaccurate. It's not easy to stay consistent at all times, as you might discover yourself.

Last updated: 2010-10-13