5.2 Type

Table of Contents

The type of a relationship encapsulates the semantics and the existence dependency of related entities. There are two basic types of relationships, which are distinguished in modeling:

  • Referential relationships (horizontal)
  • Hierarchical relationships (vertical)

5.2.1 Referential Relationships

Referential relationships are horizontal, non-hierarchical, "regular" relationships. In UML, they're called associations. An association describes a general relationship between entities and is the weakest type of referential relationship. The following image displays the ERD symbol generally used for referential relationship (in Chen notation):

Referential Relationship (Chen Notation)
Referential Relationship (Chen Notation)

General referential relationships and associations represent a loose coupling between subjects. However, there are two additional referential relationship sub types (originally defined by the UML), which define two tight coupling relationships involving whole-part semantics:

  • Aggregation (shared)
  • Composition (unshared)

Aggregation can best be described as a "shared, existence-independent part-of" relationship, which means the referencing entities are semantically "part-of" but they can exist independently of each other. This also means, that the parts generally aren't deleted when the parent is. Thus, aggregate parts can have one, several, or no parents at all. If the last parent is deleted, the parts might be deleted, but that is just a policy that doesn't always have to be true.

Composition on the other hand defines an "unshared, existence-dependent part-of" relationship, that is a part entity can have only one associated whole entity and if the whole is deleted, the part is, too. The existence dependency is the only difference between aggregation and composition, but it's key. It's also why composite parts can have only one whole. Consequently, parts can't exist on their own: they must have an owning entity.

Aggregation is really hard to distinguish from association and composition, which are relatively well-defined. Martin Fowler, the author of the book "UML Distilled", writes the following about aggregation:

One of the most frequent sources of confusion in the UML is aggregation and composition. It's easy to explain glibly: Aggregation is the part-of relationship. It's like saying that a car has an engine and wheels as its parts. This sounds good, but the difficult thing is considering what the difference is between aggregation and association.

In pre-UML days, people were rather vague on what was aggregation and what was association. Whether vague or not, they were always inconsistent with everyone else. As a result, many modelers think that aggregation is important, although for different reasons. So the UML included aggregation (Figure 5.3) but with hardly any semantics. As Jim Rumbaugh says, "Think of it as a modeling placebo" [Rumbaugh, UML Reference].

Aggregation and Composition
Aggregation and Composition

As well as aggregation, the UML has the more defined property of composition. In figure 5.4, an instance of Point may be part of a polygon or may be the center of a circle, but it cannot be both. The general rule is that, although a class may be a component of many other classes, any instance must be a component of only one owner. The class diagram may show multiple classes of potential owners, but any instance has only a single object as its owner.

You'll note that I don't show the reverse multiplicities in Figure 5.4. In most cases, as here, it's 0..1. Its only other possible value is 1, for cases in which the component class is designed so that it can have only one [other] class as its owner.

The "no sharing" rule is the key to composition. Another assumption is that if you delete the polygon, it should automatically ensure that any owned Points also are deleted.

Composition is a good way of showing properties that own by value, properties to value objects (page 73), or properties that have a strong and somewhat exclusive ownership of particular other components. Aggregation is strictly meaningless; as a result, I recommend that you ignore it in your own diagrams. If you see it in other people's diagrams, you'l need to dig deeper to find out what they mean by it. Different authors and teams use it for very different purposes.

Note, that I really dislike the composition example Mr Fowler presents. It's not a real-world example. You would never create two distinct classes named Polygon and Circle without a Shape super class, which would hold the Points. As a consequence, the statement that in most cases, as here, it's 0..1 isn't true. It's only true whenever two disinct classes or entity types reference another one. Such designs, as I described, are rather the exception than the rule. Thus, for compositions, the multiplicity is usually 1..1 (multiplicity is the combination of cardinality and optionality).

Technically, aggregation is closer to association than to composition, even though aggregation and composition share the UML diamond symbol. The diamond simply means that the one entities are the wholes and the others are the parts. In an association, entities from both entity sets are treated equally. The fill representing the existence-dependence has much more profound consequences in terms of the creation and deletion of entities than just the "part of" semantics, which basically just define the direction of the references (navigability).

At this point, you might ask: what about a fill for associations? If the fill means "existence dependence" and the diamond means "part of", where are existence-dependent associations? Well, existence dependence automatically means that one entity is the whole and another is the part. An existence-dependent association either doesn't exist or it automatically becomes a composition, whichever way you think about it:

Association, Aggregation, and Composition (View 1)
SemanticsExistence-Dependence
SharedUnshared
No Part OfAssociationComposition
Part OfAggregationComposition

Because associations are always loosely coupled and on equal levels, existence-dependence can't really be applied to associations. Thus, it's better to imagine existence-dependence as a sub-property of the "part of" property:

Association, Aggregation, and Composition (View 2)
SemanticsType
No Part OfAssociation
Shared Part OfAggregation
Unshared Part OfComposition

In the end, it's up to you to decide whether you really need aggregations or not. I use them, even though it complicates deciding between association and aggregation. However, it helps differentiate whole-part from equally-treated associations. I've written a pretty sophisticated heuristic to find out which relationships really are associations, aggregations, or compositions in section 13.4 Modeling Relationships.

Note, that I decided to use the ER notation in this work, even though it has no symbols to represent aggregation and composition. If you really need these on your ER diagrams, you might want to add some text to the relationship names by using a naming convention.

5.2.2 Hierarchical Relationships

Hierarchical relationships define vertical, inheritance relationships between one or more entity types. There are two basic types for hierarchical relationships: single and multiple inheritance. Only a few programming languages support multiple inheritance and of those who use them, not all programmers make use of multiple inheritance. Multiple inheritance has several implications that complicate development. It gets even more complicated when applying multiple inheritance to the relational model and/or SQL.

The SQL standard doesn't support multiple inheritance per se: ... SQL:1999 ... supports only single inheritance. SQL:1999 doesn't provide a good workaround for this issue, and it remains a problem for some applications to resolve in other ways. There are solutions to implement multiple inheritance via an intermediate table, much like a many-to-many join table. However, many don't like to use or need multiple inheritance. I don't use it, so I will only consider single inheritance in this paper.

The UML defines two inheritance relationship types called generalization and realization. A generalization is a hierarchical relationship between (abstract) classes, while a realization denotes a (partly) implementation of an interface. Interfaces cannot be represented by plain ERDs, the relational model, or plain SQL, because interfaces have no attributes. Thus, only generalizations can be modeled (they in turn usually do have attributes). The following image shows the symbol used for hierarchical relationships (in Chen notation):

Hierarchical Relationship
Hierarchical Relationship (Chen Notation)

Hierarchical relationships have two additional, important properties:

  • Partial vs. total (vertical)
  • Disjoint vs. non-disjoint (horizontal)

"Partial" and "total" define a vertical inheritance property. In partial hierarchical relationship, the super type's entities can exist independently of the sub types, while in a total inheritance relationship, the sub and super entities cannot exist without each other. In object-oriented terms, a total inheritance relationship's super table represents an abstract class. A partial inheritance relationship's super table would resemble a normal, instantiable super class.

"Disjoint" and "non-disjoint" is a horizontal inheritance property, concerning the sub entities only. In a disjoint hierarchical relationship, sub entities can only exist in an "exclusive or" (XOR) fashion: they can only exist in one sub table, but not in another. In a non-disjoint hierarchical relationship, there can be sub entities in one, some, or all of the sub tables, sharing the same super entity. There's no direct object-oriented equivalent for the disjoint/non-disjoint inheritance property.

All Total/Partial and Disjoint/Non-Disjoint Combinations (Chen Notation)
All Total/Partial and Disjoint/Non-Disjoint Combinations (Chen Notation)

Sub classes or sub tables are created by what the UML knows as a discriminator. A discriminator is the logical feature by which sub entity types are differentiated. A discriminator can only be set for disjoint inheritance (total or partial alike).

Example: contactable entities persons, clubs, and facilities. A contactable is either a person, a club, or a facility, but not two of them at the same time, so you would store the sub type as a discriminator column in the super table. When selecting a super entity (contactable), you can deduce the respective entity sub type from the discriminator.

Note, that the crow foot notation and others, which are targeted at logical and physical models, don't have any symbols for hierarchical relationships. You can read more about the differences between ERD relationship notations in the 5.6 Epilog on Relationship Notations section.

Last updated: 2010-10-13