SQL Database Design » 7. Standard Data Types

7. Standard Data Types

Table of Contents

By "standard" I mean two things. First, I mean the data types that developers usually deal with: strings, integral numbers, fractional numbers, booleans, dates, times, binary types, enumerated types, collective, and structured types. Second, the SQL standard defines keywords for most of these types that make them portable. In databases, there are often several keywords available for similar or exactly the same data types. This section will help you decide for the right data types, taking into account the actual DBMS and software requirements.

SQL is a high-level language whose types were originally meant to be conceptual. There is basic truth to this, but in practice, some of them actually include concrete sizes or a relative order: data types such as SMALLINT and BIGINT are defined to be smaller and bigger than a regular INTEGER. Other examples include VARCHAR and CLOB, which represent smaller and larger strings. There seems to be only a vague agreement on what is a conceptual, a logical, or a physical type.

SQL differs from regular programming languages where usually exactly one keyword implements a certain data type. Due to its history SQL has many synonyms for one and the same data type. If you're a hawk eye you will already have noticed, that not only SQL, but many programming languages have data type naming inconsistencies: Java uses short and long forms int and boolean, C# (and PHP5 type hinting documentation) use both short forms int and bool, and Pascal uses both long forms integer and boolean. (I was too lazy to find a language that eventually uses the long and short forms integer and bool.)

In SQL, the consequences of having to decide between so many, often equivalent, data types causes so many code to be unportable when the decision was bad. Overall, choosing a more portable type over a proprietary data type has more advantages than disadvantages. Portability isn't only about the data itself, but it's also about the people that use it in the long run. Using portable data types makes it easier for new developers to make the transition from other products, so standard SQL actually saves money.

Of course, nothing has only advantages. Portability usually comes at the price of storage optimization and/or performance. Even though it's sometimes impossible to find a 100 percent portable solution, the goal is to maximize portability by avoiding DBMS-specific types as much as possible, which this chapter is all about.

In each section, I will evaluate the pros and cons to find the most portable solution. The final decision isn't always easy, so the final decision will be up to you. After reading through the following sections, you should be able to define tables and columns, that can be implemented on any modern DBMS with a maximum of portability.

Last updated: 2010-10-13