PostgreSQL data types: what are they, and when to use each

Enforcing strict data types is one of the major advantages of relational databases, and PostgreSQL is one of the most popular open-source relational database options. In this article, we’ll look at many of the most commonly used data types in Postgres, how they’re used, and even how they map to more advanced distributed SQL databases.

(Note that when in doubt, you should always refer to Postgres’s official documentation for the latest information).

Boolean data types

The boolean data type stores true/false values. Note that this datatype also allows inputs of yes, on, and 1 (all equivalent to true) and no, off, and 0 (all equivalent to false).

Character data types in Postgres

All of these data types are used for storing strings. Generally speaking, varchar(n) is the right choice if you need to store strings that you know will remain constrained to a fixed length, and text is the right choice otherwise.

The difference between char(n) (and its aliases) and varchar(n) (and its aliases) is that char(n) is blank-padded, meaning that any strings with fewer than n characters will have spaces appended to the end until they reach n length.

So, for example, if we created two columns in a Postgres table, one with the data type char(3) and one varchar(3), and then we added the string hi to a row in both columns, the char(3) column would store that as hi (with a space appended to the end to make it three characters), whereas the varchar(3) column would store it as hi (no spaces appended). Numeric data types

Integer data types

As you can see, all of these data types store integer values. The difference between them is the range of values they can store, and consequently the amount of storage space they take up.

For most use cases, integer is probably the right choice, but smallint will save space if you know your values will all fit in that range, and bigint is there for those who need to store truly massive numbers.

Auto-incrementing number data types

These data types are all used to generate auto-incrementing numbers. Again, the only difference between them is the amount of space they take up, and again, there are specific small and big options, but serial is probably the best choice for most use cases.

Decimal data types

These data types all allow for the storage of numbers with decimals (i.e., not integers). decimal and numeric are the same, and should be used when precision is required, such as in financial applications. real and double precision take up less space, but also allow for fewer decimal places and thus may introduce imprecision, depending on the values you’re storing.

Time and date data

These data types store time and duration data. Note that timetz and timestamptz can also be written as time with timezone and timestamp with timezone, respectively.

All time data types except date allow for microsecond-level precision, and can accept an optional provision value p (e.g. timetz [(p)]) to specify the number of fractional digits to store in the seconds field.

date, for reasons that are probably obvious, stores values with a precision of one day.

Note also that in addition to traditional dates and times, date and timestamp can accept some special (and self-explanatory) values such as now, yesterday, tomorrow, etc. now will be stored as the current time; yesterday, today, and tomorrow are stored as 0:00 UTC on the relevant date.

UUIDs

The uuid data type is a special type of ID. We’ve got a whole blog post on UUIDs with details on what they are, how they’re used, the different types of UUIDs, etc., so we won’t repeat all that here. The short version, however, is this: UUIDs provide ID values that are highly likely to be unique, which is useful for distributed systems (see that blog post for lots of details), or any system that is likely to merge new data at any point.

(Imagine, for example, your company has a database with sequential IDs using serial. You then acquire another company and need to merge the two databases, but discover the other company has also used serial IDs. Now you have a bit of a mess to sort out since you can’t have the same row ID for two separate rows, and each database will have a 1 row, a 2 row, etc. Conversely, if both databases used UUIDs, you could easily merge the two with essentially no chance that two rows could have the same id value.)

JSON data types

Postgres can store JSON data in two formats: json and jsonb.

These two formats are effectively identical, except that jsonb is stored in a decomposed binary format that makes it slower to write but faster to read. jsonb also supports indexing, whereas json does not.

Arrays

Postgres allows you to store most data types as arrays – in other words, lists of values of that data type. Arrays are declared using [], so for example, here’s how to create a table of users that allows for multiple phone numbers to be input:

CREATE TABLE users ( user_id uuid PRIMARY KEY, name TEXT phones TEXT [] );

Array values would be inserted into this table like so:

INSERT INTO users (name, phones) VALUES('Ellen Ripley', ARRAY ['(555)-555-5555','(555)-555-5556']);

To declare array values, we can also use{} like so:

INSERT INTO users (name, phones) VALUES('Ellen Ripley', '{“(555)-555-5555”,”(555)-555-5556”}’);

Special data types in Postgres

Postgres also contains quite a few other data types, although they’re typically only needed in specific circumstances. For example, the tsvector and tsquery data types store lexemes (words normalized such that different variants of the word are stored as one word). These data types are useful for efficient text search.

Postgres also provides specific data types for network addresses, geometric shapes, enumerated types, XML types, and more. There are also a number of pseudo-types, which cannot be used as the data type for a column but can be used to define a function’s argument or results.

Data types in Postgres vs. CockroachDB

CockroachDB is a distributed SQL database that’s Postgres compatible – imagine Postgres that was designed for the cloud, that auto-shards and can accept writes on every shard, and that can scale elastically. (Sound interesting? Check it out!)

CockroachDB supports Postgres’s data types, although in some cases it uses different names. For example, varchar and char in PostgreSQL are equivalent to STRING in CockroachDB.

For the full list of data types in CockroachDB, refer to our documentation.

Note also that because CockroachDB is a distributed database, the storage size and performance notes in this article may not be the same. Since CockroachDB is Postgres-compatible, migrating a Postgres database onto CockroachDB is quite straightforward, but some schema adjustment may be required to maximize performance (But don’t worry, we have an automated migration tool to help with this).