PostgreSQL data types: what are they, and when to use each

PostgreSQL data types: what are they, and when to use each

Cloud-native Postgres?

What if you could get all the advantages of Postgres, PLUS automated sharding, elastic scaling, and a multi-active setup that allows every node to accept reads AND writes for extreme resilience?

You can.

PostgreSQL data types: what are they, and when to use each

Enforcing strict data types is one of the major advantages of relational databases, and PostgreSQL is one of the most popular open-source relational database options. In this article, we’ll look at many of the most commonly used data types in Postgres, how they’re used, and even how they map to more advanced distributed SQL databases.

(Note that when in doubt, you should always refer to Postgres’s official documentation for the latest information).

Boolean data types

Data type Size Values allowed
boolean 1 byte true and false (plus aliases, see below)

The boolean data type stores true/false values. Note that this datatype also allows inputs of yes, on, and 1 (all equivalent to true) and no, off, and 0 (all equivalent to false).

Character data types in Postgres

Data type Values allowed
character varying(n), varchar(n) Any characters, maximum length of n characters
character(n), char(n), bpchar(n) Any characters, maximum length of n characters, blank-padded
bpchar Any characters, no length limit, blank-trimmed
text Any characters, no length limit

All of these data types are used for storing strings. Generally speaking, varchar(n) is the right choice if you need to store strings that you know will remain constrained to a fixed length, and text is the right choice otherwise.

The difference between char(n) (and its aliases) and varchar(n) (and its aliases) is that char(n) is blank-padded, meaning that any strings with fewer than n characters will have spaces appended to the end until they reach n length.

So, for example, if we created two columns in a Postgres table, one with the data type char(3) and one varchar(3), and then we added the string hi to a row in both columns, the char(3) column would store that as hi (with a space appended to the end to make it three characters), whereas the varchar(3) column would store it as hi (no spaces appended).

Numeric data types

Integer data types

Data type Size Values allowed
smallint 2 bytes -32768 to +32767
integer 4 bytes -2147483648 to +2147483647
bigint 8 bytes -9223372036854775808 to +9223372036854775807

As you can see, all of these data types store integer values. The difference between them is the range of values they can store, and consequently the amount of storage space they take up.

For most use cases, integer is probably the right choice, but smallint will save space if you know your values will all fit in that range, and bigint is there for those who need to store truly massive numbers.

Auto-incrementing number data types

Data type Size Values allowed
smallserial 2 bytes 1 to 32767
serial 4 bytes 1 to 2147483647
bigserial 8 bytes 1 to 9223372036854775807

These data types are all used to generate auto-incrementing numbers. Again, the only difference between them is the amount of space they take up, and again, there are specific small and big options, but serial is probably the best choice for most use cases.

Decimal data types

Data type Size Values allowed
decimal variable 131,072 digits before the decimal point; 16,383 after it
numeric variable 131,072 digits before the decimal point; 16,383 after it
real 4 bytes 6 decimal digits
double precision 8 bytes 15 decimal digits

These data types all allow for the storage of numbers with decimals (i.e., not integers). decimal and numeric are the same, and should be used when precision is required, such as in financial applications. real and double precision take up less space, but also allow for fewer decimal places and thus may introduce imprecision, depending on the values you’re storing.

Time and date data

Data type Size Values allowed
timestamp (and timestamptz) 8 bytes 4713 BCE to 294276 ACE
date 4 bytes 4713 BCE 5874897 ACE
time 8 bytes 00:00:00 to 24:00:00
timetz 12 bytes 00:00:00+1559 to 24:00:00-1559
interval 16 bytes -178000000 years to 178000000 years

These data types store time and duration data. Note that timetz and timestamptz can also be written as time with timezone and timestamp with timezone, respectively.

All time data types except date allow for microsecond-level precision, and can accept an optional provision value p (e.g. timetz [(p)]) to specify the number of fractional digits to store in the seconds field.

date, for reasons that are probably obvious, stores values with a precision of one day.

Note also that in addition to traditional dates and times, date and timestamp can accept some special (and self-explanatory) values such as now, yesterday, tomorrow, etc. now will be stored as the current time; yesterday, today, and tomorrow are stored as 0:00 UTC on the relevant date.

UUIDs

The uuid data type is a special type of ID. We’ve got a whole blog post on UUIDs with details on what they are, how they’re used, the different types of UUIDs, etc., so we won’t repeat all that here. The short version, however, is this: UUIDs provide ID values that are highly likely to be unique, which is useful for distributed systems (see that blog post for lots of details), or any system that is likely to merge new data at any point.

(Imagine, for example, your company has a database with sequential IDs using serial. You then acquire another company and need to merge the two databases, but discover the other company has also used serial IDs. Now you have a bit of a mess to sort out since you can’t have the same row ID for two separate rows, and each database will have a 1 row, a 2 row, etc. Conversely, if both databases used UUIDs, you could easily merge the two with essentially no chance that two rows could have the same id value.)

JSON data types

Postgres can store JSON data in two formats: json and jsonb.

These two formats are effectively identical, except that jsonb is stored in a decomposed binary format that makes it slower to write but faster to read. jsonb also supports indexing, whereas json does not.

Arrays

Postgres allows you to store most data types as arrays – in other words, lists of values of that data type. Arrays are declared using [], so for example, here’s how to create a table of users that allows for multiple phone numbers to be input:

CREATE TABLE users (
	user_id uuid PRIMARY KEY,
	name TEXT
	phones TEXT []
);

Array values would be inserted into this table like so:

INSERT INTO users (name, phones)
VALUES('Ellen Ripley', ARRAY ['(555)-555-5555','(555)-555-5556']);

To declare array values, we can also use{} like so:

INSERT INTO users (name, phones)
VALUES('Ellen Ripley', '{“(555)-555-5555”,”(555)-555-5556”}’);

Special data types in Postgres

Postgres also contains quite a few other data types, although they’re typically only needed in specific circumstances. For example, the tsvector and tsquery data types store lexemes (words normalized such that different variants of the word are stored as one word). These data types are useful for efficient text search.

Postgres also provides specific data types for network addresses, geometric shapes, enumerated types, XML types, and more. There are also a number of pseudo-types, which cannot be used as the data type for a column but can be used to define a function’s argument or results.

Data types in Postgres vs. CockroachDB

CockroachDB is a distributed SQL database that’s Postgres compatible – imagine Postgres that was designed for the cloud, that auto-shards and can accept writes on every shard, and that can scale elastically. (Sound interesting? Check it out!)

CockroachDB supports Postgres’s data types, although in some cases it uses different names. For example, varchar and char in PostgreSQL are equivalent to STRING in CockroachDB.

For the full list of data types in CockroachDB, refer to our documentation.

Note also that because CockroachDB is a distributed database, the storage size and performance notes in this article may not be the same. Since CockroachDB is Postgres-compatible, migrating a Postgres database onto CockroachDB is quite straightforward, but some schema adjustment may be required to maximize performance (But don’t worry, we have an automated migration tool to help with this).

About the author

Charlie Custer github link linkedin link

Charlie is a former teacher, tech journalist, and filmmaker who’s now combined those three professions into writing and making videos about databases and application development (and occasionally messing with NLP and Python to create weird things in his spare time).

Keep Reading

High CPU usage in Postgres: how to detect it, and how to fix it

High CPU usage can bring your database – and with it, your application – grinding to a halt.

This is, unfortunately, a …

Read more
SQL cheat sheet for developers, with examples (2023)

Most SQL content on the web seems to be written with data analysts in mind. And that’s fine, but developers need …

Read more
The limitations of PostgreSQL in financial services

PostgreSQL has more than 35 years of active development under its belt making it one of the most powerful and reliable …

Read more