We here at Cockroach Labs have been busy getting CockroachDB more in line with PostgreSQL for v20.1. This allows you to be able to use Cockroach as a drop-in replacement for PostgreSQL (whilst using your favourite existing ORM or driver), but with the powerful, scalable backend we provide.
Time, Timestamp and TimestampTZ have been three data types Cockroach has supported, but were not well matched with PostgreSQL functionality. We were also missing the TimeTZ datatype, as well as precision for time and interval types. Even worse, our results did not match PostgreSQL in some cases for the existing data types we do support.
We’ve spent a lot of the v20.1 release mastering time data types, fixing what was broken through ORM tests and community reports whilst adding new features to bridge that gap. A significant amount of time was spent scratching our heads as we figured out why existing time-related ORM tests were failing against CRDB. We’ve found a lot of devils in the details, with knowledge over these edge cases sparsely located over old SQL mailing lists and StackOverflow.
In this blog post, we’ll explore some of the intricacies of time in PostgreSQL and look at how we reproduce these features using Go and its ecosystem. We’ll share our recommendations for using time along the way.
Are you excited to learn about time in the PostgreSQL world? Join us as a companion in our TARDIS (Time and Related Data Types In SQL…blog post) and hop into an adventure into the world of time. Allons-y!
When you open a connection to Postgres/CRDB, you will have a session between you (the client) and the database (the server). The session will contain a time zone, which you can set with the SET TIME ZONE
command for both PostgreSQL and CRDB.
When connecting using the shell, the PostgreSQL shell (psql) will try to connect and set the session to the local time zone as defined by your computer settings, whereas CRDB will default to UTC. We can observe this with CURRENT_TIMESTAMP
, which returns the current timestamp in the default session time zone:
otan=# SELECT CURRENT_TIMESTAMP; |
root@127.0.0.1:61879/defaultdb> SELECT CURRENT_TIMESTAMP; |
psql shell | CRDB shell |
You can specify your time zone to be a location (which is daylight savings aware), or a UTC offset. This will change the CURRENT_TIMESTAMP to be in the time zone your session is set to:
otan=# SET TIME ZONE 'Australia/Sydney'; |
root@127.0.0.1:61929/defaultdb> SET TIME ZONE 'Australia/Sydney'; |
psql shell | CRDB shell |
In the above example, CRDB and psql now output the same data as SET TIME ZONE is explicitly set to the same things.
Note your ORM or driver may have different default behaviour than the psql shell or the CRDB shell, but should allow you to set a default time zone.
Setting your session time zone will affect some time operations in some fun and exciting ways (which is not necessarily what you want as a developer) which we’ll explore soon.
In the above examples, we have used SET TIME ZONE
with an integer offset (-11) or a location (Australia/Sydney). However, Postgres also supports the syntax for SET TIME ZONE
to have GMT
or UTC
at the front, e.g. SET TIME ZONE 'UTC+3'
. Let’s see how it behaves:
otan=# SET TIME ZONE ‘UTC+3’; | root@127.0.0.1:63094/defaultdb> SET TIME ZONE ‘UTC+3’; |
psql shell | CRDB shell |
Hold on a second, why does it have a time zone of -3 when setting the time zone to UTC+3
?
It turns out that this is because having UTC
or GMT
in front uses the POSIX definition for time zones, which defines zones to be hours west of the GMT line. In essence, the sign is reversed from the time zone structure we know and love, where positive integers represent that the timestamp is east of the GMT line (also known as ISO8601 standard). Note that the POSIX standard also applies if you just add colon separators to the offset in SET TIME ZONE:
otan=# SET TIME ZONE ‘+3:00’; | root@127.0.0.1:51387/defaultdb> SET TIME ZONE ‘+3:00’; |
psql shell | CRDB shell |
The takeaway here is that integers are treated as ISO8601 format, locations do what you expect and anything else is POSIX standard.
Thinking this is a little weird? We’ve got another surprise with this coming later!
The TIMESTAMP (also known as TIMESTAMP WITHOUT TIME ZONE
) and TIMESTAMPTZ (also known as TIMESTAMP WITH TIME ZONE
) types stored as a 64-bit integer as a microsecond offset since 1970-01-01 in CRDB and as a 64-bit integer microsecond offset since 2000-01-01 in PostgreSQL (by default).
As both data types are stored using only 64-bit integers, it is important to note that neither store any time zone information. So where does TIMESTAMPTZ get the time zone from? That’s right — the session time zone. As a corollary, that means storing the TIMESTAMPTZ and fetching the results from a different session time zone results in a different printed result, with the same equivalent UTC offset underneath.
Let’s compare storing a TIMESTAMP / TIMESTAMPTZ and fetching it from a different time zone:
otan=# CREATE TABLE time_comparison(t TIMESTAMP, ttz TIMESTAMPTZ); | root@127.0.0.1:56384/defaultdb> CREATE TABLE time_comparison(t TIMESTAMP, ttz TIMESTAMPTZ); |
psql shell | CRDB shell |
In the psql shell, the time we’ve inserted these entries as 2020-05-13 14:05:23.801845
with offset -07
. The TIMESTAMP column does not regard the time zone, so drops that information. When we switched time zones to UTC-3
, only the ttz
column will change its result by transforming the time into the +3
offset, which is 10 hours ahead, hence now displaying 2020-05-13 18:05:23.801845-03
.
In the CRDB shell, things look largely the same. However, the TIMESTAMP column has an extraneous +00:00 at the front. This is a UX quirk of using the Go driver lib/pq (which is not related to the C library libpq) that powers the CRDB shell — it always displays a time zone when representing any time object since it uses the same format string for all time related types. We’re looking to fix this UX issue in a future release — but have comfort that it represents the same value underneath.
If you’re still confused, don’t worry — it’s confusing to us too. Let’s make it crystal clear.
TIMESTAMP is an absolute value time offset. It does not store time zones, or change based on the session time zone. Another way to think about it is that it is always in UTC.
TIMESTAMPTZ is also an absolute value time offset with no time zone metadata set, but it displays timestamps and performs operations in the session time zone. You’ve seen the “displays timestamp” bit already — we’ll get to the “performs operations” component later.
We’ll omit CRDB shell output for the remainder of this section to avoid duplicate information — but it is worth noting we’ve had bugs in all aspects in areas below which is how we got to explaining it.
Let’s parse the opening ceremony of the Sydney Olympics in PostgreSQL with a fresh new psql shell in California:
otan=# SELECT ‘2000-09-15 19:00’::TIMESTAMP, ‘2000-09-15 19:00’::TIMESTAMPTZ; |
psql shell |
So parsing a timestamp string without any time zone information in the string will automatically default to the current time zone for TIMESTAMPTZ.
Let’s append Sydney’s time zone in September (+11:00) into the string before we cast:
otan=# SELECT ‘2000-09-15 19:00+11:00’::TIMESTAMP, ‘2000-09-15 19:00+11:00’::TIMESTAMPTZ; |
psql shell |
As we see above:
SET TIME ZONE
) but we are displaying TIMESTAMPTZ in the time zone -07:00 as we are in California. With an 18 hour time difference, that places the opening ceremony at 1am local time in California, as you can see from the output. American readers, youse must have been tired watching the Olympics in 2000.Of course, explicitly changing the session time zone with the same strings will change the output for TIMESTAMPTZ:
otan=# SET TIME ZONE ‘Asia/Tokyo’; |
psql shell |
Again, the time zone when parsing TIMESTAMP is completely ignored, but for Tokyo’s time zone of +09:00, we’ve had to rewind the clock from 19:00 to 17:00 to display correctly in the session time zone.
Let’s look at a few casts between TIMESTAMP and TIMESTAMPTZ with the time one of the greatest cricket test match finishes of all time:
otan=# SET TIME ZONE ‘Australia/Adelaide’; |
psql shell |
When we cast from a TIMESTAMP to a TIMESTAMPTZ in Case 1, we have to set the time zone of +10:30. However, this actually changes the UTC time, as “displaying it correctly” in the current session time zone involves subtracting 10:30 from the underlying offset value.
Conversely, when we convert from TIMESTAMPTZ to TIMESTAMP in Case 2, TIMESTAMP represents an absolute UTC value but without the time zone information. As such, we would have to add 10:30 to the underlying offset value for the correct equivalent value to make this equal to the value 17:00.
This behaviour gets tricky, especially when the data is stored and fetched in a session with another session time zone whilst expecting the cast to TIMESTAMP to give you the same result:
otan=# SET TIME ZONE ‘Australia/Adelaide’; |
psql shell |
In the above example, we changed the time zone above from +10:30 to Chicago’s -06:00, so we have a 16:30 time difference. Since we cast to TIMESTAMP using the TIMESTAMPTZ data type, we evaluate it as of the session time zone at the point of evaluation, hence getting 2006-12-05 00:30:00 when casting it in America/Chicago. Note if you want a result to always evaluate to the same TIMESTAMP no matter the session time zone, use the AT TIME ZONE syntax discussed below.
For many of these operations, we need to move timestamps to a different time zone but whilst keeping the same timestamp underneath. Go does not natively do this, as time.In only changes the location, but has the same UTC offset underneath. This means that something at 10:00 cast to a +3 time zone without any offset changes would report as 13:00+3 instead of 10:00+3. This was a source of quite a few of our time bugs!
To do this correctly, we have to do an awkward dance: read the second argument from Zone to get the zone offset in seconds from the timezones before and after and then use time.Add to subtract that duration offset to the new time value.
TIMESTAMP and TIMESTAMPTZ supports microsecond precision. However, there is an option for “rounding” of fractional digits for the seconds component.
To do this, we can specify a number between 0 and 6 inclusive in parenthesis after TIMESTAMP and TIMESTAMPTZ, e.g. TIMESTAMP(3) for TIMESTAMP in millisecond precision allowing 3 fractional digits, or TIMESTAMPTZ(0) for TIMESTAMPTZ with no fractional digits. This will get the datums rounded to the specified precision.
Let’s look at examples using the time when Super Over rules were unfair:
otan=# SELECT ‘2019-07-14 17:00:00.545454’::TIMESTAMP(0); – rounded up |
psql shell |
The equivalent functionality in Go is available using time.Round in the time library.
Functions (e.g. extract
, date_trunc
) and Operators (e.g. =
, +
, -
, >
) are fairly easy to comprehend with TIMESTAMP. However they are a little nuanced with TIMESTAMPTZ, causing a variety of bugs internally in CRDB. As we alluded to earlier, it’s important to remember that TIMESTAMPTZ performs operations in the session time zone.
Let’s look at some time operators, with timestamps around the release of the movie Crocodile Dundee:
otan=# SET TIME ZONE ‘America/New_York’; |
psql shell |
Remember — in all cases, the TIMESTAMP is converted to the TIMESTAMPTZ of the current session time zone, which is ‘1986-09-26 10:00-04’ in New York. Taking a look at each case:
For builtins, let’s look at extract
:
otan=# SET TIME ZONE ‘America/New_York’; |
psql shell |
From the above:
SELECT '1986-09-26 10:00-06'::TIMESTAMPTZ
, we’d see 1986-09-26 12:00-04 as -04 is the session time zone. As such, when moving it to from -02 to -04, the time becomes 12:00 — and since we perform the operation in the session time zone — extract will hence return 12.AT TIME ZONE will convert a TIMESTAMPTZ to a TIMESTAMP at a given time zone, or a TIMESTAMP to a TIMESTAMPTZ at a given time zone (which will be transposed into the session time zone). It is worth noting that TIMESTAMP AT TIME ZONE
This can be useful if you expect users to cast from TIMESTAMPTZ to TIMESTAMP or vice versa from different session time zones but you need consistent offsets from the epoch time (see “Casting” for an example for how this could be a problem).
Confusing? Let’s look at the real examples using the release date of the song Friday:
otan=# SET TIME ZONE ‘Australia/Sydney’; |
psql shell |
From the above:
Remember the surprise earlier with the POSIX standard being used for strings when using SET TIME ZONE? In AT TIME ZONE, omitting the UTC/GMT prefix and having just a bare integer offset (e.g. +3
, -3
, 3
) also behaves as POSIX timestamp (unlike SET TIME ZONE where integers were special and behave as ISO8601):
otan=# SET TIME ZONE ‘+3’; |
psql shell |
We would expect the above case to be 2011-03-14 10:00:00+03:00
if it were ISO8601 when using AT TIME ZONE. However, as +3
is POSIX for AT TIME ZONE, it really means “at time zone 3 hours west of GMT”, to be displayed as “3 hours east of UTC”, hence adding 6 hours to the result.
Now you may be wondering when to use a location versus when to use an absolute offset. Offsets already seem tricky given how it uses POSIX style offsets.
This may help you decide — locations can infer changing time zone information. Let’s look at a dates which traverse time zones in Chicago:
root@127.0.0.1:64900/defaultdb> SET TIME ZONE ‘America/Chicago’; |
psql shell |
With the daylight savings boundary change, we can see that the time zone offset changes. That’s neat, isn’t it?
You may now be wondering — how are these time zone changes encoded? IANA maintains a database of daylight savings changes for each time zone (some of which date back a very long time). But which version of this database do we use? Well:
This means that time and daylight savings behaviours can change between computers when using CRDB if a newer version of the IANA database is on your system. This is currently tracked for a fix, and is one of the reasons we recommend always using UTC as your session time zone.
Let’s look at how daylight savings impacts interval math:
otan=# SET TIME ZONE ‘America/Chicago’; |
psql shell |
Huh — are there not 24 hours in a day? Intervals in Postgres are represented as “months”, “days” and “seconds”, which plays a role in what we see here. Let’s see how this applies to the cases above:
Does 2038-01-19 03:14:07 spark any Y2K vibes? This is the time is 2147483647 (max int32) seconds after the unix offset of 1970-01-01, known as the Y2K38 date. This was an issue in Go when handling tzdata past 2038 which has only just been resolved.
As Go 1.13 does not understand the “extended” format of tzdata which handles time zones past Y2K38, our handling of daylight savings is broken after the Y2K38 date in v20.1 (which ships with Go 1.13). We cannot easily parse it ourselves either, as the Go Location struct is not an interface, and the relevant variables to change are private and inaccessible without forking Go’s time library.
As such, if we pass Y2K38, the time zone in CRDB will be the same as the daylight savings in your current time zone, as that is what it is evaluated to in Go. This can be a problem if using a session time zone that has daylight savings:
otan=# SET TIME ZONE ‘America/Chicago’; | root@127.0.0.1:57882/defaultdb> SET TIME ZONE ‘America/Chicago’; |
psql shell | CRDB shell |
This is tracked for a future fix.
As a side note — if you’re curious about tzdata, it makes for an interesting read. Check out some of the weird offsets such as America/Chicago pre-1900 with a time offset of -5:50:36
.
Feeling confident about time? Feeling you may be turning half human, half time lord?
Let’s see — can you explain the following behaviour:
otan=# SET TIME ZONE ‘-9’; |
psql shell |
The answer is near the end of this blog post.
Like many others, CRDB recommends usage of TIMESTAMPTZ as encoding time zone data is valuable. However, we recommend always setting the session time zone to UTC. This allows the user to not worry about not losing time zone information whilst parsing, whilst allaying concerns that if a user decides to use session time zones that they perform with intended daylight-savings aware behaviour.
TIME (also known as TIME WITHOUT TIME ZONE) and TIMETZ (also known as TIME WITH TIME ZONE) both only store the time of day component of a TIMESTAMP. But:
Let’s look at using CURRENT_TIME (the equivalent of CURRENT_TIMESTAMP) in psql (with California time as default) and CRDB:
otan=# CREATE TABLE timetz_example (t time, ttz timetz); | root@127.0.0.1:56478/defaultdb> CREATE TABLE timetz_example (t time, ttz timetz); |
psql shell | CRDB shell |
As you can see, changing the time zone does not affect table results. Since TimeTZ stores the offset, they stay the same between session time zones shifts.
It is worth noting here that CRDB outputs an extra “0000-01-01” for time types, as well as an extraneous “+00:00” for the time type. This data has no meaning and is the way our driver lib/pq for the CRDB shell displays this data. This is tracked for a future fix.
When performing interval math with time, times past 23:59:59.999999, automatically overflows back to 00:00:00 as there is no “date” component.
otan=# select ‘10:00’::time + ‘14 hours’::interval; |
psql shell |
Parsing TIME and TIMETZ largely behaves the same as the parsing for TIMESTAMP and TIMESTAMPTZ:
otan=# SET TIME ZONE ‘Australia/Sydney’; |
psql shell |
Similar to TIMESTAMP/TIMESTAMPTZ, you can specify precision in parenthesis for TIME/TIMETZ types, which round to specified precision of fractional digits for the seconds component:
otan=# select ‘17:00:00.545454’::time(0), ‘17:00:00.545454+03’::timetz(0); – rounded up |
psql shell |
When casting TIME to TIMETZ, the TIME will get promoted to the time zone of your current session. However, when casting TIMETZ to TIME, we will lose time zone offset:
otan=# SET TIME ZONE ‘Australia/Sydney’; |
psql shell |
Losing the time zone offset and reinterpreting it in the session time zone can be surprising, as casting that TIME back to TIMETZ will give you a different result. In other words, casting the inverse of the inverse does not yield the identity. You can see an example of this below:
otan=# SET TIME ZONE ‘Australia/Sydney’; |
psql shell |
Here, our +03 in our TIMETZ has been reinterpreted in the session time zone of +11 after casting it to TIME, which represents a wholly different result. If you need inverses to match, AT TIME ZONE between TIME and TIMETZ will yield the desired effects.
Consider the comparison of these two equivalent times (10:00+03 and 11:00+04), one in the same time zone and one in a different time zone:
otan=# SELECT ‘10:00+03’::timetz = ‘10:00+03’::timetz; – Case 1 |
psql shell |
That’s interesting — they’re the same time in the real world, aren’t they? Well not in the realm of TimeTZ. Recall that TimeTZ stores both microsecond offset AND offset representing seconds west of UTC. If the microsecond offset is the same, then we compare seconds WEST of UTC (i.e. the negative of the offset we see above). As such:
An interesting feature that is only supported for TIME and TIMETZ is 24:00:00 time. This is a time that can be parsed, but you cannot use arithmetic to achieve the value. Adding to 24:00:00 overflows the value back to 00:00:00.
otan=# SELECT ‘24:00’::time; – 24:00 time can be parsed as such |
psql shell |
Unfortunately, Go’s time.Parse does not handle 24:00 — so our TIME string parsers all have wrappers to regex match and handle the 24:00 case.
Parsing and displaying 24:00 time with the Go time library is also a challenge, as can be demonstrated in the lib/pq driver. Unfortunately, the lib/pq driver we use displays the 24:00 with the exact same representation as 00:00:
root@127.0.0.1:57021/defaultdb> SELECT ‘24:00’::time, ‘00:00’::time; |
psql shell |
We have submitted a PR as a fix, but until then, 24:00 time is broken in the lib/pq driver, even against PostgreSQL.
Whilst TIME and TIMETZ store only the time component, TIME takes the same amount of space — and even more for TimeTZ. Furthermore, TIMETZ also does not keep track of location with time offsets. 24:00 time is an interesting case that is handled but may have limited practical uses.
It is worth also noting that PostgreSQL advises against using the TIMETZ data type. TIMETZ was originally implemented to follow the SQL standard. See the note just above section 8.1 in PostgreSQL’s own documentation.
So our recommendation is to use … neither! If you need time, you are most likely better off using TIMESTAMPTZ which takes care of these shortfalls in all these cases. If time zone information is required, use a separate column to encode that information.
The session time zone is set at -9. This means 1947-12-13 13:00+11
would be translated to 1947-12-12 17:00:00-09
.
Now UTC+3 is POSIX standard, meaning it really means 3 hours west of UTC ("-3" in ISO8601). This is six hours ahead of the session time zone of “-9”, hence translating to 23:00:00. Since AT TIME ZONE translates a TIMESTAMPTZ to a TIMESTAMP, the time zone data is gone.
(And if you are curious, 1947-12-13 is the date of the infamous Mankad incident.)
Each of the time types have interesting nuances:
As mentioned above, our advice is to always use TIMESTAMPTZ with your session time zone set to UTC. If you need time zone information, use a separate column to store this information. At the end of the day, make sure what you pick works for you.
Whew, that was confusing. Maybe I should write a strongly worded letter. In any case, working with time data types is indeed interesting — and we haven’t even touched intervals and leap seconds!
Did you enjoy our adventures and deep dive into the internals and nuances of databases? As always, we’re hiring!
Foreign keys are an important element of any relational database. But when you’re setting up your database schema, it’s …
Read More
Randomized testing is a way for programmers to automate the discovery of interesting test cases that would be …
Read More
When our VP of engineering, Peter Mattis, made the decision …
Read More