Step 1. Install CockroachDB
Download the CockroachDB archive for your OS, and extract the binary:copy
$ curl https://binaries.cockroachdb.com/cockroach-v19.2.4.darwin-10.9-amd64.tgz \ | tar -xJcopy
$ wget -qO- https://binaries.cockroachdb.com/cockroach-v19.2.4.linux-amd64.tgz \ | tar xvz
Move the binary into your
$PATHso you can execute from any shell:copy
$ mv cockroach-v19.2.4.darwin-10.9-amd64/cockroach \ /usr/local/bin/copy
$ mv cockroach-v19.2.4.linux-amd64/cockroach \ /usr/local/bin/Note:
If you get a permissions error, prefix the command with
Clean up the directory where you unpacked the binary:copy
$ rm -rf cockroach-v19.2.4.darwin-10.9-amd64copy
$ rm -rf cockroach-v19.2.4.linux-amd64
You can also execute the
cockroachbinary directly from its download location, but the rest of training documentation assumes you have the binary in your
Step 2. Start a node
cockroach start command to start a node:
$ cockroach start \ --insecure \ --store=node1 \ --listen-addr=localhost:26257 \ --http-addr=localhost:8080 \ --join=localhost:26257,localhost:26258,localhost:26259 \ --background
You'll see the following message:
* * WARNING: RUNNING IN INSECURE MODE! * * - Your cluster is open for any client that can access localhost. * - Any user, even root, can log in without providing a password. * - Any user, connecting as root, can read or write any data in your cluster. * - There is no network encryption nor authentication, and thus no confidentiality. * * Check out how to secure your cluster: https://www.cockroachlabs.com/docs/v19.2/secure-a-cluster.html * * * INFO: initial startup completed, will now wait for `cockroach init` * or a join to a running cluster to start accepting clients. * Check the log file(s) for progress. *
Step 3. Understand the flags you used
Before moving on, take a moment to understand the flags you used with the
cockroach start command:
||Indicates that the node will communicate without encryption.
You'll start all other nodes with this flag, as well as all other
Without this flag,
||The location where the node stores its data and logs.
Since you'll be running all nodes on your computer, you need to specify a unique storage location for each node. In contrast, in a real deployment, with one node per machine, it's fine to let
||The IP address/hostname and port to listen on for connections from other nodes and clients and for Admin UI HTTP request, respectively.
Again, since you'll be running all nodes on your computer, you need to specify unique ports for each node. In contrast, in a real deployment, with one node per machine, it's fine to let
||The addresses and ports of all of your initial nodes.
You'll use this exact
||The node will run in the background.|
You can run
cockroach start --help to get help on this command directly in your terminal and
cockroach --help to get help on other commands.
Step 4. Start two more nodes
Start two more nodes, using the same
cockroach start command as earlier but with unique
--http-addr flags for each new node.
Start the second node:copy
cockroach start \ --insecure \ --store=node2 \ --listen-addr=localhost:26258 \ --http-addr=localhost:8081 \ --join=localhost:26257,localhost:26258,localhost:26259 \ --background
Start the third node:copy
cockroach start \ --insecure \ --store=node3 \ --listen-addr=localhost:26259 \ --http-addr=localhost:8082 \ --join=localhost:26257,localhost:26258,localhost:26259 \ --background
Step 5. Initialize the cluster
cockroach initcommand to perform a one-time initialization of the cluster, sending the request to any node:copy
$ cockroach init --insecure --host=localhost:26257
You'll see the following message:
Cluster successfully initialized
Look at the startup details in the server log:copy
$ grep 'node starting' node1/logs/cockroach.log -A 11
The output will look something like this:
CockroachDB node starting at 2019-10-01 20:14:55.358954 +0000 UTC (took 27.9s) build: CCL v19.2.4 @ 2019/09/25 15:18:08 (go1.12.6) webui: http://localhost:8080 sql: postgresql://root@localhost:26257?sslmode=disable client flags: cockroach <client cmd> --host=localhost:26257 --insecure logs: /Users/<username>/cockroachdb-training/node1/logs temp dir: /Users/<username>/cockroachdb-training/node1/cockroach-temp462678173 external I/O path: /Users/<username>/cockroachdb-training/node1/extern store: path=/Users/<username>/cockroachdb-training/node1 status: initialized new cluster clusterID: fdc056a4-0cc0-4b29-b435-60e1db239f82 nodeID: 1
The version of CockroachDB you are running.
The URL for accessing the Admin UI.
The connection URL for your client.
The flags to use when connecting to the node via
The directory containing debug log data.
The temporary store directory of the node.
external I/O path
The external IO directory with which the local file access paths are prefixed while performing backup and restore operations using local node directories or NFS drives.
The directory containing store data, where
[n]is the index of the store, e.g.,
storefor the first store,
storefor the second store.
Whether the node is the first in the cluster (
initialized new cluster), joined an existing cluster for the first time (
initialized new node, joined pre-existing cluster), or rejoined an existing cluster (
restarted pre-existing node).
The ID of the cluster.
The ID of the node.
Step 6. Verify that the cluster is live
cockroach node statuscommand to check that all 3 nodes are part of the cluster:copy
$ cockroach node status --insecure --host=localhost:26257
id | address | sql_address | build | started_at | updated_at | locality | is_available | is_live +----+-----------------+-----------------+-----------------------------------------+----------------------------------+----------------------------------+----------+--------------+---------+ 1 | localhost:26257 | localhost:26257 | v19.2.0 | 2019-10-01 20:14:55.249457+00:00 | 2019-10-01 20:16:07.283866+00:00 | | true | true 2 | localhost:26258 | localhost:26258 | v19.2.0 | 2019-10-01 20:14:55.445079+00:00 | 2019-10-01 20:16:02.972943+00:00 | | true | true 3 | localhost:26259 | localhost:26259 | v19.2.0 | 2019-10-01 20:14:55.857631+00:00 | 2019-10-01 20:16:03.389338+00:00 | | true | true (3 rows)
cockroach sqlcommand to query the cluster:copy
$ cockroach sql \ --insecure \ --host=localhost:26257 \ --execute="SHOW DATABASES;"
database_name +---------------+ defaultdb postgres system (3 rows)
You just queried the node listening on
26257, but every other node is a SQL gateway to the cluster as well. We'll learn more about CockroachDB SQL and the built-in SQL client in a later module.
Step 7. Look at the current state of replication
To understand replication in CockroachDB, it's important to review a few concepts from the architecture:
Concept Description Range CockroachDB stores all user data (tables, indexes, etc.) and almost all system data in a giant sorted map of key-value pairs. This keyspace is divided into "ranges", contiguous chunks of the keyspace, so that every key can always be found in a single range.
From a SQL perspective, a table and its secondary indexes initially map to a single range, where each key-value pair in the range represents a single row in the table (also called the primary index because the table is sorted by the primary key) or a single row in a secondary index. As soon as a range reaches 64 MiB in size, it splits into two ranges. This process continues as the table and its indexes continue growing.
Replica CockroachDB replicates each range 3 times by default and stores each replica on a different node.
In a later module, you'll learn how to control replication.
With those concepts in mind, open the Admin UI at http://localhost:8080 and view the Node List:
Note that the Replicas count is the same on all three nodes. This indicates:
- There are this many initial "ranges" of data in the cluster. These are all internal "system" ranges since you haven't added any table data yet.
- Each range has been replicated 3 times (according to the CockroachDB default).
- For each range, each replica is stored on different nodes.
Step 8. Scale the cluster
Adding more nodes to your cluster is even easier than starting the cluster. Just like before, you use the
cockroach start command with unique
--http-addr flags for each new node. But this time, you do not have to follow-up with the
cockroach init command or any other commands.
Start the fourth node:copy
cockroach start \ --insecure \ --store=node4 \ --listen-addr=localhost:26260 \ --http-addr=localhost:8083 \ --join=localhost:26257,localhost:26258,localhost:26259 \ --background
Start the fifth node:copy
cockroach start \ --insecure \ --store=node5 \ --listen-addr=localhost:26261 \ --http-addr=localhost:8084 \ --join=localhost:26257,localhost:26258,localhost:26259 \ --background
As soon as you run these commands, the nodes join the cluster. There's no need to run the
cockroach initcommand or any other commands.
Step 9. Watch data rebalance across all 5 nodes
Go back to the Live Nodes list in the Admin UI and watch how the Replicas are automatically rebalanced to utilize the additional capacity of the new nodes:
Another way to observe this is to click Metrics in the upper left and scroll down to the Replicas per Node graph: