Collect & deliver data with Vector by Datadog & CockroachDB

Collect & deliver data with Vector by Datadog & CockroachDB

In my technical journey, I struggled to find a simple tool that gave me the flexibility to collect and deliver data to other monitoring tools. Until, one day, I found it.

The year was 2007, during a rainy season in Florida. Heavy rain pound on the roof. Earth shattering thunder clapped me out of a light sleep. I was tired and frustrated. For days I had been struggling to create the logic with Bash scripts and CSV files for what was to become a paging alert system for Solaris (which executes critical C based code for our customers). I had been entrusted to design a script that would mitigate monetary losses for the company, despite my not having any experience designing monitoring and alerting tools (at the time).

The task of collecting log data keywords, fetching metrics such as CPU, disk size, number of running processes, etc., was quite overwhelming! Also at that point, there were not many robust tool options in the market. The few that existed were unattainable due to many factors including company policies, pricing, multi-alert tools requirements, and complexity.

DevOps, SREs, and DBAs know this pain.

The miracle tool

Fast forward a few years, to my days at Datadog as a Solutions Engineer, where I came across a recently acquired versatile tool called Vector, that some Datadog customers started to adopt. The tool seemed to have all of the characteristics of the miracle tool: open source, flexible, configurable, un-intrusive, and can deliver data just about anywhere.

During conversations with former colleagues and customers, they stressed the need for a flexible data collector that could deliver filtered, sanitized, and custom data sets to multiple monitoring tools. Now, a few months later at Cockroach Labs, I learned that Datadog can be natively integrated to monitor the self-hosted and dedicated distributed database cluster.

Why choose Vector?

So, why Vector by Datadog and not the Datadog/Splunk/Dynatrace/New Relic/Elastic/other agent? One potential caveat with multi-agent collectors is that each agent is designed to take full control of a log, causing the second agent to fight for resources on the same file if the requirement is to send the log to multiple providers (ex. Splunk, Datadog, and ElasticSearch).

Vector had matured at Datadog, and I saw a good fit between Vector and CockroachDB for a use case that many users have asked for in the past: Delivery of data to multi-points from self-managed systems and control over the data collected.

Vector uses “Sources” to collect data, “Transforms” to cleanse/filter/customize data, and “Sinks” to deliver data.

Collection & delivery process with Vector by Datadog from a CockroachDB node

The steps to achieve a quick installation are not difficult if we’re willing to invest a bit of time on the custom solution. The TOML example provided is for each host but can also be used on Kubernetes deployments.

Get started with Vector and CockroachDB

Step 1: Install Vector

curl -1sLf 'https://repositories.timber.io/public/vector/cfg/setup/bash.deb.sh' | sudo -E bash
sudo apt-get install -y vector

Step 2: Configuration

Configure the Vector TOML/YAML/JSON file: Sources, Transformers, Sinks: see final TOML config file sample in my GitHub repo

Step 3: Collect logs

Collect CockroachDB logs with sources (select the logs you need to collect)

[sources.crdb_logs] 
type = "file" 
ignore_older_secs = 600 
include = ["/mnt/disks/crdb-disk/cockroach-data/logs/cockroach.log", "/mnt/disks/crdb-disk/cockroach-data/logs/cockroach-health.log", "/mnt/disks/crdb-disk/cockroach-data/logs/cockroach-security.log", "/mnt/disks/crdb-disk/cockroach-data/logs/cockroach-sql-audit.log", "/mnt/disks/crdb-disk/cockroach-data/logs/cockroach-sql-auth.log", "/mnt/disks/crdb-disk/cockroach-data/logs/cockroach-sql-exec.log", "/mnt/disks/crdb-disk/cockroach-data/logs/cockroach-sql-slow.log", "/mnt/disks/crdb-disk/cockroach-data/logs/cockroach-sql-schema.log", "/mnt/disks/crdb-disk/cockroach-data/logs/cockroach-pebble.log", "/mnt/disks/crdb-disk/cockroach-data/logs/cockroach-telemetry.log"]

Step 4: Collect data

Collect CockroachDB data from the HTTP endpoint with the Prometheus sources

[sources.crdb_ui_console]
type = "prometheus_scrape"
endpoints = [ "http://localhost:8080/_status/vars" ]
scrape_interval_secs = 15
instance_tag = "instance"
endpoint_tag = "endpoint"

Step 5: Collect server metrics

CPU, Memory, Processes, and more with sources

Step 6: Customize metrics

Customize CockroachDB metrics (ex. CRDB to Splunk) with Transforms

[transforms.crdb_metrics_to_logs_splunk]
type = "metric_to_log"
inputs = [ "crdb_ui_console" ]
host_tag = "hostname"

Step 7: Endpoint alerting tool

Determine the delivery endpoint alerting tool or platforms (many providers have free trials to access their platforms) with sinks (ex. sending data to Splunk and Datadog)

## Splunk
[sinks.logs_to_splunk_hec]
type = "splunk_hec_logs"
inputs = [ "crdb_metrics_to_logs_splunk" ]
endpoint = "https://192.168.86.30:8088"
host_key = "hostname"
indexed_fields = [ "field1" ]
compression = "gzip"
default_token = "${SPLUNK_HEC_TOKEN}"
tls.verify_certificate = false
   [sinks.to_splunk_hec_logs.encoding]
   codec = "json"
[sinks.metrics_to_splunk_hec]
type = "splunk_hec_metrics"
inputs = [ "crdb_node_metrics" ]
endpoint = "https://192.168.86.30:8088"
host_key = "hostname"
index = "{{ host }}"
source = "{{ file }}"
compression = "gzip"
default_token = "${SPLUNK_HEC_TOKEN}"
sourcetype = "{{ sourcetype }}"
tls.verify_certificate = false

## Datadog
[sinks.crdb_dd_metrics]
type = "datadog_metrics"
inputs = [ "crdb_ui_console", "crdb_node_metrics" ]
default_api_key = "${DATADOG_API_KEY}" 
[sinks.crdb_dd_logs]
type = "datadog_logs"
inputs = [ "crdb_logs"]
default_api_key = "${DATADOG_API_KEY}"
region = "us"
compression = "gzip"
site = "datadoghq.com"
tags = ["source:vector","env:dev","collector:live process"]

Step 8: Test

Test the Vector configuration before using systemctl

sudo vector --config vector.toml

Step 9: Deployment

Move the vector.toml file to the Vector directory and deploy Vector via the Linux systemctl

sudo mv vector.toml /etc/vector/vector.toml
sudo systemctl start vector

While there are paid and free tools out in the market that offer many capabilities, Vector is community maintained and offers flexibility to capture custom data and logs. The downside is a learning curve, trial, and error of converting data (which might be treated as log) and converting it to metrics. The upside of Vector is its flexibility to deliver parsed data to different provider endpoints.

So what do we gain by using Vector in our self-hosted CockroachDB clusters? We gain the flexibility to customize sending cleansed metrics to more than a single provider without being tied to a specific one. The all or nothing option to collect logs and metrics does not apply to Vector. Long gone are the days in which we were limited to collect metrics and alerts on anomalies of our systems.

So what can Vector by Datadog do for my CockroachDB self-hosted cluster? It helps me provide monitoring data delivery for multiple teams without overburdening my infrastructure with agents fighting for the same resource.

About the author

Julian Levi Hernandez

Data Streaming, Engineering, Implementation SME. Always tinkering with new technologies and analyzing features for performance.

github link linkedin link

Keep Reading

REGIONAL BY ROW: Let the database do the data homing

When building a multi-region application, one of the first things to consider is how and where data will be placed. For …

Read More
Monitor your CockroachDB clusters with cloud-native log services

Identifying transaction bottlenecks or getting an audit trail of user actions in the database can be challenging without …

Read More
The OpenTelemetry Collector as a platform for monitoring integrations

Over the past year, Cockroach Labs has been working hard to give our observability tooling some extra love. When it …

Read More
x
Developer Resources