How To Install an Elasticsearch Cluster on Ubuntu 18.04

Elasticsearch is a platform for distributed search and data analysis in real-time. It offers a multitenant-capable full-text search engine with an HTTP web interface and schema-free JSON documents with a simple installation.

By Elasticsearch, you can execute and combine various types of searches giving you the like Kibana, Logstash, X-Pack, etc., Elasticsearch can collect and monitor Big Data at a large scale. This Elasticsearch cluster includes three data nodes and with this, we could avoid a split-brain and have a quorum of master-eligible nodes.

In this free and ultimate tutorial, we will be going to learn how to install and configure a 3 Node Elasticsearch Cluster on Ubuntu 18.04 and with this you can go through some API examples on creating indexes, ingesting documents, searches, etc.Elasticsearch Logo

What is ElasticSearch?

ElasticSearch is a highly scalable open-source analytics engine, RESTful search engine built on top of Apache Lucene and issued under an Apache license. It is the most famous search engine and is generally used for full-text search, log analytics, security intelligence, business analytics, analyze big volumes of data faster and in near real-time, etc. Also, Elasticsearch is Java-based and can search and index document files in different formats.

Features of ElasticSearch

Before we get to the main topic, let’s cover some basics about Elasticsearch from below: 

Basic Concepts of Elasticsearch

  • An Elasticsearch Cluster is made up of a number of nodes;
  • Each Node includes Indexes, where an Index is a Collection of Documents;
  • Master nodes are subjective for Cluster related tasks, creating/deleting indexes, tracking of nodes, allocate shards to nodes;
  • Data nodes are liable for hosting the actual shards that have the indexed data also handles data related operations like CRUD, search, and aggregations;
  • Indexes are split into Multiple Shards;
  • Shards exist of Primary Shards and Replica Shards;
  • A Replica Shard is a Copy of a Primary Shard that is used for HA/Redundancy;
  • Shards get placed on random nodes throughout the cluster;
  • A Replica Shard will NEVER be on the same node as the Primary Shard’s associated shard-id.

Representation of Nodes, Index and Shards on 2 Nodes (as an example)

Note on Master Elections

The least number of master eligible nodes that want to join a newly elected master in order for an election is configured via the setting discovery.zen.minimum_master_nodes. This configuration is very powerful, as it makes each master-eligible node informed of the minimum number of master-eligible nodes that must be visible in order to form a cluster.

Without this setting or incorrect configuration, this might lead to a split-brain, where let’s say something went wrong and upon nodes rejoining the cluster, it may form 2 different clusters, which we want to avoid at all costs.

From consulting elasticsearch documentation, to avoid a split brain, this setting should be set to a quorum of master-eligible nodes via the following formula:

(master_eligible_nodes / 2) + 1
# in our case:
(3/2) + 1 = 2

It is advised to evade having only two master eligible nodes since a quorum of two is two. To read more on elasticsearch cluster master election process, take a look at their documentation

Prerequisites

We have to set the internal IP addresses of our nodes to either our hosts’ file or DNS server. To keep it easy & straightforward, I will add them to my host file. This needs to apply to both nodes:

$ sudo su - 
$ cat > /etc/hosts << EOF
127.0.0.1 localhost
172.31.0.77 es-node-1
172.31.0.45 es-node-2
172.31.0.48 es-node-3
EOF

Now that our host entries are set, we can start with the fun stuff.

Installing Elasticsearch on Ubuntu

The following instructions and directions should be implemented to both nodes.

Get the Elasticsearch repositories and update your system so that your servers are aware of the newly added Elasticsearch repository:

$ apt update && apt upgrade -y
$ apt install software-properties-common python-software-properties apt-transport-https -y
$ wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -
$ echo "deb https://artifacts.elastic.co/packages/6.x/apt stable main" | sudo tee -a /etc/apt/sources.list.d/elastic-6.x.list
$ apt update

Elasticsearch relies on Java, so install the java development kit:

$ apt install default-jdk -y

Verify that java is installed:

$ java -version
openjdk version "11.0.3" 2019-04-16
OpenJDK Runtime Environment (build 11.0.3+7-Ubuntu-1ubuntu218.04.1)
OpenJDK 64-Bit Server VM (build 11.0.3+7-Ubuntu-1ubuntu218.04.1, mixed mode, sharing)

Install Elasticsearch:

$ apt install elasticsearch -y

Once Elasticsearch is installed, repeat these steps on the second node. Once that is done, move on to the configuration section.

Configuring Elasticsearch

For nodes to join the same cluster, they should all share the same cluster name.

We also need to specify the discovery hosts as the masters so that the nodes can be discoverable. Since we are installing a 3 node cluster, all nodes will contribute to a master and data node role.

Feel free to inspect the Elasticsearch cluster configuration, but I will be overwriting the default configuration with the config that I need.

Make sure to apply the configuration on both nodes:

$ cat > /etc/elasticsearch/elasticsearch.yml << EOF
cluster.name: es-cluster
node.name: \${HOSTNAME}
node.master: true
node.data: true
path.logs: /var/log/elasticsearch
path.data: /usr/share/elasticsearch/data
bootstrap.memory_lock: true
network.host: 0.0.0.0
discovery.zen.minimum_master_nodes: 2
discovery.zen.ping.unicast.hosts: ["es-node-1", "es-node-2"]
EOF

Important settings for your elasticsearch cluster is described on their docs:

  • Disable swapping
  • Increase file descriptors
  • Ensure sufficient virtual memory
  • Ensure sufficient threads
  • JVM DNS cache settings
  • Temporary directory not mounted with noexec

Increase the file descriptors on the nodes, as instructed by the documentation:

$ cat > /etc/default/elasticsearch << EOF
ES_STARTUP_SLEEP_TIME=5
MAX_OPEN_FILES=65536
MAX_LOCKED_MEMORY=unlimited
EOF

Ensure that pages are not swapped out to disk by requesting the JVM to lock the heap in memory by setting LimitMEMLOCK=infinity.

Set the maximum file descriptor number for this process: LimitNOFILE and increase the number of threads using LimitNPROC:

$ vim /usr/lib/systemd/system/elasticsearch.service
[Service]
LimitMEMLOCK=infinity
LimitNOFILE=65535
LimitNPROC=4096
...

Increase the limit on the number of open files descriptors to user elasticsearch of 65536 or higher

$ cat > /etc/security/limits.conf << EOF
elasticsearch soft memlock unlimited
elasticsearch hard memlock unlimited
EOF

Increase the value of the map counts as elasticsearch uses maps directory to store its indices:

$ sysctl -w vm.max_map_count=262144

For a permanent setting, update vm.max_map_count in /etc/sysctl.conf and run:

$ sysctl -p /etc/sysctl.conf

Change the permissions of the elasticsearch data path, so that the elasticsearch user and group has permissions to read and write from the configured path:

$ chown -R elasticsearch:elasticsearch /usr/share/elasticsearch

Make sure that you have applied these steps to all the nodes before continuing.

Start Elasticsearch

Enable Elasticsearch on boot time and start the Elasticsearch service:

$ systemctl enable elasticsearch
$ systemctl start elasticsearch

Verify that Elasticsearch is running:

$ netstat -tulpn
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp6       0      0 :::9200                 :::*                    LISTEN      278/java
tcp6       0      0 :::9300                 :::*                    LISTEN      278/java

Using Elasticsearch Restful API

In this section we will get comfortable with using Elasticsearch API, by covering the following examples:

  • Cluster Overview;
  • How to view Node, Index, and Shard information;
  • How to Ingest Data into Elasticsearch;
  • Who to Search data in Elasticsearch;
  • How to delete your Index

View Cluster Health

From any node, use an HTTP client such as curl to investigate the current health of the cluster by looking at the cluster API:

$ curl -XGET http://localhost:9200/_cluster/health?pretty
{
  "cluster_name" : "es-cluster",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 3,
  "number_of_data_nodes" : 3,
  "active_primary_shards" : 0,
  "active_shards" : 0,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}

As you can see the cluster status is Green, which means everything works as expected.

In Elasticsearch you get Green, Yellow and Red statuses. Yellow would essentially mean that one or more replica shards are in an unassigned state. Red status means that some or all primary shards are unassigned which is really bad.

From this output, we can also see the number of data nodes, primary shards, unassigned shards, etc.

This is a good place to get an overall view of your Elasticsearch cluster’s health.

View the Number of Nodes in your Cluster

By looking at that /_cat/nodes API we can get information about our nodes that is part of our cluster:

$ curl -XGET http://localhost:9200/_cat/nodes?v
ip             heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
172.31.0.77              10          95   0    0.00    0.00     0.00 mdi       -      es-node-1
172.31.0.45              11          94   0    0.00    0.00     0.00 mdi       -      es-node-2
172.31.0.48              25          96   0    0.07    0.02     0.00 mdi       *      es-node-3

As you can see, we can see information about our nodes such as the JVM Heap, CPU, Load averages, node role of each node, and which node is master.

As we are not running dedicated masters, we can see that node es-node-3 got elected as master.

Create your first Elasticsearch Index

Note that when you create an index, the default primary shards are set to 5 and the default replica shard count is set to 1. You can change the replica shard count after an index has been created, but not the primary shard count as that you will need to set on index creation.

Let’s create an Elasticsearch index named myfirstindex:

$ curl -XPUT http://localhost:9200/myfirstindex
{"acknowledged":true,"shards_acknowledged":true,"index":"myfirstindex"}

Now that your index has been created, let’s have a look at the /_cat/indices API to get information about our indices:

$ curl -XGET http://localhost:9200/_cat/indices?v
health status index        uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   myfirstindex xSX9nOQJQ2qNIq4A6_0bTw   5   1          0            0      1.2kb           650b

From the output you will find that we have 5 primary shards and 1 replica shard, with 0 documents in our index and that our cluster is in a green state, meaning that our primary and replica shards have been assigned to the nodes in our cluster.

Note that a replica shard will NEVER reside on the same node as the primary shard for HA and Redundancy.

Let’s go a bit deeper and have a look at the shards, to see how our shards are distributed through our cluster, using the /_cat/shards API:

$ curl -XGET http://localhost:9200/_cat/shards?v
index        shard prirep state   docs store ip             node
myfirstindex 1     r      STARTED    0  230b 172.31.0.77    es-node-2
myfirstindex 1     p      STARTED    0  230b 172.31.0.48    es-node-3
myfirstindex 4     p      STARTED    0  230b 172.31.0.48    es-node-3
myfirstindex 4     r      STARTED    0  230b 172.31.0.77    es-node-1
myfirstindex 2     r      STARTED    0  230b 172.31.0.45    es-node-2
myfirstindex 2     p      STARTED    0  230b 172.31.0.77    es-node-1
myfirstindex 3     p      STARTED    0  230b 172.31.0.45    es-node-2
myfirstindex 3     r      STARTED    0  230b 172.31.0.48    es-node-3
myfirstindex 0     p      STARTED    0  230b 172.31.0.45    es-node-2
myfirstindex 0     r      STARTED    0  230b 172.31.0.77    es-node-1

As you can see each replica shard of its primary is spread on different nodes.

Replicating a Yellow Cluster Status

For a yellow cluster status, we know that it’s when one or more replica shards are in an unassigned state.

So let’s replicate that behavior by scaling our replica count to 3, which would mean that 5 replica shards will be in an unassigned state:

$ curl -XPUT -H 'Content-Type: application/json' \
http://localhost:9200/myfirstindex/_settings -d \
'{"number_of_replicas": 3}'

Now we have scaled the replica count to 3, but since we only have 3 nodes, we will have a yellow state cluster:

$ curl -XGET http://localhost:9200/_cat/indices?v
health status index        uuid                   pri rep docs.count docs.deleted store.size pri.store.size
yellow open   myfirstindex xSX9nOQJQ2qNIq4A6_0bTw   5   3          0            0      3.3kb           1.1kb

The cluster health status should show the number of unassigned shards, and while they are unassigned we can verify that by looking at the shards API again:

$ curl -XGET http://localhost:9200/_cat/shards?v
index        shard prirep state      docs store ip             node
myfirstindex 1     r      STARTED       0  230b 172.31.0.45    es-node-2
myfirstindex 1     p      STARTED       0  230b 172.31.0.48    es-node-3
myfirstindex 1     r      STARTED       0  230b 172.31.0.77    es-node-1
myfirstindex 1     r      UNASSIGNED
myfirstindex 4     r      STARTED       0  230b 172.31.0.45    es-node-2
myfirstindex 4     p      STARTED       0  230b 172.31.0.48    es-node-3
myfirstindex 4     r      STARTED       0  230b 172.31.0.77    es-node-1
myfirstindex 4     r      UNASSIGNED
myfirstindex 2     r      STARTED       0  230b 172.31.0.45    es-node-2
myfirstindex 2     r      STARTED       0  230b 172.31.0.48    es-node-3
myfirstindex 2     p      STARTED       0  230b 172.31.0.77    es-node-1
myfirstindex 2     r      UNASSIGNED
myfirstindex 3     p      STARTED       0  230b 172.31.0.45    es-node-2
myfirstindex 3     r      STARTED       0  230b 172.31.0.48    es-node-3
myfirstindex 3     r      STARTED       0  230b 172.31.0.77    es-node-1
myfirstindex 3     r      UNASSIGNED
myfirstindex 0     p      STARTED       0  230b 172.31.0.45    es-node-2
myfirstindex 0     r      STARTED       0  230b 172.31.0.48    es-node-3
myfirstindex 0     r      STARTED       0  230b 172.31.0.77    es-node-1
myfirstindex 0     r      UNASSIGNED

At this point in time, we could either add another node to the cluster or scale the replication factor back to 1 to get the cluster health to green again.

I will scale it back down to a replication factor of 1:

$ curl -XPUT http://localhost:9200/myfirstindex/_settings -d '{"number_of_replicas": 1}'

Ingest Data into Elasticsearch

We will ingest 3 documents into our index, this will be a simple document consisting of a name, country and gender, for example:

{ 
  "name": "james", 
  "country": "south africa", 
  "gender": "male"
}

First, we will ingest the document using a PUT HTTP method, when using a PUT method, we need to specify the document ID.

PUT methods will be used to create or update a document. For creating:

$ curl -XPUT -H 'Content-Type: application/json' \
http://localhost:9200/myfirstindex/_doc/1 -d '
{"name":"james", "country":"south africa", "gender": "male"}'

Now you will find we have one index in our cluster:

$ curl -XGET http://localhost:9200/_cat/indices?v
health status index        uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   myfirstindex xSX9nOQJQ2qNIq4A6_0bTw   5   1          1            0     11.3kb          5.6kb

Since we know that the document ID is “1”, we can do a GET on the document ID to read the document from the index:

$ curl -XGET http://localhost:9200/myfirstindex/people/1?pretty
{
  "_index" : "myfirstindex",
  "_type" : "people",
  "_id" : "1",
  "found" : false
}

If we ingest documents with a POST request, Elasticsearch generates the document ID for us automatically. Let’s create 2 documents:

$ curl -XPOST -H 'Content-Type: application/json' \
http://localhost:9200/myfirstindex/_doc/ -d '
{"name": "kevin", "country": "new zealand", "gender": "male"}'

$ curl -XPOST -H 'Content-Type: application/json' \
http://localhost:9200/myfirstindex/_doc/ -d '
{"name": "sarah", "country": "ireland", "gender": "female"}'

When we have a look again at our index, we can see that we now have 3 documents in our index:

$ curl -XGET http://localhost:9200/_cat/indices?v
health status index        uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   myfirstindex xSX9nOQJQ2qNIq4A6_0bTw   5   1          3            0       29kb         14.5kb

Search Queries

Now that we have 3 documents in our elasticsearch index, let’s explore the search APIs to get data from our index. First, let’s search for the keyword “sarah” as a source query parameter:

$ curl -XGET 'http://localhost:9200/myfirstindex/_search?q=sarah&pretty'
{
  "took" : 9,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 0.2876821,
    "hits" : [
      {
        "_index" : "myfirstindex",
        "_type" : "_doc",
        "_id" : "cvU96GsBP0-G8XdN24s4",
        "_score" : 0.2876821,
        "_source" : {
          "name" : "sarah",
          "country" : "ireland",
          "gender" : "female"
        }
      }
    ]
  }
}

We can also narrow our search query down to a specific field, for example, show me all the documents with the name kevin:

$ curl -XGET 'http://localhost:9200/myfirstindex/_search?q=name:kevin&pretty'
{
...
  "hits" : {
    "total" : 1,
    "max_score" : 0.2876821,
    "hits" : [
      {
        "_index" : "myfirstindex",
        "_type" : "_doc",
        "_id" : "gPU96GsBP0-G8XdNHoru",
        "_score" : 0.2876821,
        "_source" : {
          "name" : "kevin",
          "country" : "new zealand",
          "gender" : "male"
        }
      }
    ]
  }
}

With Elasticsearch we can also search with our query in the request body, a similar query as above would look like this:

$ curl -XPOST -H 'Content-Type: application/json' \
'http://localhost:9200/myfirstindex/_search?pretty' -d '
{
  "query": {
    "match": {
      "name": "kevin"
    }
  }
}'

{
...
        "_index" : "myfirstindex",
        "_source" : {
          "name" : "kevin",
          "country" : "new zealand",
          "gender" : "male"
        }
...
}

We can use wildcard queries:

$ curl -XPOST -H 'Content-Type: application/json' \
'http://172.31.0.77:9200/myfirstindex/_search?pretty' -d '
{
  "query": {
    "wildcard": {
      "country": "*land"
    }
  }
}'

{
...
    "hits" : [
      {
        "_index" : "myfirstindex",
        "_type" : "_doc",
        "_id" : "cvU96GsBP0-G8XdN24s4",
        "_score" : 1.0,
        "_source" : {
          "name" : "sarah",
          "country" : "ireland",
          "gender" : "female"
        }
      },
      {
        "_index" : "myfirstindex",
        "_type" : "_doc",
        "_id" : "gPU96GsBP0-G8XdNHoru",
        "_score" : 1.0,
        "_source" : {
          "name" : "kevin",
          "country" : "new zealand",
          "gender" : "male"
        }
      }
    ]
...
}

Have a look at their documentation for more information on the Search API

Delete your Index

To wrap this up, we will go ahead and delete our index:

$ curl -XDELETE http://localhost:9200/myfirstindex

Going Further

If this got you curious, then definitely have a look at this Elasticsearch Cheatsheet that I’ve put together and if you want to generate lots of data to ingest to your elasticsearch cluster, have a look at this python script.

Our other links related to ELK:

Complete MySQL dashboard with Grafana & Prometheus

Complete MySQL dashboard with Grafana & Prometheus | MySQL Database Monitoring using Grafana and Prometheus

If you play an important role as a system administrator or a database administrator, then monitoring your MySQL server is also a very crucial move towards diagnosing problems. To do such actions, you need real-time monitoring on active connections, locks, or queries which are operating on your database. Along with that, you also require active users in addition to average query times.

In order to perform monitoring on such metrics, we are planning to create a complete MySQL dashboard with the help of advanced software tools like Grafana and Prometheus.

If you stick with this Complete MySQL dashboard with Grafana & Prometheus tutorial, you will definitely learn how to build this dashboard and be performed with a collection of 10+ MySQL dashboards designed by Percona. Furthermore, You can bet that there will be a dashboard for your requirements!

What You Will Learn

The concepts that learners can follow from this tutorial are listed below:

  • What a basic Prometheus monitoring architecture looks like;
  • How to install and configure a Prometheus server on your Linux machine;
  • How to configure Grafana to import Percona’s MySQL dashboards in minutes.
  • How to set up the MySQL server exporter and how to bind it to Prometheus;

Enthusiastic to start learning?

MySQL, Grafana & Prometheus Architecture

Before beginning to learn MySQL Database Monitoring using Grafana and Prometheus, you should aware of what a Prometheus monitoring architecture looks like:

Our definitive guide on Prometheus will make you a clear idea on Prometheus works with exporters. Exporters are meant to bind to existing data sources to retrieve metrics from them. For instance, exporters would be the MongoDB exporter, the ElasticSearch exporter, or in our case the MySQL exporter.

Exporters are revealed as Docker images or as standalone binaries that you can run as background tasks or services. But here, we are using the MySQL exporter that is available on Prometheus’s official Github page.

The MySQL exporter binds to our MySQL instance and exposes metrics straight for Prometheus to apply.

As part of its configuration, Prometheus is going to bind to it and scrape metrics from it. From there, they will be noticeable on Grafana.

Also Check: How To Install and Configure Debian 10 Buster with GNOME

Complete MySQL dashboard with Grafana Prometheus mysql-grafana-architecture

Quite easy, right?

Now that you better understand how we will build it, let’s install all the tools you need to create your MySQL dashboard with Grafana.

Steps to Install Different Tools for MySQL Database Monitoring with Grafana and Prometheus

  • Install and configure Grafana
  • Install and configure Prometheus
  • Install a MySQL exporter
  • Creating a MySQL exporter service

Installing Prometheus

For those who are new to Prometheus, we covered Prometheus installation in our previous tutorial.

From this point, your Prometheus instance should be up and running with your configuration files on /etc/prometheus/prometheus.yml.

To verify it, head over to http://localhost:9090. You should see the Prometheus web interface running. If not, something is definitely wrong with your installation.

Complete MySQL dashboard with Grafana Prometheus prometheus-interface-1For now, no metrics are currently stored in your Prometheus instance because we did not set the MySQL exporter.

That’s what we are going to do next.

Installing the MySQL exporter

As detailed before, the MySQL exporter is available here. It is actually an official exporter created by Prometheus itself.

The MySQL exporter comes as a standalone binary, but we are going to configure it as a service.

First, create a Prometheus user on your instance if it is not already existing.

> sudo useradd -rs /bin/false prometheus

As a quick reminder, with this command, you will create a system (-r) user named Prometheus with a no shell access (-s) This is the user you will use to create your MySQL exporter service.

First, download the latest MySQL exporter binaries on your instance.

All distributions are available here. Find the one that suits your needs and run:

> wget https://github.com/prometheus/mysqld_exporter/releases/download/v0.11.0/mysqld_exporter-0.11.0.linux-amd64.tar.gz

Now that your binaries are download, extract them in your current folder.

> tar xvzf mysqld_exporter-0.11.0.linux-amd64.tar.gz

Move the binaries to the /usr/local/bin folder that we are going to build a service out of it. You need sudo rights to perform this operation.

> cd mysqld_exporter-0.11.0.linux-amd64/
> sudo mv mysqld_exporter /usr/local/bin/

From there, you should be able to create a user for the exporter on your MySQL database.

Run the MySQL shell, and configure your database as follows (you should have the rights to grant permissions on your database).

> sudo mysql
> CREATE USER 'exporter'@'localhost' IDENTIFIED BY 'password' WITH MAX_USER_CONNECTIONS 3;
> GRANT PROCESS, REPLICATION CLIENT, SELECT ON *.* TO 'exporter'@'localhost';

Running those commands, you will create an exporter user with a ‘password’ password on MySQL. Now, you are going to set those variables in a configuration file.

In your /etc folder, create a new configuration file named .exporter.cnf and write the following credentials to it.

> cd /etc
> sudo vi .exporter.cnf

Set the credentials in the configuration file as follows:

[client]
user=exporter
password=password

(If you set different users or passwords in the step before, you need to reflect the changes in the configuration file.)

Creating a MySQL exporter service

Now that everything is ready, it is time to create a service for your MySQL exporter. Head over to /lib/systemd/system and create a new service file.
<pre> sudo vi /lib/systemd/system/mysql_exporter.service

Paste the following configuration into it:

[Unit]
Description=MySQL Exporter
User=prometheus

[Service]
Type=simple
Restart=always
ExecStart=/usr/local/bin/mysqld_exporter \
--config.my-cnf /etc/.exporter.cnf \
--collect.auto_increment.columns \
--collect.binlog_size \
--collect.engine_innodb_status \
--collect.engine_tokudb_status \
--collect.global_status \
--web.listen-address=0.0.0.0:9104

[Install]
WantedBy=multi-user.target

Restart your system daemon and start your service.
> sudo systemctl daemon-reload
> sudo systemctl status mysql _exporter.service

Check that your service is running by issuing the following command:

> sudo systemctl status mysql_exporter
● mysql_exporter.service - MySQL Exporter
   Loaded: loaded (/lib/systemd/system/mysql_exporter.service; disabled; vendor preset: enabled)
   Active: active (running) since Sat 2019-06-08 15:11:12 UTC; 5min ago
 Main PID: 3136 (mysqld_exporter)
    Tasks: 8 (limit: 4704)
   CGroup: /system.slice/mysql_exporter.service
           └─3136 /usr/local/bin/mysqld_exporter --config.my-cnf /etc/.exporter.cnf

Congratulations!

Your MySQL exporter is all set. Now it is time to configure Prometheus to scrape it.

Quick note: your MySQL exporter runs on port 9104 and MySQL runs on port 3306 by default.

Configuring Prometheus

Prometheus scrapes targets via its configuration file. As we added a new exporter, let’s add it to the Prometheus configuration.

Head over to the location of your Prometheus configuration file and edit it as follows:

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'
    static_configs:
            - targets: ['localhost:9090', 'localhost:9104']

Restart Prometheus, and head over to Prometheus Web Interface (http://localhost:9090).

Go to the targets tab, and make sure that Prometheus is correctly scrapping the MySQL exporter target.

Back to the ‘graph’ tab, in the expression text field, type ‘mysql_exporter_scrapes_total‘. If you see a result there, it means that your Prometheus instance is correctly configured to extract metrics from the exporter.

Congratulations! The entire configuration is now done!

Installing Grafana

Grafana will be used to display our metrics. The steps to install Grafana were detailed in one of our articles already, make sure to read it before you continue.

Click Here: How To Create a Grafana Dashboard? (UI + API methods)

If you are looking for a tutorial to install it on Ubuntu 18.04, I wrote a detailed Grafana installation guide for Ubuntu users.

How to Create the MySQL dashboard with Grafana

In this case, you have essentially two choices:

  • Create your own customized Grafana dashboard: tailored to your needs, you can choose what metrics you want to display and how you want to display them.
  • Use existing community dashboards and save some time.

In this tutorial, we are going to go for option two. We are going to use awesome Percona MySQL dashboards and import them right into our Grafana.

Configuring Prometheus data source

Before starting, and if you did not do it already, you need to configure Prometheus as a Grafana data source.

Create a new data source, and configure it as follows:

a – Configuring Prometheus data source prometheus-data-source

If your Prometheus instance is not running on port 9090 by default, make sure to write the correct port in the configuration.

Now that Prometheus is configured, we can browse Percona’s existing dashboards and import one of the existing MySQL dashboards in Grafana.

Percona dashboards are available here. You can also play with the existing dashboards on Percona’s own Grafana instance here.

Complete MySQL dashboard with Grafana & Prometheus dashboards-percona

In the dashboards folder of Percona’s GitHub, download the json file that you are interested in.

> wget https://github.com/percona/grafana-dashboards/blob/master/dashboards/MySQL_Overview.json

Now that your dashboard is downloaded, in Grafana, go to Dashboards > Import > Upload .json file.

Complete MySQL dashboard with Grafana & Prometheus dashboard-import

If you press “Import”, your entire MySQL dashboard will be created automatically and it will start displaying data immediately!

Complete MySQL dashboard with Grafana & Prometheus panel-grafana

This is the MySQL overview dashboard, but there are more than 20+ dashboards for you to choose from.

Here the complete list of dashboards created by Percona:

Saves a set of Grafana dashboards for database and system monitoring using Prometheus data source.

  • Amazon RDS OS metrics (CloudWatch data source)
  • Cross-Server Graphs
  • Disk Performance
  • Disk Space
  • MongoDB Cluster Summary
  • MongoDB Overview
  • MongoDB ReplSet
  • MongoDB RocksDB
  • MongoDB Wired Tiger
  • MongoDB MMAPv1
  • MongoDB InMemory
  • MySQL InnoDB Metrics
  • MySQL InnoDB Metrics Advanced
  • MySQL InnoDB Compression
  • MySQL MYISAM/Aria Metrics
  • MySQL Overview
  • MySQL Performance Schema
  • MySQL Query Response Time
  • MySQL Replication
  • MySQL Table Statistics
  • MySQL TokuDB Graphs
  • MySQL User Statistics
  • MySQL Command Handler Counters Compare
  • PXC/Galera Cluster Overview
  • PXC/Galera Graphs
  • Prometheus
  • ProxySQL Overview
  • Summary Dashboard
  • System Overview
  • Trends Dashboard

Going Further with MySQL and Grafana

If you want to dig a little bit more into the subject, many videos can help you have an in-depth understanding of how companies are building dashboards, especially MySQL dashboards with Grafana.

Very recently, in February 2019, Peter Zaitsev (CEO at Percona) made a very great talk about it. The entire business model of Percona is built on monitoring MySQL, MongoDB, and MariaDB at scale.

A second use-case is Verizon using Grafana and MySQL and demonstrated how it can optimize automation and self-service practically.

I hope that you learned something new today. If you did, make sure to leave us a comment on how you plan on using this knowledge to your own specific needs.

Until then, have fun, as always.