InfluxDays London Recap

InfluxDays London 2021 Recap | Key Facts of InfluxDays London 2021

InfluxDays London is an annual convention held by InfluxData on the 13th and 14th of June 2019, where you get to know some general insights into the recent developments of the company and the industry by InfluxDays.

Not only recent development insights but also highlights various guest speakers whose feedbacks simply regarding real-world use cases and reality checks. In 2019, this convention was hosted in Brewery, London. This was specialized in hosting corporate venues in the Finsbury district of London.

InfluxDays London 2019 Recap

InfluxDays London 2019 Recap what_we_do

OLYMPUS DIGITAL CAMERA

Given the very fast pace of changes in the industry with new languages, new APIs, new partnerships, and a complete shift from 1.x to 2.x, expectations were high.

I wanted to have an appropriate section for the recent industry collaborations and a community chapter to end this recap.

So, in this tutorial, you will be going to have an entire look at all the latest & innovative features declared for the future of InfluxDB. Also, you guys can keep your eye on What are the key facts that were announced and what’s in the future for developers? from here.

Without wasting another single minute, let’s check out some key facts of InfluxDays London 2021 Recap prevailing below:

I – From InfluxDB 1.x to InfluxDB 2.x

If you have been working with InfluxDB for quite some time, you already know about the TICK stack, composed of Telegraf, InfluxDB, Chronograf, and Kapacitor.

All those components cover a very specific use case, whether it is about visualization with Chronograf, stream processing and analysis with Kapacitor, or metrics gathering using Telegraf.

However, some of those components showed some limitations, in addition to the inability of InfluxQL to scale up besides being a great ramp for the platform at the beginning.

That’s the reason why InfluxData invested in InfluxDB 2.0, unveiling the Flux language and revamping the whole platform thus inherently putting an end to the TICK stack as we used to know it.

The announcements made in InfluxDays London confirmed the transition.

a – InfluxDB as a single platform

Influx is much more than just InfluxDB. Even if the product is called InfluxDB 2.0, the single UI platform that InfluxData is building is doing way more than that.

Right now, Chronograf is integrated directly into the InfluxDB 2.0 platform. You have a real coexistence of the two products and you don’t have to rely on two different products to visualize data anymore.

a – InfluxDB as a single platform InfluxDB-2.0-GREY

As you can see, InfluxDB is stated to be a time series database visualization (Chronograf aspect of it), query, and task engine (tasks will be explained a little bit later).

Having a single platform for everything is interesting.

Developers are often hesitant about installing many different tools as it means that it will create maintenance costs for every single tool in the stack.

With two different products and Telegraf remaining an entity on its own, the data visualization and manipulation are done in a single place.

InfluxDB-Cloud-2-0-beta-create-API-token-for-use-with-client-library-image

Opinion

Having a single platform for pretty much everything is definitely interesting.

It is definitely a great UI/UX challenge as it can be hard to design tools that are functionally that ambitious.

However, from the live demonstrations that I saw on Friday (the workshop day), the process seems smooth enough.

InfluxData definitely delivers on the “Time To Awesome” feature. Besides being a great marketing catchword, it is actually reinforced by the predefined templates (for your system, a Docker or a Kubernetes instance) that can create a complete dashboard in seconds.

protoboard-system

Links

InfluxDB 2.0 documentation

InfluxDB 2.0 Announcements and Future Ahead – by Paul Dix

b – Emphasis on Flux

As stated before, Flux is a new data scripting and query language built by InfluxData.

flux-logo

Flux comes as a replacement of InfluxQL and provides an API, built on the concepts of piping, i.e sequencing, operations in order to manipulate data. You can for example send data to third-party services, pivot it or perform join operations on it.

For those interested, I wrote an article on the difference between Flux and SQL.

But Flux isn’t only a language. It has also a query plannera query optimizer as well as a CLI (or a REPL) that one can use in order to directly test Flux functions.

The major evolution here is that Flux is seen as a first-class citizen, and not a subpart of InfluxDB itself.

In a way, Flux should exist on its own.

InfluxData, via its CEO voice Paul Dix, clearly stated that they want Flux to integrate with other systems and that other languages might be transpiled to Flux in the future.

Maybe we could see an efficient transpilation from SQL to Flux in the future? Opinions diverge.

flux-builder

Opinion

For those following the blog, you know that I advocate technologies that are designed to reach the widest possible audience. Languages should not be created for elite programmers especially when they are aimed at data manipulation and analysis.

Flux is definitely designed that way.

With an expressive API, that one can easily understand, it should not be too hard for non-tech people to manipulate it and bring value easily.

Even if Flux is theoretically impressive, and well documented, I believe that its potential will be confirmed by real use cases and popularized by developers becoming way more productive than with other languages.

c – Tasks

In short, tasks are scheduled jobs that one can run in order to manipulate, transform or transfer data.

They can be seen as cron jobs following a very special syntax close to JSON.

task-template

The goal here is to have small units that perform tasks on their own.

The task contains Flux scripts and should be composable to perform multiple operations.

Administrators are able to create tasks templates that can be used by all the users afterward. I believe this is a great point when you want to reuse some of the work you have done previously.

Tasks can be managed via the InfluxDB UI, the CLI, and with the new API that InfluxData is building very specifically for tasks.

This API will provide a set of endpoints (such as /create or /delete) for you to easily perform some administrative operations on your tasks.

Tasks are versioned, so you could theoretically go back to a specific version if your new task isn’t satisfying enough.

Finally, you have tasks runs and tasks logs directly in the interface to see when your task was run and if it was run successfully.

What about alerts and notifications?

Alerts and notifications channels are still a very big part of InfluxDB 2.0. Special APIs were created for alerts and notifications and you should theoretically be able to create, modify or delete alert rules easily.

Opinion

In my opinion, the real challenge with tasks is about defining the best practices around them.

Having tasks that perform small operations is interesting for debugging purposes. Also, with runs and logs, you can have feedback on how your actions perform.

However, having small and independent units can lead to a very large pool of tasks to maintain, as well as code duplication probably leading to bug duplication.

There is a big challenge for developers to create an efficient data flow that is not too convoluted. Tasks are probably easy to create, but they should not be seen as the ultimate option to perform every single operation on your database.

For example, do you find tasks adapted for data cleansing? Wouldn’t it be more adapted to perform it before inserting your data into the platform?

Do’s and don’ts will definitely be needed.

d – Giraffe and Clockface

d – Giraffe and Clockface chronograf

Giraffe and Clockface are two products that were announced in InfluxDays 2021 London.

Right now, InfluxDB 2.0 provides a wide panel of visualizations for your metrics.

Graphs, scatter plots, single stats, histograms, and many more are provided by default in the interface.

However, InfluxData wants developers to be able to build their own visualizations.

Following the recent moves made by Grafana, moving from Angular panels to React panels, InfluxData created two libraries that allow developers to build their own panels and share them.

Giraffe

Giraffe is a library that is leveraging React components to provide a complete API for developers to use.

Pretty much like Grafana plugins, we can imagine that InfluxData is going to build a place for people to share their panels.

For now, the library is still in the pre-alpha stage, but I’ll share more info about it as soon as it becomes available to the public.

Clockface

At first, I had some trouble understanding the actual difference between Giraffe and Clockface.

Giraffe is designed for Chronograf-like panels and Clockface is designed to build visualization applications.

In short, it allows you to tweak the existing interface to add actions that are not existing natively in the platform.

Let’s say that you want to create a button on InfluxDB 2.0 that hides a certain panel that you find not that useful.

You would do it using Clockface, creating a button, and performing some Javascript operations to hide the panel.

e – User Packages

User packages are another big part of the next features of InfluxDB 2.0. Pretty much like NPM packages, Influx packages are meant to allow developments of any kind of logic into the platform without having to perform a pull request.

Proposals were made for interacting with Influx packages, such as a CLI with custom functions such as:

> influx package init
> influx package publish

You can define types and rules inside a configuration file that is very similar to a package.json file in Node ecosystems.

Warning: Influx user packages are different from Flux packages.

Flux packages are used to import Flux functions that were already coded and that you would want to use in a new Flux

e – User Packages influx1 e – User Packages influx2 e – User Packages influx3

II – Recent Industry Collaborations

In this industry, I believe that collaboration is one of the ways to succeed.

Most of the existing products (at least Grafana, InfluxDB, and Prometheus) are working towards improving the interoperability of their respective products.

What do we mean by interoperability?

Even if those tools are not accepting the same metrics format, and not dealing with the same APIs, there is a work in progress in accepting various different formats in InfluxDB.

a – Grafana

grafana
Even if InfluxDB 2.0 definitely intersects some of the features of Grafana, Grafana still provides new ways to monitor your InfluxDB metrics.

Both InfluxDB and Grafana are working on data exploration features.

Grafana wants to become an observability platform. It means that Grafana can be used to monitor real-time data, but it also can be used to have a closer look at certain metrics, and even compare them with metrics from other buckets or databases.

b – Prometheus & Flux Transpilation

b – Prometheus & Flux Transpilation prom

We know that Prometheus and Flux work quite differently when it comes to data querying.

As a reminder, I explained how Prometheus querying via PromQL worked in my Prometheus guide.

One of the news that was unveiled at InfluxDays London 2021 was the transpiling work done from Prometheus to Flux.

The goal of this project is to have InfluxDB as long-term storage for Prometheus. Prometheus queries would be run against this transpiler to produce Flux code.

Flux code would then be interpreted by the dedicated Flux interpreter and run on the Flux engine.

Julius Volz revealed his advancements on this project, explaining how he designed the data mapper (with metric names as fields, the measurement being ‘Prometheus’, label names being column names, and so on).

He also explained how he used abstract syntax trees to map PromQL functions to a Flux pipeline.

Remarks: there are still some concerns regarding the speed performances of the transpiling process.

InfluxData and Prometheus are working on a way to make joint operations more efficient, thus inducing faster transpiling operations.

Do Check Related Articles on Grafana & Prometheus: 

III – Real World Use Cases

During InfluxDays London, many of the talks were done by guest speakers. Guests came from very different industries and showcased on they used InfluxDB for their own needs.

This is in my opinion the most interesting part of the convention, as it featured many different use cases that highlighted how InfluxDB is solving problems in the real world.

a – Playtech

a – Playtech atavgen a – Playtech sotellme a – Playtech spaghetti

This talk was presented by Alex Tavgen, technical architect for Playtech which is a popular gambling software development company.

Alex Tavgen unveiled his views on architecture monitoring solutions and provided very concrete examples of how they should be designed.

His talk was centered around the following points :

  • “Managing a zoo” is hard: when developers are free to choose their stacks, it often leads to having teams with many different stacks which leads to obvious maintenance issues;
  • Abusive use of dashboards: it is tempting to record everything, just in case. More is not better in monitoring, and metrics should not be monitored for the sake of it. We need to bring actual sense in monitoring;
  • Are dashboards useless? No, but humans need to know what they have to look for in a dashboard. Dashboarding needs guidance and documentation;
  • Anomaly detection to preserve client trust: you should tell the client that he has an issue, and not the other way around. Many investments were made to improve actual anomaly detection, not just outlier detection. Alex Tavgen explained how he used machine learning to create a predictive engine that raises alerts when it truly matters. (they also plan to open source their Java engine).

b – Worldsensing

b – Worldsensing worldsensing b – Worldsensing worldsensing2 b – Worldsensing worldsensing3

This talk was presented by Albert Zaragoza, CTO and Head of Engineering at WorldSensing.Worldsensing is a company that provides monitoring solutions and intelligent components, leveraging IoT and time series databases to build smart cities. They provide intelligent parking solutions, as well as traffic flow solutions among other products.

Worldsensing produces client-facing applications, providing real-time feedback with extensive use of monitoring dashboards. As part of their infrastructure, Worldsensing uses Kapacitor and Flux in order to provide fast and reliable information to their end clients.

I will provide a link to the video as soon as it is available. However, if you use Postgres databases or if you are interested in providing Spatio-temporal feedback to your clients, you should definitely check what they are doing at Worldsensing.

IV – Community

InfluxDays London 2021 was also a great moment for us to talk and share our experiences with InfluxDB and the other products of its ecosystem.

a – InfluxAces

InfluxAces share their knowledge in a variety of ways, online and offline, through blogging, podcasting, attending meetups, answering community questions, building courses.. the list goes on!

And I am part of them now! Thank you!

a – InfluxAces a – InfluxAces mycard a – InfluxAces aces

As part of the InfluxAces program, I want to encourage creators and builders to submit their original articles to us as long as it is related to the subjects detailed in the article, and of course to InfluxDB.On my side, I will keep on writing about those technologies because I believe that they will shape the world of tomorrow.

b – Community Links

Meetups and conventions are great to connect with people. But they are not the only way to do it.

If you are interested in InfluxDB, you should:

  • Join the Slack community available here.
  • Ask your questions in the community forum (and have your questions answered by MarcV)
  • Tweet to or about InfluxDB on Twitter.

A Closing Word

On a technological level, it was truly inspiring to see how fast those technologies change and evolve in such a short amount of time.

The progress made so far is huge and it looks very promising for the future.

At InfluxDays London, I was lucky to be able to connect with passionate and humble people from all over the world. I was able to share my views on the future and to listen to what the others had to say about it.

Sure, I learned more about InfluxDB, but I learned even more about the people behind the scenes, whether they are working for InfluxData or not.

Engineers, developers, managers, or just curious individuals, all share the same ambition to create, learn and give back to the community.

The open-source community is definitely an amazing community.

See you next year!

How To Install and Configure Debian 10 Buster with GNOME

How To Install and Configure Debian 10 Buster with GNOME

Do you need an ultimate Guide to Install and Configure Debian 10 Buster with GNOME? This tutorial is the best option for you. Here, we have provided step-by-step instructions about how to install Debian 10 Buster with a GNOME desktop. Just have a look at the features of the Debian 10 before entering to discuss how to install and configure it using GNOME.

What is Debian?

Debian is an operating system for a wide range of devices including laptops, desktops, and servers. The developers of Debian will provide the security updates for all packages for almost of their lifetime. The current stable distribution of Debian is version 10, codenamed buster. Check out the features of the current version of the buster from the below modules.

Features of Debian 10 Buster

Initially, it was released on the 6th of July 2019, and it has come with a lot of very great features for system administrators. Have a look at them:

  • JDK update from the OpenJDK 8.0 to the new OpenJDK 11.0 version.
  • Debian 10 is now using version 3.30 of GNOME, featuring an increased desktop performance, screen sharing, and improved ways to remotely connect to Windows hosts.
  • Secure boot is now enabled by default, which means that you don’t have to disable it when trying to install Debian 10 on your machine.
  • Upgrade to Bash 5.0 essentially providing more variables for sysadmins to play with (EPOCHSECONDS or EPOCHREALTIME for example).
  • A lot of software updates: Apache 2.4.38, systemd 241, Vim 8.1, Python 3 3.7.2, and many more.
  • IPtables is being replaced by NFtables, providing an easier syntax and a more efficient way to handle your firewall rules.

After referring to these above points, you know what’s available in the brand new Debian 10 buster distribution, now it’s time for installation and configuration of Debian 10.

Do Check: How To Install InfluxDB on Windows

Suggested System Requirements for Debian 10

  • 2 GB RAM
  • 2 GHz Dual Core Processor
  • 10 GB Free Hard disk space
  • Bootable Installation Media (USB/ DVD)
  • Internet connectivity (Optional)

Now, dive into the installation & configuration steps of Debian 10 Buster

How to Install and Configure Debian 10 with GNOME?

The following are the detailed steps to install and configure the current version of Debian 10 using GNOME:

Steps to Create a Bootable USB stick on Linux

In order to install Debian 10 buster, you need to “flash” an ISO image to a USB stick, making it “bootable“.

The Debian 10 buster image is about 2 GB in size (if you choose to have a desktop environment with it), so I would recommend that you choose a USB drive that is at least 3GB large or more.

If you don’t have a USB drive that large, you can opt for minimal versions of Debian 10 Buster.

I – Create a Bootable USB stick on Linux

In my home setup, I have a Xubuntu 18.04 instance, so this is what I will use to create my bootable image.

Steps are pretty much the same for other distributions. For Windows, you would need to use Rufus to create a bootable image.

a – Plug your USB stick in the USB port

Within a couple of seconds, the operating system should automatically mount your USB drive in your filesystem (it should be mounted at the /media mount point by default).

How To Install and Configure Debian 10 Buster with GNOME volume-mounted

b – Identify where your USB drive is mounted

To get the mount point of your USB drive, you can use the lsblk command.

How To Install and Configure Debian 10 Buster with GNOME lsblk-1

As you can see, my USB drive is named “sdb”, it has one partition (part) named “sdb1” and it is mounted on “/media/antoine/7830-961F”.

Alternatively, you could use the df command to have some information about the remaining space on your USB drive.

How To Install and Configure Debian 10 Buster with GNOME df-ht

c – Download Debian 10 Buster ISO file

Your USB is ready, now you can download the ISO file to flash your drive.

The distribution images are located here. For this tutorial, I am using the Debian 10 Buster GNOME edition for amd64 processors.

If you are more familiar with another environment like Cinnamon or KDE, they are all available in the downloads page.

Run a simple wget command on any folder that you want (my home folder in this case)

$ wget https://cdimage.debian.org/debian-cd/current-live/amd64/iso-hybrid/debian-live-10.0.0-amd64-gnome.iso

If you need a more minimal distribution, you can go for the netinst version, but desktop environments might not be included.

$ wget https://cdimage.debian.org/debian-cd/current/amd64/iso-cd/debian-10.0.0-amd64-netinst.iso<

How To Install and Configure Debian 10 Buster with GNOME wget

d – Copy the image to your USB drive

To copy the image, we are going to use the dd command.

$ sudo dd if=/home/antoine/debian-live-10.0.0-amd64-gnome.iso of=/dev/sdb && sync

e – Boot on the USB drive

Now that your USB drive contains the ISO file, it is time for you to boot from it.

On most configurations, you should be able to boot on the USB by pressing ESC, F1, F2, or F8 when starting your computer.

Follow the Debian 10 Graphical Installation Steps

This is the screen that you should see once you successfully booted on the Debian 10 installer.

Select the “Graphical Debian Installer” option.

How To Install and Configure Debian 10 Buster with GNOME step-1First, you are asked to select a language.

I’ll go for English for this one.

How To Install and Configure Debian 10 Buster with GNOME step-2

On the next screen, you are asked to select a location.

I’ll pick the United States as an example.

How To Install and Configure Debian 10 Buster with GNOME step-3

Then, choose your keyboard layout. (don’t worry, you can change it later on if you want).

I’ll go for American English for this example.

How To Install and Configure Debian 10 Buster with GNOME step-3

From there, a couple of automatic checks are done within your installation.

Debian 10 will try to load additional components from the bootable device and it will perform some automatic network checks.

How To Install and Configure Debian 10 Buster with GNOME step-5

After the checks, you are asked to set a hostname for your computer.

As indicated, this is the name that will be used to identify your computer on a network.

I’ll go for “Debian-10” in this case.
How To Install and Configure Debian 10 Buster with GNOME step-7
You are asked to configure the domain name for your host. You can leave this option blank.
How To Install and Configure Debian 10 Buster with GNOME step-8

Be careful on the next step, there is a little bit of a gotcha when it comes to root passwords.

You want to leave this option blank.

As a consequence, Debian will use the password of the user you will create in the next step to perform sudo operations.

Moreover, the root account will be disabled which is interesting for security purposes.

Nonetheless, if you want to specify a specific password for root, you can do it here, but I wouldn’t recommend it.

How To Install and Configure Debian 10 Buster with GNOME step-9
Click continue, and now it is time for you to specify the real name for the user.

I’ll go for JunosNotes but feel free to mention your real name and first name.

How To Install and Configure Debian 10 Buster with GNOME step-10

Then, you have to choose a username for your host.

JunoNotes will do the trick for me.

How To Install and Configure Debian 10 Buster with GNOME step-11

Then, choose a very secure password for your host.

How To Install and Configure Debian 10 Buster with GNOME step-12

Choose a time zone for your host.

Be careful on this point as time zones are very important when it comes to logging for example.

How To Install and Configure Debian 10 Buster with GNOME step-13

From there, Debian 10 Buster will start detecting disks on your host.

How To Install and Configure Debian 10 Buster with GNOME step-14

After it is done, you will be asked for a way to partition your disks.

Go for the Guided (use entire disk) version unless you have special requirements that need to set up LVM.

How To Install and Configure Debian 10 Buster with GNOME step-15

Select the disk you want to partition.

In my case, I have only one disk on the system, so I’ll pick it.

How To Install and Configure Debian 10 Buster with GNOME step-16

For the partitioning scheme, go for “All files in one partition“, which should suit your needs.

How To Install and Configure Debian 10 Buster with GNOME step-17

For the automatic partitioning, Debian 10 creates two partitions, a primary and a swap one (when you run out of memory!)

How To Install and Configure Debian 10 Buster with GNOME step-19

If you are happy with the partitioning, simply press the “Finish partitioning and write changes to disk” option.

On the next screen, you are asked for confirmation about the previous partitioning.

Simply check “Yes” on the two options prompted.

How To Install and Configure Debian 10 Buster with GNOME step-20

From there, the installation should begin on your system.

How To Install and Configure Debian 10 Buster with GNOME step-21 How To Install and Configure Debian 10 Buster with GNOME step-22

On the next step, you have prompted the choice to use a network mirror to supplement the software included in the USB drive.

You want to press “Yes

How To Install and Configure Debian 10 Buster with GNOME step-23

By pressing “Yes”, you are asked to choose a location that is close to your network. I’ll use the United States in this case.

How To Install and Configure Debian 10 Buster with GNOME step-24

Then, choose a Debian archive mirror for your distribution.

I’ll stick with the deb.debian.org one.

How To Install and Configure Debian 10 Buster with GNOME step-25

If you are using a proxy, this is where you want to configure it. I am not using one, so I’ll leave it blank.

How To Install and Configure Debian 10 Buster with GNOME step-26

Debian 10 Buster will start configuring apt and will try to install the GRUB boot loader on your instance.

How To Install and Configure Debian 10 Buster with GNOME step-27 How To Install and Configure Debian 10 Buster with GNOME step-28

On the next step, you are asked if you want to GRUB boot loader to the master boot record, you obviously want to press “Yes” to that.

How To Install and Configure Debian 10 Buster with GNOME step-29

On the next screen, select the hard drive where you want the GRUB boot loader to be installed and press Continue.

How To Install and Configure Debian 10 Buster with GNOME step-30

Done!

The installation should be completed at this point.

How To Install and Configure Debian 10 Buster with GNOME step-32

On the lock screen, type the password that you set up in the installation phase, and this is the screen that you should see.

How To Install and Configure Debian 10 Buster with GNOME backgorund

Awesome! You now have Debian 10 on your instance.

But this tutorial is not over. Before continuing, there are a few minimal configurations that you want to do on your Debian 10 buster instance for it to be all correctly configured.

Steps to Configure your Debian 10 Buster

Before playing with your new Debian 10 buster machine, there are a few steps that you need to complete.

a – Enable unofficial Debian software download

By default, downloading Debian software (like the tools that you would find in the Software store) are disabled by default.

To enable them, head to “Activities”, and type “Software Updates”.

How To Install and Configure Debian 10 Buster with GNOME step-34-bisIn the next window, the first and the last checkbox should be already checked.

Check the “DFSG-compatible Software with Non-Free Dependencies (contrib)” option and the “Non-DFSG-compatible Software (non-free)” option.

How To Install and Configure Debian 10 Buster with GNOME step-35

Click on “Close“. From there, you will be asked to confirm your choice by reloading the information about available software.

Simply click on “Reload“.

How To Install and Configure Debian 10 Buster with GNOME step-36

Head to the Store by typing “Store” into the Activities search box.

If you are seeing third-party applications, it means that the previous step worked correctly.

How To Install and Configure Debian 10 Buster with GNOME step-38

b – Install wget to download files from Internet

wget is not installed by default on your instance.

$ sudo apt install wget

How To Install and Configure Debian 10 Buster with GNOME step-39

c – Install your NVIDIA drivers

The NVIDIA driver installation process is pretty straightforward.

Simply run the “nvidia-detect” command in your terminal and this utility will tell you which driver you have to install depending on your graphics card.

First, install nvidia-detect

$ sudo apt install nvidia-detect

How To Install and Configure Debian 10 Buster with GNOME step-42

From there, run the nvidia-detect utility in your command line.

$ nvidia-detect
Detected NVIDIA GPUs:
02:00.0 VGA compatible controller [0300]: NVIDIA Corporation GF108 [GeForce GT 430] [10de:0de1] (rev a1)
Your card is supported by the default drivers.
It is recommended to install the
    nvidia-driver
package.

As you can see, the nvidia-detect utility states that I need to install the nvidia-driver package for my instance, so this is what I am going to do.

$ sudo apt install nvidia-driver

d – Install Dash To Dock

As a Debian user, I hate going to the top left Applications menu just to find my web browser or to browse my filesystem.

As a consequence, similarly to MacOS graphical interfaces, I would like a static application dock to be visible on my desktop, all the time.

How To Install and Configure Debian 10 Buster with GNOME step-43

To install it, head to the “Store” by typing “Store” in the Applications search box. This is the window that you should see.

How To Install and Configure Debian 10 Buster with GNOME step-44

Click on “Add-ons”, and then select the “Shell Extensions tab“. You should see a list of shell extensions available for your Debian distribution.

In the top search bar, type “Dash To Dock“. Click on “Install” when you found the “Dash To Dock” store item.

How To Install and Configure Debian 10 Buster with GNOME step-40-dash-dock

Simply click “Install” on the next window that is prompted on the screen.

How To Install and Configure Debian 10 Buster with GNOME step-41-dash-dock

That’s it!

You now have a dash to dock on your desktop.

How To Install and Configure Debian 10 Buster with GNOME featured-1

Going Further

Your adventure with Debian 10 has only begun, but I would recommend that you start configuring your host if you plan on using it as a server.

Here’s a very good video of Linux Tex that explains all the things that you should do after installing your Debian 10 installation.

Some of the steps are already covered in this tutorial, but for the others, feel free to watch his video as it explains the procedures quite in detail.

 

6 Tips To A Successful Code Review | Best Practices for Effective Code Review

Code Review is one of the most prominent tasks in every SDLC. It enhances code quality and builds your codebase more stable. The process of running a code review can be a nightmare for team leads so we have compiled a few tips and tricks to a successful code review.

Moreover, software code review is a method to make sure that the code meets the functional requirements and assists developers to comply with the best coding practices. As per our knowledge and studies, we have shared 6 easy tips for a successful code review that can help code reviewers and software developers while the code review.

6 Tips for Effective Code Review

Mostly, the code review is a ‘Make It Or Break It‘ moment. It remains an important step in delivering a successful and qualitative product. Now, in this tutorial, we are going to list the 6 best tips in order to perform a successful and fruitful code review. Before that, have a look at the below video for quick reference on best practices for code review:

Tip 1 – Choose The Right Moment

Performing a successful code review begins by choosing the right moment to perform it.

What is the right moment to perform a code-review?

A good moment holds the following characteristics :

  • Low probability of being interrupted: you should tend to avoid times of the day where you know, by experience, that people tend to interrupt your work. Days often have calmer periods and this is where you should hold your code review.
  • Right state of mind: you weren’t stressed ten minutes before, or dreading an important meeting one hour after. Both the reviewer and the reviewer have to be in a good state of mind to guarantee that no turbulences will ruin the code review.
  • The reviewee ensures that the code is review-ready: there is really no point in wasting time reviewing a code that is not ready. Be ready, it is implicitly meant that the code should be qualitative, commented, and unit-tested at least. If the reviewer allocates time to review some code, it is really nothing at all to come with your paper ready.

I – Choose The Right Moment

Tip 2– Set The Boundaries Of The Code Review

Before starting the code review, you have to set correct and attainable boundaries for it. That means that you have to know exactly what you are going to cover, ahead of time.

Are we going to review one or multiple features? Given the time we allocated in the step before, does it make sense to review too many features?

A code review can consist of integrating multiple small features or a few big ones.

In many cases, you want to be focused on a few features in order to grasp exactly what they are about and not jumping too frequently from one feature to another.

In order to set boundaries, you may want to :

  • Have well-written and correctly sized tasks in a ticket management system like Jira or even a classic whiteboard for example.
  • Correctly prioritize the code review: some tasks may be more urgent than others. If too many tasks need to be integrated, you cannot perform all of them. Choose wisely.

II – Set The Boundaries Of The Code Review

Tip 3 – Set The Right Environment For It

This point is a very important one.

Before starting any code review, you want to set the correct environment for your code review.

What are the characteristics of a proper environment?

  • Your environment has to be quiet: code reviews cannot be performed efficiently when there is a lot of noise around you. A good focus is essential to be efficient in the task you have to accomplish.
  • No interruption-zone: Similar to the point we described in the first part, you may want to be in a room with restricted access, underlying the fact that you don’t want to be disturbed during the process.
  • A positive criticism zone: criticize somebody’s work is a hard process. But it is even hard when a lot of people are around you. In a restricted room, you are free to express your comments without fearing that somebody’s going to listen to it.

III – Set The Right Environment For It

Tip 4 – Communicate, communicate, communicate

I really wanted to put some emphasis on this point.

A code review is not a one-way process, quite the opposite.

A successful code review relies heavily on being able to respectfully and efficiently communicate with your peer.

A code review isn’t only about the reviewee expressing its intent. Neither is it only about the reviewer stating what’s right and what’s wrong.

It should really be a conversation about everything that composes it: the scope, the intent, the fixes, and even the disagreements.

Here are the keys points for successful communication :

  • Don’t just hear, listen: the code review has to be an understanding and insightful moment. You can not listen to the sound of your own voice. Opinions and views may differ and every disagreement should lead to a constructive discussion about what the issue is.
  • A neutral tone: a code review isn’t a whiteboard exam, nor an inquisition. It should feel like the code is being judged, not the developer behind it.
  • When in doubt, ask: if you’re not sure about a particular detail in the code review, ask about it. Some intent isn’t made clear from the beginning and could lead to misinterpretations.

IV – Communicate, communicate, communicate

Tip 5 – Keep In Mind The Deliverables

A code review always leads to a deliverable: it can be a counter-review of the code presented or a code integration to the relevant branches that you chose for your gitflow.

This is where the responsibility of the reviewer is involved. The review is responsible for presenting good code to the reviewer but the final decision belongs to the reviewer.

How does one judges if the code can be integrated or not?

Here are some criteria that can help you in your decision :

  • The code written and the development done is relevant to what was asked in the ticket dedicated.
  • The code is correctly unit-tested and is ‘considered safe’ to master branches.
  • No obvious code smells or bad practices.

Tip 6 – Use Dedicated Tools

Tools to perform a proper code review exist and you shouldn’t hesitate to use them to ease your integration process.

Three brands are predominant when it comes to code-review software: Smartbear (with Collaborator)Perforce (with Helix Swarm), and Atlassian (with Crucible).

Such tools often provide a complete interface to organize code reviews, as well as metrics in order for your team to continuously improve.

It is way more than a code comparison tool, and it integrates with pretty much all the source version control systems available.

Give it a try!

Your Turn To Share!

Did those tips help you organize your code reviews?

Does your role in a code review make more sense now?

If you have additional tips to share, or concrete examples that happened to you in your daily developer life, make sure to share them by leaving a comment below.

It really helps new developers.

Do not hesitate to share our other productivity articles and our latest article about completing a side project.

Until then, have fun, as always.

9 Best Practices for Code Review

The following are the nine best practices for a successful code review that help all software developers:

1. Know What to Look for in a Code Review
2. Build and Test — Before Review
3. Don’t Review Code for Longer Than 60 Minutes
4. Check No More Than 400 Lines at a Time
5. Give Feedback That Helps (Not Hurts)
6. Communicate Goals and Expectations
7. Include Everyone in the Code Review Process
8. Foster a Positive Culture
9. Automate to Save Time

Peer Code Review Best Practices

Peer Code Review Best Practices

Best Practices for How to Run a Code Review

Best Practices for How to Run a Code Review

How To Install an Elasticsearch Cluster on Ubuntu 18.04

Elasticsearch is a platform for distributed search and data analysis in real-time. It offers a multitenant-capable full-text search engine with an HTTP web interface and schema-free JSON documents with a simple installation.

By Elasticsearch, you can execute and combine various types of searches giving you the like Kibana, Logstash, X-Pack, etc., Elasticsearch can collect and monitor Big Data at a large scale. This Elasticsearch cluster includes three data nodes and with this, we could avoid a split-brain and have a quorum of master-eligible nodes.

In this free and ultimate tutorial, we will be going to learn how to install and configure a 3 Node Elasticsearch Cluster on Ubuntu 18.04 and with this you can go through some API examples on creating indexes, ingesting documents, searches, etc.Elasticsearch Logo

What is ElasticSearch?

ElasticSearch is a highly scalable open-source analytics engine, RESTful search engine built on top of Apache Lucene and issued under an Apache license. It is the most famous search engine and is generally used for full-text search, log analytics, security intelligence, business analytics, analyze big volumes of data faster and in near real-time, etc. Also, Elasticsearch is Java-based and can search and index document files in different formats.

Features of ElasticSearch

Before we get to the main topic, let’s cover some basics about Elasticsearch from below: 

Basic Concepts of Elasticsearch

  • An Elasticsearch Cluster is made up of a number of nodes;
  • Each Node includes Indexes, where an Index is a Collection of Documents;
  • Master nodes are subjective for Cluster related tasks, creating/deleting indexes, tracking of nodes, allocate shards to nodes;
  • Data nodes are liable for hosting the actual shards that have the indexed data also handles data related operations like CRUD, search, and aggregations;
  • Indexes are split into Multiple Shards;
  • Shards exist of Primary Shards and Replica Shards;
  • A Replica Shard is a Copy of a Primary Shard that is used for HA/Redundancy;
  • Shards get placed on random nodes throughout the cluster;
  • A Replica Shard will NEVER be on the same node as the Primary Shard’s associated shard-id.

Representation of Nodes, Index and Shards on 2 Nodes (as an example)

Note on Master Elections

The least number of master eligible nodes that want to join a newly elected master in order for an election is configured via the setting discovery.zen.minimum_master_nodes. This configuration is very powerful, as it makes each master-eligible node informed of the minimum number of master-eligible nodes that must be visible in order to form a cluster.

Without this setting or incorrect configuration, this might lead to a split-brain, where let’s say something went wrong and upon nodes rejoining the cluster, it may form 2 different clusters, which we want to avoid at all costs.

From consulting elasticsearch documentation, to avoid a split brain, this setting should be set to a quorum of master-eligible nodes via the following formula:

(master_eligible_nodes / 2) + 1
# in our case:
(3/2) + 1 = 2

It is advised to evade having only two master eligible nodes since a quorum of two is two. To read more on elasticsearch cluster master election process, take a look at their documentation

Prerequisites

We have to set the internal IP addresses of our nodes to either our hosts’ file or DNS server. To keep it easy & straightforward, I will add them to my host file. This needs to apply to both nodes:

$ sudo su - 
$ cat > /etc/hosts << EOF
127.0.0.1 localhost
172.31.0.77 es-node-1
172.31.0.45 es-node-2
172.31.0.48 es-node-3
EOF

Now that our host entries are set, we can start with the fun stuff.

Installing Elasticsearch on Ubuntu

The following instructions and directions should be implemented to both nodes.

Get the Elasticsearch repositories and update your system so that your servers are aware of the newly added Elasticsearch repository:

$ apt update && apt upgrade -y
$ apt install software-properties-common python-software-properties apt-transport-https -y
$ wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -
$ echo "deb https://artifacts.elastic.co/packages/6.x/apt stable main" | sudo tee -a /etc/apt/sources.list.d/elastic-6.x.list
$ apt update

Elasticsearch relies on Java, so install the java development kit:

$ apt install default-jdk -y

Verify that java is installed:

$ java -version
openjdk version "11.0.3" 2019-04-16
OpenJDK Runtime Environment (build 11.0.3+7-Ubuntu-1ubuntu218.04.1)
OpenJDK 64-Bit Server VM (build 11.0.3+7-Ubuntu-1ubuntu218.04.1, mixed mode, sharing)

Install Elasticsearch:

$ apt install elasticsearch -y

Once Elasticsearch is installed, repeat these steps on the second node. Once that is done, move on to the configuration section.

Configuring Elasticsearch

For nodes to join the same cluster, they should all share the same cluster name.

We also need to specify the discovery hosts as the masters so that the nodes can be discoverable. Since we are installing a 3 node cluster, all nodes will contribute to a master and data node role.

Feel free to inspect the Elasticsearch cluster configuration, but I will be overwriting the default configuration with the config that I need.

Make sure to apply the configuration on both nodes:

$ cat > /etc/elasticsearch/elasticsearch.yml << EOF
cluster.name: es-cluster
node.name: \${HOSTNAME}
node.master: true
node.data: true
path.logs: /var/log/elasticsearch
path.data: /usr/share/elasticsearch/data
bootstrap.memory_lock: true
network.host: 0.0.0.0
discovery.zen.minimum_master_nodes: 2
discovery.zen.ping.unicast.hosts: ["es-node-1", "es-node-2"]
EOF

Important settings for your elasticsearch cluster is described on their docs:

  • Disable swapping
  • Increase file descriptors
  • Ensure sufficient virtual memory
  • Ensure sufficient threads
  • JVM DNS cache settings
  • Temporary directory not mounted with noexec

Increase the file descriptors on the nodes, as instructed by the documentation:

$ cat > /etc/default/elasticsearch << EOF
ES_STARTUP_SLEEP_TIME=5
MAX_OPEN_FILES=65536
MAX_LOCKED_MEMORY=unlimited
EOF

Ensure that pages are not swapped out to disk by requesting the JVM to lock the heap in memory by setting LimitMEMLOCK=infinity.

Set the maximum file descriptor number for this process: LimitNOFILE and increase the number of threads using LimitNPROC:

$ vim /usr/lib/systemd/system/elasticsearch.service
[Service]
LimitMEMLOCK=infinity
LimitNOFILE=65535
LimitNPROC=4096
...

Increase the limit on the number of open files descriptors to user elasticsearch of 65536 or higher

$ cat > /etc/security/limits.conf << EOF
elasticsearch soft memlock unlimited
elasticsearch hard memlock unlimited
EOF

Increase the value of the map counts as elasticsearch uses maps directory to store its indices:

$ sysctl -w vm.max_map_count=262144

For a permanent setting, update vm.max_map_count in /etc/sysctl.conf and run:

$ sysctl -p /etc/sysctl.conf

Change the permissions of the elasticsearch data path, so that the elasticsearch user and group has permissions to read and write from the configured path:

$ chown -R elasticsearch:elasticsearch /usr/share/elasticsearch

Make sure that you have applied these steps to all the nodes before continuing.

Start Elasticsearch

Enable Elasticsearch on boot time and start the Elasticsearch service:

$ systemctl enable elasticsearch
$ systemctl start elasticsearch

Verify that Elasticsearch is running:

$ netstat -tulpn
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp6       0      0 :::9200                 :::*                    LISTEN      278/java
tcp6       0      0 :::9300                 :::*                    LISTEN      278/java

Using Elasticsearch Restful API

In this section we will get comfortable with using Elasticsearch API, by covering the following examples:

  • Cluster Overview;
  • How to view Node, Index, and Shard information;
  • How to Ingest Data into Elasticsearch;
  • Who to Search data in Elasticsearch;
  • How to delete your Index

View Cluster Health

From any node, use an HTTP client such as curl to investigate the current health of the cluster by looking at the cluster API:

$ curl -XGET http://localhost:9200/_cluster/health?pretty
{
  "cluster_name" : "es-cluster",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 3,
  "number_of_data_nodes" : 3,
  "active_primary_shards" : 0,
  "active_shards" : 0,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}

As you can see the cluster status is Green, which means everything works as expected.

In Elasticsearch you get Green, Yellow and Red statuses. Yellow would essentially mean that one or more replica shards are in an unassigned state. Red status means that some or all primary shards are unassigned which is really bad.

From this output, we can also see the number of data nodes, primary shards, unassigned shards, etc.

This is a good place to get an overall view of your Elasticsearch cluster’s health.

View the Number of Nodes in your Cluster

By looking at that /_cat/nodes API we can get information about our nodes that is part of our cluster:

$ curl -XGET http://localhost:9200/_cat/nodes?v
ip             heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
172.31.0.77              10          95   0    0.00    0.00     0.00 mdi       -      es-node-1
172.31.0.45              11          94   0    0.00    0.00     0.00 mdi       -      es-node-2
172.31.0.48              25          96   0    0.07    0.02     0.00 mdi       *      es-node-3

As you can see, we can see information about our nodes such as the JVM Heap, CPU, Load averages, node role of each node, and which node is master.

As we are not running dedicated masters, we can see that node es-node-3 got elected as master.

Create your first Elasticsearch Index

Note that when you create an index, the default primary shards are set to 5 and the default replica shard count is set to 1. You can change the replica shard count after an index has been created, but not the primary shard count as that you will need to set on index creation.

Let’s create an Elasticsearch index named myfirstindex:

$ curl -XPUT http://localhost:9200/myfirstindex
{"acknowledged":true,"shards_acknowledged":true,"index":"myfirstindex"}

Now that your index has been created, let’s have a look at the /_cat/indices API to get information about our indices:

$ curl -XGET http://localhost:9200/_cat/indices?v
health status index        uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   myfirstindex xSX9nOQJQ2qNIq4A6_0bTw   5   1          0            0      1.2kb           650b

From the output you will find that we have 5 primary shards and 1 replica shard, with 0 documents in our index and that our cluster is in a green state, meaning that our primary and replica shards have been assigned to the nodes in our cluster.

Note that a replica shard will NEVER reside on the same node as the primary shard for HA and Redundancy.

Let’s go a bit deeper and have a look at the shards, to see how our shards are distributed through our cluster, using the /_cat/shards API:

$ curl -XGET http://localhost:9200/_cat/shards?v
index        shard prirep state   docs store ip             node
myfirstindex 1     r      STARTED    0  230b 172.31.0.77    es-node-2
myfirstindex 1     p      STARTED    0  230b 172.31.0.48    es-node-3
myfirstindex 4     p      STARTED    0  230b 172.31.0.48    es-node-3
myfirstindex 4     r      STARTED    0  230b 172.31.0.77    es-node-1
myfirstindex 2     r      STARTED    0  230b 172.31.0.45    es-node-2
myfirstindex 2     p      STARTED    0  230b 172.31.0.77    es-node-1
myfirstindex 3     p      STARTED    0  230b 172.31.0.45    es-node-2
myfirstindex 3     r      STARTED    0  230b 172.31.0.48    es-node-3
myfirstindex 0     p      STARTED    0  230b 172.31.0.45    es-node-2
myfirstindex 0     r      STARTED    0  230b 172.31.0.77    es-node-1

As you can see each replica shard of its primary is spread on different nodes.

Replicating a Yellow Cluster Status

For a yellow cluster status, we know that it’s when one or more replica shards are in an unassigned state.

So let’s replicate that behavior by scaling our replica count to 3, which would mean that 5 replica shards will be in an unassigned state:

$ curl -XPUT -H 'Content-Type: application/json' \
http://localhost:9200/myfirstindex/_settings -d \
'{"number_of_replicas": 3}'

Now we have scaled the replica count to 3, but since we only have 3 nodes, we will have a yellow state cluster:

$ curl -XGET http://localhost:9200/_cat/indices?v
health status index        uuid                   pri rep docs.count docs.deleted store.size pri.store.size
yellow open   myfirstindex xSX9nOQJQ2qNIq4A6_0bTw   5   3          0            0      3.3kb           1.1kb

The cluster health status should show the number of unassigned shards, and while they are unassigned we can verify that by looking at the shards API again:

$ curl -XGET http://localhost:9200/_cat/shards?v
index        shard prirep state      docs store ip             node
myfirstindex 1     r      STARTED       0  230b 172.31.0.45    es-node-2
myfirstindex 1     p      STARTED       0  230b 172.31.0.48    es-node-3
myfirstindex 1     r      STARTED       0  230b 172.31.0.77    es-node-1
myfirstindex 1     r      UNASSIGNED
myfirstindex 4     r      STARTED       0  230b 172.31.0.45    es-node-2
myfirstindex 4     p      STARTED       0  230b 172.31.0.48    es-node-3
myfirstindex 4     r      STARTED       0  230b 172.31.0.77    es-node-1
myfirstindex 4     r      UNASSIGNED
myfirstindex 2     r      STARTED       0  230b 172.31.0.45    es-node-2
myfirstindex 2     r      STARTED       0  230b 172.31.0.48    es-node-3
myfirstindex 2     p      STARTED       0  230b 172.31.0.77    es-node-1
myfirstindex 2     r      UNASSIGNED
myfirstindex 3     p      STARTED       0  230b 172.31.0.45    es-node-2
myfirstindex 3     r      STARTED       0  230b 172.31.0.48    es-node-3
myfirstindex 3     r      STARTED       0  230b 172.31.0.77    es-node-1
myfirstindex 3     r      UNASSIGNED
myfirstindex 0     p      STARTED       0  230b 172.31.0.45    es-node-2
myfirstindex 0     r      STARTED       0  230b 172.31.0.48    es-node-3
myfirstindex 0     r      STARTED       0  230b 172.31.0.77    es-node-1
myfirstindex 0     r      UNASSIGNED

At this point in time, we could either add another node to the cluster or scale the replication factor back to 1 to get the cluster health to green again.

I will scale it back down to a replication factor of 1:

$ curl -XPUT http://localhost:9200/myfirstindex/_settings -d '{"number_of_replicas": 1}'

Ingest Data into Elasticsearch

We will ingest 3 documents into our index, this will be a simple document consisting of a name, country and gender, for example:

{ 
  "name": "james", 
  "country": "south africa", 
  "gender": "male"
}

First, we will ingest the document using a PUT HTTP method, when using a PUT method, we need to specify the document ID.

PUT methods will be used to create or update a document. For creating:

$ curl -XPUT -H 'Content-Type: application/json' \
http://localhost:9200/myfirstindex/_doc/1 -d '
{"name":"james", "country":"south africa", "gender": "male"}'

Now you will find we have one index in our cluster:

$ curl -XGET http://localhost:9200/_cat/indices?v
health status index        uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   myfirstindex xSX9nOQJQ2qNIq4A6_0bTw   5   1          1            0     11.3kb          5.6kb

Since we know that the document ID is “1”, we can do a GET on the document ID to read the document from the index:

$ curl -XGET http://localhost:9200/myfirstindex/people/1?pretty
{
  "_index" : "myfirstindex",
  "_type" : "people",
  "_id" : "1",
  "found" : false
}

If we ingest documents with a POST request, Elasticsearch generates the document ID for us automatically. Let’s create 2 documents:

$ curl -XPOST -H 'Content-Type: application/json' \
http://localhost:9200/myfirstindex/_doc/ -d '
{"name": "kevin", "country": "new zealand", "gender": "male"}'

$ curl -XPOST -H 'Content-Type: application/json' \
http://localhost:9200/myfirstindex/_doc/ -d '
{"name": "sarah", "country": "ireland", "gender": "female"}'

When we have a look again at our index, we can see that we now have 3 documents in our index:

$ curl -XGET http://localhost:9200/_cat/indices?v
health status index        uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   myfirstindex xSX9nOQJQ2qNIq4A6_0bTw   5   1          3            0       29kb         14.5kb

Search Queries

Now that we have 3 documents in our elasticsearch index, let’s explore the search APIs to get data from our index. First, let’s search for the keyword “sarah” as a source query parameter:

$ curl -XGET 'http://localhost:9200/myfirstindex/_search?q=sarah&pretty'
{
  "took" : 9,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 0.2876821,
    "hits" : [
      {
        "_index" : "myfirstindex",
        "_type" : "_doc",
        "_id" : "cvU96GsBP0-G8XdN24s4",
        "_score" : 0.2876821,
        "_source" : {
          "name" : "sarah",
          "country" : "ireland",
          "gender" : "female"
        }
      }
    ]
  }
}

We can also narrow our search query down to a specific field, for example, show me all the documents with the name kevin:

$ curl -XGET 'http://localhost:9200/myfirstindex/_search?q=name:kevin&pretty'
{
...
  "hits" : {
    "total" : 1,
    "max_score" : 0.2876821,
    "hits" : [
      {
        "_index" : "myfirstindex",
        "_type" : "_doc",
        "_id" : "gPU96GsBP0-G8XdNHoru",
        "_score" : 0.2876821,
        "_source" : {
          "name" : "kevin",
          "country" : "new zealand",
          "gender" : "male"
        }
      }
    ]
  }
}

With Elasticsearch we can also search with our query in the request body, a similar query as above would look like this:

$ curl -XPOST -H 'Content-Type: application/json' \
'http://localhost:9200/myfirstindex/_search?pretty' -d '
{
  "query": {
    "match": {
      "name": "kevin"
    }
  }
}'

{
...
        "_index" : "myfirstindex",
        "_source" : {
          "name" : "kevin",
          "country" : "new zealand",
          "gender" : "male"
        }
...
}

We can use wildcard queries:

$ curl -XPOST -H 'Content-Type: application/json' \
'http://172.31.0.77:9200/myfirstindex/_search?pretty' -d '
{
  "query": {
    "wildcard": {
      "country": "*land"
    }
  }
}'

{
...
    "hits" : [
      {
        "_index" : "myfirstindex",
        "_type" : "_doc",
        "_id" : "cvU96GsBP0-G8XdN24s4",
        "_score" : 1.0,
        "_source" : {
          "name" : "sarah",
          "country" : "ireland",
          "gender" : "female"
        }
      },
      {
        "_index" : "myfirstindex",
        "_type" : "_doc",
        "_id" : "gPU96GsBP0-G8XdNHoru",
        "_score" : 1.0,
        "_source" : {
          "name" : "kevin",
          "country" : "new zealand",
          "gender" : "male"
        }
      }
    ]
...
}

Have a look at their documentation for more information on the Search API

Delete your Index

To wrap this up, we will go ahead and delete our index:

$ curl -XDELETE http://localhost:9200/myfirstindex

Going Further

If this got you curious, then definitely have a look at this Elasticsearch Cheatsheet that I’ve put together and if you want to generate lots of data to ingest to your elasticsearch cluster, have a look at this python script.

Our other links related to ELK:

MongoDB Monitoring with Grafana & Prometheus

MongoDB Monitoring with Grafana & Prometheus | Mongodb Prometheus Grafana Dashboard

If you are a web application developer or a database administrator, your infrastructure likely relies on MongoDB in some ways. Monitoring MongoDB is very important to assure that you are not holding memory issues or performance issues with your database.

In previous years, you can find various ways to monitor your MongoDB. But now, we are going to discuss MongoDB Database Monitoring with Grafana and Prometheus.

Do Check: Complete MySQL dashboard with Grafana & Prometheus

Are you ready? Then check out the concepts available here which you will learn from the following tutorial:

  • What your Grafana – Prometheus – MongoDB exporter will look like
  • How to install Prometheus, a modern time-series database on your computer;
  • How to configure import a MongoDB dashboard in seconds
  • How to set up the MongoDB developed by Percona as well as binding it to MongoDB;

Note: Percona’s MongoDB exporter incorporates MongoDB stats for sharding and duplicate, as the development of David Cuadrado’s MongoDB exporter.

MongoDB, Grafana, and Prometheus Architecture

Here’s an entire overview of what the final monitoring architecture looks like.

As a quick reminder, Prometheus scrapes targets. Targets may be instrumented applications (like instrumented Java apps for example), the Pushgateway, or exporters.

Exporters are a way to bind to an existing entity (a database, a reverse proxy server, an application server) to expose metrics to Prometheus.

The MongoDB exporter is one of them.

Prometheus will bind to the MongoDB exporters and store related metrics in its own internal storage system.

From there, Grafana will bind to Prometheus and display metrics on dashboard panels.

Easy, isn’t it?

At last, you have a great understanding of what we are trying to build, let’s install the different tools needed to monitor MongoDB.

Process of Installing & Configuring Prometheus, MongoDB Exporter

Here, we come to the main topic that how to install, configure, set up the tools, and monitor the Mongodb easily:

Installing Prometheus

If you are still a beginner using Prometheus, you will find the complete Prometheus installation on this tutorial.

If you run the Prometheus installation entirely, you know have your Prometheus up and ready.

To verify it, head over to http://localhost:9090. If you have a web interface close to the one presented just below, it means that your installation went correctly.

No metrics are currently stored, except maybe Prometheus internal metrics.

Run a simple Prometheus query, such as prometheus_http_request_duration_seconds_sum, and make sure that you have some results.

prometheus-web-interface

Now that your Prometheus server is running, let’s install the MongoDB exporter to start monitor our MongoDB database.

Installing the MongoDB exporter

As explained before, the MongoDB exporter is available on Percona’s GitHub here.

The MongoDB exporter comes as a binary file in an archive, but as always, we are going to configure it as a service.

We are also going to configure it to run on a specific authenticated user dedicated to monitoring.

First, download the MongoDB exporter release file from one of the versions available here.

$ mkdir mongodb-exporter
$ cd mongodb-exporter
$ wget https://github.com/percona/mongodb_exporter/releases/download/v0.7.1/mongodb_exporter-0.7.1.linux-amd64.tar.gz

Note: as of June 2019, the MongoDB exporter version is 0.7.1.

Next, extract the downloaded archive in your current folder.

$ tar xvzf mongodb_exporter-0.7.1.linux-amd64.tar.gz

You should now have 4 files: mongodb_exporter, LICENSE, README.md, CHANGELOG.md.

All files are pretty self-explanatory, but we are only interested in the mongodb_exporter binary file here.

As we are going to configure the exporter as a service, create a dedicated user for Prometheus if not already existing.

$ sudo useradd -rs /bin/false prometheus
$ sudo mv mongodb_exporter /usr/local/bin/

Enabling MongoDB authentication

Every access to your MongoDB instance should be authenticated and authorized.

To ensure it, we are going to set up a basic MongoDB authentication for the MongoDB exporter.

MongoDB authentication is set using the –auth flag in the Mongo shell.

By default, mongod does not set this flag, so you should be able to connect to it via localhost.

$ ps aux | grep mongod
mongodb  13468  1.1  6.9 1490632 281728 ? Ssl  Jan05 2838:27 /usr/bin/mongod --unixSocketPrefix=/run/mongodb --config /etc/mongodb.conf

Connect to your MongoDB instance with mongo.

$ mongo --port 27017

Create an administrator account for your exporter with the cluster monitor role.

use admin
db.createUser(
  {
    user: "mongodb_exporter",
    pwd: "password",
    roles: [
        { role: "clusterMonitor", db: "admin" },
        { role: "read", db: "local" }
    ]
  }
)

You should see the following message

Successfully added user: {                        
        "user" : "mongodb_exporter",              
        "roles" : [                               
                {                                 
                        "role" : "clusterMonitor",
                        "db" : "admin"            
                },                                
                {                                 
                        "role" : "read",          
                        "db" : "local"            
                }                                 
        ]                                         
}

Before exiting, shut down your MongoDB instance, and restart it.

$ db.adminCommand( { shutdown: 1 } )
$ exit
$ sudo mongod --auth --port 27017 --config /etc/mongodb.conf &

Set your MongoDB URI environment variable, according to the changes that you made before.

$ export MONGODB_URI=mongodb://mongodb_exporter:password@localhost:27017

Creating a service for the MongoDB exporter

Similar to the MySQLd exporter, we are going to set up the MongoDB exporter as a service.

As usual, head over to /lib/systemd/system and create a new service file for the exporter.

$ cd /lib/systemd/system/
$ sudo touch mongodb_exporter.service

Parse the following configuration into your service file:

[Unit]
Description=MongoDB Exporter
User=prometheus

[Service]
Type=simple
Restart=always
ExecStart=/usr/local/bin/mongodb_exporter

[Install]
WantedBy=multi-user.target

From there, don’t forget to restart your system daemon and start your service.

$ sudo systemctl daemon-reload
$ sudo systemctl start mongodb_exporter.service

You should always verify that your service is working.

As a quick reminder, Percona’s MongoDB exporter runs on port 9216.

To ensure that everything is working correctly, run a simple curl command on port 9216.

$ sudo curl http://localhost:9216/metrics
# HELP go_gc_duration_seconds A summary of the GC invocation durations.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 0
go_gc_duration_seconds{quantile="0.25"} 0
go_gc_duration_seconds{quantile="0.5"} 0
go_gc_duration_seconds{quantile="0.75"} 0
go_gc_duration_seconds{quantile="1"} 0
go_gc_duration_seconds_sum 0
go_gc_duration_seconds_count 0
...

Can you already see some Prometheus metrics that are being aggregated already?

Great! Your MongoDB exporter is working!

We only need to bind it to Prometheus, and we should be all set.

Configure the MongoDB exporter as a Prometheus target

Almost there!

As described in the schema shown in the architecture section, we are going to bind Prometheus to the new MongoDB exporter.

Head over to the location of your Prometheus configuration file (mine is located at /etc/prometheus/prometheus.yml) and edit it to add the MongoDB exporter as a target.

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'
    static_configs:
            - targets: ['localhost:9090', 'localhost:9216']

Restart Prometheus, and go to http://localhost:9090/targets to verify that Prometheus is now bound to your newly added exporter.

mongodb-exporter-running

Great! Everything is up and running now.

The last step will be to create a complete Grafana dashboard to have some insights on our metrics.

Looking for a tutorial to install Grafana? We got it covered in our last article.

Building Awesome MongoDB Dashboards

As explained before, we are going to use dashboards built by Percona to monitor our MongoDB example on Grafana.

Percona provides multiple existing dashboards such as:

  • MongoDB Overview;
  • MongoDB ReplSet;
  • MongoDB RocksDB;
  • MongoDB WiredTiger;
  • MongoDB MMAPv1
  • MongoDB InMemory

For this part, we are going to focus on importing the MongoDB Overview dashboards into our Grafana instance.

Set Prometheus as a Grafana datasource

If you followed our previous tutorials, you are probably a master at it right now.

If that’s not the case, here’s the configuration for Prometheus on Grafana.

prometheus-data-source-1

Downloading Percona dashboards for Grafana

In Grafana, dashboards come as JSON files. When you create a dashboard, you can export it in this format and share it with the world.

Percona provides dozens of dashboards on its Github repository.

In this case, we are going to download the MongoDB Overview dashboard file.

Run a simple wget command to get the JSON file.

You can create a dedicated dashboards folder in your /etc/grafana folder to store your downloaded dashboards.

$ cd /etc/grafana
$ sudo mkdir dashboards
$ cd dashboards
$ sudo wget https://github.com/percona/grafana-dashboards/blob/master/dashboards/MongoDB_Overview.json

If you want all the dashboards available in the repository, simply clone the repository into your current folder.

$ sudo git clone https://github.com/percona/grafana-dashboards.git

Now that you have the JSON files available, it is time for us to import them into Grafana.

Importing the MongoDB dashboard in Grafana

For this example, we are going to import the MongoDB Overview dashboard for Grafana.

Head over to your Grafana instance, located at http://localhost:3000 (if you didn’t change any default ports in your configuration file)

On the left menu, hover the ‘Plus‘ icon with your mouse and click on Import.

import-dashboard-1

From there, you should be taken to the Import page. Click on the Upload .json file option.

import-json

Given the operating system, you are working with, navigate to the /etc/grafana folder (where you stored your dashboard), and click on the MongoDB_Overview.json file.

Your dashboard should be imported automatically, with real-time updates of your MongoDB database!

final-dashboard-3

Common Errors

If you carefully followed this tutorial, chances are that you have a fully functional dashboard right now.

However, you might encounter some errors during the process.

Here are some clues on how to solve them:

  • Failed to get server status: not authorized on admin to execute the command

This error message is fairly simple to understand.

Your mongodb_exporter user does not have the necessary credentials to perform queries on the admin database.

IV – Common Errors errors

Clue 1

To resolve it, connect to your instance, use the admin database, and make sure that you configured correctly the mongodb_exporter user (it must have the cluster monitor right on the admin database)

$ mongo --port 27017 (or --auth if you already have an admin account on your database)
$ use admin;
$ db.getUsers()
{
        "_id" : "admin.mongodb_exporter",         
        "user" : "mongodb_exporter",              
        "db" : "admin",                           
        "roles" : [                               
                {                                 
                        "role" : "clusterMonitor",
                        "db" : "admin"            
                },                                
                {                                 
                        "role" : "read",          
                        "db" : "local"            
                }                                 
        ]                                         
}

Clue 2

You didn’t properly set the MongoDB URI environment variable.

To verify it, launch the following command:

$ env  | grep mongodb
MONGODB_URI=mongodb://mongodb_exporter:password@localhost:27017

Clue 3

If this is still not working, set the MongoDB URI directly in the service file, restart your service, as well as your MongoDB instance.

[Service]
Type=simple
Restart=always
ExecStart=/usr/local/bin/mongodb_exporter --mongodb.uri=mongodb://mongodb_exporter:password@localhost:27017

$ sudo systemctl daemon-reload
$ sudo systemctl restart mongodb.service
$ sudo systemctl restart mongodb_exporter.service
  • Could not get MongoDB BuildInfo: no reachable servers!

Clue 1

Your MongoDB database is either not launched, not it is not running on port 27017.

For the first option, just verify that your MongoDB service is running, or that your mongod background is running.

$ sudo systemctl status mongodb.service
or
$ ps aux | grep mongod

For the second option, verify the port used by your MongoDB instance. To do so, run the following command:

$ sudo lsof -i -P -n | grep LISTEN
grafana-s   642         grafana    7u  IPv6  998256601      0t0  TCP *:3333 (LISTEN)
mysqld_ex  3136            root    3u  IPv6  998030854      0t0  TCP *:9104 (LISTEN)
mongod     3688            root   10u  IPv4 1092070442      0t0  TCP 127.0.0.1:27017 (LISTEN)

If your MongoDB instance is not running on the default 27017 port, change your mongodb_exporter file for it to bind to your custom port.

[Service]
mongodb.uri=mongodb://mongodb_exporter:password@localhost:12345

IV – Common Errors error-2

Going Further

Now that you have a fully operational monitoring pipeline for MongoDB, it is time for you to dig a little deeper.

Here are the best resources if you want to know more about MongoDB monitoring.

First, here’s a great talk by Vadim Tkachenko from Percona about Monitoring MySQL and MongoDB instances. You will understand how Percona builds its own monitoring architecture and its own dashboards.

This is a more general talk about MongoDB monitoring using the built-in commands available in your MongoDB CLI such as the server status, the stats, or the total size of each of your collections.

A great talk if you are not into custom monitoring solutions, and if you want to focus on native and already implemented functions.

How To Install InfluxDB on Windows

How To Install InfluxDB on Windows in 2021 | Installation, Configuration & Running of InfluxDB on Windows

In the present world, technologies are updating day by day with plenty of advanced facilities. Likewise, the installation of InfluxDB on windows tutorials also may differ some time to time. So, here we have come up with a useful and handy tutorial on How To Install InfluxDB on Windows which is good and up-to-date among others.

These types of technologies may change and grow all the time so educational resources should adapt properly. The main objective of offering this tutorial is to have an insightful and up-to-date article on how to install it on Windows.

The tutorial of Install InfluxDB on Windows in 2021 covers the following stuff in a detailed way:

Be ready to follow all the required steps for a clean InfluxDB installation.

How do you install InfluxDB on Windows in 2021?

Check out this video tutorial on how to install InfluxDB on windows and moreover, you can learn the basic commands of influxdb, and integration with Grafana from here:

How to download InfluxDB on Windows?

By following the below two methods, you can easily download the InfluxDB on windows:

a – Downloading the archive

Downloading InfluxDB is very straightforward.

Head over to the InfluxDB downloads page. There, you will see the following four boxes.

a – Downloading the archive

What are those four boxes for?

They are part of the TICK stack. (Telegraf, InfluxDB, Chronograf, and Kapacitor).

Each of these tools has a very specific role: gathering metrics, storing data, visualizing time series or having post-processing defined functions on your data.

In this tutorial, we are going to focus on InfluxDB (the time series database component of TICK)

So should you download the v1.7.6 or v2.0.0 version?

In my previous articles, I answered the main difference between the two versions, but here’s the main difference you need to remember.

features

As the 2.0 version is still experimental, we are going to go for the 1.7.6 version.

Click on the v1.7.6 button.

Another window will open with all operating systems. Scroll until you see Windows Binaries (64-bit).

windows-64-bits

Simply click on the URL in the white box, and the download will automatically start in your browser.

Store it wherever you want, in my case, it will be in the Program Files folder.

Unzip the archive using your favorite archive utility tool (7-Zip in my case) or run the following command in a Powershell command line.

Expand-Archive -Force C:\path\to\archive.zip C:\where\to\extract\to

Great! Let’s take a look at what you have here.

b – Inspecting the archive

Inside your folder, you now have 5 binaries and 1 configuration file:

  • influx.exe: a CLI used to execute IFQL commands and navigate into your databases.
  • influx_inspect.exe: get some information about InfluxDB shards (in a multinode environment)
  • influx_stress.exe: used to stress-test your InfluxDB database
  • influx_tsm.exe: InfluxDB time-structured merge tree utility (not relevant here)
  • influxd.exe: used to launch your InfluxDB server
  • influxdb.conf: used to configure your InfluxDB instance.

Relevant binaries were marked in bold.

Also Check: How To Create a Grafana Dashboard? (UI + API methods)

How to configure InfluxDB Server on your Machine

Before continuing, you have to configure your InfluxDB instance for Windows.

We are essentially interested in four sections in the configuration file.

a – Meta section

This is where your raft database will be stored. It stores metadata about your InfluxDB instance.

Create a meta folder in your InfluxDB directory (remember in my case it was Program Files).

Modify the following section in the configuration file.

[meta]
  # Where the metadata/raft database is stored
  dir = "C:\\Program Files\\InfluxDB\\meta"

b – Data section

InfluxDB stores TSM and WAL files as part of its internal storage. This is where your data is going to be stored on your computer. Create a data and a wal folder in your folder. Again, modify the configuration file accordingly.

[data]
  # The directory where the TSM storage engine stores TSM files.
  dir = "C:\\Program Files\\InfluxDB\\data"

  # The directory where the TSM storage engine stores WAL files.
  wal-dir = "C:\\Program Files\\InfluxDB\\wal

Important: you need to put double quotes in the path!

c – HTTP section

There are many ways to insert data into an InfluxDB database.

You can use client libraries to use in your Python, Java, or Javascript applications. Or you can use the HTTP endpoint directly.

InfluxDB exposes an endpoint that one can use to interact with the database. It is on port 8086. (here’s the full reference of the HTTP API)

Back to your configuration file. Configure the HTTP section as follows:

[http]
  # Determines whether HTTP endpoint is enabled.
  enabled = true

  # The bind address used by the HTTP service.
  bind-address = ":8086"

  # Determines whether HTTP request logging is enabled.
  log-enabled = true

Feel free to change the port as long as it is not interfering with ports already used on your Windows machine or server.

d – Logging section

The logging section is used to determine which levels of the log will be stored for your InfluxDB server. The parameter by default is “info”, but feel free to change it if you want to be notified only for “error” messages for example.

[logging]
  # Determines which log encoder to use for logs. Available options
  # are auto, logfmt, and json. auto will use a more user-friendly
  # output format if the output terminal is a TTY, but the format is not as
  # easily machine-readable. When the output is a non-TTY, auto will use
  # logfmt.
  # format = "auto"

  # Determines which level of logs will be emitted. The available levels
  # are error, warn, info, and debug. Logs that are equal to or above the
  # specified level will be emitted.
  level = "error"

e – Quick test

Before configuring InfluxDB as a service, let’s run a quick-dry test to see if everything is okay.

In a command-line, execute the influxd executable. Accept the firewall permission if you are prompted to do it.

firewall

Now that your InfluxDB server has started, start a new command-line utility and run the following command.

C:\Users\Antoine>curl -sl -I http://localhost:8086/ping
HTTP/1.1 204 No Content
Content-Type: application/json
Request-Id: 7dacef6d-8c2f-11e9-8018-d8cb8aa356bb
X-Influxdb-Build: OSS
X-Influxdb-Version: 1.7.6
X-Request-Id: 7dacef6d-8c2f-11e9-8018-d8cb8aa356bb
Date: Tue, 11 Jun 2021 09:58:41 GMT

The /ping endpoint is used to check if your server is running or not.

Are you getting a 204 No Content HTTP response?

Congratulations!

You now simply have to run it as a service, and you will be all done.

How to Run InfluxDB as a Windows service using NSSM Tool?

As you guessed, you are not going to run InfluxDB via the command line every time you want to run it. That’s not very practical.

You are going to run it as a service, using the very popular NSSM tool on Windows.

You could use the SC tool that is natively available on Windows, but I just find it more complicated than NSSM.

To download NSSM, head over to https://nssm.cc/download.

Extract it in the folder that you want, for me, it will be “C:\Program Files\NSSM”.

From there, in the current NSSM folder, run the following command (you need administrative rights to do it)

> nssm install

You will be prompted with the NSSM window.

Enter the following details in it (don’t forget the config section, otherwise our previous work is useless)

nssm-config

That’s it!

Now your service is installed.

Head over to the services in Windows 10. Find your service under the name and verify that its status is “Running” (if you specified an automatic startup type in NSSM of course)

Is it running? Let’s verify one more time with curl.

C:\Users\Antoine>curl -sL -I http://localhost:8086/ping
HTTP/1.1 204 No Content
Content-Type: application/json
Request-Id: ef473e13-8c38-11e9-8005-d8cb8aa356bb
X-Influxdb-Build: OSS
X-Influxdb-Version: 1.7.6
X-Request-Id: ef473e13-8c38-11e9-8005-d8cb8aa356bb
Date: Tue, 11 Jun 2021 11:06:17 GMT

Congratulations! You did it!

You installed InfluxDB on Windows as a service, and it is running on port 8086.

Most Common Mistakes In The Process

The service did not respond in a timely fashion.

I encountered this error when I tried to set up InfluxDB as a service using SC. As many solutions exist on Google and on Youtube, I solved it by using NSSM.

Tried tweaking the Windows registry but it wasn’t very useful at all.

Only one usage of each socket address (protocol/network address/port) is normally permitted.

Simple, there is already a program or service listening on 8086. You should modify the default port in the configuration file and take one that is permitted and not used.

I don’t have the same curl response

A 204 response to the curl command is the only sign that your InfluxDB is running correctly. If you don’t get the same output, you should go back and double-check the steps before.

I have a parsing error in my configuration file!

Remember that in Windows systems backslashs have to be escaped. It’s double backslashs in the paths of your InfluxDB configuration file.

If your path contains some spaces, like “Program Files”, make sure to put your path into quotes.

Complete Node Exporter Mastery with Prometheus

Complete Node Exporter Mastery with Prometheus | Monitoring Linux Host Metrics WITH THE NODE EXPORTER

Are you worried about monitoring your whole Linux system performance? Do you want to monitor your filesystems, disks, CPUs, also network statistics, likewise you would do with netstat? Also, you are looking for a complete dashboard that displays every single metric on your system efficiently.

This tutorial is the right choice for you all, where it takes a special glance at the Node Exporter, a Prometheus exporter trained in exposing Linux metrics right off the bat. In case, you are a beginner to Prometheus then we would suggest you go through this guide completely and grasp the fundamentals of the awesome time-series database.

After knowing the basics about Prometheus and before starting a build of our entire Linux monitoring system, you all have to know the concepts that will be learning today via this Complete Node Exporter Mastery with Prometheus Tutorial. 

What You Will Learn

  • Existing ways to monitor your Linux system: You will discover about the free and paid tools that you can use in order to quickly monitor your infrastructure
  • What is the Node Exporter and how to properly install it as a service
  • Bind your Node Exporter with Prometheus and start gathering system metrics
  • Play with prebuilt Grafana dashboards to build 100+ panels in one click

After coming to the edge of this tutorial, you will be able to develop your own monitoring infrastructure and add many more exporters to it.

Basics of Linux Monitoring

Before building our entire monitoring architecture, let’s have a look at what are the currently existing solutions and what problems we are trying to solve with the Node Exporter.

Also Check: Complete MySQL dashboard with Grafana & Prometheus

Existing Solutions

As a system administrator, there are multiple ways for you to monitor your Linux infrastructure.

  • Command-line tools

There are plenty of command-line tools for you to monitor your system.

They are very known to every system administrator and are often very useful to perform some simple troubleshooting on your instance.

Some examples of command-line tools may be top or htop for the CPU usage, df or du for the disks, or even tcpdump for an easy network traffic analysis.

a – Existing solutions top-command

Those solutions are great but they have major downsides.

Besides being very easy to use, they are often formatted in different ways, making it hard to export them in a consistent way.

Also, to run those commands, you sometimes need elevated privileges on the system which is not always the case.

With a complete monitoring system, you can have the security rules handled directly in your dashboarding system (for example Grafana) and you don’t need to provide direct access to your instance to whoever wants to troubleshoot outages.

  • Desktop solutions

Desktop solutions provide a more consistent and probably more practical solution to system monitoring.

Some examples of those tools are the very established SolarWinds Linux Performance Monitoring Tool (that provides very complete dashboards for your Linux system) or Zabbix with the Metric Collection product.

I – Linux Monitoring Basics

The Linux Performance Monitoring Tools by SolarWinds is a paid tool, but if you want a solution ready to use very quickly, they are a great opportunity for your Linux monitoring.

Node Exporter & Prometheus

Now that we know what the existing tools for Linux monitoring are, let’s have a look at what we are going to use today: the node exporter and Prometheus.

As stated before, Prometheus scrapes targets and the node exporter is just one of them.

In your architecture, you will have the following components:

  • A time-series database, in this case, Prometheus made to store the different metrics retrieved by the node exporter
  • A Node Exporter run as a systemd service that will periodically (every 1 second) gathers all the metrics of your system.
  • A dashboard solution, in this case, Grafana, displaying metrics gathered from Prometheus. We are not going to build every single panel by ourselves. Instead, we are going to use a very powerful feature of Grafana which is the dashboard import. (as a reminder, Grafana has a list of hundreds of dashboards that you import within the UI)

In our case, the whole stack will be run within the same instance, so there will be no need to configure any firewall rules.

The node exporter will run on port 9100 and Prometheus will run on port 9090

Note : as part of the configuration, Prometheus can actually monitor itself.

Now that you have an idea of what a monitoring architecture looks like, let’s install the different tools needed.

Installation of Required Tools

As a reminder, for our architecture, we are going to need Prometheus, the Node Exporter, and Grafana.

This tutorial focuses on installing the node exporter completely, but there’s also a quick installation for other tools.

Installing the Node Exporter

Before installing Prometheus, we need to install the Node Exporter as a service.

Head to and download the latest binary for the node exporter (here 0.18.1)

$ wget https://github.com/prometheus/node_exporter/releases/download/v0.18.1/node_exporter-0.18.1.linux-amd64.tar.gz

The archive contains a single binary which is node_exporter.

This is the binary we are going to launch to start gathering metrics on our system.

Now that you have the archive, extract it, and inspect its content it.

$ tar xvzf node_exporter-0.18.1.linux-amd64.tar.gz

Now to install it as a service, here are the instructions:

  • Create a node exporter user
$ sudo useradd -rs /bin/false node_exporter
  • Copy the binary to your /usr/local/bin folder.
$ cp node_exporter-0.18.1.linux-amd64/node_exporter /usr/local/bin
  • Apply the correct permissions to your binary file.
$ chown node_exporter:node_exporter /usr/local/bin/node_exporter

Note: you will need to have a node_exporter user to run the binary.

  • Navigate to /etc/systemd/system and create a new service file
$ cd /etc/systemd/systemd
$ sudo vim node_exporter.service

Then, paste the following configuration for your service.

[Unit]
Description=Node Exporter
After=network-online.target

[Service]
User=node_exporter
Group=node_exporter
Type=simple
ExecStart=/usr/local/bin/node_exporter

[Install]
WantedBy=multi-user.target
  • Exit vi, reload your daemon and start your service.
$ sudo systemctl daemon-reload
$ sudo systemctl start node_exporter
  • Check your service by running the following command
$ sudo systemctl status node_exporter.service

Is your service running correctly?

  • Enable your service for system startup
$ sudo systemctl enable node_exporter
  • Verify that your node exporter is correctly up and running with a simple curl command
$ curl http://localhost:9100/metrics

Can you see the key-value pairs metrics of your system metrics?

We are done with the Node Exporter installation!

Installing Prometheus

It is not the first time we install Prometheus for our projects.

First, head over to and run a simple wget command to get the latest binaries.

$ wget https://github.com/prometheus/prometheus/releases/download/v2.10.0/prometheus-2.10.0.linux-amd64.tar.gz

For this tutorial, we are running the 2.10.0 version from May 2019.

You should now have an archive, extract it, and navigate inside the folder.

# tar xvzf prometheus-2.10.0.linux-amd64.tar.gz
# cd prometheus-2.10.0.linux-amd64/

Inside this folder, you have multiple elements:

  • Prometheus: the executable file that launches a Prometheus server;
  • prometheus.yml: the configuration file for your Prometheus server;
  • promtool: a tool that can be used to check your Prometheus configuration.

In our case, we are first going to modify the Prometheus configuration file.

Navigate in it with vi:

# vi prometheus.yml

Then perform the following modifications:

global:
  scrape_interval:     1s # Set the scrape interval to every 1 second.

As a reminder, Prometheus scrapes targets. In our case, we want it to scrape our system metrics every one second.

Then, in the “scrape_configs” section, under the “static_configs” section, add the following lines.

static_configs:
            - targets: ['localhost:9090', 'localhost:9100']

It means that Prometheus will scrape the node exporter metrics, as well as its own metrics.

For now, simply launch Prometheus as a background process, and verify that it is correctly launched by pinging the Web UI interface.

> ./prometheus &

prometheus-web-console-final

Can you see the web console?

We are done with the Prometheus setup!

Installing Grafana

The last part of our initialization section is about installing Grafana.

As a reminder, Grafana is an open-source dashboard monitoring solution that binds to databases in order to display metrics in a variety of ways.

c – Installing Grafana visualization-1

To install Grafana, head over to https://grafana.com/grafana/download and download the latest binaries available for you.

$ wget https://dl.grafana.com/oss/release/grafana_6.2.4_amd64.deb

For this tutorial, we are using the 6.2.4 version of Grafana that includes the new bar gauges.

Extract the .deb file and Grafana should automatically start as a service on your computer.

$ sudo dpkg -i grafana_6.2.4_amd64.deb

You can verify that Grafana is running with the following command:

$ sudo systemctl status grafana-server
● grafana-server.service - Grafana instance
   Loaded: loaded (/usr/lib/systemd/system/grafana-server.service; disabled; vendor preset: enabled)
   Active: active (running) since Thu 2019-06-22 10:43:12 UTC; 5 days ago
     Docs: http://docs.grafana.org

If the status is set to Active, and if no visible errors are shown in your console, then Grafana is now installed correctly on your machine.

By default, Grafana is running on port 3000, and the default credentials are

  • username: admin
  • password: admin

You will be asked to change them immediately on the first connection.

With a web browser, head over to http://localhost:3000, and follow the instructions until you see the welcome screen.
As described in the screenshot, click on ‘Add data source‘ and create a new Prometheus data source.

As described in our other tutorials, you can configure your data source with the following configuration:

prometheus-data-source

Note: If you configured your Prometheus instance to run on another port, you have to change it in the configuration.

Building a complete Grafana Dashboard for the Node Exporter

Now that all your tools are set, there is not much work to do in order to have our dashboards.

For this, we are not going to build our dashboards by ourselves. Instead, we are going to use the “Import Dashboard” feature of Grafana.

In the top left menu, hover your mouse over the “Plus” icon and click on the “Import” element in the dropdown list.

import-dashboard

You will be presented with the following window:

import-window

In this window, you have multiple choices. You can either:

  • Type a dashboard URL or ID and it will be automatically imported into your Grafana instance.
  • Upload a JSON file (as a quick reminder, Grafana dashboards are exported as JSON files and can be easily shared this way)
  • Paste directly the raw JSON

In our case, we are going to use the first option by typing the dashboard id directly in the text field.

Getting Inspiration for Dashboards

We don’t have to build the dashboards all by ourselves.

This is especially true when you have dozens of metrics to look for.

You would have to spend a lot of time understanding the different metrics and building panels out of them.

We are going to use Grafana Dashboards for this. Grafana Dashboards is a repository owned by Grafana that stores hundreds of dashboards for you to choose from.

In our case, we are going to focus on Node Exporter dashboards.

Type “Node Exporter” in the search box and scroll until you reach the “Node Exporter Full” dashboard.

b – Getting Inspiration for Dashboards node-exporter-search

As you probably noticed it, the dashboard has the ID 1860 (the information is available into the URL of the website).

This is the ID that we are going to copy in order to monitor our system.

In the import, type “1860” in the Grafana.com dashboard text field, and hit “Load”.

You will be presented with a second screen to configure your dashboard.

b – Getting Inspiration for Dashboards import-2

Every field is filled automatically.

However, you will have to select the data source, in my case “Prometheus“. (you have to link it to the data source you created in section 2)

Hit “Import” when you are done.

In under a second, all the different panels will be built for you!

That’s 29 categories with over 192 panels created automatically. Awesome.

Here are some examples of how the dashboards look:

b – Getting Inspiration for Dashboards final-dash

You can also have a look at all the options available in this whole panel.

b – Getting Inspiration for Dashboards all-dashboards

Going Further

Mastering the Node Exporter is definitely a must-have skill for engineers willing to get started with Prometheus.

However, you can dig a little bit deeper using the Node Exporter.

a – Additional Modules

Not all modules are enabled by default, and if you run a simple node exporter installation, chances are that you are not running any additional plugins.

Here’s the list of the additional modules:

IV – Going Further additional-modules

In order to activate them, simply add a –collector.<name> flag when running the node exporter, for example:

ExecStart=/usr/local/bin/node_exporter --collector.processes --collector.ntp

This should activate the processes and the ntp collectors.

b – TextFile Collector

A complete Node Exporter guide would not be complete without talking about the TextFile collector, at least for a small section.

Similar to the Pushgateway, the textfile collector collects metrics from text files and stores them right into Prometheus.

It is designed for batch jobs or short-lived jobs that don’t expose metrics in a continuous way.

Some examples of the textfile collector are available here:

  • Using the text file collector from a shell script.
  • Monitoring directory sizes with the textfile collector.

c – Videos Resources

Finally, I like to link to external videos that are closely related to the subject or particularly interesting.

How To Create a Grafana Dashboard using UI and API

How To Create a Grafana Dashboard? (UI + API methods) | Best Practices for Creating Dashboard

A dashboard is a group of one or more panels structured and arranged into one or more rows. Grafana is one of the most awesome dashboards. Grafana Dashboard makes it easy to build the right queries, and personalize the display properties so that you can build a flawless dashboard for your requirement.

If you are looking to monitor your entire infrastructure or just your home, everybody helps by having a complete Grafana dashboard. In today’s tutorial, we are discussing completely how we can easily create a Grafana dashboard, what are the best practices, what the different panels are, about Dashboard UI, and how they can be used efficiently.

Best practices for creating dashboards

This section will make you understand some best practices to follow when creating Grafana dashboards:

  • At the time of new dashboard creation, ensure that it has a meaningful name.
    • If you want to create a dashboard for the experiment then set the word TEST or TMP in the name.
    • Also, include your name or initials in the dashboard name or as a tag so that people know who owns the dashboard.
    • After performing all the testing tasks on temporary experiment dashboards, remove all of them.
  • If you build multiple related dashboards, consider how to cross-reference them for easy navigation. For more information on this take a look at the best practices for managing dashboards.
  • Grafana retrieves data from a data source. A basic understanding of data sources in general and your precise is necessary.
  • Withdraw unnecessary dashboard stimulating to diminish the load on the network or backend. For instance, if your data changes every hour, then you don’t need to set the dashboard refresh rate to 30 seconds.
  • Perform the left and right Y-axes when displaying time series with multiple units or ranges.
  • Reuse your dashboards and drive consistency by utilizing templates and variables.
  • Add documentation to dashboards and panels.
    • To add documentation to a dashboard, add a Text panel visualization to the dashboard. Record things like the purpose of the dashboard, useful resource links, and any instructions users might need to interact with the dashboard. Check out this Wikimedia example.
    • To add documentation to a panel, edit the panel settings and add a description. Any text you add will appear if you hover your cursor over the small ‘i’ in the top left corner of the panel.
  • Beware of stacking graph data. The visualizations can be misleading, and hide related data. We advise turning it off in most cases.

Also Check: Best Open Source Dashboard Monitoring Tools

Dashboard UI

dashboard UI

  • Zoom out time range
  • Time picker dropdown: You can access relative time range options, auto-refresh options, and set custom absolute time ranges.
  • Manual refresh button: Will let all panels refresh (fetch new data).
  • Dashboard panel: Tap the panel title to edit panels.
  • Graph legend: You can change series colors, y-axis, and series visibility directly from the legend.

Steps to create a Grafana dashboard using UI

  • Hover the ‘Plus’ icon located on the left menu with your cursor (it should be the first icon)
  • At that point, a dropdown will open. Click on the ‘dashboard’ option.

Here are the steps to create a Grafana dashboard using the UI

  • Create a dashboard option in Grafana
  • A new dashboard will automatically be created with a first panel.

rafana new panel – query visualization

In Grafana v6.0+, the query and the visualization panels are departed. It implies that you can easily write your query, and decide later which visualization you want to use for your dashboard.

This is particularly handy because you don’t have to reset your panel every time when you want to change the visualization types.

  • First, click on ‘Add Query’ and make sure that your data source is correctly bound to Grafana.
  • Write your query and refactor it until you’re happy with the final result. By default, Grafana sets new panels to “Graph” types.

query-panel

  • Choose the visualization that fits your query the best. You have to choose between ten different visualizations (or more if you have plugins installed!)

visualization

  • Tweak your dashboard with display options until you’re satisfied with the visual of your panel.

display-options

  • Add more panels, and build a complete dashboard for your needs! Here is an example of what a dashboard could be with a little bit of work. Here is an example with a more futuristic theme on disk monitoring.

Best Practices for Creating Dashboard Final-dashboard

Steps to create a Grafana dashboard using API

Most of the API requests are authenticated within Grafana. To call the Grafana API to create a dashboard, you will have to get a token. If you don’t own the Grafana example, you have to ask your administrator for a token.

  • Hover the ‘Configuration’ icon in the left menu and click on the “API Keys” option.

Here are the steps to create a Grafana dashboard using the API

  • Click on “Add API Key”. Enter a key name and at least an “Editor” role to the key.
  • Click on “Add”

Add API Key Enter a key name and at least an “Editor” role to the key

  • A popup page will open and show you the token you will be using in the future. It is very important that you copy it immediately. You won’t be able to see it after closing the window.

grafana-api-key

  • Now that you have your API key, you need to make a call to the /api/dashboards/db endpoint using the token in the authorization header of your HTTP request.

For this example, I will be using Postman.

  • Create a new POST request in Postman, and type http://localhost:3000/api/dashboards/db as the target URL.
  • In the authorization panel, select the ‘Bearer token’ type and paste the token you got from the UI.

postman-grafana

  • In the body of your request, select “Raw” then “JSON (application/json)”. Paste the following JSON to create your dashboard.
{
  "dashboard": {
    "id": null,
    "uid": null,
    "title": "Production Overview",
    "tags": [ "templated" ],
    "timezone": "browser",
    "schemaVersion": 16,
    "version": 0
  },
  "folderId": 0,
  "overwrite": false
}

Here’s the description of every field in the request:

  • dashboard.id: the dashboard ID, should be set to null on dashboard creation.
  • dashboard.uid: the dashboard unique identifier, should be set to null on dashboard creation.
  • title: the title of your dashboard.
  • tags: dashboard can be assigned tags in order to retrieve them quicker in the future.
  • timezone: the timezone for your dashboard, should be set to the browser on dashboard creation.
  • schema version: constant value that should be 16.
  • version: your dashboard version, should be set to zero as it is the first version of your dashboard.
  • folderId: you can choose to set a folder id to your dashboard if you already have existing folders
  • overwrite: you could update an existing dashboard, but it should be set to false in our case as we creating it.
  • Click on “Send”. You choose to see the following success message.
{
"id": 3,
"slug": "production-overview",
"status": "success",
"uid": "uX5vE8nZk",
"url": "/d/uX5vE8nZk/production-overview",
"version": 1
}
  • Make sure that your dashboard was created in Grafana.

make-sure

That’s it! You now have a complete idea of the two ways to create a Grafana dashboard in 2021.

If you have any comments on this content, or if you found that this guide has run out of date in the future, make sure to leave a comment below.

How To Install Logstash on Ubuntu 18.04 and Debian 9

How To Install Logstash on Ubuntu 18.04 and Debian 9 | Tutorial on Logstash Configuration

Are you searching various websites to learn How To Install Logstash on Ubuntu 18.04 and Debian 9? Then, this tutorial is the best option for you all as it covers the detailed steps to install and configure the Logstash on Ubuntu 18.4 and Debian 9. If you are browsing this tutorial, it is apparently because you preferred to bring Logstash into your infrastructure. Logstash is a powerful tool, but you have to install and configure it properly so make use of this tutorial efficiently.

What is Logstash?

Logstash is a lightweight, open-source, server-side data processing pipeline that lets you get data from different sources, transform it on the fly, and send it to your aspired destination. It is used as a data processing pipeline for Elasticsearch, an open-source analytics and search engine that points at analyzing log ingestion, parsing, filtering, and redirecting.

Why do we use Logstash?

We use Logstash because Logstash provides a set of plugins that can easily be bound to various targets in order to gather logs from them. Moreover, Logstash provides a very expressive template language, that makes it very easy for developers to manipulate, truncate or transform data streams.

Logstash is part of the ELK stack: Elasticsearch – Logstash – Kibana but tools can be used independently.

With the recent release of the ELK stack v7.x, installation guides need to be updated for recent distributions like Ubuntu 18.04 and Debian 9.

Do Check: 

Prerequisites

  • Java version 8 or 11 (required for Logstash installation)
  • A Linux system running Ubuntu 20.04 or 18.04
  • Access to a terminal window/command line (Search > Terminal)
  • A user account with sudo or root privileges

Steps to Install install Logstash on Ubuntu and Debian

The following are the steps to install Logstash on Ubuntu and Debian: 

1 – Install the latest version of Java

Logstash, as every single tool of the ELK stack, needs Java to run properly.

In order to check whether you have Java or not, run the following command:

$ java -version
openjdk version "11.0.3" 2019-04-16
OpenJDK Runtime Environment (build 11.0.3+7-Ubuntu-1ubuntu218.04.1)
OpenJDK 64-Bit Server VM (build 11.0.3+7-Ubuntu-1ubuntu218.04.1, mixed mode, sharing)

If you don’t have Java on your computer, you should have the following output.

java-not-found

You can install it by running this command.

$ sudo apt-get install default-jre

Make sure that you now have Java installed via the first command that we run.

2 – Add the GPG key to install signed packages

In order to make sure that you are getting official versions of Logstash, you have to download the public signing key and you have to install it.

To do so, run the following commands.

$ wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -

On Debian, install the apt-transport-https package.

$ sudo apt-get install apt-transport-https

To conclude, add the Elastic package repository to your own repository list.

$ echo "deb https://artifacts.elastic.co/packages/7.x/apt stable main" | sudo tee -a /etc/apt/sources.list.d/elastic-7.x.list

3 – Install Logstash with apt

Now that Elastic repositories are added to your repository list, it is time to install the latest version of Logstash on our system.

$ sudo apt-get update
$ sudo apt-get install logstash

apt-get-update

This directive will :

  • create a logstash user
  • create a logstash group
  • create a dedicated service file for Logstash

From there, running Logstash installation should have created a service on your instance.

To check Logstash service health, run the following command.
On Ubuntu and Debian, equipped with system

$ sudo systemctl status logstash

Enable your new service on boot up and start it.

$ sudo systemctl enable logstash
$ sudo systemctl start logstash

Having your service running is just fine, but you can double-check it by verifying that Logstash is actually listening on its default port, which is 5044.

Run a simple netstat command, you should have the same output.

$ sudo lsof -i -P -n | grep logstash
java      28872        logstash   56u  IPv6 1160098302      0t0  TCP 
127.0.0.1:47796 > 127.0.0.1:9200 (ESTABLISHED)
java      28872        logstash   61u  IPv4 1160098304      0t0  UDP 127.0.0.1:10514
java      28872        logstash   79u  IPv6 1160098941      0t0  TCP 127.0.0.1:9600 (LISTEN)

As you can tell, Logstash is actively listening for connections on ports 10514 on UDP and 9600 on TCP. It is important to note if you were to forward your logs (from rsyslog to Logstash for example, either by UDP or by TCP).

On Debian and Ubuntu, here’s the content of the service file.

[Unit]
Description=logstash

[Service]
Type=simple
User=logstash
Group=logstash
# Load env vars from /etc/default/ and /etc/sysconfig/ if they exist.
# Prefixing the path with '-' makes it try to load, but if the file doesn't
# exist, it continues onward.
EnvironmentFile=-/etc/default/logstash
EnvironmentFile=-/etc/sysconfig/logstash
ExecStart=/usr/share/logstash/bin/logstash "--path.settings" "/etc/logstash"
Restart=always
WorkingDirectory=/
Nice=19
LimitNOFILE=16384

[Install]
WantedBy=multi-user.target

The environment file (located at /etc/default/logstash) contains many of the variables necessary for Logstash to run.

If you wanted to tweak your Logstash installation, for example, to change your configuration path, this is the file that you would change.

4 – Personalize Logstash with configuration files

In this step, you need to perform two more steps like as follows:

a – Understanding Logstash configuration files

Before personalizing your configuration files, there is a concept that you need to understand about configuration files.

Pipelines configuration files

In Logstash, you define what we called pipelines. A pipeline is composed of :

  • An input: where you take your data from, it can be Syslog, Apache, or NGINX for example;
  • A filter: a transformation that you would apply to your data; sometimes you may want to mutate your data, or to remove some fields from the final output.
  • An output: where you are going to send your data, most of the time Elasticsearch, but it can be modified to send a wide variety of different sources.

a – Understanding Logstash configuration files

Those pipelines are defined in configuration files.

In order to define those “pipeline configuration files“, you are going to create “pipeline files” in the /etc/logstash/conf.d directory.

Logstash general configuration file

But with Logstash, you also have standard configuration files, that configure Logstash itself.

This file is located at /etc/logstash/logstash.yml. The general configuration files define many variables, but most importantly you want to define your log path variable and data path variable.

b – Writing your own pipeline configuration file

For this part, we are going to keep it very simple.

We are going to build a very basic logging pipeline between rsyslog and stdout.

Every single log process via rsyslog will be printed to the shell running Logstash.

As Elastic documentation highlighted it, it can be quite useful to test pipeline configuration files and see immediately what they are giving as an output.

If you are looking for a complete rsyslog to Logstash to Elasticsearch tutorial, here’s a link for it.

To do so, head over to the /etc/logstash/conf.d directory and create a new file named “syslog.conf

$ cd /etc/logstash/conf.d/
$ sudo vi syslog.conf

Paste the following content inside.

input {
  udp {
    host => "127.0.0.1"
    port => 10514
    codec => "json"
    type => "rsyslog"
  }
}

filter { }


output {
  stdout { }
}

As you probably guessed, Logstash is going to listen to incoming Syslog messages on port 10514 and it is going to print it directly in the terminal.

To forward rsyslog messages to port 10514, head over to your /etc/rsyslog.conf file, and add this line at the top of the file.

*.*         @127.0.0.1:10514

rsyslog-forwarding

Now in order to debug your configuration, you have to locate the logstash binary on your instance.

To do so, run a simple whereis command.

$ whereis -b logstash
/usr/share/logstash

Now that you have located your logstash binary, shut down your service and run logstash locally, with the configuration file that you are trying to verify.

$ sudo systemctl stop logstash
$ cd /usr/share/logstash/bin
$ ./logstash -f /etc/logstash/conf.d/syslog.conf

Within a couple of seconds, you should now see the following output on your terminal.

success-config-logstash

Note : if you have any syntax errors in your pipeline configuration files, you would also be notified.

As a quick example, I removed one bracket from my configuration file. Here’s the output that I got.

error-config-logstash

5 – Monitoring Logstash using the Monitoring API

There are multiple ways to monitor a Logstash instance:

  • Using the Monitoring API provided by Logstash itself
  • By configuring the X-Pack tool and sending retrieved data to an Elasticsearch cluster
  • By visualizing data into dedicated panels of Kibana (such as the pipeline viewer for example)

In this chapter, we are going to focus on the Monitoring API, as the other methods require the entire ELK stack installed on your computer to work properly.

a – Gathering general information about Logstash

First, we are going to run a very basic command to get general information about our Logstash instance.

Run the following command on your instance:

$ curl -XGET 'localhost:9600/?pretty'
{
  "host" : "devconnected-ubuntu",
  "version" : "7.2.0",
  "http_address" : "127.0.0.1:9600",
  "id" : "05cfb06f-a652-402c-8da1-f7275fb06312",
  "name" : "devconnected-ubuntu",
  "ephemeral_id" : "871ccf4a-5233-4265-807b-8a305d349745",
  "status" : "green",
  "snapshot" : false,
  "build_date" : "2019-06-20T17:29:17+00:00",
  "build_sha" : "a2b1dbb747289ac122b146f971193cfc9f7a2f97",
  "build_snapshot" : false
}

If you are not running Logstash on the conventional 9600 port, make sure to adjust the previous command.

From the command, you get the hostname, the current version running, as well as the current HTTP address currently used by Logstash.

You also get a status property (green, yellow, or red) that has already been explained in the tutorial about setting up an Elasticsearch cluster.

b – Retrieving Node Information

If you are managing an Elasticsearch cluster, there is a high chance that you may want to get detailed information about every single node in your cluster.

For this API, you have three choices:

  • pipelines: in order to get detailed information about pipeline statistics.
  • jvm: to see current JVM statistics for this specific node
  • os: to get information about the OS running your current node.

To retrieve node information on your cluster, issue the following command:

$ curl -XGET 'localhost:9600/_node/pipelines'
{
  "host": "schkn-ubuntu",
  "version": "7.2.0",
  "http_address": "127.0.0.1:9600",
  "id": "05cfb06f-a652-402c-8da1-f7275fb06312",
  "name": "schkn-ubuntu",
  "ephemeral_id": "871ccf4a-5233-4265-807b-8a305d349745",
  "status": "green",
  "snapshot": false,
  "pipelines": {
    "main": {
      "ephemeral_id": "808952db-5d23-4f63-82f8-9a24502e6103",
      "hash": "2f55ef476c3d425f4bd887011f38bbb241991f166c153b283d94483a06f7c550",
      "workers": 2,
      "batch_size": 125,
      "batch_delay": 50,
      "config_reload_automatic": false,
      "config_reload_interval": 3000000000,
      "dead_letter_queue_enabled": false,
      "cluster_uuids": []
    }
  }
}

Here is an example for the OS request:

$ curl -XGET 'localhost:9600/_node/os'
{
  "host": "schkn-ubuntu",
  "version": "7.2.0",
  "http_address": "127.0.0.1:9600",
  "id": "05cfb06f-a652-402c-8da1-f7275fb06312",
  "name": "schkn-ubuntu",
  "ephemeral_id": "871ccf4a-5233-4265-807b-8a305d349745",
  "status": "green",
  "snapshot": false,
  "os": {
    "name": "Linux",
    "arch": "amd64",
    "version": "4.15.0-42-generic",
    "available_processors": 2
  }
}

c – Retrieving Logstash Hot Threads

Hot Threads are threads that are using a large amount of CPU power or that have an execution time that is greater than normal and standard execution times.

To retrieve hot threads, run the following command:

$ curl -XGET 'localhost:9600/_node/hot_threads?pretty'
{
  "host" : "schkn-ubuntu",
  "version" : "7.2.0",
  "http_address" : "127.0.0.1:9600",
  "id" : "05cfb06f-a652-402c-8da1-f7275fb06312",
  "name" : "schkn-ubuntu",
  "ephemeral_id" : "871ccf4a-5233-4265-807b-8a305d349745",
  "status" : "green",
  "snapshot" : false,
  "hot_threads" : {
    "time" : "2019-07-22T18:52:45+00:00",
    "busiest_threads" : 10,
    "threads" : [ {
      "name" : "[main]>worker1",
      "thread_id" : 22,
      "percent_of_cpu_time" : 0.13,
      "state" : "timed_waiting",
      "traces" : [ "java.base@11.0.3/jdk.internal.misc.Unsafe.park(Native Method)"...]
    } ]
  }
}

Installing Logstash on macOS with Homebrew

Elastic issues Homebrew formulae thus you can install Logstash with the Homebrew package manager.

In order to install with Homebrew, firstly, you should tap the Elastic Homebrew repository:

brew tap elastic/tap

Once you have clicked on the Elastic Homebrew repo, you can utilize brew install to install the default distribution of Logstash:

brew install elastic/tap/logstash-full

The above syntax installs the latest released default distribution of Logstash. If you want to install the OSS distribution, define this elastic/tap/logstash-oss.

Starting Logstash with Homebrew

To have launched start elastic/tap/logstash-full now and restart at login, run:

brew services start elastic/tap/logstash-full

To run Logstash, in the forefront, run:

logstash

Going Further

Now that you have all the basics about Logstash, it is time for you to build your own pipeline configuration files and start stashing logs.

I highly suggest that you verify Filebeat, which gives a lightweight shipper for logs and that simply be customized in order to build a centralized logging system for your infrastructure.

One of the key features of Filebeat is that it provides a back-pressure sensitive protocol, which essentially means that you are able to regulate the number that you receive.

This is a key point, as you take the risk of overloading your centralized server by pushing too much data to it.

For those who are interested in Filebeat, here’s a video about it.

Definitive Guide To InfluxDB

The Definitive Guide To InfluxDB In 2021 | InfluxDB Open Source Time Series Database

In this informative tutorial, we have covered complete details about InfluxDB like what exactly it is, why you use it, What value can developers design by fusing InfluxDB into their own environment? and many others.

Also, this guide can become an initial stage for every developer, engineer, and IT professional to understand InfluxDB concepts, use-cases, and real-world applications.

The main objective of curating this article is to make you an expert with InfluxDB in no time. So, we have designed the InfluxDB learning paths into diverse modules, each one of them bringing a new level of knowledge of time-series databases.

In this Definitive Guide To InfluxDB In 2021, firstly, you will gain some knowledge on the overall presentation of time-series databases, then with an in-depth explanation of the concepts that define InfluxDB, and at last, we explained the use-cases of InfluxDB and how it can be used in a variety of industries by using real-world examples.

Hence, step into the main topic and learn completely about InfluxDB Open Source Time Series Database, Key concepts, Use cases, etc. Let’s make use of the available links and directly jump into the required stuff of InfluxDB.

What is InfluxDB?

INfluxDB is definitely a fast-growing technology. The time-series database, developed by InfluxData, is seeing its popularity grow more and more over the past few months. It has become one of the references for developers and engineers willing to bring live monitoring into their own infrastructure.

Do Check: InfluxDays London Recap

What are Time-Series Databases?

Time Series Databases are database systems specially created to handle time-related data.

All your life, you have dealt with relational databases like MySQL or SQL Server. You may also have dealt with NoSQL databases like MongoDB or DynamoDB.

Those systems are based on the fact that you have tables. Those tables contain columns and rows, each one of them defining an entry in your table. Often, those tables are specifically designed for a purpose: one may be designed to store users, another one for photos, and finally for videos. Such systems are efficient, scalable, and used by plenty of giant companies having millions of requests on their servers.

Time series databases work differently. Data are still stored in ‘collections’ but those collections share a common denominator: they are aggregated over time.

Essentially, it means that for every point that you are able to store, you have a timestamp associated with it.

The great difference between relational databases and time series databases

The great difference between relational databases and time-series databases

But.. couldn’t we use a relational database and simply have a column named ‘time’? Oracle for example includes a TIMESTAMP data type that we could use for that purpose.

You could, but that would be inefficient.

Why do we need time-series databases?

Three words: fast ingestion rate.

Time series databases systems are built around the predicate that they need to ingest data in a fast and efficient way.

Indeed, relational databases do have a fast ingestion rate for most of them, from 20k to 100k rows per second. However, the ingestion is not constant over time. Relational databases have one key aspect that makes them slow when data tend to grow: indexes.

When you add new entries to your relational database, and if your table contains indexes, your database management system will repeatedly re-index your data for it to be accessed in a fast and efficient way. As a consequence, the performance of your DBMS tends to decrease over time. The load is also increasing over time, resulting in having difficulties to read your data.

Time Series databases are optimized for a fast ingestion rate. It means that such index systems are optimized to index data that are aggregated over time: as a consequence, the ingestion rate does not decrease over time and stays quite stable, around 50k to 100k lines per second on a single node.

difference-dbms-tsdb

Specific concepts about time-series databases

On top of the fast ingestion rate, time-series databases introduce concepts that are very specific to those technologies.

One of them is data retention. In a traditional relational database, data are stored permanently until your decide to drop them yourself. Given the use-cases of time series databases, you may want not to keep your data for too long: either because it is too expensive to do so, or because you are not that interested in old data.

Systems like InfluxDB can take care of dropping data after a certain time, with a concept called retention policy (explained in detail in part two). You can also decide to run continuous queries on live data in order to perform certain operations.

You could find equivalent operations in a relational database, for example, ‘jobs’ in SQL that can run on a given schedule.

A Whole Different Ecosystem

Time Series databases are very different when it comes to the ecosystem that orbits around them. In general, relational databases are surrounded by applications: web applications, software that connects to them to retrieve information or add new entries.

Often, a database is associated with one system. Clients connect to a website, that contacts a database in order to retrieve information. TSDB is built for client plurality: you do not have a simple server accessing the database, but a bunch of different sensors (for example) inserting their data at the same time.

As a consequence, tools were designed in order to have efficient ways to produce data or to consume it.

Data consumption

Data consumption is often done via monitoring tools such as Grafana or Chronograf. Those solutions have built-in solutions to visualize data and even make custom alerts with it.

consumption

Those tools are often used to create live dashboards that may be graphs, bar charts, gauges or live world maps.

Data Production

Data production is done by agents that are responsible for targeting special elements in your infrastructure and extract metrics from them. Such agents are called “monitoring agents“. You can easily configure them to query your tools in a given time span. Examples are Telegraf (which is an official monitoring agent), CollectD or StatsD

production

Now that you have a better understanding of what time series databases are and how they differ from relational databases, it is time to dive into the specific concepts of InfluxDB.

Illustrated InfluxDB Concepts

In this section, we are going to explain the key concepts behind InfluxDB and the key query associated with it. InfluxDB embeds its own query language and I think that this point deserves a small explanation.

InfluxDB Query Language

Before starting, it is important for you to know which version of InfluxDB you are currently using. As of April 2019, InfluxDB comes in two versions: v1.7+ and v2.0.

v2.0 is currently in alpha version and puts the Flux language as a centric element of the platform. v1.7 is equipped with InfluxQL language (and Flux if you activate it).

features (1)

The main differences between v1.7 and v2.0

Right now, I do recommend keeping on using InfluxQL as Flux is not completely established in the platform.

InfluxQL is a query language that is very similar to SQL and that allows any user to query its data and filter it. Here’s an example of an InfluxQL query :

influxql-example-1
See how similar it is to the SQL language?

In the following sections, we are going to explore InfluxDB key concepts, provided with the associated IQL (short for InfluxQL) queries.

Explained InfluxDB Key Concepts

influxdb-terms

In this section, we will go through the list of essential terms to know to deal with InfluxDB in 2021.

Database

database is a fairly simple concept to understand on its own because you are applied to use this term with relational databases. In a SQL environment, a database would host a collection of tables, and even schemas and would represent one instance on its own.

In InfluxDB, a database host a collection of measurements. However, a single InfluxDB instance can host multiple databases. This is where it differs from traditional database systems. This logic is detailed in the graph below :

influx-internals

The most common ways to interact with databases are either creating a database or by navigating into a database in order to see collections (you have to be “in a database” in order to query collections, otherwise it won’t work).

database-queries

Measurement

As shown in the graph above, the database stores multiple measurements. You could think of a measurement as a SQL table. It stores data, and even metadata, over time. Data that are meant to coexist together should be stored in the same measurement.

Measurement example

Measurement Example

Measurement IFQL example

Measurement IFQL example

In a SQL world, data are stored in columns, but in InfluxDB we have two other terms: tags & fields.

Tags & Fields

Warning! This is a very important chapter as it explains the subtle difference between tags & fields.

When I first started with InfluxDB, I had a hard time grasping exactly why are tags & fields different. For me, they represented ‘columns’ where you could store exactly the same data.

When defining a new ‘column’ in InfluxDB, you have the choice to either declare it as a tag or as a value and it makes a very big difference.

In fact, the biggest difference between the two is that tags are indexed and values are not. Tags can be seen as metadata defining our data in the measurement. They are hints giving additional information about data, but not data itself.

Fields, on the other side, is literally data. In our last example, the temperature ‘column’ would be a field.

Back to our cpu_metrics example, let’s say that we wanted to add a column named ‘location’ as its name states, defines where the sensor is.

Should we add it as a tag or a field?

tags-vs-fields

In our case, it would be added as a.. tag! We definitely want the location ‘column’ to be indexed and taken into account when performing a query over the location.

In general, I would advise keeping your measurements relatively small when it comes to the number of fields. More and more fields often rhyme with lower performance. You could create other measurements to store another field and index it properly.

Now that we’ve added the location tag to our measurement, let’s go a bit deeper into the taxonomy.

A set of tags is called a “tag-set”. The ‘column name’ of a tag is called a “tag key”. Values of a tag are called “tag values”. The same taxonomy repeats for fields. Back to our drawings.

Measurement taxonomy

Timestamp

Probably the simplest keyword to define. A timestamp in InfluxDB is a date and a time defined in RFC3339 format. When using InfluxDB, it is very common to define your time column as a timestamp in Unix time expressed in nanoseconds.

Tip: you can choose a nanosecond format for the time column and reduce the precision later by adding trailing zeros to your time value for it to fit the nanosecond format.

Retention policy

This feature of InfluxDB is for me one of the best features there is.

A retention policy defines how long you are going to keep your data. Retention policies are defined per database and of course, you can have multiple of them. By default, the retention policy is ‘autogen‘ and will basically keep your data forever. In general, databases have multiple retention policies that are used for different purposes.

How retention policies workWhat are the typical use-cases of retention policies?

Let’s pretend that you are using InfluxDB for live monitoring of an entire infrastructure.

You want to be able to detect when a server goes off for example. In this case, you are interested in data coming from that server in the present or short moments before. You are not interested in keeping the data for several months, as a result, you want to define a small retention policy: one or two hours for example.

Now if you are using InfluxDB for IoT, capturing data coming from a water tank for example. Later, you want to be able to share your data with the data science team for them to analyze it. In this case, you might want to keep data for a longer time: five years for example.

Point

Finally, an easy one to end this chapter about InfluxDB terms. A point is simply a set of fields that has the same timestamp. In a SQL world, it would be seen as a row or as a unique entry in a table. Nothing special here.

Congratulations on making it so far! In the next chapter, we are going to see the different use-cases of InfluxDB and how it can be used to take your company to the next level.

InfluxDB Use-Cases

Here is a detailed explanation of InfluxDB Use-Cases:

DevOps Monitoring

DevOps Monitoring is a very big subject nowadays. More and more teams are investing in building fast and reliable architectures that revolve around monitoring. From services to clusters of servers, it is very common for engineers to build a monitoring stack that provides smart alerts.

If you are interested in learning more about DevOps Monitoring, I wrote a couple of guides on the subject, you might find them relevant to your needs.

From the tools defined in section 1, you could build your own monitoring infrastructure and bring direct value to your company or start-up.

IoT World

The IoT is probably the next big revolution that is coming in the next few years. By 2020, it is estimated that over 30 billion devices will be considered IoT devices. Whether you are monitoring a single device or a giant network of IoT devices, you want to have accurate and instant metrics for you to take the best decisions regarding the goal you are trying to achieve.

Real companies are already working with InfluxDB for IoT. One example would be WorldSensing, a company that aims at expanding smart cities via individual concepts such as smart parking or traffic monitoring system. Their website is available here :

Industrial & Smart Plants

Plants are becoming more and more connected. Tasks are more automated than ever : as a consequence it brings an obvious need to be able to monitor every piece of the production chain to ensure a maximal throughput. But even when machines are not doing all the work and humans are involved, time-series monitoring is a unique opportunity to bring relevant metrics to managers.

Besides reinforcing productivity, they can contribute to building safer workplaces as they are able to detect issues quicker. Value for managers as well as for workers.

Your Own Imagination!

The examples detailed above are just examples and your imagination is the only limit to the applications that you can find for Time Series databases. I have shown it via some articles that I wrote, but time-series can be even used in cybersecurity!

If you have cool applications of InfluxDB or time-series database, post them as comments below, it is interesting to see what idea people can come up with.

Going Further

In this article, you learned many different concepts: what are time-series databases and how they are used in the real world. We have gone through a complete list of all the technical terms behind InfluxDB, and I am confident now to say that you are to go on your own adventure.

My advice to you right now would be to build something on your own. Install it, play with it, and start bringing value to your company or start-up today. Create a dashboard, play with queries, setup some alerts: there are many things that you will have to do in order to complete your InfluxDB journey.

If you need some inspiration to go further, you can check the other articles that we wrote on the subject: they provide clear step-by-step guides on how to setup everything.