Understanding Processes on Linux

Understanding Processes on Linux | Types of Process in Linux | Creating, Listing, Monitoring, Changing Linux Processes

Are you finding an ultimate guide to provide complete knowledge about Understanding Processes on Linux? This could be the right page for all developers & administrators.

Right from what is processes in Linux to how they are managed on Linux are explained here in a detailed way with simple examples for better understanding. If you are working as a system administrator, then I must say that you would have associated with the processes in Linux in many diverse ways.

Actually, Processes are at the center of the Linux OS designed by the Kernel itself, they represent running operations currently happening on your Linux host. You can perform everything with processes like starting them, interrupt them, resume them, or stop them.

In today’s tutorial, we are going to take a deep look at Linux Processes, what they are, what commands are associated with processes, how they are used on our operating system, what signals are, and how we can select more computational resources to our present processes.

What You Will Learn

By reading this tutorial until the end, you will learn about the following concepts

  • What processes are and how they are created on a Linux system?
  • How processes can be identified on a Linux system?
  • What background and foreground processes are?
  • What signals are and how they can be used to interact with processes?
  • How to use the pgrep as well as the pkill command effectively
  • How to adjust process priority using nice and renice
  • How to see process activity in real-time on Linux

That’s quite a long program, so without further ado, let’s start with a brief description of what processes are.

Also Check Related Linux Guides: 

Linux Processes Basics

In short, processes are running programs on your Linux host that perform operations such as writing to a disk, writing to a file, or running a web server for example.

The process has an owner and they are identified by a process ID (also called PID)

Linux Processes Basics process-identity

On the other hand, programs are lines, or code or lines of machine instructions stored on a persistent data storage.

They can just sit on your data storage, or they can be in execution, i.e running as processes.

Linux Processes Basics program-process

In order to perform the operations they are assigned to, processes need resources: CPU timememory (such as RAM or disk space), but also virtual memory such as swap space in case your process gets too greedy.

Obviously, processes can be startedstoppedinterrupted, and even killed.

Before issuing any commands, let’s see how processes are created and managed by the kernel itself.

Process Initialization on Linux

As we already stated, processes are managed by the Kernel on Linux.

However, there is a core concept that you need to understand in order to know how Linux creates processes.

By default, when you boot a Linux system, your Linux kernel is loaded into memory, it is given a virtual filesystem in the RAM (also called initramfs) and the initial commands are executed.

One of those commands starts the very first process on Linux.

Historically, this process was called the init process but it got replaced by the systemd initialization process on many recent Linux distributions.

To prove it, run the following command on your host

$ ps -aux | head -n 2

Process Initialization on Linux systemd

As you can see, the systemd process has a PID of 1.

If you were to print all processes on your system, using a tree display, you would find that all processes are children of the systemd one.

$ pstree

Process Initialization on Linux pstree

It is noteworthy to underline the fact that all those initialization steps (except for the launch of the initial process) are done in a reserved space called the kernel space.

The kernel space is a space reserved to the Kernel in order for it to run essential system tools properly and to make sure that your entire host is running in a consistent way.

On the other hand, user space is reserved for processes launched by the user and managed by the kernel itself.

user-kernel-space

As a consequence, the systemd process is the very first process launched in the userspace.

Creation of a Processes in Linux

A new process is normally created when an existing process makes an exact copy of itself in memory. The child process will have the same environment as its parent, but only the process ID number is different.

In order to create a new process in Linux, we can use two conventional ways. They are as such:

  • With The System() Function – this method is relatively simple, however, it’s inefficient and has significantly certain security risks.
  • With fork() and exec() Function – this technique is a little advanced but offers greater flexibility, speed, together with security.

Process Creation using Fork and Exec

When you are creating and running a program on Linux, it generally involves two main steps: fork and execute.

Fork operation

A fork is a clone operation, it takes the current process, also called the parent process, and it clones it in a new process with a brand new process ID.

When forking, everything is copied from the parent process: the stack, the heap, but also the file descriptors meaning the standard input, the standard output, and the standard error.

It means that if my parent process was writing to the current shell console, the child process will also write to the shell console.

Process Creation using Fork and Exec fork

 

The execution of the cloned process will also start at the same instruction as the parent process.

Execute operation

The execute operation is used on Linux to replace the current process image with the image from another process.

On the previous diagram, we saw that the stack of the parent process contained three instructions left.

As a consequence, the instructions were copied to the new process but they are not relevant to what we want to execute.

The exec operation will replace the process image (i.e the set of instructions that need to be executed) with another one.

Process Creation using Fork and Exec

If you were for example to execute the exec command in your bash terminal, your shell would terminate as soon as the command is completed as your current process image (your bash interpreter) would be replaced with the context of the command you are trying to launch.

$ exec ls -l

If you were to trace the system calls done when creating a process, you would find that the first C command called is the exec one.

strace-linux

Creating processes from a shell environment

When you are launching a shell console, the exact same principles apply when you are launching a command.

A shell console is a process that waits for input from the user.

It also launches a bash interpreter when you hit Enter and it provides an environment for your commands to run.

But the shell follows the steps we described earlier.

When you hit enter, the shell is forked to a child process that will be responsible for running your command. The shell will wait patiently until the execution of the child process finishes.

On the other hand, the child process is linked to the same file descriptors and it may share variables that were declared on a global scope.

The child process executes the “exec” command in order to replace the current process image (which is the shell process image) in the process image of the command you are trying to run.

The child process will eventually finish and it will print its result to the standard output it inherited from the parent process, in this case, the shell console itself.

shell-execution

Now that you have some basics about how processes are created in your Linux environment, let’s see some details about processes and how they can be identified easily.

Identifying & Listing Running Processes on Linux

The easiest way to identify running processes on Linux is to run the ps command.

$ ps

ps-command

By default, the ps command will show you the list of the currently running processes owned by the current user.

In this case, only two processes are running for my user: the bash interpreter and the ps command I have run into it.

The important part here is that processes have owners, most of the time the user who runs them in the first place.

To illustrate this, let’s have a listing of the first ten processes on your Linux operating system, with a different display format.

$ ps -ef | head -n 10

Identifying running processes on Linux

As you can see here, the top ten processes are owned by the user “root“.

This information will be particularly important when it comes to interacting with processes with signals.

To display the processes that are owned and executed by the current connected user, run the following command

$ ps u

Identifying running processes on Linux

There are plenty of different options for the ps command, and they can be seen by running the manual command.

$ man ps

From experience, the two most important commands in order to see running processes are

ps aux

USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND

That corresponds to a BSD-style process listing, where the following command

ps -ef

UID  PID  PPID C STIME TTY  TIME CMD

Corresponds to a POSIX-style process listing.

They are both representing current running processes on a system, but the first one has the “u” option for “user oriented” which makes it easier to read process metrics.
ps-aux

Now that you have seen what processes are and how they can be listed, let’s see what background and foreground processes are on your host.

Here is the description of all the fields displayed by ps -f command:

Sr.No. Column & Description
1 UID: User ID that this process belongs to (the person running it)
2 PID: Process ID
3 PPID: Parent process ID (the ID of the process that started it)
4 C: CPU utilization of process
5 STIME: Process start time
6 TTY: Terminal type associated with the process
7 TIME: CPU time taken by the process
8 CMD: The command that started this process

Also, there are other options that can be used along with ps command:

Sr.No. Option & Description
1 -a: Shows information about all users
2 -x: Shows information about processes without terminals
3 -u: Shows additional information like -f option
4 -e: Displays extended information

How to Control Processes in Linux?

In Linux, there are some commands for controlling processes like kill, pkill, pgrep, and killall, here are a few key examples of how to use them:

$ pgrep -u tecmint top
$ kill 2308
$ pgrep -u tecmint top
$ pgrep -u tecmint glances
$ pkill glances
$ pgrep -u tecmint glances

Types of Processes

Basically, there are two types of processes in Linux:

  • Foreground processes (also referred to as interactive processes) – these are initialized and controlled through a terminal session. In other words, there has to be a user connected to the system to start such processes; they haven’t started automatically as part of the system functions/services.
  • Background processes (also referred to as non-interactive/automatic processes) – these are processes not connected to a terminal; they don’t expect any user input.

Background and foreground processes

The definition of background and foreground processes is pretty self-explanatory.

Jobs and processes in the current shell

A background process on Linux is a process that runs in the background, meaning that it is not actively managed by a user through a shell for example.

On the opposite side, a foreground process is a process that can be interacted with via direct user input.

Let’s say for example that you have opened a shell terminal and that you typed the following command in your console.

$ sleep 10000

As you probably noticed, your terminal will hang until the termination of the sleep process. As a consequence, the process is not executed in the background, it is executed in the foreground.

I am able to interact with it. If I press Ctrl + Z, it will directly send a stop signal to the process for example.

Jobs and processes in the current shell foreground

However, there is a way to execute the process in the background.

To execute a process in the background, simply put a “&” sign at the end of your command.

$ sleep 10000 &

As you can see, the control was directly given back to the user and the process started executing in the background

Jobs and processes in the current shell background

To see your process running, in the context of the current shell, you can execute the jobs command

$ jobs

Jobs and processes in the current shell jobs

Jobs are a list of processes that were started in the context of the current shell and that may still be running in the background.

As you can see in the example above, I have two processes currently running in the background.

The different columns from left to right represent the job ID, the process state (that you will discover in the next section), and the command executed.

Using the bg and fg commands

In order to interact with jobs, you have two commands available: bg and fg.

The bg command is used on Linux in order to send a process to the background and the syntax is as follows

$ bg %<job_id>

Similarly, in order to send a process to the foreground, you can use the fg in the same fashion

$ fg %<job_id>

If we go back to the list of jobs of our previous example, if I want to bring job 3 to the foreground, meaning to the current shell window, I would execute the following command

$ fg %3

Using the bg and fg commands

By issuing a Ctrl + Z command, I am able to stop the process. I can link it with a bg command in order to send it to the background.

Using the bg and fg commands bg-1

Now that you have a better idea of what background and foreground processes are, let’s see how it is possible for you to interact with the process using signals.

Interacting with processes using signals

On Linux, signals are a form of interprocess communication (also called IPC) that creates and sends asynchronous notifications to running processes about the occurrence of a specific event.

Signals are often used in order to send a kill or a termination command to a process in order to shut it down (also called kill signal).

In order to send a signal to a process, you have to use the kill command.

$ kill -<signal number> <pid>|<process_name>

For example, in order to force an HTTPD process (PID = 123) to terminate (without a clean shutdown), you would run the following command

$ kill -9 123

Signals categories explained

As explained, there are many signals that one can send in order to notify a specific process.

Here is the list of the most commonly used ones:

  • SIGINT: short for the signal interrupt is a signal used in order to interrupt a running process. It is also the signal that is being sent when a user pressed Ctrl + C on a terminal;
  • SIGHUP: short for signal hangup is the signal sent by your terminal when it is closed. Similarly to a SIGINT, the process terminates;
  • SIGKILL: signal used in order to force a process to stop whether it can be gracefully stopped or not. This signal can not be ignored except for the init process (or the systemd one on recent distributions);
  • SIGQUIT: a specific signal sent when a user wants to quit or to exit the current process. It can be invoked by pressing Ctrl + D and it is often used in terminal shells or in SSH sessions;
  • SIGUSR1, SIGUSR2: those signals are used purely for communication purposes and they can be used in programs in order to implement custom handlers;
  • SIGSTOP: instructs the process to stop its execution without terminating the process. The process is then waiting to be continued or to be killed completely;
  • SIGCONT: if the process is marked as stopped, it instructs the process to start its execution again.

In order to see the full list of all signals available, you can run the following command

$ kill -l

 1) SIGHUP       2) SIGINT       3) SIGQUIT      4) SIGILL
 5) SIGTRAP      6) SIGABRT      7) SIGBUS       8) SIGFPE
 9) SIGKILL     10) SIGUSR1     11) SIGSEGV     12) SIGUSR2
13) SIGPIPE     14) SIGALRM     15) SIGTERM     16) SIGSTKFLT
17) SIGCHLD     18) SIGCONT     19) SIGSTOP     20) SIGTSTP
21) SIGTTIN     22) SIGTTOU     23) SIGURG      24) SIGXCPU
25) SIGXFSZ     26) SIGVTALRM   27) SIGPROF     28) SIGWINCH
29) SIGIO       30) SIGPWR      31) SIGSYS      34) SIGRTMIN
35) SIGRTMIN+1  36) SIGRTMIN+2  37) SIGRTMIN+3  38) SIGRTMIN+4
39) SIGRTMIN+5  40) SIGRTMIN+6  41) SIGRTMIN+7  42) SIGRTMIN+8
43) SIGRTMIN+9  44) SIGRTMIN+10 45) SIGRTMIN+11 46) SIGRTMIN+12
47) SIGRTMIN+13 48) SIGRTMIN+14 49) SIGRTMIN+15 50) SIGRTMAX-14
51) SIGRTMAX-13 52) SIGRTMAX-12 53) SIGRTMAX-11 54) SIGRTMAX-10
55) SIGRTMAX-9  56) SIGRTMAX-8  57) SIGRTMAX-7  58) SIGRTMAX-6
59) SIGRTMAX-5  60) SIGRTMAX-4  61) SIGRTMAX-3  62) SIGRTMAX-2
63) SIGRTMAX-1  64) SIGRTMAX

Signals and Processes States

Now that you know that it is possible to interrupt, kill or stop processes, it is time for you to learn about processes states.

States of a Process in Linux

Processes have many different states, they can be :

  • Running: processes running are the ones using some computational power (such as CPU time) in the current time. A process can also be called “runnable” if all running conditions are met, and it is waiting for some CPU time by the CPU scheduler.
  • Stopped: a signal that is stopped is linked to the SIGSTOP signal or to the Ctrl + Z keyboard shortcut. The process execution is suspended and it is either waiting for a SIGCONT or for a SIGKILL.
  • Sleeping: a sleeping process is a process of waiting for some event or for a resource (like a disk) to be available.

Here is a diagram that represents the different process states linked to the signals you may send to them.

Signals and Processes States process-states

Now that you know a bit more about process states, let’s have a look at the pgrep and pkill commands.

Manipulating Process with pgrep and pkill

On Linux, there is already a lot that you can do by simply using the ps command.

You can narrow down your search to one particular process, and you can use the PID in order to kill it completely.

However, there are two commands that were designed in order for your commands to be even shorter: pgrep and pkill

Using the pgrep command

The pgrep command is a shortcut for using the ps command piped with the grep command.

The pgrep command will search for all the occurrences for a specific process using a name or a defined pattern.

The syntax of the pgrep command is the following one

$ pgrep <options> <pattern>

For example, if you were to search for all processes named “bash” on your host, you would run the following command

$ pgrep bash<

The pgrep command is not restricted to the processes owned by the current user by default.

If another user was to run the bash command, it would appear in the output of the pgrep command.

Using the pgrep command pgrep

It is also possible to search for processes using globbing characters.

Using the pgrep command pgrep-globbing

Using the pkill command

On the other hand, the pkill command is also a shortcut for the ps command used with the kill command.

The pkill command is used in order to send signals to processes based on their IDs or their names.

The syntax of the pkill command is as follows

$ pkill <options> <pattern>

For example, if you want to kill all Firefox windows on your host, you would run the following command

$ pkill firefox

Similar to the pgrep command, you have the option to narrow down your results by specifying a user with the -u option.

To kill all processes starting with “fire” and owned by the current user and root, you would run the following command

$ pkill user,root fire*

If you don’t have the right to stop a process, you will get a permission denied error message to your standard output.

Using the pkill command permission-denied-1

You also have the option to send specific signals by specifying the signal number in the pkill command

For example, in order to stop Firefox with a SIGSTOP signal, you would run the following command

$ pkill -19 firefox<

Changing Linux Process Priority using nice and renice

On Linux, not all processes are given the same priority when it comes to CPU time.

Some processes, such as very important processes run by root, are given a higher priority in order for the operating system to work on tasks that truly matter to the system.

Process priority on Linux is called the nice level.

The nice level is a priority scale going from -20 to 19.

The lower you go on the niceness scale, the higher the priority will be.

Similarly, the higher you are on the niceness scale, the lower your priority will be.

In order to remember it, you can remember the fact that “the nicer you are, the more you are willing to share resources with others”.

djusting process priority using nice and renice

In order to start a certain program or process with a given nice level, you will run the following command

$ nice -n <level> <command>

For example, in order to run the tar command with a custom tar level, you would run the following command

$ nice -n 19 tar -cvf test.tar file

Similarly, you can use the renice command in order to set the nice level of a running process to a given value.

$ renice -n <priority> <pid>

For example, if I have a running process with the PID 123, I can use the renice command in order to set its priority to a given value.

$ renice -n 18 123

Niceness and permissions

If you are not a member of the sudo group (or a member of the wheel group on Red Hat-based distributions), there are some restrictions when it comes to what you can do with the nice command.

To illustrate it, try to run the following command as a non-sudo user

$ nice -n -1 tar -cvf test.tar file

nice: cannot set niceness: Permission denied

nice-permissions

When it comes to niceness, there is one rule that you need to know:

As a non-root (or sudo) user, you won’t be able to set a nice level lower than the default assigned one (which is zero), and you won’t be able to renice a running process to a lower level than the current one.

To illustrate the last point, launch a sleep command in the background with a nice value of 2.

$ nice -n 2 sleep 10000 &

Next, identify the process ID of the process you just created.

Niceness and permissions

Now, try to set the nice level of your process to a value lower to the one you specified in the first place.

$ renice -n 1 8363

renice
As you probably noticed, you won’t be able to set the niceness level to 1, but only to a value higher than the one you specified.

Now if you choose to execute the command as sudo, you will be able to set the nice level to a lower value.

sudo-rence

Now that you have a clear idea of the nice and renice commands, let’s see how you can monitor your processes in real-time on Linux.

Monitoring processes on Linux using top and htop

In a previous article, we discussed how it is possible to build a complete monitoring pipeline in order to monitor Linux processes in real-time.

Using top on Linux

The top is an interactive command that any user can run in order to have a complete and ordered listing of all processes running on a Linux host.

To run top, simply execute it without any arguments.

The top will run in interactive mode.

$ top

If you want to run top for a custom number of iterations, run the following command

$ top -n <number><

top

The top command will first show recap statistics about your system at the top, for example, the number of tasks running, the percentage of CPU used, or the memory consumption.

Right below it, you have access to a live list of all processes running or sleeping on your host.

This view will refresh every three seconds, but you can obviously tweak this parameter.

To increase the refresh rate in the top command, press the “d” command and choose a new refresh rate

refresh-rate

Similarly, you can change the nice value of a running process live by pressing the “r” key on your keyboard.

The same permissions rules apply if you want to modify processes to a value lower to the one they are already assigned.

As a consequence, you may need to run the command as sudo.

renice-top

Using htop on Linux

Alternatively, if you are looking for a nicer way to visualize processes on your Linux host, you can use the htop command.

By default, the htop command is not available on most distributions, so you will need to install it with the following instructions.

$ sudo apt-get update
$ sudo apt-get install htop

If you are running a Red Hat based distribution, run the following commands.

$ sudo yum -y install epel-release
$ sudo yum -y update
$ sudo yum -y install htop

Finally, to run the htop command, simply run it without any arguments.

$ htop

htop

As you can see, the output is very similar except that it showcases information in a more human-friendly output.

Conclusion

In this tutorial, you learned many concepts about processes: how they are created, how they can be managed, and how they can be monitored effectively.

If you are looking for more tutorials related to Linux system administration, we have a complete section dedicated to it on the website, so make sure to check it out.

Until then, have fun, as always.

Access Control Lists on Linux Explained

Access Control Lists on Linux Explained | Linux ACL Cheat Sheet

Access control list (ACL) gives an additional, more flexible permission mechanism for file systems. It is intended to help UNIX file permissions. ACL permits you to grant permissions for any user or group to any disc resource.

If you are working as a system administrator then you would probably be familiar with Linux ACLs. Because they were used to define more fine-grained discretionary access rights for files and directories.

In today’s Access Control Lists on Linux Explained Tutorial, we are going to explain deeper information about Linux access control lists, what they are used for and how they are managed to configure a Linux system properly.

Get Ready to Learn A New Topic?

What You Will Learn

If you follow this tutorial until the end, you are going to learn about the following topics:

That’s quite a long program, so without further ado, let’s start with a quick definition of what Linux file access control lists acls are.

Access Control Lists Basics on Linux

On Linux, there are two ways of setting permissions for users and groups: with regular file permissions or with access control lists.

What is ACL(Access Control Lists)?

Access control lists are used on Linux filesystems to set custom and more personalized permissions on files and folders. ACLs allow file owners or privileged users to grant rights to specific users or to specific groups.

getfacl-1

In Linux, as you probably know, the permissions are divided into three categories: one for the owner of the file, one for the group, and one for the others.

However, in some cases, you may want to grant access to a directory (the execute permission for example) to a specific user without having to put this user into the group of the file.

This is exactly why access control lists were invented in the first place.

Do Refer More Linux Tutorials: 

Listing Access Control List

On Linux, access control lists are not enabled when you create a new file or directory on your host (except if a parent directory has some ACLs predefined).

To see if access control lists are defined for a file or directory, run the ls command and look for a “+” character at the end of the permission line.

$ ls -l

access-control-list
To show the difference, here is the difference when listing files on a minimal instance.

Listing Access Control List acl-fileNow that you have some basics about access control lists, let’s see how you can start creating basic ACL for your files and directories.

List of commands for setting up ACL

1) To add permission for user
setfacl -m "u:user:permissions" /path/to/file

2) To add permissions for a group
setfacl -m "g:group:permissions" /path/to/file

3) To allow all files or directories to inherit ACL entries from the directory it is within
setfacl -dm "entry" /path/to/dir

4) To remove a specific entry
setfacl -x "entry" /path/to/file

5) To remove all entries
setfacl -b path/to/file

Creating access control lists on Linux

Before starting with ACL commands, it is important to have the packages installed on your host.

Checking ACL packages installation

It might not be the case if you chose to have a minimal server running.

Start by checking the help related to the setfacl by running the following command

$ setfacl --help

If your host cannot find the setfacl command, make sure to install the necessary packages for ACL management.

$ sudo apt-get install acl -y

Note that you will need sudo privileges on Debian 10 to run this command.

Checking ACL packages installation

Run the setfacl command and make sure that you are able to see the help commands this time.

Now that your host is correctly configured, let’s see how the setfacl command works.

Setting access control lists using setfacl

With access control lists, there are two main commands that you need to remember: setfacl and getfacl.

In this chapter, we are going to take a look at the setfacl command as the getfacl one is pretty self-explanatory.

The setfacl command is used on Linux to create, modify and remove access control lists on a file or directory.

The setfacl has the following syntax

$ setfacl {-m, -x}  {u, g}:<name>:[r, w, x] <file, directory>

Where curly brackets mean one of the following options and regular brackets mean one or several items.

  • -m: means that you want to modify one or several ACL entries on the file or directory.
  • -x: means that you want to remove one or several ACL entries on a file or directory.
  • {u, g}: if you want to modify the ACL for a user or for a group.
  • name: this is an optional parameter, it can be omitted if you want to set the ACL entries for every user or for every group on your host.
  • [r, w, x]: in order to set read, write or execute permissions on the file or directory.

For example, in order to set specific write permissions for a user on a file, you would write the following command

$ setfacl -m u:user:w <file, directory>

In order to set execute permissions for all users on your host, you would write the following command

$ setfacl -m u::x <file, directory>

To set full permissions for a specific group on your host, you would write the setfacl this way

$ setfacl -m g:group:rwx <file, directory>

Now let’s say that you want to remove an ACL entry from a file.

In order to remove a user-specific entry from a file, you would specify the x option.

Note: you cannot specific rights from a single ACL entry, meaning that you can’t remove write permissions, keeping the ACL read permissions active.

$ setfacl -x u:<user> <file, directory>

Similarly, to remove ACL related to groups on your host, you would write the following command

$ setfacl -x g:<group> <file, directory>

Now that you have seen how you can create access control lists easily on Linux, it is time to see how you can check existing access control lists on files and directories.

Listing access control lists using getfacl

The getfacl command is used on Linux to print a complete listing of all regular permissions and access control lists permissions on a file or directory.

The getfacl can be used with the following syntax

$ getfacl <file, directory>

getfacl-2

The getfacl command is divided into multiple categories :

  • Filename, owner, and group: The information about the user and group ownership is shown at the top;
  • User permissions: First, you would find regular user permissions, also called the owning user, followed by any user-specific ACL entries (called named users)
  • Group permissions: Owning groups are presented followed by group-specific ACL entries, also called named groups
  • Mask: That restricts the permissions given to ACL entries, the mask is going to be detailed in the next section;
  • Other permissions: Those permissions are always active and this is the last category explored when no other permissions match with the current user or group.

Working with the access control lists mask

As you probably saw from the last screenshot, there is a mask entry between the named groups and the other permissions.

But what is this mask used for?

The ACL mask is different from the file creation mask (umask) and it is used in order to restrict existing ACL entries existing on a file or directory.

The ACL mask is used as the maximum set of ACL permissions regardless of existing permissions that exceed the ACL mask.

As always, a diagram speaks a hundred words.

Working with the access control lists mask

The ACL mask is updated every time you run a setfacl command unless you specify that you don’t want to update the mask with the -n flag.

To prevent the mask from being updated, run the setfacl with the following command

$ setfacl -n -m u:antoine:rwx <file, directory>

As you can see in this example, I have set the user “antoine” to have full permissions on the file.

The mask is set to restrict permissions to read and write permissions.

As a consequence, the “effective permissions” set on this file for this user are read and write ones, the execute permission is not granted.

Working with the access control lists mask

Note: If your maximum set of permissions differs from the mask entry, you will be presented with an effective line computing the “real” set of ACL entries used.

Creating access control lists defaults on directories

As already mentioned in this article, it is possible to create ACL entries on directories and they work in the same way file access control lists work.

However, there is a small difference when it comes to directories: you have to option to create access control lists defaults.

Access control lists defaults are used to create ACL entries on a directory that will be inherited by objects in this directory like files or subdirectories.

When creating default ACL entries :

  • Files created in this directory inherit the ACL entries specified in the parent directory
  • Subdirectories created in this directory inherit the ACL entries as well as the default ACL entries from the parent directory.

To create default ACL entries, specify the -d option when setting ACL using the setfacl command.

$ setfacl -d -m {u, g}:<name>:[r, w, x] <directory>

For example, to assign read permissions to all files created in a directory, you would run the following command

$ setfacl -d -m u::r directory

Creating access control lists defaults on directories getfacl-3

Now, when a file is created in this acl-directory, you can see that default ACL entries are applied to the file.

Creating access control lists defaults on directories default-1-1

Similarly, when a directory is created in the acl-directory, it will inherit default ACL entries specified in the parent directory.

Creating access control lists defaults on directories default-2

Note that it is recommended to specify default permissions for all three categories (user, group, and other).

In fact, specifying one of the three entries will create the remaining two with permissions related to the file creation mask.

Deleting default access control lists on directories

In order to delete default existing access control lists on directories, use the -k flag with the setfacl command.

$ setfacl -k <directory>

Given the example we specified earlier, here is how to delete default entries

$ setfacl -k acl-directory

Deleting default access control lists on directories remove-default

Note that deleting ACL entries from the parent directory does not delete ACL entries in files or directories contained in the parent directory.

To remove default ACL entries in a directory and all subdirectories, you would have to use a recursive option (-R)

$ setfacl -kR <directory>

remove-default-1

Conclusion

In this tutorial, you learned about access control lists on Linux, the getfacl, and the setfacl command.

You also discovered more about the access control lists mask and how default ACL is used in order to create ACL entries on files and subdirectories contained in the parent directory.

If you are curious about Linux system administration, we have many more tutorials on the subject, make sure to read them!

Cron Jobs and Crontab on Linux Explained

Cron Jobs and Crontab on Linux Explained | What is Cron Job & Crontab in Linux with Syntax?

This Cron Jobs and Crontab on Linux Explained Tutorial helps you understand cron on Linux along with the role of the crontab file. System administrators are likely to spend a lot of time performing recurring tasks on their systems.

But the best way to automate tasks on Linux systems is the cron jobs. It was initially found in 1975 by AT&T Bell Laboratories.

Not only cron jobs and crontab command on Linux but also you are going to explain about Linux cron daemon. Rather than these concepts also keep focusing on the difference between user-defined cron jobs and system-defined cron jobs.

Are you ready for learnings?

What is ‘crontab’ in Linux?

The crontab is a list of commands that you require to run on a daily schedule, and also the name of the command used to manage that list. Crontab stands for “cron table,” because it uses the job scheduler cron to execute tasks.

cron itself is termed after “Chronos, ” the Greek word for time.cron is the system process that will automatically execute tasks for you as per the set schedule. The schedule is called the crontab, which is also the name of the program used to edit that schedule.

Linux Crontab Format

MIN HOUR DOM MON DOW CMD

Linux Crontab Syntax

Field    Description    Allowed Value
MIN      Minute field    0 to 59
HOUR     Hour field      0 to 23
DOM      Day of Month    1-31
MON      Month field     1-12
DOW      Day Of Week     0-6
CMD      Command         Any command to be executed.

Important Crontab Examples

The following are some of the necessary examples of Crontab. Kindly have a look at them:

Description Command
Cron command to do the various scheduling jobs. The below-given command executes at 7 AM and 5 PM daily.
0 7,17 * * * /scripts/script.sh
Command to execute a cron after every 5 minutes.
*/5* * * * *  /scripts/script.sh
Cron scheduler command helps you to execute the task every Monday at 5 AM. This command is helpful for doing weekly tasks like system clean-up.
0 5 * * mon  /scripts/script.sh
Command run your script at 3 minutes intervals.
*/3 * * * * /scripts/monitor.sh

Linux Crontab Command

The crontabcommand permits you to install, view, or open a crontab file for editing:

  • crontab -e: Edit crontab file, or create one if it doesn’t already exist.
  • crontab -l: Display crontab file contents.
  • crontab -r: Remove your current crontab file.
  • crontab -i: Remove your current crontab file with a prompt before removal.
  • crontab -u <username>: Edit other user crontab files. This option needs system administrator privileges.

What is Cron and Cron Jobs in Linux?

Cron is a system daemon run on any Linux system that is responsible for detecting cron jobs and executing them at given intervals.

Cron runs every minute and it will inspect a set of pre-defined directories on your filesystem to see if jobs need to be run.

On the other hand, cron jobs are tasks defined to run at given intervals or periods, usually shell scripts or simple bash commands.

Cron jobs are usually used in order to log certain events to your Syslog utilities, or to schedule backup operations on your host (like database backups or filesystem backups).

For a Linux OS running systemd as a service manager, you can inspect the cron service by running the following command

$ sudo systemctl status cron.service

Note: You need sudo privileges to inspect system services with systemd

What is the Cron Job Syntax?

The most important to know about cron is probably the cron job syntax.

In order to define a cron job, you are going to define:

  • Periodicity: meaning when your job is going to be executed over time. You can define it to run every first day of the month, every 5 minutes, or on a very specific day of the year. Examples will be given later on in the article;
  • Command: literally the command to be executed by the cron utility, it can be a backup command or whatever command that you would normally run in a shell;
  • User: this is reserved for system-defined cron jobs where you want to specify the user that should the cron command. For user-defined cron jobs, you don’t have to specify a user, and your system will run them as root by default.

cron-syntax

As you probably noticed, the periodicity column is made of 5 columns.

Every single column can either be set to *, meaning that the command will be executed for every single value of the interval specified or to a particular value, for example, the 6th month of the year.

If you want to execute a command for every minute, of every hour, of every day of the month, of every month, you need to write the following command

* * * * *  logger "This is a command executed every minute"

If you want to execute a command every 30 minutes, you would write

*/30 * * * * logger "This is executed every 30 minutes"

On the other hand, if you want to execute a command on the first day of the month at midnight, you would write

0 0 1 * * logger "This is executed on the first day of the month, at midnight"

When defining those cron jobs, I didn’t have to specify the user executing them.

This is because there is a difference between user-defined cron jobs and system-defined cron jobs.

User-Defined Cron Jobs

User-defined cron jobs are cron jobs defined by a given user on the host. It doesn’t mean that it is not able to execute commands affecting the entire system, but its tasks are isolated on given folders on the host.

Every user is able to have its own set of cron jobs on a Linux host.

Listing user-defined cron jobs

When connected to a specific user, run this command to see the cron jobs owned by the user

$ crontab -l

If you own cron jobs, they will immediately be displayed to the standard output.

By default, user-defined cron jobs are stored in the /var/spool/cron/crontabs directory but you will need to be root to explore it.

user-defined-cron

Adding user-defined cron jobs

In order to edit the cron jobs related to the user you are connected to, run the following command

$ crontab -e

By default, your host will open your default editor and you will be able to able your cron jobs on the system.

Add a new line to the end of the file with the following line for example

* * * * * logger "This is a log command from junosnotes"

Logger is a command that allows users to write custom messages to logs. If you need a complete guide on logging and Syslog, we have a complete write-up on it.

You don’t have to specify a user as the system is already aware of the user defining this command.

Moreover, the command will be executed as the current user by default.

You don’t have to restart any services, your job will be taken into account on the next occurrence.

Given the example, we specified earlier, inspect your logs to see your cron job executed

$ sudo journalctl -xfn

cron-job-user

As you can see, the cron service inspected user-specific directories on the host (in /var/spool/cron/crontabs), it opened a session as my current user, executed the command, and closed the session.

Awesome!

You learned how you can define user-defined cron jobs on your host.

Removing user defined cron jobs

In order to remove user-defined cron jobs, use the following command

$ crontab -r
(or)
$ crontab -ri

Crontab will be deleted for your current user (it won’t delete system-defined cron jobs).

Run a cron job listing to check that all cron jobs have been deleted

System Defined Cron Jobs

System-defined cron jobs are jobs defined in shared directories on the filesystem.

It means that, given that you have sudo privileges on the host, you will be able to define cron jobs that may be modified by other administrators on your system.

Directories related to system defined cron jobs are located in the etc directory and can be seen by running

$ ls -l | grep cron

As you can see, this folder contains a lot of different folders and files :

  • anacrontab: a file used by the anacron service on Linux, which will be explained in one of the next sections.
  • cron.d: a directory containing a list of cron jobs to be read by the cron service. The files in cron.d are written given the cron syntax we saw before;
  • cron.daily: a directory containing a list of scripts to be executed by the system every day. Files are different from the files contained in the cron.d directory because they are actual bash scripts and not cron jobs written with cron syntax;
  • cron.hourly, cron.monthly, cron.weekly are self-explanatory, they contain scripts executed every hour, every month, and every week of the year;
  • crontab: a cron file written with cron syntax that instructs the cron service to run jobs located in the daily, hourly, monthly, and weekly folders. It can also define custom jobs similarly to user-defined cron jobs, except that you have to specify the user that should run the command.

Listing system defined cron jobs

As you probably understood, cron jobs defined in global configuration folders are spread over multiple folders.

Moreover, multiple cron files can be defined on those folders.

However, using the command line, there are efficient ways to concatenate all the files in a given directory.

To list all cron jobs defined in cron.d, run the following command

$ cat /etc/cron.d/*

Similarly, to list cron jobs defined in the crontab file, run the following command

$ cat /etc/crontab

Similarly, you can inspect all the scripts that are going to be executed on a daily basis

ls -l /etc/cron.daily/

Adding system defined cron jobs

As you probably understood, there are multiple ways to add system-defined cron jobs.

You can create a cron file in the cron.d, and the file will be inspected every minute for changes.

You can also add your cron job directly to the crontab file. If you want to execute a task every minute or every hour, it is recommended to add your script directly to the corresponding cron directories.

The only difference with user-defined cron jobs is that you will have to specify a user that will run the cron command.

For example, create a new file in the cron.d directory and add the following content to it (you will obviously need sudo privileges for the commands to run)

$ sudo nano /etc/cron.d/custom-cron

*/1 * * * *    root    logger 'This is a cron from cron.d'

Again, no need for you to restart any services, the cron service will inspect your file on the next iteration.

To see your cron job in action, run the following command

$ sudo journalctl -xfn 100 | grep logger

This is what you should see on your screen

Great!

As you can see your job is now executed every minute by the root user on your host.

Now that you have a complete idea of what user-defined cron jobs and system-defined cron jobs are, let’s see the complete cron lifecycle on a Linux host.

Cron Complete Cycle on Linux

Without further ado, here is a complete cron cycle on Linux.

cron-cycle-2

This is what your cron service does every minute, as well as all the directories inspected.

Cron will inspect the user-defined cron jobs and execute them if needed.

It will also inspect the crontab file where several default cron jobs are defined by default.

Those default cron jobs are scripts that instruct your host to verify every minute, every hour, every day, and every week specific folders and to execute the scripts that are inside them.

Finally, the cron.d directory is inspected. The cron.d may contain custom cron files and it also contains a very important file which is the anacron cron file.

Anacron cron file on Linux

The anacron cron file is a file executed every half an hour between 7:00 am and 11 pm.

The anacron cron file is responsible for calling the anacron service.

The anacron service is a service that is responsible for running cron jobs in case your computer was not able to run them in the first place.

Suppose that your computer is off but you had a cron job responsible for running update scripts every week.

As a consequence, when turning your computer on, instead of waiting an entire week to run those update scripts, the anacron service will detect that you were not able to launch the update cron before.

Anacron will then proceed to run the cron job for your system to be updated.

By default, the anacron cron file is instructed to verify that the cron.daily, cron.weekly, cron.hourly, and cron.monthly directories were correctly called in the past.

If anacron detects that the cron jobs in the cron.monthly folders haven’t run, it will be in charge to run them.

Conclusion

Today, you learned how cron and crontab work on Linux. You also had a complete introduction on the cron syntax and how to define your own cron scripts as a user on your host.

Finally, you had a complete cron cycle overview of how things work on your host and of what anacron is.

If you are interested in Linux system administration, we have a complete section about it on our website. Check out some of our Linux Tutorials from below:

Until then, have fun, as always.

Input Output Redirection on Linux Explained

Input Output Redirection on Linux Explained | Error Redirection in Linux

In Linux, it’s very often to execute Input or output redirection while working daily. It is one of the core concepts of Unix-based systems also it is utilized as a way to improve programmer productivity amazingly.

In this tutorial, we will be discussing in detail regarding the standard input/output redirections on Linux. Mostly, Unix system commands take input from your terminal and send the resultant output back to your terminal.

Moreover, this guide will reflect on the design of the Linux kernel on files as well as the way processes work for having a deep and complete understanding of what input & output redirection is.

Do Check: 

If you follow this Input Output Redirection on Linux Explained Tutorial until the end, then you will be having a good grip on the concepts like What file descriptors are and how they related to standard inputs and outputs, How to check standard inputs and outputs for a given process on Linux, How to redirect standard input and output on Linux, and How to use pipelines to chain inputs and outputs for long commands;

So without further ado, let’s take a look at what file descriptors are and how files are conceptualized by the Linux kernel.

Get Ready?

What is Redirection?

In Linux, Redirection is a feature that helps when executing a command, you can change the standard input/output devices. The basic workflow of any Linux command is that it takes an input and gives an output.

  • The standard input (stdin) device is the keyboard.
  • The standard output (stdout) device is the screen.

With redirection, the standard input/output can be changed.

What are Linux processes?

Before understanding input and output on a Linux system, it is very important to have some basics about what Linux processes are and how they interact with your hardware.

If you are only interested in input and output redirection command lines, you can jump to the next sections. This section is for system administrators willing to go deeper into the subject.

a – How are Linux processes created?

You probably already heard it before, as it is a pretty popular adage, but on Linux, everything is a file.

It means that processes, devices, keyboards, hard drives are represented as files living on the filesystem.

The Linux Kernel may differentiate those files by assigning them a file type (a file, a directory, a soft link, or a socket for example) but they are stored in the same data structure by the Kernel.

As you probably already know, Linux processes are created as forks of existing processes which may be the init process or the systemd process on more recent distributions.

When creating a new process, the Linux Kernel will fork a parent process and it will duplicate a structure which is the following one.

b – How are files stored on Linux?

I believe that a diagram speaks a hundred words, so here is how files are conceptually stored on a Linux system.

b – How are files stored on Linux?

As you can see, for every process created, a new task_struct is created on your Linux host.

This structure holds two references, one for filesystem metadata (called fs) where you can find information such as the filesystem mask for example.

The other one is a structure for files holding what we call file descriptors.

It also contains metadata about the files used by the process but we will focus on file descriptors for this chapter.

In computer science, file descriptors are references to other files that are currently used by the kernel itself.

But what do those files even represent?

c – How are file descriptors used on Linux?

As you probably already know, the kernel acts as an interface between your hardware devices (a screen, a mouse, a CD-ROM, or a keyboard).

It means that your Kernel is able to understand that you want to transfer some files between disks, or that you may want to create a new video on your secondary drive for example.

As a consequence, the Linux Kernel is permanently moving data from input devices (a keyboard for example) to output devices (a hard drive for example).

Using this abstraction, processes are essentially a way to manipulate inputs (as Read operations) to render various outputs (as write operations)

But how do processes know where data should be sent to?

Processes know where data should be sent to using file descriptors.

On Linux, the file descriptor 0 (or fd[0]) is assigned to the standard input.

Similarly the file descriptor 1 (or fd[1]) is assigned to the standard output, and the file descriptor 2 (or fd[2]) is assigned to the standard error.

file-descriptors-Linux

It is a constant on a Linux system, for every process, the first three file descriptors are reserved for standard inputs, outputs, and errors.

Those file descriptors are mapped to devices on your Linux system.

Devices registered when the kernel was instantiated, they can be seen in the /dev directory of your host.

c – How are file descriptors used on Linux?

If you were to take a look at the file descriptors of a given process, let’s say a bash process for example, you can see that file descriptors are essentially soft links to real hardware devices on your host.

c – How are file descriptors used on Linux proc

As you can see, when isolating the file descriptors of my bash process (that has the 5151 PID on my host), I am able to see the devices interacting with my process (or the files opened by the kernel for my process).

In this case, /dev/pts/0 represents a terminal which is a virtual device (or tty) on my virtual filesystem. In simpler terms, it means that my bash instance (running in a Gnome terminal interface) waits for inputs from my keyboard, prints them to the screen, and executes them when asked to.

Now that you have a clearer understanding of file descriptors and how they are used by processes, we are ready to describe how to do input and output redirection on Linux.

What is Output redirection on Linux?

Input and output redirection is a technique used in order to redirect/change standard inputs and outputs, essentially changing where data is read from, or where data is written to.

For example, if I execute a command on my Linux shell, the output might be printed directly to my terminal (a cat command for example).

However, with output redirection, I could choose to store the output of my cat command in a file for long-term storage.

a – How does output redirection works?

Output redirection is the act of redirecting the output of a process to a chosen place like files, databases, terminals, or any devices (or virtual devices) that can be written to.

output-redirection-diagram

As an example, let’s have a look at the echo command.

By default, the echo function will take a string parameter and print it to the default output device.

As a consequence, if you run the echo function in a terminal, the output is going to be printed in the terminal itself.

echo-devconnected

Now let’s say that I want the string to be printed to a file instead, for long-term storage.

To redirect standard output on Linux, you have to use the “>” operator.

As an example, to redirect the standard output of the echo function to a file, you should run

$ echo junosnotes > file

If the file is not existing, it will be created.

Next, you can have a look at the content of the file and see that the “junosnotes” string was correctly printed to it.

redirecting-output-to-a-file

Alternatively, it is possible to redirect the output by using the “1>” syntax.

$ echo test 1> file

a – How does output redirection works

b – Output Redirection to files in a non-destructive way

When redirecting the standard output to a file, you probably noticed that it erases the existing content of the file.

Sometimes, it can be quite problematic as you would want to keep the existing content of the file, and just append some changes to the end of the file.

To append content to a file using output redirection, use the “>>” operator rather than the “>” operator.

Given the example we just used before, let’s add a second line to our existing file.

$ echo a second line >> file

appending-content-output-redirection

Great!

As you can see, the content was appended to the file, rather than overwriting it completely.

c – Output redirection gotchas

When dealing with output redirection, you might be tempted to execute a command to a file only to redirect the output to the same file.

Redirecting to the same file

echo 'This a cool butterfly' > file
sed 's/butterfly/parrot/g' file > file

What do you expect to see in the test file?

The result is that the file is completely empty.

c – Output redirection gotchas cat-file-empty

Why?

By default, when parsing your command, the kernel will not execute the commands sequentially.

It means that it won’t wait for the end of the sed command to open your new file and to write the content to it.

Instead, the kernel is going to open your file, erase all the content inside it, and wait for the result of your sed operation to be processed.

As the sed operation is seeing an empty file (because all the content was erased by the output redirection operation), the content is empty.

As a consequence, nothing is appended to the file, and the content is completely empty.

In order to redirect the output to the same file, you may want to use pipes or more advanced commands such as

command … input_file > temp_file  &&  mv temp_file input_file

Protecting a file from being overwritten

In Linux, it is possible to protect files from being overwritten by the “>” operator.

You can protect your files by setting the “noclobber” parameter on the current shell environment.

$ set -o noclobber

It is also possible to restrict output redirection by running

$ set -C

Note: to re-enable output redirection, simply run set +C

noclobber

As you can see, the file cannot be overridden when setting this parameter.

If I really want to force the override, I can use the “>|” operator to force it.

override-1

What is Input Redirection on Linux?

Input redirection is the act of redirecting the input of a process to a given device (or virtual device) so that it starts reading from this device and not from the default one assigned by the Kernel.

a – How does input redirection works?

As an instance, when you are opening a terminal, you are interacting with it with your keyboard.

However, there are some cases where you might want to work with the content of a file because you want to programmatically send the content of the file to your command.

input-redirection-diagram

To redirect the standard input on Linux, you have to use the “<” operator.

As an example, let’s say that you want to use the content of a file and run a special command on them.

In this case, I am going to use a file containing domains, and the command will be a simple sort command.

In this way, domains will be sorted alphabetically.

With input redirection, I can run the following command

cat-input-redirect

If I want to sort those domains, I can redirect the content of the domains file to the standard input of the sort function.

$ sort < domains

sort-example

With this syntax, the content of the domains file is redirected to the input of the sort function. It is quite different from the following syntax

$ sort domains

Even if the output may be the same, in this case, the sort function takes a file as a parameter.

In the input redirection example, the sort function is called with no parameter.

As a consequence, when no file parameters are provided to the function, the function reads it from the standard input by default.

In this case, it is reading the content of the file provided.

b – Redirecting standard input with a file containing multiple lines

If your file is containing multiple lines, you can still redirect the standard input from your command for every single line of your file.

b – Redirecting standard input with a file containing multiple lines multiline1

Let’s say for example that you want to have a ping request for every single entry in the domains file.

By default, the ping command expects a single IP or URL to be pinged.

You can, however, redirect the content of your domain’s file to a custom function that will execute a ping function for every entry.

$ ( while read ip; do ping -c 2 $ip; done ) < ips

b – Redirecting standard input with a file containing multiple lines

c – Combining input redirection with output redirection

Now that you know that standard input can be redirected to a command, it is useful to mention that input and output redirection can be done within the same command.

Now that you are performing ping commands, you are getting the ping statistics for every single website on the domains list.

The results are printed on the standard output, which is in this case the terminal.

But what if you wanted to save the results to a file?

This can be achieved by combining input and output redirections on the same command.

$ ( while read ip; do ping -c 2 $ip; done ) < domains > stats.txt

Great! The results were correctly saved to a file and can be analyzed later on by other teams in your company.

d – Discarding standard output completely

In some cases, it might be handy to discard the standard output completely.

It may be because you are not interested in the standard output of a process or because this process is printing too many lines on the standard output.

To discard standard output completely on Linux, redirect the standard output to /dev/null.

Redirecting to /dev/null causes data to be completely discarded and erased.

$ cat file > /dev/null

Note: Redirecting to /dev/null does not erase the content of the file but it only discards the content of the standard output.

What is standard error redirection on Linux?

Finally, after input and output redirection, let’s see how standard error can be redirected.

a – How does standard error redirection work?

Very similarly to what we saw before, error redirection is redirecting errors returned by processes to a defined device on your host.

For example, if I am running a command with bad parameters, what I am seeing on my screen is an error message and it has been processed via the file descriptor responsible for error messages (fd[2]).

Note that there are no trivial ways to differentiate an error message from a standard output message in the terminal, you will have to rely on the programmer sending error messages to the correct file descriptor.

error-redirection-diagram

To redirect error output on Linux, use the “2>” operator

$ command 2> file

Let’s use the example of the ping command in order to generate an error message on the terminal.

Now let’s see a version where the error output is redirected to an error file.

As you can see, I used the “2>” operator to redirect errors to the “error-file” file.

If I were to redirect only the standard output to the file, nothing would be printed to it.

As you can see, the error message was printed to my terminal and nothing was added to my “normal-file” output.

b – Combining standard error with standard output

In some cases, you may want to combine the error messages with the standard output and redirect it to a file.

It can be particularly handy because some programs are not only returning standard messages or error messages but a mix of two.

Let’s take the example of the find command.

If I am running a find command on the root directory without sudo rights, I might be unauthorized to access some directories, like processes that I don’t own for example.

permission-denied

As a consequence, there will be a mix of standard messages (the files owned by my user) and error messages (when trying to access a directory that I don’t own).

In this case, I want to have both outputs stored in a file.

To redirect the standard output as well as the error output to a file, use the “2<&1” syntax with a preceding “>”.

$ find / -user junosnotes > file 2>&1

Alternatively, you can use the “&>” syntax as a shorter way to redirect both the output and the errors.

$ find / -user junosnotes &> file

So what happened here?

When bash sees multiple redirections, it processes them from left to right.

As a consequence, the output of the find function is first redirected to the file.

Next, the second redirection is processed and redirects the standard error to the standard output (which was previously assigned to the file).

multiple-redirections

multiple-redirections 2

What are pipelines on Linux?

Pipelines are a bit different from redirections.

When doing standard input or output redirection, you were essentially overwriting the default input or output to a custom file.

With pipelines, you are not overwriting inputs or outputs, but you are connecting them together.

Pipelines are used on Linux systems to connect processes together, linking standard outputs from one program to the standard input of another.

Multiple processes can be linked together with pipelines (or pipes)

pipelines-linux-2

Pipes are heavily used by system administrators in order to create complex queries by combining simple queries together.

One of the most popular examples is probably counting the number of lines in a text file, after applying some custom filters on the content of the file.

Let’s go back the domains file we created in the previous sections and let’s change their country extensions to include .net domains.

Now let’s say that you want to count the numbers of .com domains in the file.

How would you perform that? By using pipes.

First, you want to filter the results to isolate only the .com domains in the file. Then, you want to pipe the result to the “wc” command in order to count them.

Here is how you would count .com domains in the file.

$ grep .com domains | wc -l

Here is what happened with a diagram in case you still can’t understand it.

counting-domains

Awesome!

Conclusion

In today’s tutorial, you learned what input and output redirection is and how it can be effectively used to perform administrative operations on your Linux system.

You also learned about pipelines (or pipes) that are used to chain commands in order to execute longer and more complex commands on your host.

If you are curious about Linux administration, we have a whole category dedicated to it on JunosNiotes, so make sure to check it out!

Understanding Hard and Soft Links on Linux

Understanding Hard and Soft Links on Linux | What are Hard & Soft Links in Linux?

In this tutorial, we are going to discuss what are hard and soft links with syntax and how we can understand Hard and Soft Links on Linux easily.

In case, you are wondering how you can generate a shortcut on a Linux system, then this tutorial can be the perfect answer for you all.

Are you Ready to Start learning about Understanding Hard and Soft Links on Linux? here you go.

What Will You Learn?

This section is completely about the stuff that is provided in this tutorial about the topic we are going to discuss. It helps you to know a bit earlier about what you are going to learn:

  • How storage works on a Linux system and what inodes are exactly?
  • What hard and soft links are, given what you just learned before
  • How copying differs from creating links to your files
  • How to create links on a Linux system
  • How to find hard and soft links: all the commands that you should know.
  • Some of the quick facts about hard and soft links

That’s a long program, so without further ado, let’s see how data is organized on your Linux system and what inodes are?

Do Refer: 

How does storage work on a Linux system?

In order to understand what hard and soft links are, you need to have some basics on how data is stored on your system.

In a computer, data are stored on a hard drive.

Your hard drive has a capacity, let’s say 1 TB, and it is divided into multiple blocks of a given capacity.

II - How does storage work on a Linux system storage-design

If I launch a simple fdisk command on my system, I am able to see that my device is separated into sectors of 512 bytes.

II - How does storage work on a Linux system fdisk-simple

That’s a lot of sectors, as my hard drive has a capacity of almost 32 GBs.

Now every time that I create a file, it will be stored on a block.

But what if the file size exceeds 512 bytes?

You guessed it, my file will be “fragmented” and stored into multiple different blocks.

how-files-are-stored

If the different pieces of your file are physically far away from each other on the disk, the time needed to build your file will obviously be longer.

That’s why you had to defragment your disk on Windows systems, for example, you were essentially making related blocks closer.

Luckily for us, those blocks have addresses.

Your Linux system does not have to read the entire disk to find your file, it will cherry-pick some addresses that correspond to your file data.

How?

By using inodes.

a – What are inodes?

Inodes are essentially identification cards for your file.

They contain the file metadata, the file permissions, the file type, the file size but most importantly the physical address of this file on the disk.

inode

Without going into too many details, your inode will keep references to the locations of your different file blocks.

So one inode equals one file.

That’s the first layer of your file system, and how references are kept to the underlying data.

storage-first-layer

However, as you probably noticed, filenames are not part of the inode, they are not part of the identification card of the file.

b – About filenames and inodes

On a Linux system, filenames are kept on a separate index.

This separate index keeps track of all the different filenames existing on your system (even directories) and they know the corresponding inode in the inode index.

What does it mean?

It essentially means that you can have multiple filenames (say “doc.txt” and “paper.txt”) pointing to the same exact file, sharing the same content.

Now that you learned about the filenames index, let’s add another layer to our previous layered architecture.

b – About filenames and inodes filesystems-design

With this schema, you have a basic understanding of how files are organized and stored on your filesystem, but more importantly, you will be able to understand what hard and soft links are.

What is Soft Link And Hard Link In Linux?

Let’s start with softs links as they are probably the easiest to understand.

a – Understanding soft links

Soft links, also called symbolic links, are files that point to other files on the filesystem.

Similar to shortcuts on Windows or macOS, soft links are often used as faster ways to access files located in another part of the filesystem (because the path may be hard to remember for example).

Symbolic links are identified with the filetype “l” when running a ls command.

They also have a special syntax composed of the link name and an arrow pointing to the file they are referencing.

a – Understanding soft links symbolic-link

In this case, the file “shortcut” is pointing to the file “file.txt” on my filesystem.

Have you paid attention to the permissions?

They are set to “rwx” by default for the user, the group, and the others.

However, you would be constrained by the permissions of the file if you were to manipulate this file with another user.

b – Soft links and inodes

So why did we talk so much about inodes in the first section?

Let’s have a quick look at the inode of the file and the inode of the shortcut.

$ stat shortcut
  File: shortcut -> file.txt
  Size: 3               Blocks: 0          IO Block: 4096   symbolic link
Device: fc01h/64513d    Inode: 258539      Links: 1

$ stat file.txt
  File: job
  Size: 59              Blocks: 8          IO Block: 4096   regular file
Device: fc01h/64513d    Inode: 258545      Links: 2

The inodes are different.

However, the original file inode is pointing directly to the file content (i.e the blocks containing the actual content) while the symbolic link inode is pointing to a block containing the path to the original file.

sooft-links

The file and the shortcut share the same content.

It means that I was to modify the content of the shortcut, the changes would be passed on to the content of the file.

If I delete the shortcut, it will simply delete the reference to the first inode. As the file inode (the first one) still references the content of the file on the disk, the content won’t be lost.

However, if I were to delete the file, the symbolic link would lose its reference to the first inode. As a consequence, you would not be able to read the file anymore.

This is what we call a dangling symbolic link, a link that is not pointing to anything.

deleting-soft-link-original-file

See the red highlighting when I deleted the original file?

Your terminal is giving visual clues that a symbolic link is a dangling symbolic link.

So what’s the size of a symbolic link?

Remember, the symbolic link points to the path of the original file on the filesystem.

In this example, I created a file named “devconnected”, and I built a shortcut to it using the ln command.

Can you guess the size of the shortcut? 12, because “devconnected” actually contains 12 letters.

soft-link-size

Great!

Now you have a good understanding of what soft links are.

c – Understanding hard links

Hard links to a file are instances of the file under a different name on the filesystem.

Hard links are literally the file, meaning that they share all the attributes of the original file, even the inode number.

hard-link

Here’s a hard link created on my system.

It is sharing the same content as the original file and it is also sharing the same permissions.

Changing permissions of the original file would change the permissions of the hard link.

d – Hard links and inodes

Let’s have a quick look at the original file inode and the hard link inode.

$ stat hardlink
  File: hardlink
  Size: 59              Blocks: 8          IO Block: 4096   regular file
Device: fc01h/64513d    Inode: 258545      Links: 2

$ stat file.txt
  File: file.txt
  Size: 59              Blocks: 8          IO Block: 4096   regular file
Device: fc01h/64513d    Inode: 258545      Links: 2

As you probably noticed, the inodes are the same, but the filenames are different!

Here’s what happens in this case on the filesystem.

hard-soft-links

When you are creating a symbolic link, you are essentially creating a new link to the same content, via another inode, but you don’t have access to the content directly in the link.

When creating a hard link, you are literally directly manipulating the file.

If you modify the content in the hard link file, the content will be changed in the original file.

Similarly, if you modify the content in the original file, it will be modified in the hard link file.

However, if you delete the hard link file, you will still be able to access the original file content.

Similarly, deleting the original file has no consequences on the content of the hard link.

Data are definitely deleted when no inodes point to it anymore.

Hard or Soft?

You won’t find a clear answer to this question. If the type that suits your special situation can be the best link. While these concepts can be tricky to memorize, the syntax is pretty straightforward, so that is a plus!

To keep the two links easily separated in your mind, I leave you with this:

  • A hard link always points a filename to data on a storage device.
  • A soft link always points a filename to another filename, which then points to information on a storage device.

What is the difference between copying and creating a hard link?

With the concepts that you just learned, you may wonder what’s the difference between copying a file and creating a hard link to the file.

Don’t we have two files in the end, with the same content?

The difference between copying and hard linking is that hard-linking does not duplicate the content of the file that it links to.

When you are copying a file, you are essentially assigning new blocks on the disk with the same content as the original file.

Even if you share the same content with hard-linking, you are using disk space to store the name of the original file, not the actual content of the file.

Diagrams may explain it better than words.

Here’s what copying a file means.

copying-a-file

See how it differs from hard-linking?

Now that you know how copying files differ from hard linking to files, let’s see how you can create symbolic and hard links on a Linux system.

Manipulating links on a Linux system

a – How to create a symbolic link on Linux?

To create a symbolic link, you need to use the ln command, with a -s flag (for symbolic).

The first argument specifies the file you want to link to.

The second argument describes the name of the link you are about to create.

$ ln -s file shortcut

$ ls -l
-rw-rw-r-- 1 schkn schkn 0 Aug 14 20:12 file
lrwxrwxrwx 1 schkn schkn 4 Aug 14 20:12 shortcut -> file

You can also create symbolic links to directories.

$ mkdir folder
$ ln -s folder shortcut-folder

$ ls -l
drwxrwxr-x  2 schkn schkn   4096 Aug 14 20:13 folder
lrwxrwxrwx  1 schkn schkn      7 Aug 14 20:14 shortcut-folder -> folder/

b – How to delete symbolic links on Linux

To remove existing symbolic links, use the unlink command.

Following our previous example :

$ unlink shortcut

$ ls -l 
-rw-rw-r-- 1 schkn schkn 0 Aug 14 20:12 file

You can also simply remove the shortcut by using the rm command.

Using the unlink command might be a little bit safer than performing an rm command.

$ rm shortcut

$ ls -l 
-rw-rw-r-- 1 schkn schkn 0 Aug 14 20:12 file

c – How to create a hard link on Linux

A hard link is created using the lnwithout specifying the s flag.

$ ln file hardlink

$ ls -l
-rw-rw-r--  2 schkn schkn      0 Aug 14 20:12 file
-rw-rw-r--  2 schkn schkn      0 Aug 14 20:12 hardlink

d – How to remove a hard link on Linux

Again, you can use the unlink command to delete a hard link on a Linux system.

$ ln file hardlink
$ unlink hardlink
$ ls -l
-rw-rw-r--  2 schkn schkn      0 Aug 14 20:12 file

Now that you know how to create links, let’s see how you can find links on your filesystem.

How to find links on a filesystem?

There are multiple ways to find links on a Linux system, but here are the main ones.

Using the find command

The find command has a type flag that you can use in order to find links on your system.

$ find . -type l -ls
262558      0 lrwxrwxrwx   1 schkn    schkn           7 Aug 14 20:14 ./shortcut-folder2 -> folder2/
.
.
.
262558      0 lrwxrwxrwx   1 schkn    schkn           7 Aug 14 20:14 ./shortcut-folder -> folder/

However, if you want to limit searches to the current directory, you have to use the maxdepth parameter.

$ find . -maxdepth 1 -type l -ls 
262558      0 lrwxrwxrwx   1 schkn    schkn           7 Aug 14 20:14 ./shortcut-folder -> folder/
258539      0 lrwxrwxrwx   1 schkn    schkn           3 Jan 26  2019 ./soft-job -> job

Finding links that point to a specific file

With the lname option, you have the opportunity to target links pointing to a specific filename.

$ ls -l
drwxrwxr-x  2 schkn schkn   4096 Aug 14 20:13 folder
lrwxrwxrwx  1 schkn schkn      7 Aug 14 20:38 devconnected -> folder/

$ find . -lname "fold*"
./devconnected

Finding broken links

With the L flag, you can find broken (or daggling) symbolic links on your system.

$ find -L . -type l -ls
258539      0 lrwxrwxrwx   1 schkn    schkn           3 Jan 26  2019 ./broken-link -> file

Quick facts about links on Linux

Before finishing this tutorial, there are some quick facts that you need to know about soft and hard links.

  • Soft links can point to different filesystems, and to remote filesystems. If you were to use NFS, which stands for Network File System, you would be able to create a symbolic link from one filesystem to a file system accessed by the network. As Linux abstracts different filesystems by using a virtual filesystem, it makes no difference for the kernel to link to a file located on an ext2, ext3, or an ext4 filesystem.
  • Hard links cannot be created for directories and they are constrained to the limits of your current filesystem. Creating hard links for directories could create access loops, where you would try to access a directory that points to itself. If you need more explanations about why it can’t be done conceptually, here’s a very good post by Volker Siegel on the subject.

I hope that you learned something new today. If you are looking for more Linux system administration tutorials, make sure to check our dedicated section.

Here is the list of our recent tutorials:

How To Install Git On Debian 10 Buster

How To Install Git On Debian 10 Buster | Debian Git Repository | Debian Buster Git

Git is the world’s famous distributed software version control system that allows you to keep track of your software at the source level. It is used by many open-source and commercial projects. In this tutorial, we will be discussing completely how to install & get started with Git on Debian 10 Buster Linux along with the introduction of Git such as what is git, git terms, git commands, and also features of git.

What is Git?

Git is the most commonly used distributed version control system in the world created by Linus Torvalds in 2005. The popular option among open-source and other collaborative software projects is Git. Also, several project files are kept in a Git repository, and big companies like GitHubGitlab, or Bitbucket assist to promote software development project sharing and collaboration.

Mainly, the Git tool is utilized by development teams to keep track of all the changes happening on a codebase, as well as organizing code in individual branches. In today’s tutorial, we are working on how to set up Git on a Debian 10 Buster machine.

What is Debian?

Debian is an operating system for a wide range of devices including laptops, desktops, and servers. The developers of Debian will provide the security updates for all packages for almost of their lifetime. The current stable distribution of Debian is version 10, codenamed buster. Debian 10 is brand new, so if you require a complete setup tutorial for Debian 10, follow this tutorial.

Also Check:

Terms of Git

For a better understanding of Git, you must know a few of the common Git Terms. So, we have compiled here in detail:

  • Repository: It is a directory on your local computer or a remote server where all your project files are kept and tracked by Git.
  • Modified: If you add a file in the staging area, and modify the file again before committing, then the file will have a modified status. You will have to add the file to the staging area again for you to be able to commit it.
  • Commit: It is keeping a snapshot of the files that are in the staging area. A commit has information such as a title, description, author name, email, hash, etc.
  • Staged: Before you commit your changes to the Git repository, you must add the files to the staging area. The files in the staging area are called staged files.
  • Tracked: If you want Git to track a file, then you have to tell Git to track the file manually.
  • Untracked: If you create a new file on your Git repository, then it is called an untracked file in Git. Unless you tell git to track it, Git won’t track a file.

Git Features

Before learning the installation of Git, knowing completely about the tool is very essential. So, here we have provided features of Git in an image format for quick reference and easy sharing to others. Look at the below shareable image and download it on your devices for usage:

Git Features shareable image

How To Install Git On Linux 2021?

Prerequisites

Before starting, make sure that you have root privileges on your instance.

To make sure of it, run the following command.

$ sudo -l

I – Prerequisites sudo-rights

How to Install Git from official sources?

By following the below sub-modules, you can easily understand the installation of Git from official sources:

Update your apt repositories

First of all, make sure that you have the latest versions of the repositories on your apt cache.

To update them, run the following command:

$ sudo apt update

II – Install Git from official sources apt-update

Install Git from the official repository

To install the latest stable version of Git (2.20.1 in my case), run the following command.

$ sudo apt-get install git

b – Install Git from the official repository git-install

Great!

Now you can check the git version that is running on your computer.
<pre$ git –version 2.20.1

Steps for Installing Git From Source

As you probably noticed, you are not getting the latest version of Git with the apt repositories. As of August 2019, the latest Git version is 2.22.0. In order to install the latest Git version on your Debian 10 instance, follow those instructions.

Install required dependencies

In order to build Git, you will have to install manually dependencies on your system. To do so, run the following command:

$ sudo apt-get install dh-autoreconf libcurl4-gnutls-dev libexpat1-dev \
  gettext libz-dev libssl-dev

a – Install required dependencies manual-dependencies

Install documentation dependencies

In order to add documentation to Git (in different formats), you will need the following dependencies

$ sudo apt-get install asciidoc xmlto docbook2x

b – Install documentation dependencies manual-2

Install the install-info dependencies

On Debian configurations, you will need to add the install-info dependency to your system.

$ sudo apt-get install install-info

c – Install the install-info dependencies manual-3

Download and build the latest Git version

Head to the Git repository on Github, and select the version you want to run on your Debian instance.
d – Download and build the latest Git version latest-git-version

Head to the directory where you stored the tar.gz file, and run the following commands.

$ tar -zxf git-2.22.0.tar.gz
$ cd git-2.22.0
$ make configure
$ ./configure --prefix=/usr
$ make all doc info
$ sudo make install install-doc install-html install-info

Again, run the following command to make sure that Git is correctly installed on your system

$ git --version

d – Download and build the latest Git version git-2.22.0

Configuring Git

Now that Git is correctly set on your instance, it is time for you to configure it.

This information is used when you are committing to repositories, you want to make sure that you are appearing under the correct name and email address.

To configure Git, run the following commands:

$ git config --global user.name "devconnected" 
$ git config --global user.email "devconnectedblog@gmail.com"

Now to make sure that your changes are made, run the following command.

$ git config --list

IV – Configuring Git git-config

You can also look at your modifications in the gitconfig file available in your home directory.

To view it, run the following command.

$ cat ~/.gitconfig

IV – Configuring Git gitconfig-file

Now that your Git instance is up and running, it is time for you to make your first contributions to the open-source world!

Here’s a very good link by Digital Ocean on a first introduction to the Open Source world!

Uninstalling Git

If by any chance you are looking for removing Git from your Debian 10 Buster instance, run the following command:

$ sudo apt-get remove git

V – Uninstalling Git git-remove

Until then, have fun, as always.

Syslog The Complete System Administrator Guide

Syslog: The Complete System Administrator Guide

Guys who hold Linux systems & who work as system administrators can get a high opportunity to work with Syslog, at least one time.

When you are working to system logging on Linux system then it is pretty much connected to the Syslog protocol. It is a specification that defines a standard for message logging on any system.

Developers or administrators who are not familiar with Syslog can acquire complete knowledge from this tutorial. Syslog was designed in the early ’80s by Eric Allman (from Berkeley University), and it works on any operating system that implements the Syslog protocol.

The perfect destination that you should come to learn more about Syslog and Linux logging, in general, is this Syslog: The Complete System Administrator Guide and other related articles on Junosnotes.com

Here is everything that you need to know about Syslog:

What is the purpose of Syslog?

I – What is the purpose of Syslog

Syslog is used as a standard to produce, forward, and collect logs produced on a Linux instance. Syslog defines severity levels as well as facility levels helping users having a greater understanding of logs produced on their computers. Logs can, later on, be analyzed and visualized on servers referred to as Syslog servers.

Here are a few more reasons why the Syslog protocol was designed in the first place:

  • Defining an architecture: this will be explained in detail later on, but if Syslog is a protocol, it will probably be part of complete network architecture, with multiple clients and servers. As a consequence, we need to define roles, in short: are you going to receive, produce or relay data?
  • Message format: Syslog defines the way messages are formatted. This obviously needs to be standardized as logs are often parsed and stored into different storage engines. As a consequence, we need to define what a Syslog client would be able to produce, and what a Syslog server would be able to receive;
  • Specifying reliability: Syslog needs to define how it handles messages that can not be delivered. As part of the TCP/IP stack, Syslog will obviously be opinionated on the underlying network protocol (TCP or UDP) to choose from;
  • Dealing with authentication or message authenticity: Syslog needs a reliable way to ensure that clients and servers are talking in a secure way and that messages received are not altered.

Now that we know why Syslog is specified in the first place, let’s see how a Syslog architecture works.

Must Refer: How To Install and Configure Debian 10 Buster with GNOME

What is Syslog architecture?

When designing a logging architecture, as a centralized logging server, it is very likely that multiple instances will work together.

Some will generate log messages, and they will be called “devices” or “syslog clients“.

Some will simply forward the messages received, they will be called “relays“.

Finally, there are some instances where you are going to receive and store log data, those are called “collectors” or “syslog servers”.

syslog-component-arch

Knowing those concepts, we can already state that a standalone Linux machine acts as a “syslog client-server” on its own: it produces log data, it is collected by rsyslog and stored right into the filesystem.

Here’s a set of architecture examples around this principle.

In the first design, you have one device and one collector. This is the most simple form of logging architecture out there.

architecture-1

Add a few more clients to your infrastructure, and you have the basis of a centralized logging architecture.

architecture -2

Multiple clients are producing data and are sending it to a centralized syslog server, responsible for aggregating and storing client data.

If we were to complexify our architecture, we can add a “relay“.

Examples of relays could be Logstash instances for example, but they also could be rsyslog rules on the client-side.

architecture - 3

Those relays act most of the time as “content-based routers” (if you are not familiar with content-based routers, here is a link to understand them).

It means that based on the log content, data will be redirected to different places. Data can also be completely discarded if you are not interested in it.

Now that we have detailed Syslog components, let’s see what a Syslog message looks like.

How Syslog Architecture Works?

There are three different layers within the Syslog standard. They are as follows:

  1. Syslog content (information contained in an event message)
  2. Syslog application (generates, interprets, routes, and stores messages)
  3. Syslog transport (transmits the messages)

syslog message layers destinations

Moreover, applications can be configured to send messages to different destinations. There are also alarms that give instant notifications for events like as follows:

  • Hardware errors
  • Application failures
  • Lost contact
  • Mis-configuration

Besides, alarms can be set up to send notifications through SMS, pop-up messages, email, HTTP, and more. As the process is automated, the IT team will receive instant notifications if there is an unexpected breakdown of any of the devices.

The Syslog Format

Syslog has a standard definition and format of the log message defined by RFC 5424. As a result, it is composed of a header, structured-data (SD), and a message. Inside the header, you will see a description of the type such as:

  • Priority
  • Version
  • Timestamp
  • Hostname
  • Application
  • Process ID
  • Message ID

Later, you will recognize structured data which have data blocks in the “key=value” format in square brackets. After the SD, you can discover the detailed log message, which is encoded in UTF-8.

For instance, look at the below message:

<34>1 2003-10-11T22:14:15.003Z mymachine.example.com su - ID47 - BOM'su root' failed for lonvick on /dev/pts/8

Writes to the resulting format:

<priority>VERSION ISOTIMESTAMP HOSTNAME APPLICATION PID MESSAGEID STRUCTURED-DATA MSG

What is the Syslog message format?

The Syslog format is divided into three parts:

  • PRI part: that details the message priority levels (from a debug message to an emergency) as well as the facility levels (mail, auth, kernel);
  • HEADER part: composed of two fields which are the TIMESTAMP and the HOSTNAME, the hostname being the machine name that sends the log;
  • MSG part: this part contains the actual information about the event that happened. It is also divided into a TAG and a CONTENT field.

syslog-format

Before detailing the different parts of the syslog format, let’s have a quick look at syslog severity levels as well as syslog facility levels.

a – What are Syslog facility levels?

In short, a facility level is used to determine the program or part of the system that produced the logs.

By default, some parts of your system are given facility levels such as the kernel using the kern facility, or your mailing system using the mail facility.

If a third party wants to issue a log, it would probably be a reserved set of facility levels from 16 to 23 called “local use” facility levels.

Alternatively, they can use the “user-level” facility, meaning that they would issue logs related to the user that issued the commands.

In short, if my Apache server is run by the “apache” user, then the logs would be stored under a file called “apache.log” (<user>.log)

Here are the Syslog facility levels described in a table:

Numerical Code Keyword Facility name
0 kern Kernel messages
1 user User-level messages
2 mail Mail system
3 daemon System Daemons
4 auth Security messages
5 syslog Syslogd messages
6 lpr Line printer subsystem
7 news Network news subsystem
8 uucp UUCP subsystem
9 cron Clock daemon
10 authpriv Security messages
11 ftp FTP daemon
12 ntp NTP subsystem
13 security Security log audit
14 console Console log alerts
15 solaris-cron Scheduling logs
16-23 local0 to local7 Locally used facilities

Do those levels sound familiar to you?

Yes! On a Linux system, by default, files are separated by facility name, meaning that you would have a file for auth (auth.log), a file for the kernel (kern.log), and so on.

Here’s a screenshot example of my Debian 10 instance.

var-log-debian-10

Now that we have seen syslog facility levels, let’s describe what syslog severity levels are.

b – What are Syslog severity levels?

Syslog severity levels are used to how severe a log event is and they range from debugging, informational messages to emergency levels.

Similar to Syslog facility levels, severity levels are divided into numerical categories ranging from 0 to 7, 0 being the most critical emergency level.

Here are the syslog severity levels described in a table:

Value Severity Keyword
0 Emergency emerg
1 Alert alert
2 Critical crit
3 Error err
4 Warning warning
5 Notice notice
6 Informational info
7 Debug debug

Even if logs are stored by facility name by default, you could totally decide to have them stored by severity levels instead.

If you are using rsyslog as a default syslog server, you can check rsyslog properties to configure how logs are separated.

Now that you know a bit more about facilities and severities, let’s go back to our syslog message format.

c – What is the PRI part?

The PRI chunk is the first part that you will get to read on a syslog formatted message.

The PRI stores the “Priority Value” between angle brackets.

Remember the facilities and severities you just learned?

If you take the message facility number, multiply it by eight, and add the severity level, you get the “Priority Value” of your syslog message.

Remember this if you want to decode your syslog message in the future.

pri-calc-fixed

d – What is the HEADER part?

As stated before, the HEADER part is made of two crucial information: the TIMESTAMP part and the HOSTNAME part (that can sometimes be resolved to an IP address)

This HEADER part directly follows the PRI part, right after the right angle bracket.

It is noteworthy to say that the TIMESTAMP part is formatted on the “Mmm dd hh:mm:ss” format, “Mmm” being the first three letters of a month of the year.

HEADER-example

When it comes to the HOSTNAME, it is often the one given when you type the hostname command. If not found, it will be assigned either the IPv4 or the IPv6 of the host.

How does Syslog message delivery work?

When issuing a syslog message, you want to make sure that you use reliable and secure ways to deliver log data.

Syslog is of course opiniated on the subject, and here are a few answers to those questions.

a – What is Syslog forwarding?

Syslog forwarding consists of sending clients’ logs to a remote server in order for them to be centralized, making log analysis and visualization easier.

Most of the time, system administrators are not monitoring one single machine, but they have to monitor dozens of machines, on-site and off-site.

As a consequence, it is a very common practice to send logs to a distant machine, called a centralized logging server, using different communication protocols such as UDP or TCP.

b – Is Syslog using TCP or UDP?

As specified on the RFC 3164 specification, syslog clients use UDP to deliver messages to syslog servers.

Moreover, Syslog uses port 514 for UDP communication.

However, on recent syslog implementations such as rsyslog or syslog-ng, you have the possibility to use TCP (Transmission Control Protocol) as a secure communication channel.

For example, rsyslog uses port 10514 for TCP communication, ensuring that no packets are lost along the way.

Furthermore, you can use the TLS/SSL protocol on top of TCP to encrypt your Syslog packets, making sure that no man-in-the-middle attacks can be performed to spy on your logs.

If you are curious about rsyslog, here’s a tutorial on how to setup a complete centralized logging server in a secure and reliable way.

What are current Syslog implementations?

Syslog is a specification, but not the actual implementation in Linux systems.

Here is a list of current Syslog implementations on Linux:

  • Syslog daemon: published in 1980, the syslog daemon is probably the first implementation ever done and only supports a limited set of features (such as UDP transmission). It is most commonly known as the sysklogd daemon on Linux;
  • Syslog-ng: published in 1998, syslog-ng extends the set of capabilities of the original syslog daemon including TCP forwarding (thus enhancing reliability), TLS encryption, and content-based filters. You can also store logs to local databases for further analysis.

syslog-ng

  • Rsyslog: released in 2004 by Rainer Gerhards, rsyslog comes as a default syslog implementation on most of the actual Linux distributions (Ubuntu, RHEL, Debian, etc..). It provides the same set of features as syslog-ng for forwarding but it allows developers to pick data from more sources (Kafka, a file, or Docker for example)

rsyslog-card

Best Practices of the Syslog

When manipulating Syslog or when building a complete logging architecture, there are a few best practices that you need to know:

  • Use reliable communication protocols unless you are willing to lose data. Choosing between UDP (a non-reliable protocol) and TCP (a reliable protocol) really matters. Make this choice ahead of time;
  • Configure your hosts using the NTP protocol: when you want to work with real-time log debugging, it is best for you to have hosts that are synchronized, otherwise, you would have a hard time debugging events with good precision;
  • Secure your logs: using the TLS/SSL protocol surely has some performance impacts on your instance, but if you are to forward authentication or kernel logs, it is best to encrypt them to make sure that no one is having access to critical information;
  • You should avoid over-logging: defining a good log policy is crucial for your company. You have to decide if you are interested in storing (and essentially consuming bandwidth) for informational or debug logs for example. You may be interested in having only error logs for example;
  • Backup log data regularly: if you are interested in keeping sensitive logs, or if you are audited on a regular basis, you may be interested in backing up your log on an external drive or on a properly configured database;
  • Set up log retention policies: if logs are too old, you may be interested in dropping them, also known as “rotating” them. This operation is done via the logrotate utility on Linux systems.

Conclusion

The Syslog protocol is definitely a classic for system administrators or Linux engineers willing to have a deeper understanding of how logging works on a server.

However, there is a time for theory, and there is a time for practice.

So where should you go from there? You have multiple options.

You can start by setting up a Syslog server on your instance, like a Kiwi Syslog server for example, and starting gathering data from it.

Or, if you have a bigger infrastructure, you should probably start by setting up a centralized logging architecture, and later on, monitor it using very modern tools such as Kibana for visualization.

I hope that you learned something today.

Until then, have fun, as always.

How To Install and Configure Debian 10 Buster with GNOME

How To Install and Configure Debian 10 Buster with GNOME

Do you need an ultimate Guide to Install and Configure Debian 10 Buster with GNOME? This tutorial is the best option for you. Here, we have provided step-by-step instructions about how to install Debian 10 Buster with a GNOME desktop. Just have a look at the features of the Debian 10 before entering to discuss how to install and configure it using GNOME.

What is Debian?

Debian is an operating system for a wide range of devices including laptops, desktops, and servers. The developers of Debian will provide the security updates for all packages for almost of their lifetime. The current stable distribution of Debian is version 10, codenamed buster. Check out the features of the current version of the buster from the below modules.

Features of Debian 10 Buster

Initially, it was released on the 6th of July 2019, and it has come with a lot of very great features for system administrators. Have a look at them:

  • JDK update from the OpenJDK 8.0 to the new OpenJDK 11.0 version.
  • Debian 10 is now using version 3.30 of GNOME, featuring an increased desktop performance, screen sharing, and improved ways to remotely connect to Windows hosts.
  • Secure boot is now enabled by default, which means that you don’t have to disable it when trying to install Debian 10 on your machine.
  • Upgrade to Bash 5.0 essentially providing more variables for sysadmins to play with (EPOCHSECONDS or EPOCHREALTIME for example).
  • A lot of software updates: Apache 2.4.38, systemd 241, Vim 8.1, Python 3 3.7.2, and many more.
  • IPtables is being replaced by NFtables, providing an easier syntax and a more efficient way to handle your firewall rules.

After referring to these above points, you know what’s available in the brand new Debian 10 buster distribution, now it’s time for installation and configuration of Debian 10.

Do Check: How To Install InfluxDB on Windows

Suggested System Requirements for Debian 10

  • 2 GB RAM
  • 2 GHz Dual Core Processor
  • 10 GB Free Hard disk space
  • Bootable Installation Media (USB/ DVD)
  • Internet connectivity (Optional)

Now, dive into the installation & configuration steps of Debian 10 Buster

How to Install and Configure Debian 10 with GNOME?

The following are the detailed steps to install and configure the current version of Debian 10 using GNOME:

Steps to Create a Bootable USB stick on Linux

In order to install Debian 10 buster, you need to “flash” an ISO image to a USB stick, making it “bootable“.

The Debian 10 buster image is about 2 GB in size (if you choose to have a desktop environment with it), so I would recommend that you choose a USB drive that is at least 3GB large or more.

If you don’t have a USB drive that large, you can opt for minimal versions of Debian 10 Buster.

I – Create a Bootable USB stick on Linux

In my home setup, I have a Xubuntu 18.04 instance, so this is what I will use to create my bootable image.

Steps are pretty much the same for other distributions. For Windows, you would need to use Rufus to create a bootable image.

a – Plug your USB stick in the USB port

Within a couple of seconds, the operating system should automatically mount your USB drive in your filesystem (it should be mounted at the /media mount point by default).

How To Install and Configure Debian 10 Buster with GNOME volume-mounted

b – Identify where your USB drive is mounted

To get the mount point of your USB drive, you can use the lsblk command.

How To Install and Configure Debian 10 Buster with GNOME lsblk-1

As you can see, my USB drive is named “sdb”, it has one partition (part) named “sdb1” and it is mounted on “/media/antoine/7830-961F”.

Alternatively, you could use the df command to have some information about the remaining space on your USB drive.

How To Install and Configure Debian 10 Buster with GNOME df-ht

c – Download Debian 10 Buster ISO file

Your USB is ready, now you can download the ISO file to flash your drive.

The distribution images are located here. For this tutorial, I am using the Debian 10 Buster GNOME edition for amd64 processors.

If you are more familiar with another environment like Cinnamon or KDE, they are all available in the downloads page.

Run a simple wget command on any folder that you want (my home folder in this case)

$ wget https://cdimage.debian.org/debian-cd/current-live/amd64/iso-hybrid/debian-live-10.0.0-amd64-gnome.iso

If you need a more minimal distribution, you can go for the netinst version, but desktop environments might not be included.

$ wget https://cdimage.debian.org/debian-cd/current/amd64/iso-cd/debian-10.0.0-amd64-netinst.iso<

How To Install and Configure Debian 10 Buster with GNOME wget

d – Copy the image to your USB drive

To copy the image, we are going to use the dd command.

$ sudo dd if=/home/antoine/debian-live-10.0.0-amd64-gnome.iso of=/dev/sdb && sync

e – Boot on the USB drive

Now that your USB drive contains the ISO file, it is time for you to boot from it.

On most configurations, you should be able to boot on the USB by pressing ESC, F1, F2, or F8 when starting your computer.

Follow the Debian 10 Graphical Installation Steps

This is the screen that you should see once you successfully booted on the Debian 10 installer.

Select the “Graphical Debian Installer” option.

How To Install and Configure Debian 10 Buster with GNOME step-1First, you are asked to select a language.

I’ll go for English for this one.

How To Install and Configure Debian 10 Buster with GNOME step-2

On the next screen, you are asked to select a location.

I’ll pick the United States as an example.

How To Install and Configure Debian 10 Buster with GNOME step-3

Then, choose your keyboard layout. (don’t worry, you can change it later on if you want).

I’ll go for American English for this example.

How To Install and Configure Debian 10 Buster with GNOME step-3

From there, a couple of automatic checks are done within your installation.

Debian 10 will try to load additional components from the bootable device and it will perform some automatic network checks.

How To Install and Configure Debian 10 Buster with GNOME step-5

After the checks, you are asked to set a hostname for your computer.

As indicated, this is the name that will be used to identify your computer on a network.

I’ll go for “Debian-10” in this case.
How To Install and Configure Debian 10 Buster with GNOME step-7
You are asked to configure the domain name for your host. You can leave this option blank.
How To Install and Configure Debian 10 Buster with GNOME step-8

Be careful on the next step, there is a little bit of a gotcha when it comes to root passwords.

You want to leave this option blank.

As a consequence, Debian will use the password of the user you will create in the next step to perform sudo operations.

Moreover, the root account will be disabled which is interesting for security purposes.

Nonetheless, if you want to specify a specific password for root, you can do it here, but I wouldn’t recommend it.

How To Install and Configure Debian 10 Buster with GNOME step-9
Click continue, and now it is time for you to specify the real name for the user.

I’ll go for JunosNotes but feel free to mention your real name and first name.

How To Install and Configure Debian 10 Buster with GNOME step-10

Then, you have to choose a username for your host.

JunoNotes will do the trick for me.

How To Install and Configure Debian 10 Buster with GNOME step-11

Then, choose a very secure password for your host.

How To Install and Configure Debian 10 Buster with GNOME step-12

Choose a time zone for your host.

Be careful on this point as time zones are very important when it comes to logging for example.

How To Install and Configure Debian 10 Buster with GNOME step-13

From there, Debian 10 Buster will start detecting disks on your host.

How To Install and Configure Debian 10 Buster with GNOME step-14

After it is done, you will be asked for a way to partition your disks.

Go for the Guided (use entire disk) version unless you have special requirements that need to set up LVM.

How To Install and Configure Debian 10 Buster with GNOME step-15

Select the disk you want to partition.

In my case, I have only one disk on the system, so I’ll pick it.

How To Install and Configure Debian 10 Buster with GNOME step-16

For the partitioning scheme, go for “All files in one partition“, which should suit your needs.

How To Install and Configure Debian 10 Buster with GNOME step-17

For the automatic partitioning, Debian 10 creates two partitions, a primary and a swap one (when you run out of memory!)

How To Install and Configure Debian 10 Buster with GNOME step-19

If you are happy with the partitioning, simply press the “Finish partitioning and write changes to disk” option.

On the next screen, you are asked for confirmation about the previous partitioning.

Simply check “Yes” on the two options prompted.

How To Install and Configure Debian 10 Buster with GNOME step-20

From there, the installation should begin on your system.

How To Install and Configure Debian 10 Buster with GNOME step-21 How To Install and Configure Debian 10 Buster with GNOME step-22

On the next step, you have prompted the choice to use a network mirror to supplement the software included in the USB drive.

You want to press “Yes

How To Install and Configure Debian 10 Buster with GNOME step-23

By pressing “Yes”, you are asked to choose a location that is close to your network. I’ll use the United States in this case.

How To Install and Configure Debian 10 Buster with GNOME step-24

Then, choose a Debian archive mirror for your distribution.

I’ll stick with the deb.debian.org one.

How To Install and Configure Debian 10 Buster with GNOME step-25

If you are using a proxy, this is where you want to configure it. I am not using one, so I’ll leave it blank.

How To Install and Configure Debian 10 Buster with GNOME step-26

Debian 10 Buster will start configuring apt and will try to install the GRUB boot loader on your instance.

How To Install and Configure Debian 10 Buster with GNOME step-27 How To Install and Configure Debian 10 Buster with GNOME step-28

On the next step, you are asked if you want to GRUB boot loader to the master boot record, you obviously want to press “Yes” to that.

How To Install and Configure Debian 10 Buster with GNOME step-29

On the next screen, select the hard drive where you want the GRUB boot loader to be installed and press Continue.

How To Install and Configure Debian 10 Buster with GNOME step-30

Done!

The installation should be completed at this point.

How To Install and Configure Debian 10 Buster with GNOME step-32

On the lock screen, type the password that you set up in the installation phase, and this is the screen that you should see.

How To Install and Configure Debian 10 Buster with GNOME backgorund

Awesome! You now have Debian 10 on your instance.

But this tutorial is not over. Before continuing, there are a few minimal configurations that you want to do on your Debian 10 buster instance for it to be all correctly configured.

Steps to Configure your Debian 10 Buster

Before playing with your new Debian 10 buster machine, there are a few steps that you need to complete.

a – Enable unofficial Debian software download

By default, downloading Debian software (like the tools that you would find in the Software store) are disabled by default.

To enable them, head to “Activities”, and type “Software Updates”.

How To Install and Configure Debian 10 Buster with GNOME step-34-bisIn the next window, the first and the last checkbox should be already checked.

Check the “DFSG-compatible Software with Non-Free Dependencies (contrib)” option and the “Non-DFSG-compatible Software (non-free)” option.

How To Install and Configure Debian 10 Buster with GNOME step-35

Click on “Close“. From there, you will be asked to confirm your choice by reloading the information about available software.

Simply click on “Reload“.

How To Install and Configure Debian 10 Buster with GNOME step-36

Head to the Store by typing “Store” into the Activities search box.

If you are seeing third-party applications, it means that the previous step worked correctly.

How To Install and Configure Debian 10 Buster with GNOME step-38

b – Install wget to download files from Internet

wget is not installed by default on your instance.

$ sudo apt install wget

How To Install and Configure Debian 10 Buster with GNOME step-39

c – Install your NVIDIA drivers

The NVIDIA driver installation process is pretty straightforward.

Simply run the “nvidia-detect” command in your terminal and this utility will tell you which driver you have to install depending on your graphics card.

First, install nvidia-detect

$ sudo apt install nvidia-detect

How To Install and Configure Debian 10 Buster with GNOME step-42

From there, run the nvidia-detect utility in your command line.

$ nvidia-detect
Detected NVIDIA GPUs:
02:00.0 VGA compatible controller [0300]: NVIDIA Corporation GF108 [GeForce GT 430] [10de:0de1] (rev a1)
Your card is supported by the default drivers.
It is recommended to install the
    nvidia-driver
package.

As you can see, the nvidia-detect utility states that I need to install the nvidia-driver package for my instance, so this is what I am going to do.

$ sudo apt install nvidia-driver

d – Install Dash To Dock

As a Debian user, I hate going to the top left Applications menu just to find my web browser or to browse my filesystem.

As a consequence, similarly to MacOS graphical interfaces, I would like a static application dock to be visible on my desktop, all the time.

How To Install and Configure Debian 10 Buster with GNOME step-43

To install it, head to the “Store” by typing “Store” in the Applications search box. This is the window that you should see.

How To Install and Configure Debian 10 Buster with GNOME step-44

Click on “Add-ons”, and then select the “Shell Extensions tab“. You should see a list of shell extensions available for your Debian distribution.

In the top search bar, type “Dash To Dock“. Click on “Install” when you found the “Dash To Dock” store item.

How To Install and Configure Debian 10 Buster with GNOME step-40-dash-dock

Simply click “Install” on the next window that is prompted on the screen.

How To Install and Configure Debian 10 Buster with GNOME step-41-dash-dock

That’s it!

You now have a dash to dock on your desktop.

How To Install and Configure Debian 10 Buster with GNOME featured-1

Going Further

Your adventure with Debian 10 has only begun, but I would recommend that you start configuring your host if you plan on using it as a server.

Here’s a very good video of Linux Tex that explains all the things that you should do after installing your Debian 10 installation.

Some of the steps are already covered in this tutorial, but for the others, feel free to watch his video as it explains the procedures quite in detail.

 

How To Install Logstash on Ubuntu 18.04 and Debian 9

How To Install Logstash on Ubuntu 18.04 and Debian 9 | Tutorial on Logstash Configuration

Are you searching various websites to learn How To Install Logstash on Ubuntu 18.04 and Debian 9? Then, this tutorial is the best option for you all as it covers the detailed steps to install and configure the Logstash on Ubuntu 18.4 and Debian 9. If you are browsing this tutorial, it is apparently because you preferred to bring Logstash into your infrastructure. Logstash is a powerful tool, but you have to install and configure it properly so make use of this tutorial efficiently.

What is Logstash?

Logstash is a lightweight, open-source, server-side data processing pipeline that lets you get data from different sources, transform it on the fly, and send it to your aspired destination. It is used as a data processing pipeline for Elasticsearch, an open-source analytics and search engine that points at analyzing log ingestion, parsing, filtering, and redirecting.

Why do we use Logstash?

We use Logstash because Logstash provides a set of plugins that can easily be bound to various targets in order to gather logs from them. Moreover, Logstash provides a very expressive template language, that makes it very easy for developers to manipulate, truncate or transform data streams.

Logstash is part of the ELK stack: Elasticsearch – Logstash – Kibana but tools can be used independently.

With the recent release of the ELK stack v7.x, installation guides need to be updated for recent distributions like Ubuntu 18.04 and Debian 9.

Do Check: 

Prerequisites

  • Java version 8 or 11 (required for Logstash installation)
  • A Linux system running Ubuntu 20.04 or 18.04
  • Access to a terminal window/command line (Search > Terminal)
  • A user account with sudo or root privileges

Steps to Install install Logstash on Ubuntu and Debian

The following are the steps to install Logstash on Ubuntu and Debian: 

1 – Install the latest version of Java

Logstash, as every single tool of the ELK stack, needs Java to run properly.

In order to check whether you have Java or not, run the following command:

$ java -version
openjdk version "11.0.3" 2019-04-16
OpenJDK Runtime Environment (build 11.0.3+7-Ubuntu-1ubuntu218.04.1)
OpenJDK 64-Bit Server VM (build 11.0.3+7-Ubuntu-1ubuntu218.04.1, mixed mode, sharing)

If you don’t have Java on your computer, you should have the following output.

java-not-found

You can install it by running this command.

$ sudo apt-get install default-jre

Make sure that you now have Java installed via the first command that we run.

2 – Add the GPG key to install signed packages

In order to make sure that you are getting official versions of Logstash, you have to download the public signing key and you have to install it.

To do so, run the following commands.

$ wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -

On Debian, install the apt-transport-https package.

$ sudo apt-get install apt-transport-https

To conclude, add the Elastic package repository to your own repository list.

$ echo "deb https://artifacts.elastic.co/packages/7.x/apt stable main" | sudo tee -a /etc/apt/sources.list.d/elastic-7.x.list

3 – Install Logstash with apt

Now that Elastic repositories are added to your repository list, it is time to install the latest version of Logstash on our system.

$ sudo apt-get update
$ sudo apt-get install logstash

apt-get-update

This directive will :

  • create a logstash user
  • create a logstash group
  • create a dedicated service file for Logstash

From there, running Logstash installation should have created a service on your instance.

To check Logstash service health, run the following command.
On Ubuntu and Debian, equipped with system

$ sudo systemctl status logstash

Enable your new service on boot up and start it.

$ sudo systemctl enable logstash
$ sudo systemctl start logstash

Having your service running is just fine, but you can double-check it by verifying that Logstash is actually listening on its default port, which is 5044.

Run a simple netstat command, you should have the same output.

$ sudo lsof -i -P -n | grep logstash
java      28872        logstash   56u  IPv6 1160098302      0t0  TCP 
127.0.0.1:47796 > 127.0.0.1:9200 (ESTABLISHED)
java      28872        logstash   61u  IPv4 1160098304      0t0  UDP 127.0.0.1:10514
java      28872        logstash   79u  IPv6 1160098941      0t0  TCP 127.0.0.1:9600 (LISTEN)

As you can tell, Logstash is actively listening for connections on ports 10514 on UDP and 9600 on TCP. It is important to note if you were to forward your logs (from rsyslog to Logstash for example, either by UDP or by TCP).

On Debian and Ubuntu, here’s the content of the service file.

[Unit]
Description=logstash

[Service]
Type=simple
User=logstash
Group=logstash
# Load env vars from /etc/default/ and /etc/sysconfig/ if they exist.
# Prefixing the path with '-' makes it try to load, but if the file doesn't
# exist, it continues onward.
EnvironmentFile=-/etc/default/logstash
EnvironmentFile=-/etc/sysconfig/logstash
ExecStart=/usr/share/logstash/bin/logstash "--path.settings" "/etc/logstash"
Restart=always
WorkingDirectory=/
Nice=19
LimitNOFILE=16384

[Install]
WantedBy=multi-user.target

The environment file (located at /etc/default/logstash) contains many of the variables necessary for Logstash to run.

If you wanted to tweak your Logstash installation, for example, to change your configuration path, this is the file that you would change.

4 – Personalize Logstash with configuration files

In this step, you need to perform two more steps like as follows:

a – Understanding Logstash configuration files

Before personalizing your configuration files, there is a concept that you need to understand about configuration files.

Pipelines configuration files

In Logstash, you define what we called pipelines. A pipeline is composed of :

  • An input: where you take your data from, it can be Syslog, Apache, or NGINX for example;
  • A filter: a transformation that you would apply to your data; sometimes you may want to mutate your data, or to remove some fields from the final output.
  • An output: where you are going to send your data, most of the time Elasticsearch, but it can be modified to send a wide variety of different sources.

a – Understanding Logstash configuration files

Those pipelines are defined in configuration files.

In order to define those “pipeline configuration files“, you are going to create “pipeline files” in the /etc/logstash/conf.d directory.

Logstash general configuration file

But with Logstash, you also have standard configuration files, that configure Logstash itself.

This file is located at /etc/logstash/logstash.yml. The general configuration files define many variables, but most importantly you want to define your log path variable and data path variable.

b – Writing your own pipeline configuration file

For this part, we are going to keep it very simple.

We are going to build a very basic logging pipeline between rsyslog and stdout.

Every single log process via rsyslog will be printed to the shell running Logstash.

As Elastic documentation highlighted it, it can be quite useful to test pipeline configuration files and see immediately what they are giving as an output.

If you are looking for a complete rsyslog to Logstash to Elasticsearch tutorial, here’s a link for it.

To do so, head over to the /etc/logstash/conf.d directory and create a new file named “syslog.conf

$ cd /etc/logstash/conf.d/
$ sudo vi syslog.conf

Paste the following content inside.

input {
  udp {
    host => "127.0.0.1"
    port => 10514
    codec => "json"
    type => "rsyslog"
  }
}

filter { }


output {
  stdout { }
}

As you probably guessed, Logstash is going to listen to incoming Syslog messages on port 10514 and it is going to print it directly in the terminal.

To forward rsyslog messages to port 10514, head over to your /etc/rsyslog.conf file, and add this line at the top of the file.

*.*         @127.0.0.1:10514

rsyslog-forwarding

Now in order to debug your configuration, you have to locate the logstash binary on your instance.

To do so, run a simple whereis command.

$ whereis -b logstash
/usr/share/logstash

Now that you have located your logstash binary, shut down your service and run logstash locally, with the configuration file that you are trying to verify.

$ sudo systemctl stop logstash
$ cd /usr/share/logstash/bin
$ ./logstash -f /etc/logstash/conf.d/syslog.conf

Within a couple of seconds, you should now see the following output on your terminal.

success-config-logstash

Note : if you have any syntax errors in your pipeline configuration files, you would also be notified.

As a quick example, I removed one bracket from my configuration file. Here’s the output that I got.

error-config-logstash

5 – Monitoring Logstash using the Monitoring API

There are multiple ways to monitor a Logstash instance:

  • Using the Monitoring API provided by Logstash itself
  • By configuring the X-Pack tool and sending retrieved data to an Elasticsearch cluster
  • By visualizing data into dedicated panels of Kibana (such as the pipeline viewer for example)

In this chapter, we are going to focus on the Monitoring API, as the other methods require the entire ELK stack installed on your computer to work properly.

a – Gathering general information about Logstash

First, we are going to run a very basic command to get general information about our Logstash instance.

Run the following command on your instance:

$ curl -XGET 'localhost:9600/?pretty'
{
  "host" : "devconnected-ubuntu",
  "version" : "7.2.0",
  "http_address" : "127.0.0.1:9600",
  "id" : "05cfb06f-a652-402c-8da1-f7275fb06312",
  "name" : "devconnected-ubuntu",
  "ephemeral_id" : "871ccf4a-5233-4265-807b-8a305d349745",
  "status" : "green",
  "snapshot" : false,
  "build_date" : "2019-06-20T17:29:17+00:00",
  "build_sha" : "a2b1dbb747289ac122b146f971193cfc9f7a2f97",
  "build_snapshot" : false
}

If you are not running Logstash on the conventional 9600 port, make sure to adjust the previous command.

From the command, you get the hostname, the current version running, as well as the current HTTP address currently used by Logstash.

You also get a status property (green, yellow, or red) that has already been explained in the tutorial about setting up an Elasticsearch cluster.

b – Retrieving Node Information

If you are managing an Elasticsearch cluster, there is a high chance that you may want to get detailed information about every single node in your cluster.

For this API, you have three choices:

  • pipelines: in order to get detailed information about pipeline statistics.
  • jvm: to see current JVM statistics for this specific node
  • os: to get information about the OS running your current node.

To retrieve node information on your cluster, issue the following command:

$ curl -XGET 'localhost:9600/_node/pipelines'
{
  "host": "schkn-ubuntu",
  "version": "7.2.0",
  "http_address": "127.0.0.1:9600",
  "id": "05cfb06f-a652-402c-8da1-f7275fb06312",
  "name": "schkn-ubuntu",
  "ephemeral_id": "871ccf4a-5233-4265-807b-8a305d349745",
  "status": "green",
  "snapshot": false,
  "pipelines": {
    "main": {
      "ephemeral_id": "808952db-5d23-4f63-82f8-9a24502e6103",
      "hash": "2f55ef476c3d425f4bd887011f38bbb241991f166c153b283d94483a06f7c550",
      "workers": 2,
      "batch_size": 125,
      "batch_delay": 50,
      "config_reload_automatic": false,
      "config_reload_interval": 3000000000,
      "dead_letter_queue_enabled": false,
      "cluster_uuids": []
    }
  }
}

Here is an example for the OS request:

$ curl -XGET 'localhost:9600/_node/os'
{
  "host": "schkn-ubuntu",
  "version": "7.2.0",
  "http_address": "127.0.0.1:9600",
  "id": "05cfb06f-a652-402c-8da1-f7275fb06312",
  "name": "schkn-ubuntu",
  "ephemeral_id": "871ccf4a-5233-4265-807b-8a305d349745",
  "status": "green",
  "snapshot": false,
  "os": {
    "name": "Linux",
    "arch": "amd64",
    "version": "4.15.0-42-generic",
    "available_processors": 2
  }
}

c – Retrieving Logstash Hot Threads

Hot Threads are threads that are using a large amount of CPU power or that have an execution time that is greater than normal and standard execution times.

To retrieve hot threads, run the following command:

$ curl -XGET 'localhost:9600/_node/hot_threads?pretty'
{
  "host" : "schkn-ubuntu",
  "version" : "7.2.0",
  "http_address" : "127.0.0.1:9600",
  "id" : "05cfb06f-a652-402c-8da1-f7275fb06312",
  "name" : "schkn-ubuntu",
  "ephemeral_id" : "871ccf4a-5233-4265-807b-8a305d349745",
  "status" : "green",
  "snapshot" : false,
  "hot_threads" : {
    "time" : "2019-07-22T18:52:45+00:00",
    "busiest_threads" : 10,
    "threads" : [ {
      "name" : "[main]>worker1",
      "thread_id" : 22,
      "percent_of_cpu_time" : 0.13,
      "state" : "timed_waiting",
      "traces" : [ "java.base@11.0.3/jdk.internal.misc.Unsafe.park(Native Method)"...]
    } ]
  }
}

Installing Logstash on macOS with Homebrew

Elastic issues Homebrew formulae thus you can install Logstash with the Homebrew package manager.

In order to install with Homebrew, firstly, you should tap the Elastic Homebrew repository:

brew tap elastic/tap

Once you have clicked on the Elastic Homebrew repo, you can utilize brew install to install the default distribution of Logstash:

brew install elastic/tap/logstash-full

The above syntax installs the latest released default distribution of Logstash. If you want to install the OSS distribution, define this elastic/tap/logstash-oss.

Starting Logstash with Homebrew

To have launched start elastic/tap/logstash-full now and restart at login, run:

brew services start elastic/tap/logstash-full

To run Logstash, in the forefront, run:

logstash

Going Further

Now that you have all the basics about Logstash, it is time for you to build your own pipeline configuration files and start stashing logs.

I highly suggest that you verify Filebeat, which gives a lightweight shipper for logs and that simply be customized in order to build a centralized logging system for your infrastructure.

One of the key features of Filebeat is that it provides a back-pressure sensitive protocol, which essentially means that you are able to regulate the number that you receive.

This is a key point, as you take the risk of overloading your centralized server by pushing too much data to it.

For those who are interested in Filebeat, here’s a video about it.

Tcpdump Command in Linux

tcpdump is a command-line utility that you can manage to capture and examine network traffic going to and from your system. It is the most regularly used tool amongst network administrators for troubleshooting network issues and security testing.

Notwithstanding its name, with tcpdump, you can also catch non-TCP traffic such as UDP, ARP, or ICMP. The captured packets can be written to a file or standard output. One of the most critical features of the tcpdump command is its capacity to use filters and charge only the data you wish to analyze.

In this article, you will learn the basics of how to use the tcpdump command in Linux.

Installing tcpdump

tcpdump is installed by default on most Linux distributions and macOS. To check if the tcpdump command is available on your system type:

$ tcpdump --version

The output should look something like this:

Output:

tcpdump version 4.9.2

libpcap version 1.8.1

OpenSSL 1.1.1b 26 Feb 2019

If tcpdump is not present on your system, the command above will print “tcpdump: command not found.” You can easily install tcpdump using the package manager of your distro.

Installing tcpdump on Ubuntu and Debian

$ sudo apt update && sudo apt install tcpdump

Installing tcpdump on CentOS and Fedora

$ sudo yum install tcpdump

Installing tcpdump on Arch Linux

$ sudo pacman -S tcpdump

Capturing Packets with tcpdump

The general syntax for the tcpdump command is as follows:

tcpdump [options] [expression]

  • The command options allow you to control the behavior of the command.
  • The filter expression defines which packets will be captured.

Only root or user with sudo privileges can run tcpdump. If you try to run the command as an unprivileged user, you’ll get an error saying: “You don’t have permission to capture on that device.”

The most simple use case is to invoke tcpdump without any options and filters:

$ sudo tcpdump
Output:

tcpdump: verbose output suppressed, use -v or -vv for full protocol decode

listening on ens3, link-type EN10MB (Ethernet), capture size 262144 bytes

15:47:24.248737 IP linuxize-host.ssh > desktop-machine.39196: Flags [P.], seq 201747193:201747301, ack 1226568763, win 402, options [nop,nop,TS val 1051794587 ecr 2679218230], length 108

15:47:24.248785 IP linuxize-host.ssh > desktop-machine.39196: Flags [P.], seq 108:144, ack 1, win 402, options [nop,nop,TS val 1051794587 ecr 2679218230], length 36

15:47:24.248828 IP linuxize-host.ssh > desktop-machine.39196: Flags [P.], seq 144:252, ack 1, win 402, options [nop,nop,TS val 1051794587 ecr 2679218230], length 108

... Long output suppressed

23116 packets captured

23300 packets received by filter

184 packets dropped by kernel

tcpdump will continue to capture packets and write to the standard output until it receives an interrupt signal. Use the Ctrl+C key combination to send an interrupt signal and stop the command.

For more verbose output, pass the -v option, or -vv for even more verbose output:

$ sudo tcpdump -vv

You can specify the number of packets to be captured using the -c option. For example, to capture only ten packets, you would type:

$ sudo tcpdump -c 10

After capturing the packets, tcpdump will stop.

When no interface is specified, tcpdump uses the first interface it finds and dumps all packets going through that interface.

Use the -D option to print a list of all available network interfaces that tcpdump can collect packets from:

$ sudo tcpdump -D

For each interface, the command prints the interface name, a short description, and an associated index (number):

 Output:

1.ens3 [Up, Running]

2.any (Pseudo-device that captures on all interfaces) [Up, Running]

3.lo [Up, Running, Loopback]

The output above shows that ens3 is the first interface found by tcpdump and used when no interface is provided to the command. The second interface any is a special device that allows you to capture all active interfaces.

To specify the interface you want to capture traffic, invoke the command with the -i option followed by the interface name or the associated index. For example, to capture all packets from all interfaces, you would specify any interface:

$ sudo tcpdump -i any

By default, tcpdump performs reverse DNS resolution on IP addresses and translates port numbers into names. Use the -n option to disable the translation:

$ sudo tcpdump -n

Skipping the DNS lookup avoids generating DNS traffic and makes the output more readable. It is recommended to use this option whenever you invoke tcpdump.

Instead of displaying the output on the screen, you can redirect it to a file using the redirection operators > and >>:

 $ sudo tcpdump -n -i any > file.out

You can also watch the data while saving it to a file using the tee command:

$ sudo tcpdump -n -l | tee file.out

The -l option in the command above tells tcpdump to make the output line buffered. When this option is not used, the output will not be written on the screen when a new line is generated.

Understanding the tcpdump Output

tcpdump outputs information for each captured packet on a new line. Each line includes a timestamp and information about that packet, depending on the protocol.

The typical format of a TCP protocol line is as follows:

[Timestamp] [Protocol] [Src IP].[Src Port] > [Dst IP].[Dst Port]: [Flags], [Seq], [Ack], [Win Size], [Options], [Data Length]

Let’s go field by field and explain the following line:

15:47:24.248737 IP 192.168.1.185.22 > 192.168.1.150.37445: Flags [P.], seq 201747193:201747301, ack 1226568763, win 402, options [nop,nop,TS val 1051794587 ecr 2679218230], length 108

  • 15:47:24.248737 – The timestamp of the captured packet is local and uses the following format: hours:minutes: seconds. Frac, where frac is fractions of a second since midnight.
  • IP – The packet protocol. In this case, IP means the Internet protocol version 4 (IPv4).
  • 192.168.1.185.22 – The source IP address and port, separated by a dot (.).
  • 192.168.1.150.37445 – The destination IP address and port, separated by a dot (.).
  • Flags [P.] – TCP Flags field. In this example, [P.] means Push Acknowledgment packet, which acknowledges the previous packet and sends data. Other typical flag field values are as follows:
    • [.] – ACK (Acknowledgment)
    • [S] – SYN (Start Connection)
    • [P] – PSH (Push Data)
    • [F] – FIN (Finish Connection)
    • [R] – RST (Reset Connection)
    • [S.] – SYN-ACK (SynAcK Packet)
  • seq 201747193:201747301 – The sequence number is in the first: last notation. It shows the number of data contained in the packet. Except for the first packet in the data stream where these numbers are absolute, all subsequent packets use as relative byte positions. In this example, the number is 201747193:201747301, meaning that this packet contains bytes 201747193 to 201747301 of the data stream. Use the -S option to print absolute sequence numbers.
  • Ack 1226568763 The acknowledgment number is the sequence number of the next data expected by the other end of this connection.
  • Win 402 – The window number is the number of available bytes in the receiving buffer.
  • options [nop,nop,TS val 1051794587 ecr 2679218230] – TCP options. or “no operation,” is padding used to make the TCP header multiple of 4 bytes. TS val is a TCP timestamp, and ecr stands for an echo reply. Visit the IANA documentation for more information about TCP options.
  • length 108 – The length of payload data

tcpdump Filters

When tcpdump is invoked with no filters, it captures all traffic and produces a tremendous output, making it very difficult to find and analyze the packets of interest.

Filters are one of the most powerful features of the tcpdump command. They since they allow you to capture only those packets matching the expression. For example, when troubleshooting issues related to a web server, you can use filters to obtain only the HTTP traffic.

tcpdump uses the Berkeley Packet Filter (BPF) syntax to filter the captured packets using various machining parameters such as protocols, source and destination IP addresses and ports, etc.

In this article, we’ll take a look at some of the most common filters. For a list of all available filters, check the pcap-filter manpage.

Filtering by Protocol

To restrict the capture to a particular protocol, specify the protocol as a filter. For example, to capture only the UDP traffic, you would run:

sudo tcpdump -n udp

Another way to define the protocol is to use the proto qualifier, followed by the protocol number. The following command will filter the protocol number 17 and produce the same result as the one above:

sudo tcpdump -n proto 17

For more information about the numbers, check the IP protocol numbers list.

Filtering by Host

To capture only packets related to a specific host, use the host qualifier:

$ sudo tcpdump -n host 192.168.1.185

The host can be either an IP address or a name.

You can also filter the output to a given IP range using the net qualifier. For example, to dump only packets related to 10.10.0.0/16, you would use:

$ sudo tcpdump -n net 10.10

Filtering by Port

To limit capture only to packets from or to a specific port, use the port qualifier. The command below captures packets related to the SSH (port 22) service by using this command:

$ sudo tcpdump -n port 23

The port range qualifier allows you to capture traffic in a range of ports:

sudo tcpdump -n port range 110-150

Filtering by Source and Destination

You can also filter packets based on the origin or target port or host using src, dst, src and dst, and src or dst qualifiers.

The following command captures coming packets from a host with IP 192.168.1.185:

sudo tcpdump -n src host 192.168.1.185

To find the traffic coming from any source to port 80, you would use:

sudo tcpdump -n dst port 80

Complex Filters

Filters can be mixed using the and (&&), or (||), and not (!) operators.

For example, to catch all HTTP traffic coming from a source IP address 192.168.1.185, you would use this command:

sudo tcpdump -n src 192.168.1.185 and tcp port 80

You can also use parentheses to group and create more complex filters:

$ sudo tcpdump -n 'host 192.168.1.185 and (tcp port 80 or tcp port 443)'

To avoid parsing errors when using special characters, enclose the filters inside single quotes.

Here is another example command to capture all traffic except SSH from a source IP address 192.168.1.185:

$ sudo tcpdump -n src 192.168.1.185 and not dst port 22

Packet Inspection

By default tcpdump, catches only the packet headers. However, sometimes you may need to examine the content of the packets.

tcpdump enables you to print the content of the packets in ASCII and HEX.

The -A option tells tcpdump to print each packet in ASCII and -x in HEX:

$ sudo tcpdump -n -A

To show the packet’s contents in both HEX and ASCII, use the -X option:

$ sudo tcpdump -n -X

Reading and Writing Captures to a File

Another useful feature of tcpdump is to write the packets to a file.

This is handy when you are taking a large number of packages or carrying packets for later analysis.

To start writing to a file, use the -w option followed by the output capture file:

$ sudo tcpdump -n -w data.pcap

This command up will save the capture to a file named data. pcap. You can name the file as you want, but it is a standard protocol to use the .pcap extension (packet capture).

When the -w option is used, the output is not represented on the screen. tcpdump writes raw packets and generates a binary file that cannot be read with a regular text editor.

To inspect the contents of the file, request tcpdump with the -r option:

$ sudo tcpdump -r data.pcap

If you need to run tcpdump in the background, add the ampersand symbol (&) at the command end.

The capture file can also be examined with other packet analyzer tools such as Wireshark.

When obtaining packets over a long period, you can allow file rotation. tcpdump enables you to generate new files and rotate the dump file on a defined time interval or fixed size. The following command will create up to ten 200MB files, named file.pcap0, file.pcap1, and so on: before overwriting older files.

$ sudo tcpdump -n -W 10 -C 200 -w /tmp/file.pcap

Once ten files are created, the older files will be overwritten.

Please take care that you should only run tcpdump only during troubleshooting issues.

If you need to start tcpdump at a particular time, you can use a cronjob. tcpdump does not have an alternative to exit after a given time. You can use the timeout command to stop tcpdump after any time. For example, to exit after 5 minutes, you would use:

$ sudo timeout 300 tcpdump -n -w data.pcap

Conclusion: 

To analyze and troubleshoot network related issues, the tcpdump command-line tool is used.

This article presented you with the basics of tcpdump usage and syntax. If you have any queries related to tcpdump, feel free to contact us.