Linux Tee Command with Examples

The tee command records from the regular input and writes both standard output and one or more files simultaneously. Tee is frequently used in sequence with other commands through piping.

In this article, we will cover the basics of working the tee command.

tee Command Syntax

The syntax for the tee command is as below:

tee [OPTIONS] [FILE]

Where OPTIONS can be:

    • -a (–append) – Do not overwrite the files; instead, affix to the given files.
    • -i (–ignore-interrupts) – Ignore interrupt signals.
    • Use tee –help to view all available options.
  • FILE_NAMES – One or more files. Each of which the output data is written to

 How to Use the tee Command

The tee command’s most basic method represents the standard output (stdout) of a program and writing it in a file.

In the below example, we use the df command to get information about the available disk space on the file system. The output is piped to the tee command, expressing the result to the terminal, and writes the same information to the file disk_usage.txt.

$ df -h | tee disk_usage.txt

Output:

Filesystem      Size  Used Avail Use% Mounted on

dev             7.8G     0  7.8G   0% /dev

run             7.9G  1.8M  7.9G   1% /run

/dev/nvme0n1p3  212G  159G   43G  79% /

tmpfs           7.9G  357M  7.5G   5% /dev/shm

tmpfs           7.9G     0  7.9G   0% /sys/fs/cgroup

tmpfs           7.9G   15M  7.9G   1% /tmp

/dev/nvme0n1p1  511M  107M  405M  21% /boot

/dev/sda1       459G  165G  271G  38% /data

tmpfs           1.6G   16K  1.6G   1% /run/user/120

Write to Multiple File

By using the tee command, you can write to multiple files also. To do so, define a list of files separated by space as arguments:

$ command | tee file1.out file2.out file3.out

Append to File

By default, the tee command will overwrite the specified file. Use the -a (–append) option to append the output to the file :

$ command | tee -a file.out

Ignore Interrupt

To ignore interrupts use the -i (–ignore-interrupts) option. This is useful when stopping the command during execution with CTRL+C and want the tee to exit gracefully.

$ command | tee -i file.out

Hide the Output

If you don’t want the tee to write to the standard output, you can redirect it to /dev/null:

$ command | tee file.out >/dev/null

Using tee in Conjunction with sudo

Let us say you need to write to a file owned by root as a sudo user. The following command will fail because the redirection of the output is not operated by sudo. The redirection is executed as the unprivileged user.

$ sudo echo "newline" > /etc/file.conf

The output will look something like this:

Output:

bash: /etc/file.conf: Permission denied

Prepend sudo before the tee command as shown below:

$ echo "newline" | sudo tee -a /etc/file.conf

the tee will receive the echo command output, upgrade to sudo permissions and then write to the file.

Using tee in combination with sudo enables you to write to files owned by other users.

Conclusion:

If you want to read from standard input and writes it to standard output and one or more files, then the tee command is used.

Syslog The Complete System Administrator Guide

Syslog: The Complete System Administrator Guide

Guys who hold Linux systems & who work as system administrators can get a high opportunity to work with Syslog, at least one time.

When you are working to system logging on Linux system then it is pretty much connected to the Syslog protocol. It is a specification that defines a standard for message logging on any system.

Developers or administrators who are not familiar with Syslog can acquire complete knowledge from this tutorial. Syslog was designed in the early ’80s by Eric Allman (from Berkeley University), and it works on any operating system that implements the Syslog protocol.

The perfect destination that you should come to learn more about Syslog and Linux logging, in general, is this Syslog: The Complete System Administrator Guide and other related articles on Junosnotes.com

Here is everything that you need to know about Syslog:

What is the purpose of Syslog?

I – What is the purpose of Syslog

Syslog is used as a standard to produce, forward, and collect logs produced on a Linux instance. Syslog defines severity levels as well as facility levels helping users having a greater understanding of logs produced on their computers. Logs can, later on, be analyzed and visualized on servers referred to as Syslog servers.

Here are a few more reasons why the Syslog protocol was designed in the first place:

  • Defining an architecture: this will be explained in detail later on, but if Syslog is a protocol, it will probably be part of complete network architecture, with multiple clients and servers. As a consequence, we need to define roles, in short: are you going to receive, produce or relay data?
  • Message format: Syslog defines the way messages are formatted. This obviously needs to be standardized as logs are often parsed and stored into different storage engines. As a consequence, we need to define what a Syslog client would be able to produce, and what a Syslog server would be able to receive;
  • Specifying reliability: Syslog needs to define how it handles messages that can not be delivered. As part of the TCP/IP stack, Syslog will obviously be opinionated on the underlying network protocol (TCP or UDP) to choose from;
  • Dealing with authentication or message authenticity: Syslog needs a reliable way to ensure that clients and servers are talking in a secure way and that messages received are not altered.

Now that we know why Syslog is specified in the first place, let’s see how a Syslog architecture works.

Must Refer: How To Install and Configure Debian 10 Buster with GNOME

What is Syslog architecture?

When designing a logging architecture, as a centralized logging server, it is very likely that multiple instances will work together.

Some will generate log messages, and they will be called “devices” or “syslog clients“.

Some will simply forward the messages received, they will be called “relays“.

Finally, there are some instances where you are going to receive and store log data, those are called “collectors” or “syslog servers”.

syslog-component-arch

Knowing those concepts, we can already state that a standalone Linux machine acts as a “syslog client-server” on its own: it produces log data, it is collected by rsyslog and stored right into the filesystem.

Here’s a set of architecture examples around this principle.

In the first design, you have one device and one collector. This is the most simple form of logging architecture out there.

architecture-1

Add a few more clients to your infrastructure, and you have the basis of a centralized logging architecture.

architecture -2

Multiple clients are producing data and are sending it to a centralized syslog server, responsible for aggregating and storing client data.

If we were to complexify our architecture, we can add a “relay“.

Examples of relays could be Logstash instances for example, but they also could be rsyslog rules on the client-side.

architecture - 3

Those relays act most of the time as “content-based routers” (if you are not familiar with content-based routers, here is a link to understand them).

It means that based on the log content, data will be redirected to different places. Data can also be completely discarded if you are not interested in it.

Now that we have detailed Syslog components, let’s see what a Syslog message looks like.

How Syslog Architecture Works?

There are three different layers within the Syslog standard. They are as follows:

  1. Syslog content (information contained in an event message)
  2. Syslog application (generates, interprets, routes, and stores messages)
  3. Syslog transport (transmits the messages)

syslog message layers destinations

Moreover, applications can be configured to send messages to different destinations. There are also alarms that give instant notifications for events like as follows:

  • Hardware errors
  • Application failures
  • Lost contact
  • Mis-configuration

Besides, alarms can be set up to send notifications through SMS, pop-up messages, email, HTTP, and more. As the process is automated, the IT team will receive instant notifications if there is an unexpected breakdown of any of the devices.

The Syslog Format

Syslog has a standard definition and format of the log message defined by RFC 5424. As a result, it is composed of a header, structured-data (SD), and a message. Inside the header, you will see a description of the type such as:

  • Priority
  • Version
  • Timestamp
  • Hostname
  • Application
  • Process ID
  • Message ID

Later, you will recognize structured data which have data blocks in the “key=value” format in square brackets. After the SD, you can discover the detailed log message, which is encoded in UTF-8.

For instance, look at the below message:

<34>1 2003-10-11T22:14:15.003Z mymachine.example.com su - ID47 - BOM'su root' failed for lonvick on /dev/pts/8

Writes to the resulting format:

<priority>VERSION ISOTIMESTAMP HOSTNAME APPLICATION PID MESSAGEID STRUCTURED-DATA MSG

What is the Syslog message format?

The Syslog format is divided into three parts:

  • PRI part: that details the message priority levels (from a debug message to an emergency) as well as the facility levels (mail, auth, kernel);
  • HEADER part: composed of two fields which are the TIMESTAMP and the HOSTNAME, the hostname being the machine name that sends the log;
  • MSG part: this part contains the actual information about the event that happened. It is also divided into a TAG and a CONTENT field.

syslog-format

Before detailing the different parts of the syslog format, let’s have a quick look at syslog severity levels as well as syslog facility levels.

a – What are Syslog facility levels?

In short, a facility level is used to determine the program or part of the system that produced the logs.

By default, some parts of your system are given facility levels such as the kernel using the kern facility, or your mailing system using the mail facility.

If a third party wants to issue a log, it would probably be a reserved set of facility levels from 16 to 23 called “local use” facility levels.

Alternatively, they can use the “user-level” facility, meaning that they would issue logs related to the user that issued the commands.

In short, if my Apache server is run by the “apache” user, then the logs would be stored under a file called “apache.log” (<user>.log)

Here are the Syslog facility levels described in a table:

Numerical Code Keyword Facility name
0 kern Kernel messages
1 user User-level messages
2 mail Mail system
3 daemon System Daemons
4 auth Security messages
5 syslog Syslogd messages
6 lpr Line printer subsystem
7 news Network news subsystem
8 uucp UUCP subsystem
9 cron Clock daemon
10 authpriv Security messages
11 ftp FTP daemon
12 ntp NTP subsystem
13 security Security log audit
14 console Console log alerts
15 solaris-cron Scheduling logs
16-23 local0 to local7 Locally used facilities

Do those levels sound familiar to you?

Yes! On a Linux system, by default, files are separated by facility name, meaning that you would have a file for auth (auth.log), a file for the kernel (kern.log), and so on.

Here’s a screenshot example of my Debian 10 instance.

var-log-debian-10

Now that we have seen syslog facility levels, let’s describe what syslog severity levels are.

b – What are Syslog severity levels?

Syslog severity levels are used to how severe a log event is and they range from debugging, informational messages to emergency levels.

Similar to Syslog facility levels, severity levels are divided into numerical categories ranging from 0 to 7, 0 being the most critical emergency level.

Here are the syslog severity levels described in a table:

Value Severity Keyword
0 Emergency emerg
1 Alert alert
2 Critical crit
3 Error err
4 Warning warning
5 Notice notice
6 Informational info
7 Debug debug

Even if logs are stored by facility name by default, you could totally decide to have them stored by severity levels instead.

If you are using rsyslog as a default syslog server, you can check rsyslog properties to configure how logs are separated.

Now that you know a bit more about facilities and severities, let’s go back to our syslog message format.

c – What is the PRI part?

The PRI chunk is the first part that you will get to read on a syslog formatted message.

The PRI stores the “Priority Value” between angle brackets.

Remember the facilities and severities you just learned?

If you take the message facility number, multiply it by eight, and add the severity level, you get the “Priority Value” of your syslog message.

Remember this if you want to decode your syslog message in the future.

pri-calc-fixed

d – What is the HEADER part?

As stated before, the HEADER part is made of two crucial information: the TIMESTAMP part and the HOSTNAME part (that can sometimes be resolved to an IP address)

This HEADER part directly follows the PRI part, right after the right angle bracket.

It is noteworthy to say that the TIMESTAMP part is formatted on the “Mmm dd hh:mm:ss” format, “Mmm” being the first three letters of a month of the year.

HEADER-example

When it comes to the HOSTNAME, it is often the one given when you type the hostname command. If not found, it will be assigned either the IPv4 or the IPv6 of the host.

How does Syslog message delivery work?

When issuing a syslog message, you want to make sure that you use reliable and secure ways to deliver log data.

Syslog is of course opiniated on the subject, and here are a few answers to those questions.

a – What is Syslog forwarding?

Syslog forwarding consists of sending clients’ logs to a remote server in order for them to be centralized, making log analysis and visualization easier.

Most of the time, system administrators are not monitoring one single machine, but they have to monitor dozens of machines, on-site and off-site.

As a consequence, it is a very common practice to send logs to a distant machine, called a centralized logging server, using different communication protocols such as UDP or TCP.

b – Is Syslog using TCP or UDP?

As specified on the RFC 3164 specification, syslog clients use UDP to deliver messages to syslog servers.

Moreover, Syslog uses port 514 for UDP communication.

However, on recent syslog implementations such as rsyslog or syslog-ng, you have the possibility to use TCP (Transmission Control Protocol) as a secure communication channel.

For example, rsyslog uses port 10514 for TCP communication, ensuring that no packets are lost along the way.

Furthermore, you can use the TLS/SSL protocol on top of TCP to encrypt your Syslog packets, making sure that no man-in-the-middle attacks can be performed to spy on your logs.

If you are curious about rsyslog, here’s a tutorial on how to setup a complete centralized logging server in a secure and reliable way.

What are current Syslog implementations?

Syslog is a specification, but not the actual implementation in Linux systems.

Here is a list of current Syslog implementations on Linux:

  • Syslog daemon: published in 1980, the syslog daemon is probably the first implementation ever done and only supports a limited set of features (such as UDP transmission). It is most commonly known as the sysklogd daemon on Linux;
  • Syslog-ng: published in 1998, syslog-ng extends the set of capabilities of the original syslog daemon including TCP forwarding (thus enhancing reliability), TLS encryption, and content-based filters. You can also store logs to local databases for further analysis.

syslog-ng

  • Rsyslog: released in 2004 by Rainer Gerhards, rsyslog comes as a default syslog implementation on most of the actual Linux distributions (Ubuntu, RHEL, Debian, etc..). It provides the same set of features as syslog-ng for forwarding but it allows developers to pick data from more sources (Kafka, a file, or Docker for example)

rsyslog-card

Best Practices of the Syslog

When manipulating Syslog or when building a complete logging architecture, there are a few best practices that you need to know:

  • Use reliable communication protocols unless you are willing to lose data. Choosing between UDP (a non-reliable protocol) and TCP (a reliable protocol) really matters. Make this choice ahead of time;
  • Configure your hosts using the NTP protocol: when you want to work with real-time log debugging, it is best for you to have hosts that are synchronized, otherwise, you would have a hard time debugging events with good precision;
  • Secure your logs: using the TLS/SSL protocol surely has some performance impacts on your instance, but if you are to forward authentication or kernel logs, it is best to encrypt them to make sure that no one is having access to critical information;
  • You should avoid over-logging: defining a good log policy is crucial for your company. You have to decide if you are interested in storing (and essentially consuming bandwidth) for informational or debug logs for example. You may be interested in having only error logs for example;
  • Backup log data regularly: if you are interested in keeping sensitive logs, or if you are audited on a regular basis, you may be interested in backing up your log on an external drive or on a properly configured database;
  • Set up log retention policies: if logs are too old, you may be interested in dropping them, also known as “rotating” them. This operation is done via the logrotate utility on Linux systems.

Conclusion

The Syslog protocol is definitely a classic for system administrators or Linux engineers willing to have a deeper understanding of how logging works on a server.

However, there is a time for theory, and there is a time for practice.

So where should you go from there? You have multiple options.

You can start by setting up a Syslog server on your instance, like a Kiwi Syslog server for example, and starting gathering data from it.

Or, if you have a bigger infrastructure, you should probably start by setting up a centralized logging architecture, and later on, monitor it using very modern tools such as Kibana for visualization.

I hope that you learned something today.

Until then, have fun, as always.

Understanding Processes on Linux

Understanding Processes on Linux | Types of Process in Linux | Creating, Listing, Monitoring, Changing Linux Processes

Are you finding an ultimate guide to provide complete knowledge about Understanding Processes on Linux? This could be the right page for all developers & administrators.

Right from what is processes in Linux to how they are managed on Linux are explained here in a detailed way with simple examples for better understanding. If you are working as a system administrator, then I must say that you would have associated with the processes in Linux in many diverse ways.

Actually, Processes are at the center of the Linux OS designed by the Kernel itself, they represent running operations currently happening on your Linux host. You can perform everything with processes like starting them, interrupt them, resume them, or stop them.

In today’s tutorial, we are going to take a deep look at Linux Processes, what they are, what commands are associated with processes, how they are used on our operating system, what signals are, and how we can select more computational resources to our present processes.

What You Will Learn

By reading this tutorial until the end, you will learn about the following concepts

  • What processes are and how they are created on a Linux system?
  • How processes can be identified on a Linux system?
  • What background and foreground processes are?
  • What signals are and how they can be used to interact with processes?
  • How to use the pgrep as well as the pkill command effectively
  • How to adjust process priority using nice and renice
  • How to see process activity in real-time on Linux

That’s quite a long program, so without further ado, let’s start with a brief description of what processes are.

Linux Processes Basics

In short, processes are running programs on your Linux host that perform operations such as writing to a disk, writing to a file, or running a web server for example.

The process has an owner and they are identified by a process ID (also called PID)

Linux Processes Basics process-identity

On the other hand, programs are lines, or code or lines of machine instructions stored on a persistent data storage.

They can just sit on your data storage, or they can be in execution, i.e running as processes.

Linux Processes Basics program-process

In order to perform the operations they are assigned to, processes need resources: CPU timememory (such as RAM or disk space), but also virtual memory such as swap space in case your process gets too greedy.

Obviously, processes can be startedstoppedinterrupted, and even killed.

Before issuing any commands, let’s see how processes are created and managed by the kernel itself.

Process Initialization on Linux

As we already stated, processes are managed by the Kernel on Linux.

However, there is a core concept that you need to understand in order to know how Linux creates processes.

By default, when you boot a Linux system, your Linux kernel is loaded into memory, it is given a virtual filesystem in the RAM (also called initramfs) and the initial commands are executed.

One of those commands starts the very first process on Linux.

Historically, this process was called the init process but it got replaced by the systemd initialization process on many recent Linux distributions.

To prove it, run the following command on your host

$ ps -aux | head -n 2

Process Initialization on Linux systemd

As you can see, the systemd process has a PID of 1.

If you were to print all processes on your system, using a tree display, you would find that all processes are children of the systemd one.

$ pstree

Process Initialization on Linux pstree

It is noteworthy to underline the fact that all those initialization steps (except for the launch of the initial process) are done in a reserved space called the kernel space.

The kernel space is a space reserved to the Kernel in order for it to run essential system tools properly and to make sure that your entire host is running in a consistent way.

On the other hand, user space is reserved for processes launched by the user and managed by the kernel itself.

user-kernel-space

As a consequence, the systemd process is the very first process launched in the userspace.

Creation of a Processes in Linux

A new process is normally created when an existing process makes an exact copy of itself in memory. The child process will have the same environment as its parent, but only the process ID number is different.

In order to create a new process in Linux, we can use two conventional ways. They are as such:

  • With The System() Function – this method is relatively simple, however, it’s inefficient and has significantly certain security risks.
  • With fork() and exec() Function – this technique is a little advanced but offers greater flexibility, speed, together with security.

Process Creation using Fork and Exec

When you are creating and running a program on Linux, it generally involves two main steps: fork and execute.

Fork operation

A fork is a clone operation, it takes the current process, also called the parent process, and it clones it in a new process with a brand new process ID.

When forking, everything is copied from the parent process: the stack, the heap, but also the file descriptors meaning the standard input, the standard output, and the standard error.

It means that if my parent process was writing to the current shell console, the child process will also write to the shell console.

Process Creation using Fork and Exec fork

 

The execution of the cloned process will also start at the same instruction as the parent process.

Execute operation

The execute operation is used on Linux to replace the current process image with the image from another process.

On the previous diagram, we saw that the stack of the parent process contained three instructions left.

As a consequence, the instructions were copied to the new process but they are not relevant to what we want to execute.

The exec operation will replace the process image (i.e the set of instructions that need to be executed) with another one.

Process Creation using Fork and Exec

If you were for example to execute the exec command in your bash terminal, your shell would terminate as soon as the command is completed as your current process image (your bash interpreter) would be replaced with the context of the command you are trying to launch.

$ exec ls -l

If you were to trace the system calls done when creating a process, you would find that the first C command called is the exec one.

strace-linux

Creating processes from a shell environment

When you are launching a shell console, the exact same principles apply when you are launching a command.

A shell console is a process that waits for input from the user.

It also launches a bash interpreter when you hit Enter and it provides an environment for your commands to run.

But the shell follows the steps we described earlier.

When you hit enter, the shell is forked to a child process that will be responsible for running your command. The shell will wait patiently until the execution of the child process finishes.

On the other hand, the child process is linked to the same file descriptors and it may share variables that were declared on a global scope.

The child process executes the “exec” command in order to replace the current process image (which is the shell process image) in the process image of the command you are trying to run.

The child process will eventually finish and it will print its result to the standard output it inherited from the parent process, in this case, the shell console itself.

shell-execution

Now that you have some basics about how processes are created in your Linux environment, let’s see some details about processes and how they can be identified easily.

Identifying & Listing Running Processes on Linux

The easiest way to identify running processes on Linux is to run the ps command.

$ ps

ps-command

By default, the ps command will show you the list of the currently running processes owned by the current user.

In this case, only two processes are running for my user: the bash interpreter and the ps command I have run into it.

The important part here is that processes have owners, most of the time the user who runs them in the first place.

To illustrate this, let’s have a listing of the first ten processes on your Linux operating system, with a different display format.

$ ps -ef | head -n 10

Identifying running processes on Linux

As you can see here, the top ten processes are owned by the user “root“.

This information will be particularly important when it comes to interacting with processes with signals.

To display the processes that are owned and executed by the current connected user, run the following command

$ ps u

Identifying running processes on Linux

There are plenty of different options for the ps command, and they can be seen by running the manual command.

$ man ps

From experience, the two most important commands in order to see running processes are

ps aux

USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND

That corresponds to a BSD-style process listing, where the following command

ps -ef

UID  PID  PPID C STIME TTY  TIME CMD

Corresponds to a POSIX-style process listing.

They are both representing current running processes on a system, but the first one has the “u” option for “user oriented” which makes it easier to read process metrics.
ps-aux

Now that you have seen what processes are and how they can be listed, let’s see what background and foreground processes are on your host.

Here is the description of all the fields displayed by ps -f command:

Sr.No. Column & Description
1 UID: User ID that this process belongs to (the person running it)
2 PID: Process ID
3 PPID: Parent process ID (the ID of the process that started it)
4 C: CPU utilization of process
5 STIME: Process start time
6 TTY: Terminal type associated with the process
7 TIME: CPU time taken by the process
8 CMD: The command that started this process

Also, there are other options that can be used along with ps command:

Sr.No. Option & Description
1 -a: Shows information about all users
2 -x: Shows information about processes without terminals
3 -u: Shows additional information like -f option
4 -e: Displays extended information

How to Control Processes in Linux?

In Linux, there are some commands for controlling processes like kill, pkill, pgrep, and killall, here are a few key examples of how to use them:

$ pgrep -u tecmint top
$ kill 2308
$ pgrep -u tecmint top
$ pgrep -u tecmint glances
$ pkill glances
$ pgrep -u tecmint glances

Types of Processes

Basically, there are two types of processes in Linux:

  • Foreground processes (also referred to as interactive processes) – these are initialized and controlled through a terminal session. In other words, there has to be a user connected to the system to start such processes; they haven’t started automatically as part of the system functions/services.
  • Background processes (also referred to as non-interactive/automatic processes) – these are processes not connected to a terminal; they don’t expect any user input.

Background and foreground processes

The definition of background and foreground processes is pretty self-explanatory.

Jobs and processes in the current shell

A background process on Linux is a process that runs in the background, meaning that it is not actively managed by a user through a shell for example.

On the opposite side, a foreground process is a process that can be interacted with via direct user input.

Let’s say for example that you have opened a shell terminal and that you typed the following command in your console.

$ sleep 10000

As you probably noticed, your terminal will hang until the termination of the sleep process. As a consequence, the process is not executed in the background, it is executed in the foreground.

I am able to interact with it. If I press Ctrl + Z, it will directly send a stop signal to the process for example.

Jobs and processes in the current shell foreground

However, there is a way to execute the process in the background.

To execute a process in the background, simply put a “&” sign at the end of your command.

$ sleep 10000 &

As you can see, the control was directly given back to the user and the process started executing in the background

Jobs and processes in the current shell background

To see your process running, in the context of the current shell, you can execute the jobs command

$ jobs

Jobs and processes in the current shell jobs

Jobs are a list of processes that were started in the context of the current shell and that may still be running in the background.

As you can see in the example above, I have two processes currently running in the background.

The different columns from left to right represent the job ID, the process state (that you will discover in the next section), and the command executed.

Using the bg and fg commands

In order to interact with jobs, you have two commands available: bg and fg.

The bg command is used on Linux in order to send a process to the background and the syntax is as follows

$ bg %<job_id>

Similarly, in order to send a process to the foreground, you can use the fg in the same fashion

$ fg %<job_id>

If we go back to the list of jobs of our previous example, if I want to bring job 3 to the foreground, meaning to the current shell window, I would execute the following command

$ fg %3

Using the bg and fg commands

By issuing a Ctrl + Z command, I am able to stop the process. I can link it with a bg command in order to send it to the background.

Using the bg and fg commands bg-1

Now that you have a better idea of what background and foreground processes are, let’s see how it is possible for you to interact with the process using signals.

Interacting with processes using signals

On Linux, signals are a form of interprocess communication (also called IPC) that creates and sends asynchronous notifications to running processes about the occurrence of a specific event.

Signals are often used in order to send a kill or a termination command to a process in order to shut it down (also called kill signal).

In order to send a signal to a process, you have to use the kill command.

$ kill -<signal number> <pid>|<process_name>

For example, in order to force an HTTPD process (PID = 123) to terminate (without a clean shutdown), you would run the following command

$ kill -9 123

Signals categories explained

As explained, there are many signals that one can send in order to notify a specific process.

Here is the list of the most commonly used ones:

  • SIGINT: short for the signal interrupt is a signal used in order to interrupt a running process. It is also the signal that is being sent when a user pressed Ctrl + C on a terminal;
  • SIGHUP: short for signal hangup is the signal sent by your terminal when it is closed. Similarly to a SIGINT, the process terminates;
  • SIGKILL: signal used in order to force a process to stop whether it can be gracefully stopped or not. This signal can not be ignored except for the init process (or the systemd one on recent distributions);
  • SIGQUIT: a specific signal sent when a user wants to quit or to exit the current process. It can be invoked by pressing Ctrl + D and it is often used in terminal shells or in SSH sessions;
  • SIGUSR1, SIGUSR2: those signals are used purely for communication purposes and they can be used in programs in order to implement custom handlers;
  • SIGSTOP: instructs the process to stop its execution without terminating the process. The process is then waiting to be continued or to be killed completely;
  • SIGCONT: if the process is marked as stopped, it instructs the process to start its execution again.

In order to see the full list of all signals available, you can run the following command

$ kill -l

 1) SIGHUP       2) SIGINT       3) SIGQUIT      4) SIGILL
 5) SIGTRAP      6) SIGABRT      7) SIGBUS       8) SIGFPE
 9) SIGKILL     10) SIGUSR1     11) SIGSEGV     12) SIGUSR2
13) SIGPIPE     14) SIGALRM     15) SIGTERM     16) SIGSTKFLT
17) SIGCHLD     18) SIGCONT     19) SIGSTOP     20) SIGTSTP
21) SIGTTIN     22) SIGTTOU     23) SIGURG      24) SIGXCPU
25) SIGXFSZ     26) SIGVTALRM   27) SIGPROF     28) SIGWINCH
29) SIGIO       30) SIGPWR      31) SIGSYS      34) SIGRTMIN
35) SIGRTMIN+1  36) SIGRTMIN+2  37) SIGRTMIN+3  38) SIGRTMIN+4
39) SIGRTMIN+5  40) SIGRTMIN+6  41) SIGRTMIN+7  42) SIGRTMIN+8
43) SIGRTMIN+9  44) SIGRTMIN+10 45) SIGRTMIN+11 46) SIGRTMIN+12
47) SIGRTMIN+13 48) SIGRTMIN+14 49) SIGRTMIN+15 50) SIGRTMAX-14
51) SIGRTMAX-13 52) SIGRTMAX-12 53) SIGRTMAX-11 54) SIGRTMAX-10
55) SIGRTMAX-9  56) SIGRTMAX-8  57) SIGRTMAX-7  58) SIGRTMAX-6
59) SIGRTMAX-5  60) SIGRTMAX-4  61) SIGRTMAX-3  62) SIGRTMAX-2
63) SIGRTMAX-1  64) SIGRTMAX

Signals and Processes States

Now that you know that it is possible to interrupt, kill or stop processes, it is time for you to learn about processes states.

States of a Process in Linux

Processes have many different states, they can be :

  • Running: processes running are the ones using some computational power (such as CPU time) in the current time. A process can also be called “runnable” if all running conditions are met, and it is waiting for some CPU time by the CPU scheduler.
  • Stopped: a signal that is stopped is linked to the SIGSTOP signal or to the Ctrl + Z keyboard shortcut. The process execution is suspended and it is either waiting for a SIGCONT or for a SIGKILL.
  • Sleeping: a sleeping process is a process of waiting for some event or for a resource (like a disk) to be available.

Here is a diagram that represents the different process states linked to the signals you may send to them.

Signals and Processes States process-states

Now that you know a bit more about process states, let’s have a look at the pgrep and pkill commands.

Manipulating Process with pgrep and pkill

On Linux, there is already a lot that you can do by simply using the ps command.

You can narrow down your search to one particular process, and you can use the PID in order to kill it completely.

However, there are two commands that were designed in order for your commands to be even shorter: pgrep and pkill

Using the pgrep command

The pgrep command is a shortcut for using the ps command piped with the grep command.

The pgrep command will search for all the occurrences for a specific process using a name or a defined pattern.

The syntax of the pgrep command is the following one

$ pgrep <options> <pattern>

For example, if you were to search for all processes named “bash” on your host, you would run the following command

$ pgrep bash<

The pgrep command is not restricted to the processes owned by the current user by default.

If another user was to run the bash command, it would appear in the output of the pgrep command.

Using the pgrep command pgrep

It is also possible to search for processes using globbing characters.

Using the pgrep command pgrep-globbing

Using the pkill command

On the other hand, the pkill command is also a shortcut for the ps command used with the kill command.

The pkill command is used in order to send signals to processes based on their IDs or their names.

The syntax of the pkill command is as follows

$ pkill <options> <pattern>

For example, if you want to kill all Firefox windows on your host, you would run the following command

$ pkill firefox

Similar to the pgrep command, you have the option to narrow down your results by specifying a user with the -u option.

To kill all processes starting with “fire” and owned by the current user and root, you would run the following command

$ pkill user,root fire*

If you don’t have the right to stop a process, you will get a permission denied error message to your standard output.

Using the pkill command permission-denied-1

You also have the option to send specific signals by specifying the signal number in the pkill command

For example, in order to stop Firefox with a SIGSTOP signal, you would run the following command

$ pkill -19 firefox<

Changing Linux Process Priority using nice and renice

On Linux, not all processes are given the same priority when it comes to CPU time.

Some processes, such as very important processes run by root, are given a higher priority in order for the operating system to work on tasks that truly matter to the system.

Process priority on Linux is called the nice level.

The nice level is a priority scale going from -20 to 19.

The lower you go on the niceness scale, the higher the priority will be.

Similarly, the higher you are on the niceness scale, the lower your priority will be.

In order to remember it, you can remember the fact that “the nicer you are, the more you are willing to share resources with others”.

djusting process priority using nice and renice

In order to start a certain program or process with a given nice level, you will run the following command

$ nice -n <level> <command>

For example, in order to run the tar command with a custom tar level, you would run the following command

$ nice -n 19 tar -cvf test.tar file

Similarly, you can use the renice command in order to set the nice level of a running process to a given value.

$ renice -n <priority> <pid>

For example, if I have a running process with the PID 123, I can use the renice command in order to set its priority to a given value.

$ renice -n 18 123

Niceness and permissions

If you are not a member of the sudo group (or a member of the wheel group on Red Hat-based distributions), there are some restrictions when it comes to what you can do with the nice command.

To illustrate it, try to run the following command as a non-sudo user

$ nice -n -1 tar -cvf test.tar file

nice: cannot set niceness: Permission denied

nice-permissions

When it comes to niceness, there is one rule that you need to know:

As a non-root (or sudo) user, you won’t be able to set a nice level lower than the default assigned one (which is zero), and you won’t be able to renice a running process to a lower level than the current one.

To illustrate the last point, launch a sleep command in the background with a nice value of 2.

$ nice -n 2 sleep 10000 &

Next, identify the process ID of the process you just created.

Niceness and permissions

Now, try to set the nice level of your process to a value lower to the one you specified in the first place.

$ renice -n 1 8363

renice
As you probably noticed, you won’t be able to set the niceness level to 1, but only to a value higher than the one you specified.

Now if you choose to execute the command as sudo, you will be able to set the nice level to a lower value.

sudo-rence

Now that you have a clear idea of the nice and renice commands, let’s see how you can monitor your processes in real-time on Linux.

Monitoring processes on Linux using top and htop

In a previous article, we discussed how it is possible to build a complete monitoring pipeline in order to monitor Linux processes in real-time.

Using top on Linux

The top is an interactive command that any user can run in order to have a complete and ordered listing of all processes running on a Linux host.

To run top, simply execute it without any arguments.

The top will run in interactive mode.

$ top

If you want to run top for a custom number of iterations, run the following command

$ top -n <number><

top

The top command will first show recap statistics about your system at the top, for example, the number of tasks running, the percentage of CPU used, or the memory consumption.

Right below it, you have access to a live list of all processes running or sleeping on your host.

This view will refresh every three seconds, but you can obviously tweak this parameter.

To increase the refresh rate in the top command, press the “d” command and choose a new refresh rate

refresh-rate

Similarly, you can change the nice value of a running process live by pressing the “r” key on your keyboard.

The same permissions rules apply if you want to modify processes to a value lower to the one they are already assigned.

As a consequence, you may need to run the command as sudo.

renice-top

Using htop on Linux

Alternatively, if you are looking for a nicer way to visualize processes on your Linux host, you can use the htop command.

By default, the htop command is not available on most distributions, so you will need to install it with the following instructions.

$ sudo apt-get update
$ sudo apt-get install htop

If you are running a Red Hat based distribution, run the following commands.

$ sudo yum -y install epel-release
$ sudo yum -y update
$ sudo yum -y install htop

Finally, to run the htop command, simply run it without any arguments.

$ htop

htop

As you can see, the output is very similar except that it showcases information in a more human-friendly output.

Conclusion

In this tutorial, you learned many concepts about processes: how they are created, how they can be managed, and how they can be monitored effectively.

If you are looking for more tutorials related to Linux system administration, we have a complete section dedicated to it on the website, so make sure to check it out.

Until then, have fun, as always.

How To Install Git On Debian 10 Buster

How To Install Git On Debian 10 Buster | Debian Git Repository | Debian Buster Git

Git is the world’s famous distributed software version control system that allows you to keep track of your software at the source level. It is used by many open-source and commercial projects. In this tutorial, we will be discussing completely how to install & get started with Git on Debian 10 Buster Linux along with the introduction of Git such as what is git, git terms, git commands, and also features of git.

What is Git?

Git is the most commonly used distributed version control system in the world created by Linus Torvalds in 2005. The popular option among open-source and other collaborative software projects is Git. Also, several project files are kept in a Git repository, and big companies like GitHubGitlab, or Bitbucket assist to promote software development project sharing and collaboration.

Mainly, the Git tool is utilized by development teams to keep track of all the changes happening on a codebase, as well as organizing code in individual branches. In today’s tutorial, we are working on how to set up Git on a Debian 10 Buster machine.

What is Debian?

Debian is an operating system for a wide range of devices including laptops, desktops, and servers. The developers of Debian will provide the security updates for all packages for almost of their lifetime. The current stable distribution of Debian is version 10, codenamed buster. Debian 10 is brand new, so if you require a complete setup tutorial for Debian 10, follow this tutorial.

Also Check:

Terms of Git

For a better understanding of Git, you must know a few of the common Git Terms. So, we have compiled here in detail:

  • Repository: It is a directory on your local computer or a remote server where all your project files are kept and tracked by Git.
  • Modified: If you add a file in the staging area, and modify the file again before committing, then the file will have a modified status. You will have to add the file to the staging area again for you to be able to commit it.
  • Commit: It is keeping a snapshot of the files that are in the staging area. A commit has information such as a title, description, author name, email, hash, etc.
  • Staged: Before you commit your changes to the Git repository, you must add the files to the staging area. The files in the staging area are called staged files.
  • Tracked: If you want Git to track a file, then you have to tell Git to track the file manually.
  • Untracked: If you create a new file on your Git repository, then it is called an untracked file in Git. Unless you tell git to track it, Git won’t track a file.

Git Features

Before learning the installation of Git, knowing completely about the tool is very essential. So, here we have provided features of Git in an image format for quick reference and easy sharing to others. Look at the below shareable image and download it on your devices for usage:

Git Features shareable image

How To Install Git On Linux 2021?

Prerequisites

Before starting, make sure that you have root privileges on your instance.

To make sure of it, run the following command.

$ sudo -l

I – Prerequisites sudo-rights

How to Install Git from official sources?

By following the below sub-modules, you can easily understand the installation of Git from official sources:

Update your apt repositories

First of all, make sure that you have the latest versions of the repositories on your apt cache.

To update them, run the following command:

$ sudo apt update

II – Install Git from official sources apt-update

Install Git from the official repository

To install the latest stable version of Git (2.20.1 in my case), run the following command.

$ sudo apt-get install git

b – Install Git from the official repository git-install

Great!

Now you can check the git version that is running on your computer.
<pre$ git –version 2.20.1

Steps for Installing Git From Source

As you probably noticed, you are not getting the latest version of Git with the apt repositories. As of August 2019, the latest Git version is 2.22.0. In order to install the latest Git version on your Debian 10 instance, follow those instructions.

Install required dependencies

In order to build Git, you will have to install manually dependencies on your system. To do so, run the following command:

$ sudo apt-get install dh-autoreconf libcurl4-gnutls-dev libexpat1-dev \
  gettext libz-dev libssl-dev

a – Install required dependencies manual-dependencies

Install documentation dependencies

In order to add documentation to Git (in different formats), you will need the following dependencies

$ sudo apt-get install asciidoc xmlto docbook2x

b – Install documentation dependencies manual-2

Install the install-info dependencies

On Debian configurations, you will need to add the install-info dependency to your system.

$ sudo apt-get install install-info

c – Install the install-info dependencies manual-3

Download and build the latest Git version

Head to the Git repository on Github, and select the version you want to run on your Debian instance.
d – Download and build the latest Git version latest-git-version

Head to the directory where you stored the tar.gz file, and run the following commands.

$ tar -zxf git-2.22.0.tar.gz
$ cd git-2.22.0
$ make configure
$ ./configure --prefix=/usr
$ make all doc info
$ sudo make install install-doc install-html install-info

Again, run the following command to make sure that Git is correctly installed on your system

$ git --version

d – Download and build the latest Git version git-2.22.0

Configuring Git

Now that Git is correctly set on your instance, it is time for you to configure it.

This information is used when you are committing to repositories, you want to make sure that you are appearing under the correct name and email address.

To configure Git, run the following commands:

$ git config --global user.name "devconnected" 
$ git config --global user.email "devconnectedblog@gmail.com"

Now to make sure that your changes are made, run the following command.

$ git config --list

IV – Configuring Git git-config

You can also look at your modifications in the gitconfig file available in your home directory.

To view it, run the following command.

$ cat ~/.gitconfig

IV – Configuring Git gitconfig-file

Now that your Git instance is up and running, it is time for you to make your first contributions to the open-source world!

Here’s a very good link by Digital Ocean on a first introduction to the Open Source world!

Uninstalling Git

If by any chance you are looking for removing Git from your Debian 10 Buster instance, run the following command:

$ sudo apt-get remove git

V – Uninstalling Git git-remove

Until then, have fun, as always.

Access Control Lists on Linux Explained

Access Control Lists on Linux Explained | Linux ACL Cheat Sheet

Access control list (ACL) gives an additional, more flexible permission mechanism for file systems. It is intended to help UNIX file permissions. ACL permits you to grant permissions for any user or group to any disc resource.

If you are working as a system administrator then you would probably be familiar with Linux ACLs. Because they were used to define more fine-grained discretionary access rights for files and directories.

In today’s Access Control Lists on Linux Explained Tutorial, we are going to explain deeper information about Linux access control lists, what they are used for and how they are managed to configure a Linux system properly.

Get Ready to Learn A New Topic?

What You Will Learn

If you follow this tutorial until the end, you are going to learn about the following topics:

That’s quite a long program, so without further ado, let’s start with a quick definition of what Linux file access control lists acls are.

Access Control Lists Basics on Linux

On Linux, there are two ways of setting permissions for users and groups: with regular file permissions or with access control lists.

What is ACL(Access Control Lists)?

Access control lists are used on Linux filesystems to set custom and more personalized permissions on files and folders. ACLs allow file owners or privileged users to grant rights to specific users or to specific groups.

getfacl-1

In Linux, as you probably know, the permissions are divided into three categories: one for the owner of the file, one for the group, and one for the others.

However, in some cases, you may want to grant access to a directory (the execute permission for example) to a specific user without having to put this user into the group of the file.

This is exactly why access control lists were invented in the first place.

Do Refer More Linux Tutorials: 

Listing Access Control List

On Linux, access control lists are not enabled when you create a new file or directory on your host (except if a parent directory has some ACLs predefined).

To see if access control lists are defined for a file or directory, run the ls command and look for a “+” character at the end of the permission line.

$ ls -l

access-control-list
To show the difference, here is the difference when listing files on a minimal instance.

Listing Access Control List acl-fileNow that you have some basics about access control lists, let’s see how you can start creating basic ACL for your files and directories.

List of commands for setting up ACL

1) To add permission for user
setfacl -m "u:user:permissions" /path/to/file

2) To add permissions for a group
setfacl -m "g:group:permissions" /path/to/file

3) To allow all files or directories to inherit ACL entries from the directory it is within
setfacl -dm "entry" /path/to/dir

4) To remove a specific entry
setfacl -x "entry" /path/to/file

5) To remove all entries
setfacl -b path/to/file

Creating access control lists on Linux

Before starting with ACL commands, it is important to have the packages installed on your host.

Checking ACL packages installation

It might not be the case if you chose to have a minimal server running.

Start by checking the help related to the setfacl by running the following command

$ setfacl --help

If your host cannot find the setfacl command, make sure to install the necessary packages for ACL management.

$ sudo apt-get install acl -y

Note that you will need sudo privileges on Debian 10 to run this command.

Checking ACL packages installation

Run the setfacl command and make sure that you are able to see the help commands this time.

Now that your host is correctly configured, let’s see how the setfacl command works.

Setting access control lists using setfacl

With access control lists, there are two main commands that you need to remember: setfacl and getfacl.

In this chapter, we are going to take a look at the setfacl command as the getfacl one is pretty self-explanatory.

The setfacl command is used on Linux to create, modify and remove access control lists on a file or directory.

The setfacl has the following syntax

$ setfacl {-m, -x}  {u, g}:<name>:[r, w, x] <file, directory>

Where curly brackets mean one of the following options and regular brackets mean one or several items.

  • -m: means that you want to modify one or several ACL entries on the file or directory.
  • -x: means that you want to remove one or several ACL entries on a file or directory.
  • {u, g}: if you want to modify the ACL for a user or for a group.
  • name: this is an optional parameter, it can be omitted if you want to set the ACL entries for every user or for every group on your host.
  • [r, w, x]: in order to set read, write or execute permissions on the file or directory.

For example, in order to set specific write permissions for a user on a file, you would write the following command

$ setfacl -m u:user:w <file, directory>

In order to set execute permissions for all users on your host, you would write the following command

$ setfacl -m u::x <file, directory>

To set full permissions for a specific group on your host, you would write the setfacl this way

$ setfacl -m g:group:rwx <file, directory>

Now let’s say that you want to remove an ACL entry from a file.

In order to remove a user-specific entry from a file, you would specify the x option.

Note: you cannot specific rights from a single ACL entry, meaning that you can’t remove write permissions, keeping the ACL read permissions active.

$ setfacl -x u:<user> <file, directory>

Similarly, to remove ACL related to groups on your host, you would write the following command

$ setfacl -x g:<group> <file, directory>

Now that you have seen how you can create access control lists easily on Linux, it is time to see how you can check existing access control lists on files and directories.

Listing access control lists using getfacl

The getfacl command is used on Linux to print a complete listing of all regular permissions and access control lists permissions on a file or directory.

The getfacl can be used with the following syntax

$ getfacl <file, directory>

getfacl-2

The getfacl command is divided into multiple categories :

  • Filename, owner, and group: The information about the user and group ownership is shown at the top;
  • User permissions: First, you would find regular user permissions, also called the owning user, followed by any user-specific ACL entries (called named users)
  • Group permissions: Owning groups are presented followed by group-specific ACL entries, also called named groups
  • Mask: That restricts the permissions given to ACL entries, the mask is going to be detailed in the next section;
  • Other permissions: Those permissions are always active and this is the last category explored when no other permissions match with the current user or group.

Working with the access control lists mask

As you probably saw from the last screenshot, there is a mask entry between the named groups and the other permissions.

But what is this mask used for?

The ACL mask is different from the file creation mask (umask) and it is used in order to restrict existing ACL entries existing on a file or directory.

The ACL mask is used as the maximum set of ACL permissions regardless of existing permissions that exceed the ACL mask.

As always, a diagram speaks a hundred words.

Working with the access control lists mask

The ACL mask is updated every time you run a setfacl command unless you specify that you don’t want to update the mask with the -n flag.

To prevent the mask from being updated, run the setfacl with the following command

$ setfacl -n -m u:antoine:rwx <file, directory>

As you can see in this example, I have set the user “antoine” to have full permissions on the file.

The mask is set to restrict permissions to read and write permissions.

As a consequence, the “effective permissions” set on this file for this user are read and write ones, the execute permission is not granted.

Working with the access control lists mask

Note: If your maximum set of permissions differs from the mask entry, you will be presented with an effective line computing the “real” set of ACL entries used.

Creating access control lists defaults on directories

As already mentioned in this article, it is possible to create ACL entries on directories and they work in the same way file access control lists work.

However, there is a small difference when it comes to directories: you have to option to create access control lists defaults.

Access control lists defaults are used to create ACL entries on a directory that will be inherited by objects in this directory like files or subdirectories.

When creating default ACL entries :

  • Files created in this directory inherit the ACL entries specified in the parent directory
  • Subdirectories created in this directory inherit the ACL entries as well as the default ACL entries from the parent directory.

To create default ACL entries, specify the -d option when setting ACL using the setfacl command.

$ setfacl -d -m {u, g}:<name>:[r, w, x] <directory>

For example, to assign read permissions to all files created in a directory, you would run the following command

$ setfacl -d -m u::r directory

Creating access control lists defaults on directories getfacl-3

Now, when a file is created in this acl-directory, you can see that default ACL entries are applied to the file.

Creating access control lists defaults on directories default-1-1

Similarly, when a directory is created in the acl-directory, it will inherit default ACL entries specified in the parent directory.

Creating access control lists defaults on directories default-2

Note that it is recommended to specify default permissions for all three categories (user, group, and other).

In fact, specifying one of the three entries will create the remaining two with permissions related to the file creation mask.

Deleting default access control lists on directories

In order to delete default existing access control lists on directories, use the -k flag with the setfacl command.

$ setfacl -k <directory>

Given the example we specified earlier, here is how to delete default entries

$ setfacl -k acl-directory

Deleting default access control lists on directories remove-default

Note that deleting ACL entries from the parent directory does not delete ACL entries in files or directories contained in the parent directory.

To remove default ACL entries in a directory and all subdirectories, you would have to use a recursive option (-R)

$ setfacl -kR <directory>

remove-default-1

Conclusion

In this tutorial, you learned about access control lists on Linux, the getfacl, and the setfacl command.

You also discovered more about the access control lists mask and how default ACL is used in order to create ACL entries on files and subdirectories contained in the parent directory.

If you are curious about Linux system administration, we have many more tutorials on the subject, make sure to read them!

Understanding Hard and Soft Links on Linux

Understanding Hard and Soft Links on Linux | What are Hard & Soft Links in Linux?

In this tutorial, we are going to discuss what are hard and soft links with syntax and how we can understand Hard and Soft Links on Linux easily.

In case, you are wondering how you can generate a shortcut on a Linux system, then this tutorial can be the perfect answer for you all.

Are you Ready to Start learning about Understanding Hard and Soft Links on Linux? here you go.

What Will You Learn?

This section is completely about the stuff that is provided in this tutorial about the topic we are going to discuss. It helps you to know a bit earlier about what you are going to learn:

  • How storage works on a Linux system and what inodes are exactly?
  • What hard and soft links are, given what you just learned before
  • How copying differs from creating links to your files
  • How to create links on a Linux system
  • How to find hard and soft links: all the commands that you should know.
  • Some of the quick facts about hard and soft links

That’s a long program, so without further ado, let’s see how data is organized on your Linux system and what inodes are?

Do Refer: 

How does storage work on a Linux system?

In order to understand what hard and soft links are, you need to have some basics on how data is stored on your system.

In a computer, data are stored on a hard drive.

Your hard drive has a capacity, let’s say 1 TB, and it is divided into multiple blocks of a given capacity.

II - How does storage work on a Linux system storage-design

If I launch a simple fdisk command on my system, I am able to see that my device is separated into sectors of 512 bytes.

II - How does storage work on a Linux system fdisk-simple

That’s a lot of sectors, as my hard drive has a capacity of almost 32 GBs.

Now every time that I create a file, it will be stored on a block.

But what if the file size exceeds 512 bytes?

You guessed it, my file will be “fragmented” and stored into multiple different blocks.

how-files-are-stored

If the different pieces of your file are physically far away from each other on the disk, the time needed to build your file will obviously be longer.

That’s why you had to defragment your disk on Windows systems, for example, you were essentially making related blocks closer.

Luckily for us, those blocks have addresses.

Your Linux system does not have to read the entire disk to find your file, it will cherry-pick some addresses that correspond to your file data.

How?

By using inodes.

a – What are inodes?

Inodes are essentially identification cards for your file.

They contain the file metadata, the file permissions, the file type, the file size but most importantly the physical address of this file on the disk.

inode

Without going into too many details, your inode will keep references to the locations of your different file blocks.

So one inode equals one file.

That’s the first layer of your file system, and how references are kept to the underlying data.

storage-first-layer

However, as you probably noticed, filenames are not part of the inode, they are not part of the identification card of the file.

b – About filenames and inodes

On a Linux system, filenames are kept on a separate index.

This separate index keeps track of all the different filenames existing on your system (even directories) and they know the corresponding inode in the inode index.

What does it mean?

It essentially means that you can have multiple filenames (say “doc.txt” and “paper.txt”) pointing to the same exact file, sharing the same content.

Now that you learned about the filenames index, let’s add another layer to our previous layered architecture.

b – About filenames and inodes filesystems-design

With this schema, you have a basic understanding of how files are organized and stored on your filesystem, but more importantly, you will be able to understand what hard and soft links are.

What is Soft Link And Hard Link In Linux?

Let’s start with softs links as they are probably the easiest to understand.

a – Understanding soft links

Soft links, also called symbolic links, are files that point to other files on the filesystem.

Similar to shortcuts on Windows or macOS, soft links are often used as faster ways to access files located in another part of the filesystem (because the path may be hard to remember for example).

Symbolic links are identified with the filetype “l” when running a ls command.

They also have a special syntax composed of the link name and an arrow pointing to the file they are referencing.

a – Understanding soft links symbolic-link

In this case, the file “shortcut” is pointing to the file “file.txt” on my filesystem.

Have you paid attention to the permissions?

They are set to “rwx” by default for the user, the group, and the others.

However, you would be constrained by the permissions of the file if you were to manipulate this file with another user.

b – Soft links and inodes

So why did we talk so much about inodes in the first section?

Let’s have a quick look at the inode of the file and the inode of the shortcut.

$ stat shortcut
  File: shortcut -> file.txt
  Size: 3               Blocks: 0          IO Block: 4096   symbolic link
Device: fc01h/64513d    Inode: 258539      Links: 1

$ stat file.txt
  File: job
  Size: 59              Blocks: 8          IO Block: 4096   regular file
Device: fc01h/64513d    Inode: 258545      Links: 2

The inodes are different.

However, the original file inode is pointing directly to the file content (i.e the blocks containing the actual content) while the symbolic link inode is pointing to a block containing the path to the original file.

sooft-links

The file and the shortcut share the same content.

It means that I was to modify the content of the shortcut, the changes would be passed on to the content of the file.

If I delete the shortcut, it will simply delete the reference to the first inode. As the file inode (the first one) still references the content of the file on the disk, the content won’t be lost.

However, if I were to delete the file, the symbolic link would lose its reference to the first inode. As a consequence, you would not be able to read the file anymore.

This is what we call a dangling symbolic link, a link that is not pointing to anything.

deleting-soft-link-original-file

See the red highlighting when I deleted the original file?

Your terminal is giving visual clues that a symbolic link is a dangling symbolic link.

So what’s the size of a symbolic link?

Remember, the symbolic link points to the path of the original file on the filesystem.

In this example, I created a file named “devconnected”, and I built a shortcut to it using the ln command.

Can you guess the size of the shortcut? 12, because “devconnected” actually contains 12 letters.

soft-link-size

Great!

Now you have a good understanding of what soft links are.

c – Understanding hard links

Hard links to a file are instances of the file under a different name on the filesystem.

Hard links are literally the file, meaning that they share all the attributes of the original file, even the inode number.

hard-link

Here’s a hard link created on my system.

It is sharing the same content as the original file and it is also sharing the same permissions.

Changing permissions of the original file would change the permissions of the hard link.

d – Hard links and inodes

Let’s have a quick look at the original file inode and the hard link inode.

$ stat hardlink
  File: hardlink
  Size: 59              Blocks: 8          IO Block: 4096   regular file
Device: fc01h/64513d    Inode: 258545      Links: 2

$ stat file.txt
  File: file.txt
  Size: 59              Blocks: 8          IO Block: 4096   regular file
Device: fc01h/64513d    Inode: 258545      Links: 2

As you probably noticed, the inodes are the same, but the filenames are different!

Here’s what happens in this case on the filesystem.

hard-soft-links

When you are creating a symbolic link, you are essentially creating a new link to the same content, via another inode, but you don’t have access to the content directly in the link.

When creating a hard link, you are literally directly manipulating the file.

If you modify the content in the hard link file, the content will be changed in the original file.

Similarly, if you modify the content in the original file, it will be modified in the hard link file.

However, if you delete the hard link file, you will still be able to access the original file content.

Similarly, deleting the original file has no consequences on the content of the hard link.

Data are definitely deleted when no inodes point to it anymore.

Hard or Soft?

You won’t find a clear answer to this question. If the type that suits your special situation can be the best link. While these concepts can be tricky to memorize, the syntax is pretty straightforward, so that is a plus!

To keep the two links easily separated in your mind, I leave you with this:

  • A hard link always points a filename to data on a storage device.
  • A soft link always points a filename to another filename, which then points to information on a storage device.

What is the difference between copying and creating a hard link?

With the concepts that you just learned, you may wonder what’s the difference between copying a file and creating a hard link to the file.

Don’t we have two files in the end, with the same content?

The difference between copying and hard linking is that hard-linking does not duplicate the content of the file that it links to.

When you are copying a file, you are essentially assigning new blocks on the disk with the same content as the original file.

Even if you share the same content with hard-linking, you are using disk space to store the name of the original file, not the actual content of the file.

Diagrams may explain it better than words.

Here’s what copying a file means.

copying-a-file

See how it differs from hard-linking?

Now that you know how copying files differ from hard linking to files, let’s see how you can create symbolic and hard links on a Linux system.

Manipulating links on a Linux system

a – How to create a symbolic link on Linux?

To create a symbolic link, you need to use the ln command, with a -s flag (for symbolic).

The first argument specifies the file you want to link to.

The second argument describes the name of the link you are about to create.

$ ln -s file shortcut

$ ls -l
-rw-rw-r-- 1 schkn schkn 0 Aug 14 20:12 file
lrwxrwxrwx 1 schkn schkn 4 Aug 14 20:12 shortcut -> file

You can also create symbolic links to directories.

$ mkdir folder
$ ln -s folder shortcut-folder

$ ls -l
drwxrwxr-x  2 schkn schkn   4096 Aug 14 20:13 folder
lrwxrwxrwx  1 schkn schkn      7 Aug 14 20:14 shortcut-folder -> folder/

b – How to delete symbolic links on Linux

To remove existing symbolic links, use the unlink command.

Following our previous example :

$ unlink shortcut

$ ls -l 
-rw-rw-r-- 1 schkn schkn 0 Aug 14 20:12 file

You can also simply remove the shortcut by using the rm command.

Using the unlink command might be a little bit safer than performing an rm command.

$ rm shortcut

$ ls -l 
-rw-rw-r-- 1 schkn schkn 0 Aug 14 20:12 file

c – How to create a hard link on Linux

A hard link is created using the lnwithout specifying the s flag.

$ ln file hardlink

$ ls -l
-rw-rw-r--  2 schkn schkn      0 Aug 14 20:12 file
-rw-rw-r--  2 schkn schkn      0 Aug 14 20:12 hardlink

d – How to remove a hard link on Linux

Again, you can use the unlink command to delete a hard link on a Linux system.

$ ln file hardlink
$ unlink hardlink
$ ls -l
-rw-rw-r--  2 schkn schkn      0 Aug 14 20:12 file

Now that you know how to create links, let’s see how you can find links on your filesystem.

How to find links on a filesystem?

There are multiple ways to find links on a Linux system, but here are the main ones.

Using the find command

The find command has a type flag that you can use in order to find links on your system.

$ find . -type l -ls
262558      0 lrwxrwxrwx   1 schkn    schkn           7 Aug 14 20:14 ./shortcut-folder2 -> folder2/
.
.
.
262558      0 lrwxrwxrwx   1 schkn    schkn           7 Aug 14 20:14 ./shortcut-folder -> folder/

However, if you want to limit searches to the current directory, you have to use the maxdepth parameter.

$ find . -maxdepth 1 -type l -ls 
262558      0 lrwxrwxrwx   1 schkn    schkn           7 Aug 14 20:14 ./shortcut-folder -> folder/
258539      0 lrwxrwxrwx   1 schkn    schkn           3 Jan 26  2019 ./soft-job -> job

Finding links that point to a specific file

With the lname option, you have the opportunity to target links pointing to a specific filename.

$ ls -l
drwxrwxr-x  2 schkn schkn   4096 Aug 14 20:13 folder
lrwxrwxrwx  1 schkn schkn      7 Aug 14 20:38 devconnected -> folder/

$ find . -lname "fold*"
./devconnected

Finding broken links

With the L flag, you can find broken (or daggling) symbolic links on your system.

$ find -L . -type l -ls
258539      0 lrwxrwxrwx   1 schkn    schkn           3 Jan 26  2019 ./broken-link -> file

Quick facts about links on Linux

Before finishing this tutorial, there are some quick facts that you need to know about soft and hard links.

  • Soft links can point to different filesystems, and to remote filesystems. If you were to use NFS, which stands for Network File System, you would be able to create a symbolic link from one filesystem to a file system accessed by the network. As Linux abstracts different filesystems by using a virtual filesystem, it makes no difference for the kernel to link to a file located on an ext2, ext3, or an ext4 filesystem.
  • Hard links cannot be created for directories and they are constrained to the limits of your current filesystem. Creating hard links for directories could create access loops, where you would try to access a directory that points to itself. If you need more explanations about why it can’t be done conceptually, here’s a very good post by Volker Siegel on the subject.

I hope that you learned something new today. If you are looking for more Linux system administration tutorials, make sure to check our dedicated section.

Here is the list of our recent tutorials:

Cron Jobs and Crontab on Linux Explained

Cron Jobs and Crontab on Linux Explained | What is Cron Job & Crontab in Linux with Syntax?

This Cron Jobs and Crontab on Linux Explained Tutorial helps you understand cron on Linux along with the role of the crontab file. System administrators are likely to spend a lot of time performing recurring tasks on their systems.

But the best way to automate tasks on Linux systems is the cron jobs. It was initially found in 1975 by AT&T Bell Laboratories.

Not only cron jobs and crontab command on Linux but also you are going to explain about Linux cron daemon. Rather than these concepts also keep focusing on the difference between user-defined cron jobs and system-defined cron jobs.

Are you ready for learnings?

What is ‘crontab’ in Linux?

The crontab is a list of commands that you require to run on a daily schedule, and also the name of the command used to manage that list. Crontab stands for “cron table,” because it uses the job scheduler cron to execute tasks.

cron itself is termed after “Chronos, ” the Greek word for time.cron is the system process that will automatically execute tasks for you as per the set schedule. The schedule is called the crontab, which is also the name of the program used to edit that schedule.

Linux Crontab Format

MIN HOUR DOM MON DOW CMD

Linux Crontab Syntax

Field    Description    Allowed Value
MIN      Minute field    0 to 59
HOUR     Hour field      0 to 23
DOM      Day of Month    1-31
MON      Month field     1-12
DOW      Day Of Week     0-6
CMD      Command         Any command to be executed.

Important Crontab Examples

The following are some of the necessary examples of Crontab. Kindly have a look at them:

Description Command
Cron command to do the various scheduling jobs. The below-given command executes at 7 AM and 5 PM daily.
0 7,17 * * * /scripts/script.sh
Command to execute a cron after every 5 minutes.
*/5* * * * *  /scripts/script.sh
Cron scheduler command helps you to execute the task every Monday at 5 AM. This command is helpful for doing weekly tasks like system clean-up.
0 5 * * mon  /scripts/script.sh
Command run your script at 3 minutes intervals.
*/3 * * * * /scripts/monitor.sh

Linux Crontab Command

The crontabcommand permits you to install, view, or open a crontab file for editing:

  • crontab -e: Edit crontab file, or create one if it doesn’t already exist.
  • crontab -l: Display crontab file contents.
  • crontab -r: Remove your current crontab file.
  • crontab -i: Remove your current crontab file with a prompt before removal.
  • crontab -u <username>: Edit other user crontab files. This option needs system administrator privileges.

What is Cron and Cron Jobs in Linux?

Cron is a system daemon run on any Linux system that is responsible for detecting cron jobs and executing them at given intervals.

Cron runs every minute and it will inspect a set of pre-defined directories on your filesystem to see if jobs need to be run.

On the other hand, cron jobs are tasks defined to run at given intervals or periods, usually shell scripts or simple bash commands.

Cron jobs are usually used in order to log certain events to your Syslog utilities, or to schedule backup operations on your host (like database backups or filesystem backups).

For a Linux OS running systemd as a service manager, you can inspect the cron service by running the following command

$ sudo systemctl status cron.service

Note: You need sudo privileges to inspect system services with systemd

What is the Cron Job Syntax?

The most important to know about cron is probably the cron job syntax.

In order to define a cron job, you are going to define:

  • Periodicity: meaning when your job is going to be executed over time. You can define it to run every first day of the month, every 5 minutes, or on a very specific day of the year. Examples will be given later on in the article;
  • Command: literally the command to be executed by the cron utility, it can be a backup command or whatever command that you would normally run in a shell;
  • User: this is reserved for system-defined cron jobs where you want to specify the user that should the cron command. For user-defined cron jobs, you don’t have to specify a user, and your system will run them as root by default.

cron-syntax

As you probably noticed, the periodicity column is made of 5 columns.

Every single column can either be set to *, meaning that the command will be executed for every single value of the interval specified or to a particular value, for example, the 6th month of the year.

If you want to execute a command for every minute, of every hour, of every day of the month, of every month, you need to write the following command

* * * * *  logger "This is a command executed every minute"

If you want to execute a command every 30 minutes, you would write

*/30 * * * * logger "This is executed every 30 minutes"

On the other hand, if you want to execute a command on the first day of the month at midnight, you would write

0 0 1 * * logger "This is executed on the first day of the month, at midnight"

When defining those cron jobs, I didn’t have to specify the user executing them.

This is because there is a difference between user-defined cron jobs and system-defined cron jobs.

User-Defined Cron Jobs

User-defined cron jobs are cron jobs defined by a given user on the host. It doesn’t mean that it is not able to execute commands affecting the entire system, but its tasks are isolated on given folders on the host.

Every user is able to have its own set of cron jobs on a Linux host.

Listing user-defined cron jobs

When connected to a specific user, run this command to see the cron jobs owned by the user

$ crontab -l

If you own cron jobs, they will immediately be displayed to the standard output.

By default, user-defined cron jobs are stored in the /var/spool/cron/crontabs directory but you will need to be root to explore it.

user-defined-cron

Adding user-defined cron jobs

In order to edit the cron jobs related to the user you are connected to, run the following command

$ crontab -e

By default, your host will open your default editor and you will be able to able your cron jobs on the system.

Add a new line to the end of the file with the following line for example

* * * * * logger "This is a log command from junosnotes"

Logger is a command that allows users to write custom messages to logs. If you need a complete guide on logging and Syslog, we have a complete write-up on it.

You don’t have to specify a user as the system is already aware of the user defining this command.

Moreover, the command will be executed as the current user by default.

You don’t have to restart any services, your job will be taken into account on the next occurrence.

Given the example, we specified earlier, inspect your logs to see your cron job executed

$ sudo journalctl -xfn

cron-job-user

As you can see, the cron service inspected user-specific directories on the host (in /var/spool/cron/crontabs), it opened a session as my current user, executed the command, and closed the session.

Awesome!

You learned how you can define user-defined cron jobs on your host.

Removing user defined cron jobs

In order to remove user-defined cron jobs, use the following command

$ crontab -r
(or)
$ crontab -ri

Crontab will be deleted for your current user (it won’t delete system-defined cron jobs).

Run a cron job listing to check that all cron jobs have been deleted

System Defined Cron Jobs

System-defined cron jobs are jobs defined in shared directories on the filesystem.

It means that, given that you have sudo privileges on the host, you will be able to define cron jobs that may be modified by other administrators on your system.

Directories related to system defined cron jobs are located in the etc directory and can be seen by running

$ ls -l | grep cron

As you can see, this folder contains a lot of different folders and files :

  • anacrontab: a file used by the anacron service on Linux, which will be explained in one of the next sections.
  • cron.d: a directory containing a list of cron jobs to be read by the cron service. The files in cron.d are written given the cron syntax we saw before;
  • cron.daily: a directory containing a list of scripts to be executed by the system every day. Files are different from the files contained in the cron.d directory because they are actual bash scripts and not cron jobs written with cron syntax;
  • cron.hourly, cron.monthly, cron.weekly are self-explanatory, they contain scripts executed every hour, every month, and every week of the year;
  • crontab: a cron file written with cron syntax that instructs the cron service to run jobs located in the daily, hourly, monthly, and weekly folders. It can also define custom jobs similarly to user-defined cron jobs, except that you have to specify the user that should run the command.

Listing system defined cron jobs

As you probably understood, cron jobs defined in global configuration folders are spread over multiple folders.

Moreover, multiple cron files can be defined on those folders.

However, using the command line, there are efficient ways to concatenate all the files in a given directory.

To list all cron jobs defined in cron.d, run the following command

$ cat /etc/cron.d/*

Similarly, to list cron jobs defined in the crontab file, run the following command

$ cat /etc/crontab

Similarly, you can inspect all the scripts that are going to be executed on a daily basis

ls -l /etc/cron.daily/

Adding system defined cron jobs

As you probably understood, there are multiple ways to add system-defined cron jobs.

You can create a cron file in the cron.d, and the file will be inspected every minute for changes.

You can also add your cron job directly to the crontab file. If you want to execute a task every minute or every hour, it is recommended to add your script directly to the corresponding cron directories.

The only difference with user-defined cron jobs is that you will have to specify a user that will run the cron command.

For example, create a new file in the cron.d directory and add the following content to it (you will obviously need sudo privileges for the commands to run)

$ sudo nano /etc/cron.d/custom-cron

*/1 * * * *    root    logger 'This is a cron from cron.d'

Again, no need for you to restart any services, the cron service will inspect your file on the next iteration.

To see your cron job in action, run the following command

$ sudo journalctl -xfn 100 | grep logger

This is what you should see on your screen

Great!

As you can see your job is now executed every minute by the root user on your host.

Now that you have a complete idea of what user-defined cron jobs and system-defined cron jobs are, let’s see the complete cron lifecycle on a Linux host.

Cron Complete Cycle on Linux

Without further ado, here is a complete cron cycle on Linux.

cron-cycle-2

This is what your cron service does every minute, as well as all the directories inspected.

Cron will inspect the user-defined cron jobs and execute them if needed.

It will also inspect the crontab file where several default cron jobs are defined by default.

Those default cron jobs are scripts that instruct your host to verify every minute, every hour, every day, and every week specific folders and to execute the scripts that are inside them.

Finally, the cron.d directory is inspected. The cron.d may contain custom cron files and it also contains a very important file which is the anacron cron file.

Anacron cron file on Linux

The anacron cron file is a file executed every half an hour between 7:00 am and 11 pm.

The anacron cron file is responsible for calling the anacron service.

The anacron service is a service that is responsible for running cron jobs in case your computer was not able to run them in the first place.

Suppose that your computer is off but you had a cron job responsible for running update scripts every week.

As a consequence, when turning your computer on, instead of waiting an entire week to run those update scripts, the anacron service will detect that you were not able to launch the update cron before.

Anacron will then proceed to run the cron job for your system to be updated.

By default, the anacron cron file is instructed to verify that the cron.daily, cron.weekly, cron.hourly, and cron.monthly directories were correctly called in the past.

If anacron detects that the cron jobs in the cron.monthly folders haven’t run, it will be in charge to run them.

Conclusion

Today, you learned how cron and crontab work on Linux. You also had a complete introduction on the cron syntax and how to define your own cron scripts as a user on your host.

Finally, you had a complete cron cycle overview of how things work on your host and of what anacron is.

If you are interested in Linux system administration, we have a complete section about it on our website. Check out some of our Linux Tutorials from below:

Until then, have fun, as always.

Input Output Redirection on Linux Explained

Input Output Redirection on Linux Explained | Error Redirection in Linux

In Linux, it’s very often to execute Input or output redirection while working daily. It is one of the core concepts of Unix-based systems also it is utilized as a way to improve programmer productivity amazingly.

In this tutorial, we will be discussing in detail regarding the standard input/output redirections on Linux. Mostly, Unix system commands take input from your terminal and send the resultant output back to your terminal.

Moreover, this guide will reflect on the design of the Linux kernel on files as well as the way processes work for having a deep and complete understanding of what input & output redirection is.

Do Check: 

If you follow this Input Output Redirection on Linux Explained Tutorial until the end, then you will be having a good grip on the concepts like What file descriptors are and how they related to standard inputs and outputs, How to check standard inputs and outputs for a given process on Linux, How to redirect standard input and output on Linux, and How to use pipelines to chain inputs and outputs for long commands;

So without further ado, let’s take a look at what file descriptors are and how files are conceptualized by the Linux kernel.

Get Ready?

What is Redirection?

In Linux, Redirection is a feature that helps when executing a command, you can change the standard input/output devices. The basic workflow of any Linux command is that it takes an input and gives an output.

  • The standard input (stdin) device is the keyboard.
  • The standard output (stdout) device is the screen.

With redirection, the standard input/output can be changed.

What are Linux processes?

Before understanding input and output on a Linux system, it is very important to have some basics about what Linux processes are and how they interact with your hardware.

If you are only interested in input and output redirection command lines, you can jump to the next sections. This section is for system administrators willing to go deeper into the subject.

a – How are Linux processes created?

You probably already heard it before, as it is a pretty popular adage, but on Linux, everything is a file.

It means that processes, devices, keyboards, hard drives are represented as files living on the filesystem.

The Linux Kernel may differentiate those files by assigning them a file type (a file, a directory, a soft link, or a socket for example) but they are stored in the same data structure by the Kernel.

As you probably already know, Linux processes are created as forks of existing processes which may be the init process or the systemd process on more recent distributions.

When creating a new process, the Linux Kernel will fork a parent process and it will duplicate a structure which is the following one.

b – How are files stored on Linux?

I believe that a diagram speaks a hundred words, so here is how files are conceptually stored on a Linux system.

b – How are files stored on Linux?

As you can see, for every process created, a new task_struct is created on your Linux host.

This structure holds two references, one for filesystem metadata (called fs) where you can find information such as the filesystem mask for example.

The other one is a structure for files holding what we call file descriptors.

It also contains metadata about the files used by the process but we will focus on file descriptors for this chapter.

In computer science, file descriptors are references to other files that are currently used by the kernel itself.

But what do those files even represent?

c – How are file descriptors used on Linux?

As you probably already know, the kernel acts as an interface between your hardware devices (a screen, a mouse, a CD-ROM, or a keyboard).

It means that your Kernel is able to understand that you want to transfer some files between disks, or that you may want to create a new video on your secondary drive for example.

As a consequence, the Linux Kernel is permanently moving data from input devices (a keyboard for example) to output devices (a hard drive for example).

Using this abstraction, processes are essentially a way to manipulate inputs (as Read operations) to render various outputs (as write operations)

But how do processes know where data should be sent to?

Processes know where data should be sent to using file descriptors.

On Linux, the file descriptor 0 (or fd[0]) is assigned to the standard input.

Similarly the file descriptor 1 (or fd[1]) is assigned to the standard output, and the file descriptor 2 (or fd[2]) is assigned to the standard error.

file-descriptors-Linux

It is a constant on a Linux system, for every process, the first three file descriptors are reserved for standard inputs, outputs, and errors.

Those file descriptors are mapped to devices on your Linux system.

Devices registered when the kernel was instantiated, they can be seen in the /dev directory of your host.

c – How are file descriptors used on Linux?

If you were to take a look at the file descriptors of a given process, let’s say a bash process for example, you can see that file descriptors are essentially soft links to real hardware devices on your host.

c – How are file descriptors used on Linux proc

As you can see, when isolating the file descriptors of my bash process (that has the 5151 PID on my host), I am able to see the devices interacting with my process (or the files opened by the kernel for my process).

In this case, /dev/pts/0 represents a terminal which is a virtual device (or tty) on my virtual filesystem. In simpler terms, it means that my bash instance (running in a Gnome terminal interface) waits for inputs from my keyboard, prints them to the screen, and executes them when asked to.

Now that you have a clearer understanding of file descriptors and how they are used by processes, we are ready to describe how to do input and output redirection on Linux.

What is Output redirection on Linux?

Input and output redirection is a technique used in order to redirect/change standard inputs and outputs, essentially changing where data is read from, or where data is written to.

For example, if I execute a command on my Linux shell, the output might be printed directly to my terminal (a cat command for example).

However, with output redirection, I could choose to store the output of my cat command in a file for long-term storage.

a – How does output redirection works?

Output redirection is the act of redirecting the output of a process to a chosen place like files, databases, terminals, or any devices (or virtual devices) that can be written to.

output-redirection-diagram

As an example, let’s have a look at the echo command.

By default, the echo function will take a string parameter and print it to the default output device.

As a consequence, if you run the echo function in a terminal, the output is going to be printed in the terminal itself.

echo-devconnected

Now let’s say that I want the string to be printed to a file instead, for long-term storage.

To redirect standard output on Linux, you have to use the “>” operator.

As an example, to redirect the standard output of the echo function to a file, you should run

$ echo junosnotes > file

If the file is not existing, it will be created.

Next, you can have a look at the content of the file and see that the “junosnotes” string was correctly printed to it.

redirecting-output-to-a-file

Alternatively, it is possible to redirect the output by using the “1>” syntax.

$ echo test 1> file

a – How does output redirection works

b – Output Redirection to files in a non-destructive way

When redirecting the standard output to a file, you probably noticed that it erases the existing content of the file.

Sometimes, it can be quite problematic as you would want to keep the existing content of the file, and just append some changes to the end of the file.

To append content to a file using output redirection, use the “>>” operator rather than the “>” operator.

Given the example we just used before, let’s add a second line to our existing file.

$ echo a second line >> file

appending-content-output-redirection

Great!

As you can see, the content was appended to the file, rather than overwriting it completely.

c – Output redirection gotchas

When dealing with output redirection, you might be tempted to execute a command to a file only to redirect the output to the same file.

Redirecting to the same file

echo 'This a cool butterfly' > file
sed 's/butterfly/parrot/g' file > file

What do you expect to see in the test file?

The result is that the file is completely empty.

c – Output redirection gotchas cat-file-empty

Why?

By default, when parsing your command, the kernel will not execute the commands sequentially.

It means that it won’t wait for the end of the sed command to open your new file and to write the content to it.

Instead, the kernel is going to open your file, erase all the content inside it, and wait for the result of your sed operation to be processed.

As the sed operation is seeing an empty file (because all the content was erased by the output redirection operation), the content is empty.

As a consequence, nothing is appended to the file, and the content is completely empty.

In order to redirect the output to the same file, you may want to use pipes or more advanced commands such as

command … input_file > temp_file  &&  mv temp_file input_file

Protecting a file from being overwritten

In Linux, it is possible to protect files from being overwritten by the “>” operator.

You can protect your files by setting the “noclobber” parameter on the current shell environment.

$ set -o noclobber

It is also possible to restrict output redirection by running

$ set -C

Note: to re-enable output redirection, simply run set +C

noclobber

As you can see, the file cannot be overridden when setting this parameter.

If I really want to force the override, I can use the “>|” operator to force it.

override-1

What is Input Redirection on Linux?

Input redirection is the act of redirecting the input of a process to a given device (or virtual device) so that it starts reading from this device and not from the default one assigned by the Kernel.

a – How does input redirection works?

As an instance, when you are opening a terminal, you are interacting with it with your keyboard.

However, there are some cases where you might want to work with the content of a file because you want to programmatically send the content of the file to your command.

input-redirection-diagram

To redirect the standard input on Linux, you have to use the “<” operator.

As an example, let’s say that you want to use the content of a file and run a special command on them.

In this case, I am going to use a file containing domains, and the command will be a simple sort command.

In this way, domains will be sorted alphabetically.

With input redirection, I can run the following command

cat-input-redirect

If I want to sort those domains, I can redirect the content of the domains file to the standard input of the sort function.

$ sort < domains

sort-example

With this syntax, the content of the domains file is redirected to the input of the sort function. It is quite different from the following syntax

$ sort domains

Even if the output may be the same, in this case, the sort function takes a file as a parameter.

In the input redirection example, the sort function is called with no parameter.

As a consequence, when no file parameters are provided to the function, the function reads it from the standard input by default.

In this case, it is reading the content of the file provided.

b – Redirecting standard input with a file containing multiple lines

If your file is containing multiple lines, you can still redirect the standard input from your command for every single line of your file.

b – Redirecting standard input with a file containing multiple lines multiline1

Let’s say for example that you want to have a ping request for every single entry in the domains file.

By default, the ping command expects a single IP or URL to be pinged.

You can, however, redirect the content of your domain’s file to a custom function that will execute a ping function for every entry.

$ ( while read ip; do ping -c 2 $ip; done ) < ips

b – Redirecting standard input with a file containing multiple lines

c – Combining input redirection with output redirection

Now that you know that standard input can be redirected to a command, it is useful to mention that input and output redirection can be done within the same command.

Now that you are performing ping commands, you are getting the ping statistics for every single website on the domains list.

The results are printed on the standard output, which is in this case the terminal.

But what if you wanted to save the results to a file?

This can be achieved by combining input and output redirections on the same command.

$ ( while read ip; do ping -c 2 $ip; done ) < domains > stats.txt

Great! The results were correctly saved to a file and can be analyzed later on by other teams in your company.

d – Discarding standard output completely

In some cases, it might be handy to discard the standard output completely.

It may be because you are not interested in the standard output of a process or because this process is printing too many lines on the standard output.

To discard standard output completely on Linux, redirect the standard output to /dev/null.

Redirecting to /dev/null causes data to be completely discarded and erased.

$ cat file > /dev/null

Note: Redirecting to /dev/null does not erase the content of the file but it only discards the content of the standard output.

What is standard error redirection on Linux?

Finally, after input and output redirection, let’s see how standard error can be redirected.

a – How does standard error redirection work?

Very similarly to what we saw before, error redirection is redirecting errors returned by processes to a defined device on your host.

For example, if I am running a command with bad parameters, what I am seeing on my screen is an error message and it has been processed via the file descriptor responsible for error messages (fd[2]).

Note that there are no trivial ways to differentiate an error message from a standard output message in the terminal, you will have to rely on the programmer sending error messages to the correct file descriptor.

error-redirection-diagram

To redirect error output on Linux, use the “2>” operator

$ command 2> file

Let’s use the example of the ping command in order to generate an error message on the terminal.

Now let’s see a version where the error output is redirected to an error file.

As you can see, I used the “2>” operator to redirect errors to the “error-file” file.

If I were to redirect only the standard output to the file, nothing would be printed to it.

As you can see, the error message was printed to my terminal and nothing was added to my “normal-file” output.

b – Combining standard error with standard output

In some cases, you may want to combine the error messages with the standard output and redirect it to a file.

It can be particularly handy because some programs are not only returning standard messages or error messages but a mix of two.

Let’s take the example of the find command.

If I am running a find command on the root directory without sudo rights, I might be unauthorized to access some directories, like processes that I don’t own for example.

permission-denied

As a consequence, there will be a mix of standard messages (the files owned by my user) and error messages (when trying to access a directory that I don’t own).

In this case, I want to have both outputs stored in a file.

To redirect the standard output as well as the error output to a file, use the “2<&1” syntax with a preceding “>”.

$ find / -user junosnotes > file 2>&1

Alternatively, you can use the “&>” syntax as a shorter way to redirect both the output and the errors.

$ find / -user junosnotes &> file

So what happened here?

When bash sees multiple redirections, it processes them from left to right.

As a consequence, the output of the find function is first redirected to the file.

Next, the second redirection is processed and redirects the standard error to the standard output (which was previously assigned to the file).

multiple-redirections

multiple-redirections 2

What are pipelines on Linux?

Pipelines are a bit different from redirections.

When doing standard input or output redirection, you were essentially overwriting the default input or output to a custom file.

With pipelines, you are not overwriting inputs or outputs, but you are connecting them together.

Pipelines are used on Linux systems to connect processes together, linking standard outputs from one program to the standard input of another.

Multiple processes can be linked together with pipelines (or pipes)

pipelines-linux-2

Pipes are heavily used by system administrators in order to create complex queries by combining simple queries together.

One of the most popular examples is probably counting the number of lines in a text file, after applying some custom filters on the content of the file.

Let’s go back the domains file we created in the previous sections and let’s change their country extensions to include .net domains.

Now let’s say that you want to count the numbers of .com domains in the file.

How would you perform that? By using pipes.

First, you want to filter the results to isolate only the .com domains in the file. Then, you want to pipe the result to the “wc” command in order to count them.

Here is how you would count .com domains in the file.

$ grep .com domains | wc -l

Here is what happened with a diagram in case you still can’t understand it.

counting-domains

Awesome!

Conclusion

In today’s tutorial, you learned what input and output redirection is and how it can be effectively used to perform administrative operations on your Linux system.

You also learned about pipelines (or pipes) that are used to chain commands in order to execute longer and more complex commands on your host.

If you are curious about Linux administration, we have a whole category dedicated to it on JunosNiotes, so make sure to check it out!

How To Encrypt Root Filesystem on Linux

As a system administrator, you probably already know how important it is to encrypt your disks.

If your laptop were to be stolen, even a novice hacker would be able to extract the information contained on the disks.

All it takes is a simple USB stick with a LiveCD on it and everything would be stolen.

Luckily for you, there are ways for you to prevent this from happening : by encrypting data stored on your disks.

In this tutorial, we are going to see the steps needed in order to perform a full system encryption. You may find other tutorials online focused on encrypting just a file or home partitions for example.

In this case, we are encrypting the entire system meaning the entire root partition and the boot folder. We are going to encrypt a part of the bootloader.

Ready?

Prerequisites

In order to perform all the operations detailed in this guide, you obviously need to have system administrator rights.

In order to check that this is the case, make sure that you belong to the “sudo“ group (for Debian based distributions) or “wheel“ (on RedHat based ones).
How To Encrypt Root Filesystem on Linux checking-sudo

If you see the following output, you should be good to go.

Before continuing, it is important for you to know that encrypting disks doesn’t come without any risks.

The process involves formatting your entire disk meaning that you will lose data if you don’t back it up. As a consequence, it might be a good idea for you to backup your files, whether you choose to do it on an external drive or in an online cloud.

If you are not sure about the steps needed to backup your entire system, I recommend that you read the following tutorial that explains it in clear terms.

Now that everything is set, we can begin encrypting our entire system.

Identify your current situation

This tutorial is divided into three parts : one for each scenario that you may be facing.

After identifying your current situation, you can directly navigate to the chapter that you are interested about.

If you want to encrypt a system that already contains unencrypted data, you have two choices :

  • You can add an additional disk to your computer or server and configure it to become the bootable disk : you can go to the part one.
  • You cannot add an additional disk to your computer (a laptop under warranty for example) : you will find the information needed on part two.

If you are installing a brand new system, meaning that you install the distribution from scratch, you may encrypt your entire disk directly from the graphical installer. As a consequence, you can go to part three.

Design Hard Disk Layout

Whenever you are creating new partitions, encrypted or not, it is quite important to choose the hard disk design ahead of time.

In this case, we are going to design our disk using a MBR layout : the first 512 bytes of the bootable disk will be reserved for the first stage of the GRUB (as well as metadata for our partitions).

The first partition will be an empty partition reserved for systems using EFI (or UEFI) as the booting firmware. If you choose to install Windows 10 in the future, you will have a partition already available for that.

The second partition of our disk will be formatted as a LUKS-LVM partition containing one physical volume (the disk partition itself) as well as one volume group containing two logical volumes : one for the root filesystem and another one for a small swap partition.

As you can see, the second stage of the GRUB will be encrypted too : this is because we chose to have the boot folder stored on the same partition.

Design Hard Disk Layout mbr-disk-design

Of course, you are not limited to the design provided here, you can add additional logical volumes for your logs for example.

This design will be our roadmap for this tutorial : we are going to start from a brand new disk and implement all the parts together.

Data-at-rest encryption

This tutorial focuses on data-at-rest encryption. As its name states, data-at-rest encryption means that your system is encrypted, i.e nobody can read from it, when it is resting or powered off.

Data-at-rest encryption data-at-rest-encryption

This encryption is quite useful if your computer were to be stolen, hackers would not be able to read data on the disk unless they know about the passphrase that you are going to choose in the next sections.

However, there would still be a risk that your data is erased forever : having no read access to a disk does not mean that they cannot simply remove partitions on it.

As a consequence, make sure that you keep a backup of your important files somewhere safe.

Encrypting Root Filesystem on New Disk

As detailed during the introduction, we are going to encrypt the root filesystem from a new disk that does not contain any data at all. This is quite important because the encrypted disk will be formatted in the process.

Head over to the system that you want to encrypt and plug the new disk. First of all, identify your current disk, which is probably named “/dev/sda” and the disk that you just plugged in (probably named “/dev/sdb”).

If you have any doubts about the correspondence between names and disk serials, you can append vendors and serials with the “-o” option of lsblk.

$ lsblk -do +VENDOR,SERIAL

Encrypting Root Filesystem on New Disk listing-drives-linux

In this case, the disk with data is named “/dev/sda” and the new one is named “/dev/sdb”.

First of all, we need to create the layout we specified in the introduction, meaning one partition that is going to be a EFI one and one LUKS-LVM partition.

Creating Basic Disk Layout

The first step on our journey towards full disk encryption starts with two simple partitions : one EFI (even if we use MBR, in case you want to change in the future) and one for our LVM.

To create new partitions on your disk, use the “fdisk” command and specify the disk to be formatted.

$ sudo fdisk /dev/sdb

As explained in the introduction, the first partition will be a 512 Mb one and the other one will take the remaining space on the disk.

Creating Basic Disk Layout create-w95-partition

In the “fdisk” utility, you can create a new partition with the “n” option and specify a size of 512 megabytes with “+512M“.

Make sure to change the partition type to W95 FAT32 using the “t” option and specifying “b” as the type.

Awesome, now that you have your first partition, we are going to create the one we are interested in.

Creating Basic Disk Layout create-second-partition

Creating the second partition is even simpler.

In the fdisk utility, use “n” in order to create a new partition and stick with the defaults, meaning that you can press “Enter” on every steps.

When you are done, you can simply press “w” in order to write the changes to disk.

Now, executing the “fdisk” command again will give you a good idea of the changes that you performed on the disk.

$ sudo fdisk -l /dev/sdb

Creating Basic Disk Layout fdisk-command-disk

Great!

Your second partition is ready to be formatted so let’s head to it.

Creating LUKS & LVM partitions on disk

In order to encrypt disks, we are going to use LUKS, short for the Linux Unified Key Setup project.

LUKS is a specification for several backends implemented in some versions of the Linux kernel.

In this case, we are going to use the “dm-crypt” submodule of the Linux storage stack.

As its names states, “dm-crypt” is part of the device mapper module that aims at creating a layer of abstraction between your physical disks and the way you choose to design your storage stack.

Creating LUKS & LVM partitions on disk dm-crypt

This information is quite important because it means that you can encrypt pretty much every device using the “dm-crypt” backend.

In this case, we are going to encrypt a disk, containing a set of LVM partitions, but you may choose to encrypt a USB memory stick or a floppy disk.

In order to interact with the “dm-crypt” module, we are going to use the “cryptsetup” command.

Obviously, you may need to install it on your server if you don’t have it already.

$ sudo apt-get instal cryptsetup

$ which cryptsetup

Creating LUKS & LVM partitions on disk which-cryptsetup

Now that the cryptsetup is available on your computer, you will create your first LUKS-formatted partition.

To create a LUKS partition, you are going to use the “cryptsetup” command followed by the “luksFormat” command that formats the specified partition (or disk).

 $ sudo cryptsetup luksFormat --type luks1 /dev/sdb2
Note : so why are we specifying the LUKS1 formatting type? As of January 2021, GRUB (our bootloader) does not support LUKS2 encryption. Make sure to leave a comment if you notice that LUKS2 is now released for the GRUB bootlader.

Creating LUKS & LVM partitions on disk cryptsetup-luksformat

As you can see, you are notified that this operation will erase all data stored on the disk. Check the disk that you are formatting one last time, and type “YES” when you are ready.

Right after, you are prompted with a passphrase. LUKS uses two authentication methods : a passphrase based one which is essentially a password that you enter on decryption.

LUKS can also use keys. Using keys, you can for example store it on a part of your disk and your system will be able to look after it automatically.

Choose a strong passphrase, enter it again and wait to the disk encryption to complete.

Creating LUKS & LVM partitions on disk cryptsetup-luksformat-2

When you are done, you can check with the “lsblk” command that your partition is now encrypted as a LUKS one.

Awesome! You now have an encrypted partition.

$ lsblk -f

list-encrypted-drives

To check that your partition is correctly formatted, you can use the “cryptsetup” command followed by the “luksDump” option and specify the name of the encrypted device.

$ sudo cryptsetup luksDump /dev/sdb2

cryptsetup-luksdump

Your version should be set to “1” for the “LUKS1” format and you should see below the encrypted passphrase in one of the keyslots.

Creating Encrypted LVM on disk

Now that your LUKS encrypted partition is ready, you can “open” it. “Opening” an encrypted partition simply means that you are going to access data on the disk.

To open your encrypted device, use the “cryptsetup” command followed by “luksOpen”, the name of the encrypted device and a name.

$ sudo cryptsetup luksOpen <encrypted_device> <name>

cryptsetup-luksOpen

In this case, we chose to name the device “cryptlvm“.

As a consequence, using the “lsblk” command again, you can see that a new device was added to the existing device list. The second partition now contains a device named “cryptlvm” which is your decrypted partition.

Now that everything is ready, we can start creating our two LVM : one for our root partition and one for swap.

First of all, we are going to create a physical volume for our new disk using the “pvcreate” command.

# Optional, if you don't have LVM commands : sudo apt-get install lvm2

$ sudo pvcreate /dev/mapper/cryptlvm

create-physical-volume

Now that your physical volume is ready, you can use it to create a volume group named “cryptvg“.

$ sudo vgcreate cryptvg /dev/mapper/cryptlvm

vgcreate-command

Now that your volume group is ready, you can create your two logical volumes.

In this case, the first partition is a 13Gb one and the swap partition will take the remaining space. Make sure to modify those numbers for your specific case.

In order to host our root filesystem, we are going to create an EXT4 filesystem on the logical volume.

$ sudo lvcreate -n lvroot -L 13G cryptvg

$ sudo mkfs.ext4 /dev/mapper/cryptvg-lvroot

create-root-logical-volume

Creating the swap partition can be achieved using the same steps, using the “lvcreate” and the “mkswap” one.

$ sudo lvcreate -n lvswap -l 100%FREE cryptvg

$ sudo mkswap /dev/mapper/cryptvg-lvswap

create-swap-logical-volume

Awesome! Now that your partitions are created, it is time for you to transfer your existing rootfilesystem on the newly created one.

Transfer Entire Filesystem to Encrypted Disk

Before transferring your entire filesystem, it might be a good idea to check that you have enough space on the destination drive.

$ df -h

In order to transfer your entire filesystem to your newly created partition, you are going to use the “rsync” command.

Mount your newly created logical volume and start copying your files and folders recursively to the destination drive.

$ sudo mount /dev/mapper/cryptvg-lvroot /mnt

$ sudo rsync -aAXv / --exclude="mnt" /mnt --progress

This process can take quite some time depending on the amount of data that you have to transfer.

After a while, your entire filesystem should be copied to your encrypted drive. Now that the “/boot” is encrypted, you will need to re-install the stage 1 of the GRUB accordingly.

Install and Configure GRUB Bootloader

So, why would you need to re-install and re-configure your GRUB accordingly?

To answer this question, you need to have a basic idea of the way your system boots up when using a BIOS/MBR conventional booting process.

Install and Configure GRUB Bootloader linux-bios-boot-process

As explained in the introduction, GRUB is split into two (sometimes three) parts : GRUB stage 1 and GRUB stage 2. The stage 1 will only look for the location of the stage 2, often located in the “/boot” folder of your filesystem.

The stage 2 is responsible for many tasks : loading the necessary modules, loading the kernel into memory and starting the the initramfs process.

As you understood, the stage 2 is encrypted here, so we need to tell the stage 1 (located in the first 512 bytes of your disk) that it needs to be decrypted first.

Re-install GRUB Stage 1 & 2

In order to reinstall the first stage of the GRUB, you first need to enable the “cryptomount” that enables access to encrypted devices in the GRUB environment.

To achieve that, you need to edit the “/etc/default/grub” file and add the “GRUB_ENABLE_CRYPTODISK=y” option.

However, you are currently sitting on the system that you are trying to encrypt. As a consequence, you will need to chroot into your new drive in order to execute the commands properly.

Chroot in Encrypted Drive

To chroot into your encrypted drive, you will have to execute the following commands.

$ sudo mount --bind /dev /mnt/dev
$ sudo mount --bind /run /mnt/run

$ sudo chroot /mnt/

$ sudo mount --types=proc proc /proc
$ sudo mount --types=sysfs sys /sys

Chroot in Encrypted Drive lsblk-chroot

Now that you executed those commands, you should now be in the context of your encrypted drive.

$ vi /etc/default/grub

grub-enable-cryptodisk-1

GRUB_ENABLE_CRYPTODISK=y

As stated in the GRUB documentation, this option will configure the GRUB to look for encrypted devices and add additional commands in order to decrypt them.

Now that the stage 1 is configured, you can install it on your MBR using the grub-install command.

$ grub-install --boot-directory=/boot /dev/sdb
Note : be careful, you need to specify “/dev/sdb” and not “/dev/sdb1”.

grub-install-stage-1

As you probably noticed, when providing no options for the GRUB installation, you have by default an “i386-pc” installation (which is designed for a BIOS-based firmware).

Re-install GRUB Stage 2

Using the steps detailed above, the stage 1 has been updated but we also need to tell the stage 2 that it is dealing with an encrypted disk.

To achieve that, head over to the “/etc/default/grub” and add another line for your GRUB stage 2.

GRUB_CMDLINE_LINUX="cryptdevice=UUID=<encrypted_device_uuid> root=UUID=<root_fs_uuid>"

This is an important line because it tells the second stage of the GRUB where the encrypted drive is and where the root partition is located.

To identify the UUIDs needed, you can use the “lsblk” command with the “-f” option.

$ lsblk -f

lsblk-uuids

Using those UUIDs, we would add the following line to the GRUB configuration file.

GRUB_CMDLINE_LINUX="cryptdevice=UUID=1b9a0045-93d5-4560-a6f7-78c07e1e15c4 root=UUID=dd2bfc7f-3da2-4dc8-b4f0-405a758f548e"

To update your current GRUB installation, you can use the “update-grub2” command in your chrooted environment.

$ sudo update-grub2

update-grub2-command

Now that you updated your GRUB installation, your GRUB menu (i.e the stage 2) should be modified and you should see the following content when inspecting the “/boot/grub/grub.cfg” file.

grub-configuration-file

As you can see, the GRUB configuration file was modified and your system is now using “cryptomount” in order to locate the encrypted drive.

For your system to boot properly, you need to check that :

  • You are loading the correct modules such as cryptodisk, luks, lvm and others;
  • The “cryptomount” instruction is correctly set;
  • The kernel is loaded using the “cryptdevice” instruction we just set in the previous section.
  • The UUID specified are correct : the “cryptdevice” one is pointing to the LUKS2 encrypted partition and the “root” one to the ext4 root filesystem.

Modify crypttab and fstab files

One of the first steps of initramfs will be to mount your volumes using the “/etc/crypttab” and “/etc/fstab” files on the filesystem.

As a consequence, and because you creating new volumes, you may have to modify those files in order to put the correct UUID in them.

First of all, head over to the “/etc/crypttab” file (you can create it if it does not exist already) and add the following content

$ nano /etc/crypttab

# <target name>   <source device>        <key file> <options>
  cryptlvm        UUID=<luks_uuid>       none       luks

Modify crypttab and fstab files crypttab-file

If you are not sure about the UUID of your encrypted device, you can use the “blkid” to get the information.

$ blkid | grep -i LUKS

Now that the crypttab file is modified, you only need to modify the fstab accordingly.

$ nano /etc/fstab

# <file system>       <mount point>   <type>  <options>             <dump>    <pass>
UUID=<ext4 uuid>      /               ext4    errors=remount-ro     0         1

Again, if you are not sure about the UUID of your ext4 filesystem, you can use the “blkid” command again.

$ blkid | grep -i ext4

Almost done!

Now that your GRUB and configuration files are correctly configured, we only need to configure the initramfs image.

Re-configure initramfs image

Among all the boot scripts, initramfs will look for the root filesystem you specified in the previous chapter.

However, in order to decrypt the root filesystem, it will need to invoke the correct initramfs modules, namely the “cryptsetup-initramfs” one. In your chrooted environment, you can execute the following command :

$ apt-get install cryptsetup-initramfs

In order to include the cryptsetup modules in your initramfs image, make sure to execute the “update-initramfs” command.

$ update-initramfs -u -k all

That’s it!

You have successfully assembled all the needed pieces in order to create a fully encrypted disk on your system. You can now reboot your computer and have a look at your new boot process.

Boot on Encrypted Device

When booting, the first screen that you will see is the first stage of the GRUB trying to decrypt the second stage of the GRUB.

Boot on Encrypted Device grub-stage-1-encrypted

If you see this password prompt, it means that you don’t have any errors in your stage 1 configuration.

Note : be aware that this screen may not follow your usual keyboard layout. As a consequence, if you have an incorrect password prompt, you should try pretending that you have a US keyboard or an AZERTY one for example.

When providing the correct password, you will be presented with the GRUB menu.

grub-stage-2

If you see this screen, it means that your stage 1 was able to open the stage 2. You can select the “Ubuntu” option and boot on your system.

boot-lock-screen

On the next screen, you are asked to provide the passphrase again.

This is quite normal because your boot partition is encrypted. As a consequence, you need one passphrase in order to unlock the stage 2 and one to unlock the entire root filesystem.

Luckily, there is a way to avoid that : by having a key file embedded in the initramfs image. For that, ArchLinux contributors wrote an excellent tutorial on the subject.

In this case, we are just going to provide the passphrase and press Enter.

After a while, when the init process is done, you should be presented with the lock screen of your user interface!

Congratulations, you successfully encrypted an entire system on Linux!

lock-screen

Encrypting Root Filesystem on Existing Disk

In some cases, you may have to encrypt an existing disk without the capability of removing one of the disks on your computer. This case may happen if you have a disk under warranty for example.

In this case, the process is quite simple :

  • Make a bootable USB (or removable device) containing an ISO of the distribution of your choice;
  • Use the device in order to boot and log into a LiveCD of your distribution;
  • From the LiveCD, identify the hard disk containing your root distribution and make a backup of it;
  • Mount the primary partition on the folder of your choice and follow the instructions of the previous chapter;

So why do you need to use a LiveCD if you want to encrypt a non-removable disk?

If you were to encrypt your main primary disk, you would have to unmount it. However, as it is the root partition of your system, you would not be able to unmount it, as a consequence you have to use a LiveCD.

Encrypting Root Filesystem From Installation Wizard

In some cases, some distributors embed the encryption process right into the installation wizard.

If you are not looking to transfer an existing filesystem from one system to another, you might be tempted to use this option.

Taking Ubuntu 20.04 as an example, the installation process suggests disk encryption in the disk configuration wizard.

Encrypting Root Filesystem From Installation Wizard
If you select this option, you will have a similar setup to the one done in the previous sections. However, most distributions choose not to encrypt the “/boot” folder.

encrypted-system-from-wizard

If you want to encrypt the “/boot” folder, we recommend that you read the first section of this tutorial.

Troubleshooting

As open-source changes constantly, there is a chance that you are not able to boot your system, even if you followed the steps of this tutorial carefully.

However, as error sources are probably infinite and specific to every user, there would be no point enumerating every single issue that you can encouter.

However, most of the time, it is quite important to know on which step of the boot process you are failing.

If you see a screen with a “grub rescue” prompt, it probably means that you are stuck on the stage 1, thus that the bootloader was not able to locate the disk containing the second stage.

If you are in an initramfs prompt, it probably means that something wrong happened during the init process :

  • Are you sure that you specified the filesystems to mount in the crypttab and fstab files?
  • Are you sure that all modules were currently loaded in your initramfs image? Aren’t you missing the cryptsetup or lvm modules for example?

initramfs-screen

Below are some resources that we found interesting during the writing of this tutorial, they may have some answers to your problems :

  • Encrypting an entire system : a similar tutorial for ArchLinux;
  • Manual System Encryption on Ubuntu : steps used in order to chroot in a root filesystem.

Conclusion

In this tutorial, you learnt how you can encrypt an entire root filesystem, with the “/boot” folder, using the LUKS specification.

You also learnt about the Linux boot process and the different steps that your system goes through in order to launch your operating system.

Achieving a full-system encryption is quite lengthy but it is very interesting for users that are willing to dig deeper into the Linux and open source world.

If you are interested in Linux System Administration, make sure to read our other tutorials and to navigate to our dedicated section.

Linux Logging Complete Guide

As a Linux system administrator, inspecting log files is one of the most common tasks that you may have to perform.

Linux logs are crucial : they store important information about some errors that may happen on your system.

They might also store information about who’s trying to access your system, what a specific service is doing, or about a system crash that happened earlier.

As a consequence, knowing how to locatemanipulate and parse log files is definitely a skill that you have to master.

In this tutorial, we are going to unveil everything that there is to know about Linux logging.

You will be presented with the way logging is architectured on Linux systems and how different virtual devices and processes interact together to log entries.

We are going to dig deeper into the Syslog protocol and how it transitioned from syslogd (on old systems) to journalctl powered by systemd on recent systems.

Linux Logging Types

When dealing with Linux logging, there are a few basics that you need to understand before typing any commands in the terminal.

On Linux, you have two types of logging mechanisms :

  • Kernel logging: related to errors, warning or information entries that your kernel may write;
  • User logging: linked to the user space, those log entries are related to processes or services that may run on the host machine.

By splitting logging into two categories, we are essentially unveiling that memory itself is divided into two categories on Linux : user space and kernel space.

Linux Logging Types linux-spaces

Kernel Logging

Let’s start first with logging associated with the kernel space also known as the Kernel logging.

On the kernel space, logging is done via the Kernel Ring Buffer.

The kernel ring buffer is a circular buffer that is the first datastructure storing log messages when the system boots up.

When you are starting your Linux machine, if log messages are displayed on the screen, those messages are stored in the kernel ring buffer.

Kernel Logging

Kernel logs during boot process

The Kernel logging is started before user logging (managed by the syslog daemon or by rsyslog on recent distributions).

The kernel ring buffer, pretty much like any other log files on your system can be inspected.

In order to open Kernel-related logs on your system, you have to use the “dmesg” command.

Note : you need to execute this command as root or to have privileged rights in order to inspect the kernel ring buffer.
$ dmesg

Kernel Logging dmesg

As you can see, from the system boot until the time when you executed the command, the kernel keeps track of all the actions, warnings or errors that may happen in the kernel space.

If your system has trouble detecting or mounting a disk, this is probably where you want to inspect the errors.

As you can see, the dmesg command is a pretty nice interface in order to see kernel logs, but how is the dmesg command printing those results back to you?

In order to unveil the different mechanisms used, let’s see which processes and devices take care of Kernel logging.

Kernel Logging internals

As you probably heard it before, on Linux, everything is a file.

If everything is a file, it also means that devices are files.

On Linux, the kernel ring buffer is materialized by a character device file in the /dev directory and it is named kmsg.

$ ls -l /dev/ | grep kmsg

Kernel Logging internals kmsg

If we were to depict the relationship between the kmsg device and the kernel ring buffer, this is how we would represent it.

kernel-logging-internals

As you can see, the kmsg device is an abstraction used in order to read and write to the kernel ring buffer.

You can essentially see it as an entrypoint for user space processes in order to write to the kernel ring buffer.

However, the diagram shown above is incomplete as one special file is used by the kernel in order to dump the kernel log information to a file.

Kernel Logging internals

If we were to summarize it, we would essentially state that the kmsg virtual device acts as an entrypoint for the kernel ring buffer while the output of this process (the log lines) are printed to the /proc/kmsg file.

This file can be parsed by only one single process which is most of the time the logging utility used on the user space. On some distributions, it can be syslogd, but on more recent distributions it is integrated with rsyslog.

The rsyslog utility has a set of embedded modules that will redirect kernel logs to dedicated files on the file system.

Historically, kernel logs were retrieved by the klogd daemon on previous systems but it has been replaced by rsyslog on most distributions.

Kernel Logging internals klogd

klogd utility running on Debian 4.0 Etch

On one hand, you have logging utilities reading from the ring buffer but you also have user space programs writing to the ring buffer : systemd (with the famous systemd-journal) on recent distributions for example.

Now that you know more about Kernel logging, let’s see how logging is done on the user space.

User side logging with Syslog

Logging on the userspace is quite different from logging on the kernel space.

On the user side, logging is based on the Syslog protocol.

Syslog is used as a standard to produce, forward and collect logs produced on a Linux instance.

Syslog defines severity levels as well as facility levels helping users having a greater understanding of logs produced on their computers.

Logs can later on be analyzed and visualized on servers referred as Syslog servers.

User side logging with Syslog-card

In short, the Syslog protocol is a protocol used to define the log messages are formatted, sent and received on Unix systems.

Syslog is known for defining the syslog format that defines the format that needs to be used by applications in order to send logs.

This format is well-known for defining two important terms : facilities and priorities.

Syslog Facilities Explained

In short, a facility level is used to determine the program or part of the system that produced the logs.

On your Linux system, many different utilities and programs are sending logs. In order to determine which process sent the log in the first place, Syslog defines numbers, facility numbers, that are used by programs to send Syslog logs.

There are more than 23 different Syslog facilities that are described in the table below.

Numerical Code Keyword Facility name
0 kern Kernel messages
1 user User-level messages
2 mail Mail system
3 daemon System Daemons
4 auth Security messages
5 syslog Syslogd messages
6 lpr Line printer subsystem
7 news Network news subsystem
8 uucp UUCP subsystem
9 cron Clock daemon
10 authpriv Security messages
11 ftp FTP daemon
12 ntp NTP subsystem
13 security Security log audit
14 console Console log alerts
15 solaris-cron Scheduling logs
16-23 local0 to local7 Locally used facilities

Most of those facilities are reserved to system processes (such as the mail server if you have one or the cron utility). Some of them (from the facility number 16 to 23) can be used by custom Syslog client or user programs to send logs.

Syslog Priorities Explained

Syslog severity levels are used to how severe a log event is and they range from debug, informational messages to emergency levels.

Similarly to Syslog facility levels, severity levels are divided into numerical categories ranging from 0 to 7, 0 being the most critical emergency level.

Again, here is a table for all the priority levels available with Syslog.

Here are the syslog severity levels described in a table:

Value Severity Keyword
0 Emergency emerg
1 Alert alert
2 Critical crit
3 Error err
4 Warning warning
5 Notice notice
6 Informational info
7 Debug debug

Syslog Architecture

Syslog also defines a couple of technical terms that are used in order to build the architecture of logging systems :

  • Originator : also known as a “Syslog client”, an originator is responsible for sending the Syslog formatted message over the network or to the correct application;
  • Relay : a relay is used in order to forward messages over the network. A relay can transform the messages in order to enrich it for example (famous examples include Logstash or fluentd);
  • Collector : also known as “Syslog servers”, collectors are used in order to store, visualize and retrieve logs from multiple applications. The collector can write logs to a wide variety of different outputs : local files, databases or caches.

Syslog Architecture syslog

As you can see, the Syslog protocol follows the client-server architecture we have seen in previous tutorials.

One Syslog client creates messages and sends it to optional local or distant relays that can be further transferred to Syslog servers.

Now that you know how the Syslog protocol is architectured, what about our own Linux system?

Is it following this architecture?

Linux Local Logging Architecture

Logging on a local Linux system follows the exact principles we have described before.

Without further ado, here is the way logging is architectured on a Linux system (on recent distributions)

Linux Local Logging Architecture linux-logging-2

Following the originator-relay-collector architecture described before, in the case of a local Linux system :

  • Originators are client applications that may embed syslog or journald libraries in order to send logs;
  • No relays are implemented by default locally;
  • Collectors are rsyslog and the journald daemon listening on predefined sockets for incoming logs.

So where are logs stored after being received by the collectors?

Linux Log File Location

On your Linux system, logs are stored in the /var/log directory.

Logs in the /var/log directory are split into the Syslog facilities that we saw earlier followed by the log suffix : auth.log, daemon.log, kern.log or dpkg.log.

If you inspected the auth.log file, you would be presented with logs related to authentication and authorization on your Linux system.

Linux Log File Location auth

Similarly, the cron.log file displays information related to the cron service on your system.

However, as you can see from the diagram above, there is a coexistence of two different logging systems on your Linux server : rsyslog and systemd-journal.

Rsyslog and systemd-journal coexistence

Historically, a daemon was responsible for gathering logs from your applications on Linux.

On many old distributions, this task was assigned to a daemon called syslogd but it was replaced in recent distributions by the rsyslog daemon.

When systemd replaced the existing init process on recent distributions, it came with its own way of retrieving and storing logs : systemd-journal.

Now, the two systems are coexisting but their coexistence was thought to be backwards compatible with the ways logs used to be architectured in the past.

The main difference between rsyslog and systemd-journal is that rsyslog will persist logs into the log files available at /var/log while journald will not persist data unless configured to do it.

Journal Log Files Location

As you understood it from the last section, the systemd-journal utility also keeps track of logging activities on your system.

Some applications that are configured as services (an Apache HTTP Server for example) may talk directly to the systemd journal.

The systemd journal stores logs in a centralized way is the /run/log/journal directory.

The log files are stored as binary files by systemd, so you won’t be able to inspect the files using the usual cat or less commands.

Instead, you want to use the “journalctl” command in order to inspect log files created by systemd-journal.

$ journalctl

There are many different options that you can use with journalctl, but most of the time you want to stick with the “-r” and “-u” option.

In order to see the latest journal entries, use “journalctl” with the “-r” option.

$ journalctl -r

Journal Log Files Location journalctl-r

If you want to see logs related to a specific service, use the “-u” option and specify the name of the service.

$ journalctl -u <service>

For example, in order to see logs related to the SSH service, you would run the following command

$ journalctl -u ssh

Now that you have seen how you can read configuration files, let’s see how you can easily configure your logging utilities.

Linux Logging Configuration

As you probably understood from the previous sections, Linux logging is based on two important components : rsyslog and systemd-journal.

Each one of those utilities has its own configuration file and we are going to see in the following chapters how they can be configured.

Systemd journal configuration

The configuration files for the systemd journal are located in the /etc/systemd directory.

$ sudo vi /etc/systemd/journald.conf

The file named “journald.conf” is used in order to configure the journal daemon on recent distributions.

One of the most important options in the journal configuration is the “Storage” parameter.

As specific before, the journal files are not persisted on your system and they will be lost on the next restart.

To make your journal logs persistent, make sure to modify this parameter to “persistent” and to restart your systemd journal daemon.

Systemd journal configuration persistent

To restart the journal daemon, use the “systemctl” command with the “restart” option and specify the name of the service.

$ sudo systemctl restart systemd-journald

As a consequence, journal logs will be stored into the /var/log/journal directory next to the rsyslog log files.

$ ls -l /var/log/journal

Systemd journal configuration var-log-journal

If you are curious about the systemd-journal configuration, make sure to read the documentation provided by FreeDesktop.

Rsyslog configuration

On the other hand, the rsyslog service can be configured via the /etc/rsyslog.conf configuration file.

$ sudo vi /etc/rsyslog.conf

As specified before, rsyslog is essentially a Syslog collector but the main concept that you have to understand is that Rsyslog works with modules.

Rsyslog configuration rsyslog-card

Its modular architecture provides plugins such as native ways to transfer logs to a file, a shell, a database or sockets.

Working with rsyslog, there are two main sections that are worth your attention : modules and rules.

Rsyslog Modules

By default, two modules are enabled on your system : imuxsock (listening on the syslog socket) and imjournal (essentially forwarding journal logs to rsyslog).

Note : the imklog (responsible for gathering Kernel logs) might be also activated.

Rsyslog configuration modules-rsyslog

Rsyslog Rules

The rules section of rsyslog is probably the most important one.

On rsyslog, but you can find the same principles on old distributions with systemd, the rules section defines which log should be stored to your file system depending on their facility and priority.

As an example, let’s take the following rsyslog configuration file.

Rsyslog Rules rules-rsyslog

The first column describes the rules applied : on the left side of the dot, you define the facility and on the right side the severity.

Rsyslog Rules rsyslog-rules

A wildcard symbol “*” means that it is working for all severities.

As a consequence, if you want to tweak your logging configuration in order, say for example that for example you are interested in only specific severities, this is the file you would modify.

Linux Logs Monitoring Utilities

In the previous section, we have seen how you can easily configure your logging utilities, but what utilities can you use in order to read your Linux logs easily?

The easiest way to read and monitor your Linux logs is to use the tail command with the “-f” option for follow.

$ tail -f <file>

For example, in order to read the logs written in the auth.log file, you would run the following command.

$ tail -f /var/log/auth.log

Another great way of reading Linux logs is to use graphical applications if you are running a Linux desktop environment.

The “Logs” application is a graphical application designed in order to list application and system logs that may be stored in various logs files (either in rsyslog or journald).

Linux Logs Monitoring Utilities logs-application

Linux Logging Utilities

Now that you have seen how logging can be configured on a Linux system, let’s see a couple of utilities that you can use in case you want to log messages.

Using logger

The logger utility is probably one of the simpliest log client to use.

Logger is used in order to send log messages to the system log and it can be executed using the following syntax.

$ logger <options> <message>

Let’s say for example that you want to send an emergency message from the auth facility to your rsyslog utility, you would run the following command.

$ logger -p auth.emerg "Somebody tried to connect to the system"

Now if you were to inspect the /var/log/auth.log file, you would be able to find the message you just logged to the rsyslog server.

$ tail -n 10 /var/log/auth.log | grep --color connect>

Linux Logging Utilities var-log-auth

The logger is very useful when used in Bash scripts for example.

But what if you wanted to log files using the systemd-journal?

Using systemd-cat

In order to send messages to the systemd journal, you have to use the “systemd-cat” command and specify the command that you want to run.

$ systemd-cat <command> <arguments>

If you want to send the output of the “ls -l” command to the journal, you would write the following command

$ systemd-cat ls -l

Using systemd-cat journalctl-2

It is also possible to send “plain text” logs to the journal by piping the echo command to the systemd-cat utility.

$ echo "This is a message to journald" | systemd-cat

Using wall

The wall command is not related directly to logging utilities but it can be quite useful for Linux system administration.

The wall command is used in order to send messages to all logged-in users.

$ wall -n <message>

If you were for example to write a message to all logged-in users to notify them about the next server reboot, you would run the following command.

$ wall -n "Server reboot in five minutes, close all important applications"

Using wall wall-message

Conclusion

In this tutorial, you learnt more about Linux logging : how it is architectured and how different logging components (namely rsyslog and journald) interact together.

You learnt more about the Syslog protocol and how collectors can be configured in order to log specific events on your system.

Linux logging is a wide topic and there are many more topics for you to explore on the subject.

Did you know that you can build centralized logging systems in order to monitor logs on multiple machines?

If you are interested about centralized logging, make sure to read our guide!

Also, if you are passionate about Linux system administration, we have a complete section dedicated to it on the website, so make sure to check it out!