What is Linux?

Introduction

This article will include:

  1. Introduction to GNU and Linux
  2. Illustrations of Linux in a single diagram (main components)

Linux || GNU/Linux

An operating system (OS) is the software interface between a user and computer hardware. It is an abstraction layer that allows userspace programs (web browser, text editor, media player) to avoid worrying about interacting with hardware (keyboard, screen, CPU, RAM). The Linux kernel is mostly used in combination with the GNU operating system. Thus, the GNU/Linux operating system (OS) consists of the Linux kernel (the core of the operating system) and GNU software packages.

  1. Computer resources are finite or shared, such as physical memory, CPU execution cycles, devices/hardware and files. The Linux kernel is the core of the operating system. It manages and allocates computer resources (CPU scheduling, memory management, process management, filesystem management, device management, networking, etc.). Its most important role is, perhaps, allowing multiple user programs to be run on the same operating system without conflict, while having to share finite system resources.

  2. GNU provides the rest of the operating system:
    • Toolchain used for developing applications and the operating system. The toolchain is rarely used directly by end users; they are development tools, or used indirectly by programs on your system. This includes the GNU Compiler Collection (gcc), the GNU C library (glibc), GNU Binutils (binary utilities - assembler for assembly programs, linker for object files, ...).
    • Base system is a set of software packages used commonly in the GNU/Linux operating system. These are utilities for managing your programs and files. These are command-line programs; but there are graphical applications built on top of these tools, if a user does not wish to interact through the command line. This includes bash (command shell interpreter program, to interpret commands), GRUB (bootloader for booting the OS), coreutils (core utilities for using the OS - cd, cat, ls, rm, ...), and other utilities (tar, grep, sed, ...).
  3. And then there are other software packages, like web browsers and text editors. These are the user software programs. Perhaps surprising to some, this also includes your graphical user interface (GUI), which ultimately is just a set of packages. You can change your desktop environment/window manager easily in Linux!

Each distribution of GNU/Linux (Ubuntu, Debian, Fedora, Arch, etc.) is a bundle of packages that can be 'distributed' as an operating system, with slight differences in the kernel, GNU packages, and other software packages. To most users, the main difference will be the desktop environments and programs that are shipped with the distribution. This allows the different distributions to look so different.

"Everything in Linux is a file"

is probably the most important thing to keep in mind while continuing to read this article. *Almost* everything is a file, from the bootloader executable file, to the compressed kernel image, the files in the filesystem hierarchy, and the system manager and other software. This allows users/administrators to have a great degree of flexibility. Almost everything can be changed or added, simply by installing the desired files (bundled into packages) to the correct locations: bootloader, kernel (yes even the kernel!!), desktop environment, network manager, command shell, browser, and more.

When your system is powered off, the operating system has to reside somewhere... yes, it resides on your disk storage (SSD/HDD). The Linux operating system resides as a filesystem containing executable binary files, libraries, configuration files. Once again, everything is a file! During boot process, the kernel will mount this filesystem, execute the init process (AKA first process, ancestor of all processes, system manager), which will call init scripts and start up other services and processes. The operating system will be loaded bit by bit into RAM, and eventually 'user space' is reached, and you can launch your user programs.


Linux in a single diagram - 3 main components

  1. Bootloader
  2. Kernel
  3. Filesystem hierarchy (directory structure and files)

There are two images below. The first is from my article on the UEFI boot process, where I explain the steps of booting, from power on to system initialization. It is fine if you do not understand the boot process; you may read that later; the article also explains some of the terms used below in greater detail. Just remember that everything is a file, and try to identify in the image below the (1) bootloader, (2) kernel, and (3) filesystem structure.

The boot process can be summarized as:

  1. Power on
  2. Select UEFI boot menu entry (software stored on motherboard)
  3. Execute Bootloader
  4. Search /boot filesystem for kernel
  5. Bootloader loads Kernel
  6. Mount root filesystem, start initialization scripts

Try and identify each step in the diagram below!

This next image presents the components of Linux in a single diagram. It provides an idea of where the bootloader, kernel, and filesystem are located. The programs on your system are just files - executables (in /bin), libraries (in /lib), configuration files (in /etc) - "scattered" across the different directories in the filesystem.

Identify in the diagram each of the 3 main components of a Linux Operating System (OS):

  1. Bootloader
  2. Kernel
  3. Filesystem hierarchy

1. Bootloader

During the boot process, the bootloader stored on a specific partition (EFI system partition) of a storage disk is executed. The bootloader then searches and loads the kernel into RAM, as part of the boot process described more in my article on the UEFI boot process.

Thus, the bootloader is the middleman to pass control from the UEFI firmware located on the motherboard, to the kernel, the core of the operating system.

2. Kernel

Your CPU executes instructions, allowing programs to run. CPUs may operate in different modes, commonly two separate modes: kernel mode and user mode. These are protection modes with differing restrictions on the types of allowed operations. These CPU modes are commonly enforced at the hardware level.

The kernel is the core of the operating system, and operates in the privileged kernel mode, where it can perform unrestricted device/hardware management, CPU instructions and memory access.

User processes, on the other hand, operate in user mode. User processes may not be able to interact directly with hardware, have restricted CPU instructions, and may only access memory that is marked as being in user space.

As mentioned, kernel space sits between user space and the hardware. As such, user processes have to perform system calls (requests to the kernel to perform operations on their behalf) for certain types of operations such as read/write to file storage, or creating and communicating with other processes.

The kernel performs the central role in the operating system, of managing and allocating system resources. This includes deciding when to schedule processes for CPU execution; allocating physical memory amongst processes; managing access to files and directories; creating, executing and terminating processes; accessing hardware/devices; handling network communication for user processes; and providing an interface for user processes to perform system calls (system call API/interface). Perhaps I will write an article with illustrations on the kernel's role soon :)

3. Filesystem hierarchy

The image above is from Wikipedia - Unix Filesystem as of 21 July 2020. The Filesystem Hierarchy Standard (FHS) by the Linux Foundation provides guidelines for filesystem structure. Most Linux distributions follow it, with some deviations, allowing files to be placed in a predictable location following the standard. The image below is from the FHS page mentioned earlier describing some of the common directories. You can skim it for fun :)

Each Linux distribution has their own package manager that makes it very easy to install, update, and remove packages. They also install/update/remove packages that are required by these packages (resolving dependencies).

One side effect of this filesystem structure is that when installing a software package, its files will be "scattered" across the filesystem hierarchy. It also means that if a user is searching for a configuration file (or other particular type of file), it will likely be located in a predictable directory. Programs also depend on this predictable hierarchy; they may depend on configuration files or libraries. The filesystem should be neat, and predictable. Software programs are created by developers from around the world. Your system will have hundreds of software programs, and there needs to be a logical hierarchy of files and directories.

Directory Purpose
/usr/bin binary executable files
/etc configuration files
/usr/lib libraries
/usr/share/application_name data files
/usr/share/man manual pages

A package manager will download your desired software package, and install its files to the correct directories. Software packages are not required to install to all these directories (i.e. they may not have libraries or man pages). Different distributions may also differ in their filesystem structure. A small subset of the files being installed by an application called Firejail on the Arch Linux distribution:

  • /usr/bin/firejail
  • /etc/firejail/chromium.profile
  • /usr/lib/firejail/seccomp
  • /usr/share/doc/firejail/README
  • /usr/share/man/man1/firejail.1.gz

Conclusion

GNU software packages (toolchain and base system) and the Linux kernel come together to form the GNU/Linux operating system. A Linux distribution is a means of distributing an operating system, with a filesystem and software packages, to end users. The Linux filesystem hierarchy resides on your storage disk, including binary executables, libraries, and configuration files.

The 3 main components of a GNU/Linux operating system are: bootloader, kernel, filesystem hierarchy.

Each Linux distribution has their own package manager that makes it very easy to install, update, and remove packages. They also install/update/remove packages that are required by these packages (resolving dependencies). Installed files will be "scattered" across the filesystem to predictable directories.


References

  1. GNU - What's in a Name
  2. GNU - Linux and the GNU System
  3. ArchWiki - GNU
  4. ArchWiki - Core utilities
  5. Red Hat - What is the Linux kernel?
  6. Wikipedia - CPU modes
  7. Wikipedia - Protection ring
  8. man-pages - Linux system calls
  9. Gentoo Wiki - Users and the Linux file system
  10. Wikipedia - Unix filesystem
  11. The Linux Foundation - Filesystem Hierarchy Standard

Final Notes

These guides are targeted mostly at newcomers to Linux. The extensive use of illustrations is something I find most other guides lack. I would appreciate any feedback, and corrections if I have made any mistakes. Apologies in advance if I have!

Email:
sky100aw@gmail.com