Chapter 1 – Introduction to Linux

en flag
it flag

1.1 Introduction

In this chapter we will explore the evolution of Linux® and popular operating systems. We will also discuss the considerations for choosing an operating system.

Linux Professional Institute offers a roadmap to IT success. Build confidence and in your Linux skills and unlock opportunities with Linux Essentials. Get to the top of the resume stack, invest in your skills, and become a server professional with the LPIC-1 certification.  Earn more respect and money, show your ready for new projects, and become an network professional with the LPIC-2 certification. Achieve long-term career growth and become an Enterprise Professional with the LPIC-3 certification.

Linux® is the registered trademark of Linus Torvalds in the U.S. and other countries.

1.2 Linux Evolution and Popular Operating Systems

The definition of the word Linux depends on the context in which it is used. Linux means the kernel of the system, which is the central controller of everything that happens on the computer (more on this later). People that say their computer “runs Linux” usually refer to the kernel and suite of tools that come with it (called the distribution). If you have “Linux experience”, you are most likely talking about the programs themselves, though depending on the context, you might be talking about knowing how to fine-tune the kernel. Each of these components will be investigated so that you understand exactly what roles each plays.

Further complicating things is the term UNIX. UNIX was originally an operating system developed at AT&T Bell Labs in the 1970’s. It was modified and forked (that is, people modified it and those modifications served as the basis for other systems) such that at the present time there are many different variants of UNIX. However, UNIX is now both a trademark and a specification, owned by an industry consortium called the Open Group. Only software that has been certified by the Open Group may call itself UNIX. Despite adopting all the requirements of the UNIX specification, Linux has not been certified, so Linux really isn’t UNIX! It’s just… UNIX-like.

1.2.1 Role of the Kernel

The kernel of the operating system is like an air traffic controller at an airport. The kernel dictates which program gets which pieces of memory, it starts and kills programs, and it handles displaying text on a monitor. When an application needs to write to disk, it must ask the operating system to do it. If two applications ask for the same resource, the kernel decides who gets it, and in some cases, kills off one of the applications in order to save the rest of the system.

The kernel also handles switching of applications. A computer will have a small number of CPUs and a finite amount of memory. The kernel takes care of unloading one task and loading a new task if there are more tasks than CPUs. When the current task has run a sufficient amount of time, the CPU pauses the task so that another may run. This is called pre-emptive multitasking. Multitasking means that the computer is doing several tasks at once, and pre-emptive means that the kernel is deciding when to switch focus between tasks. With the tasks rapidly switching, it appears that the computer is doing many things at once.

Each application may think it has a large block of memory on the system, but it is the kernel that maintains this illusion, remapping smaller blocks of memory, sharing blocks of memory with other applications, or even swapping out blocks that haven’t been touched to disk.

When the computer starts up it loads a small piece of code called a boot loader. The boot loader’s job is to load the kernel and get it started. If you are more familiar with operating systems such as Microsoft Windows or Apple’s OS X, you probably never see the boot loader, but in the UNIX world it’s usually visible so that you can tweak the way your computer boots.

The boot loader loads the Linux kernel, and then transfers control. Linux then continues with running the programs necessary to make the computer useful, such as connecting to the network or starting a web server.

1.2.2 Applications

Like an air traffic controller, the kernel is not useful without something to control. If the kernel is the tower, the applications are the airplanes. Applications make requests to the kernel and receive resources, such as memory, CPU, and disk, in return. The kernel also abstracts the complicated details away from the application. The application doesn’t know if a block of disk is on a solid-state drive from manufacturer A, a spinning metal hard drive from manufacturer B, or even a network file share. Applications just follow the kernel’s Application Programming Interface (API) and in return don’t have to worry about the implementation details.

When we, as users, think of applications, we tend to think of word processors, web browsers, and email clients. The kernel doesn’t care if it is running something that’s user facing, a network service that talks to a remote computer, or an internal task. So, from this we get an abstraction called a process. A process is just one task that is loaded and tracked by the kernel. An application may even need multiple processes to function, so the kernel takes care of running the processes, starting and stopping them as requested, and handing out system resources.

1.2.3 Role of Open Source

Linux started out in 1991 as a hobby project by Linus Torvalds. He made the source freely available and others joined in to shape this fledgling operating system. His was not the first system to be developed by a group, but since it was a built-from-scratch project, early adopters had the ability to influence the project’s direction and to make sure mistakes from other UNIXes were not repeated.

Software projects take the form of source code, which is a human readable set of computer instructions. The source code may be written in any of hundreds of different languages, Linux just happens to be written in C, which is a language that shares history with the original UNIX.

Source code is not understood directly by the computer, so it must be compiled into machine instructions by a compiler. The compiler gathers all of the source files and generates something that can be run on the computer, such as the Linux kernel.

Historically, most software has been issued under a closed-source license, meaning that you get the right to use the machine code, but cannot see the source code. Often the license specifically says that you will not attempt to reverse engineer the machine code back to source code to figure out what it does!

Open source takes a source-centric view of software. The open source philosophy is that you have a right to obtain the software, and to modify it for your own use. Linux adopted this philosophy to great success. People took the source, made changes, and shared them back with the rest of the group.

Alongside this, was the GNU project (GNU’s, not UNIX). While GNU was building their own operating system, they were far more effective at building the tools that go along with a UNIX operating system, such as the compilers and user interfaces. The source was all freely available, so Linux was able to target their tools and provide a complete system. As such, most of the tools that are part of the Linux system come from these GNU tools.

There are many different variants on open source, and those will be examined in a later chapter. All agree that you should have access to the source code, but they differ in how you can, or in some cases, must, redistribute changes.

1.2.4 Linux Distributions

Take Linux and the GNU tools, add some more user facing applications like an email client, and you have a full Linux system. People started bundling all this software into a distribution almost as soon as Linux became usable. The distribution takes care of setting up the storage, installing the kernel, and installing the rest of the software. The full featured distributions also include tools to manage the system and a package manager to help you add and remove software after the installation is complete.

Like UNIX, there are many different flavors of distributions. These days, there are distributions that focus on running servers, desktops, or even industry specific tools like electronics design or statistical computing. The major players in the market can be traced back to either Red Hat or Debian. The most visible difference is the package manager, though you will find other differences on everything from file locations to political philosophies.

Red Hat started out as a simple distribution that introduced the Red Hat Package Manager (RPM). The developer eventually formed a company around it, which tried to commercialize a Linux desktop for business. Over time, Red Hat started to focus more on the server applications such as web and file serving, and released Red Hat Enterprise Linux, which was a paid service on a long release cycle. The release cycle dictates how often software is upgraded. A business may value stability and want long release cycles, a hobbyist or a startup may want the latest software and opt for a shorter release cycle. To satisfy the latter group, Red Hat sponsors the Fedora Project which makes a personal desktop comprising the latest software, but still built on the same foundations as the enterprise version.

Because everything in Red Hat Enterprise Linux is open source, a project called CentOS came to be, that recompiled all the RHEL packages and gave them away for free. CentOS and others like it (such as Scientific Linux) are largely compatible with RHEL and integrate some newer software, but do not offer the paid support that Red Hat does.

Scientific Linux is an example of a specific use distribution based on Red Hat. The project is a Fermilab sponsored distribution designed to enable scientific computing. Among its many applications, Scientific Linux is used with particle accelerators including the Large Hadron Collider at CERN.

Open SUSE originally derived from Slackware, yet incorporates many aspects of Red Hat. The original company was purchased by Novell in 2003, which was then purchased by the Attachmate Group in 2011. The Attachmate group then merged with Micro Focus International. Through all of the mergers and acquisitions, SUSE has managed to continue and grow. While Open SUSE is desktop based and available to the general public, SUSE Linux Enterprise contains proprietary code and is sold as a server product.

Debian is more of a community effort, and as such, also promotes the use of open source software and adherence to standards. Debian came up with its own package management system based on the .deb file format. While Red Hat leaves non Intel and AMD platform support to derivative projects, Debian supports many of these platforms directly.

Ubuntu is the most popular Debian derived distribution. It is the creation of Canonical, a company that was made to further the growth of Ubuntu and make money by providing support.

Linux Mint was started as a fork of Ubuntu Linux, while still relying upon the Ubuntu repositories. There are various versions, all free of cost, but some include proprietary codecs, which can not be distributed without license restrictions in certain countries. Linux Mint is quickly supplanting Ubuntu as the world’s most popular desktop Linux solution.We have discussed the distributions specifically mentioned in the Linux Essentials objectives. You should be aware that there are hundreds, if not thousands more that are available. It is important to understand that while there are many different distributions of Linux, many of the programs and commands remain the same or are very similar.

1.2.4.1 What is a Command?

The simplest answer to the question, “What is a command?”, is that a command is a software program that when executed on the command line, performs an action on the computer.

When you consider a command using this definition, you are really considering what happens when you execute a command. When you type in a command, a process is run by the operating system that can read input, manipulate data and produce output. From this perspective, a command runs a process on the operating system, which then causes the computer to perform a job.

However, there is another way of looking at what a command is: look at its source. The source is where the command “comes from” and there are several different sources of commands within the shell of your CLI:

  • Commands built-in to the shell itself: A good example is the cd command as it is part of the bash shell. When a user types the cd command, the bash shell is already executing and knows how to interpret that command, requiring no additional programs to be started.
  • Commands that are stored in files that are searched by the shell: If you type a ls command, then the shell searches through the directories that are listed in the PATH variable to try to find a file named ls that it can execute. These commands can also be executed by typing the complete path to the command.
  • Aliases: An alias can override a built-in command, function, or a command that is found in a file. Aliases can be useful for creating new commands built from existing functions and commands.
  • Functions: Functions can also be built using existing commands to either create new commands, override commands built-in to the shell or commands stored in files. Aliases and functions are normally loaded from the initialization files when the shell first starts, discussed later in this section.

Consider This

While aliases will be covered in detail in a later section, this brief example may be helpful in understanding the concept of commands.

An alias is essentially a nickname for another command or series of commands. For example, the cal 2014 command will display the calendar for the year 2014. Suppose you end up running this command often. Instead of executing the full command each time, you can create an alias called mycal and run the alias, as demonstrated in the following graphic:

sysadmin@localhost:~$ alias mycal="cal 2014"                             
sysadmin@localhost:~$ mycal                                                                            2014                                         
     January               February               March                  
Su Mo Tu We Th Fr Sa  Su Mo Tu We Th Fr Sa  Su Mo Tu We Th Fr Sa        
          1  2  3  4                     1                     1      
5  6  7  8  9 10 11   2  3  4  5  6  7  8   2  3  4  5  6  7  8          
12 13 14 15 16 17 18   9 10 11 12 13 14 15   9 10 11 12 13 14 15         
19 20 21 22 23 24 25  16 17 18 19 20 21 22  16 17 18 19 20 21 22         
26 27 28 29 30 31     23 24 25 26 27 28     23 24 25 26 27 28 29         
                                            30 31         
       April                  May                   June                 
Su Mo Tu We Th Fr Sa  Su Mo Tu We Th Fr Sa  Su Mo Tu We Th Fr Sa         
       1  2  3  4  5               1  2  3   1  2  3  4  5  6  7         
 6  7  8  9 10 11 12   4  5  6  7  8  9 10   8  9 10 11 12 13 14         
13 14 15 16 17 18 19  11 12 13 14 15 16 17  15 16 17 18 19 20 21     
20 21 22 23 24 25 26  18 19 20 21 22 23 24  22 23 24 25 26 27 28    
27 28 29 30           25 26 27 28 29 30 31  29 30                               
        July                 August              September 
Su Mo Tu We Th Fr Sa  Su Mo Tu We Th Fr Sa  Su Mo Tu We Th Fr Sa     
       1  2  3  4  5                  1  2      1  2  3  4  5  6

1.2.5 Hardware Platforms

Linux started out as something that would only run on a computer like Linus’: a 386 with a specific hard drive controller. The range of support grew, as people built support for other hardware. Eventually, Linux started supporting other chips, including hardware that was made to run competitive operating systems!

The types of hardware grew from the humble Intel chip up to supercomputers. Later, smaller-size, Linux supported, chips were developed to fit in consumer devices, called embedded devices. The support for Linux became ubiquitous such that it is often easier to build hardware to support Linux and then use Linux as a springboard for your custom software, than it is to build the custom hardware and software from scratch.

Eventually, cellular phones and tablets started running Linux. A company, later bought by Google, came up with the Android platform which is a bundle of Linux and the software necessary to run a phone or tablet. This means that the effort to get a phone to market is significantly less, and companies can spend their time innovating on the user facing software rather than reinventing the wheel each time. Android is now one of the market leaders in the space.

Aside from phones and tablets, Linux can be found in many consumer devices. Wireless routers often run Linux because it has a rich set of network features. The TiVo is a consumer digital video recorder built on Linux. Even though these devices have Linux at the core, the end users don’t have to know. The custom software interacts with the user and Linux provides the stable platform.

1.3 Choosing an Operating System

You have learned that Linux is a UNIX-like operating system, which means that it has not undergone formal certification and therefore can’t use the official UNIX trademark. There are many other alternatives; some are UNIX-like and some are certified as UNIX. There are also non-Unix operating systems such as Microsoft Windows.

The most important question to ask when determining the configuration of a machine is “what will this machine do?” If you need to run specialized software that only runs on Oracle Solaris, then that’s what you’ll need. If you need to be able to read and write Microsoft Office documents, then you’ll either need Windows or something capable of running LibreOffice or OpenOffice.

1.3.1 Decision Points

The first thing you need to decide is the machine’s role. Will you be sitting at the console running productivity applications or web browsing? If so, you have a desktop. Will the machine be used as a Web server or otherwise sitting in a server rack somewhere? You’re looking at a server.

Servers usually sit in a rack and share a keyboard and monitor with many other computers, since console access is only used to set up and troubleshoot the server. The server will run in non-graphical mode, which frees up resources for the real purpose of the computer. A desktop will primarily run a GUI.

Next, determine the functions of the machine. Is there specific software it needs to run, or specific functions it needs to do? Do you need to be able to manage hundreds or thousands of these machines at the same time? What is the skill set of the team managing the computer and software?

You must also determine the lifetime and risk tolerance of the server. Operating systems and software upgrades come on a periodic basis, called the release cycle. Software vendors will only support older versions of software for a certain period of time before not offering any updates, which is called the maintenance cycle (or life cycle). For example, major Fedora Linux releases come out approximately every 6 months. Versions are considered End of Life (EOL) after 2 major versions plus one month, so you have between 7 and 13 months after installing Fedora before you need to upgrade. Contrast this with the commercial server variant, Red Hat Enterprise Linux, and you can go up to 13 years before needing to upgrade.

The maintenance and release cycles are important because in an enterprise server environment it is time consuming, and therefore rare, to do a major upgrade on a server. Instead, the server itself is replaced when there are major upgrades or replacements to the application that necessitate an operating system upgrade. Similarly, a slow release cycle is important because applications often target the current version of the operating system and you want to avoid the overhead of upgrading servers and operating systems constantly to keep up. There is a fair amount of work involved in upgrading a server, and the server role often has many customizations made that are difficult to port to a new server. This necessitates much more testing than if only the application were upgraded.

If you are doing software development or traditional desktop work, you often want the latest software. Newer software has improvements in both functionality and appearance, which contributes to more enjoyment from the use of the computer. A desktop often stores its work on a remote server, so the desktop can be wiped clean and the newer operating system put on with little interruption.

Individual software releases can be characterized as beta or stable. One of the great things about being an open source developer is that you can release your new software and quickly get feedback from users. If a software release is in a state that it has many new features that have not been rigorously tested, it is typically referred to as beta. After those features have been tested in the field, the software moves to a stable point. If you need the latest features, then you are looking for a distribution that has a quick release cycle and makes it easy to use beta software. On the server side, you want stable software unless those new features are necessary and you don’t mind running code that has not been thoroughly tested.

Another loosely related concept is backward compatibility. This refers to the ability for a later operating system to be compatible with software made for earlier versions. This is usually a concern if you need to upgrade your operating system, but aren’t in a position to upgrade your application software.

Of course, cost is always a factor. Linux itself might be free, but you may need to pay for support, depending on which options you choose. Microsoft has server license costs and may have additional support costs over the lifetime of the server. Your chosen operating system might only run on a particular selection of hardware, which further affects the cost.

1.3.2 Microsoft Windows

The Microsoft world splits the operating systems according to the machine’s purpose: desktop or server? The Windows desktop edition has undergone various naming schemes with the current version (as of this writing) being simply Windows 8. New versions of the desktop come out every 3-5 years and tend to be supported for many years. Backward compatibility is also a priority for Microsoft, even going so far as to bundle virtual machine technology so that users can run older software.

In the server realm, there is Windows Server, currently (at this writing) at version 2012 to denote the release date. The server runs a GUI, but largely as a competitive response to Linux, has made amazing strides in command line scripting abilities through PowerShell. You can also make the server look like a desktop with the optional Desktop Experience package.

1.3.3 Apple OS X

Apple makes the OS X operating system, which has undergone UNIX certification. OS X is partially based on software from the FreeBSD project.

At the moment, OS X is primarily a desktop operating system but there are optional packages that help with management of network services that allow many OS X desktops to collaborate, such as to share files or have a network login.

OS X on the desktop is usually a personal decision as many find the system easier to use. The growing popularity of OS X has ensured healthy support from software vendors. OS X is also quite popular in the creative industries such as video production. This is one area where the applications drive the operating system decision, and therefore the hardware choice since OS X runs on Apple hardware.

1.3.4 BSD

There are several open source BSD (Berkely Software Distribution) projects, such as OpenBSD, FreeBSD, and NetBSD. These are alternatives to Linux in many respects as they use a large amount of common software. BSDs are typically implemented in the server role, though there are also variants such as GNOME and KDE that were developed for desktop roles.

1.3.5 Other Commercial UNIXes

Some of the more popular commercial UNIXes are:

  • Oracle Solaris
  • IBM AIX
  • HP-UX

Each of these runs on hardware from their respective creators. The hardware is usually large and powerful, offering such features as hot-swap CPU and memory, or integration with legacy mainframe systems also offered by the vendor.

Unless the software requires the specific hardware or the needs of the application require some of the redundancy built into the hardware, most people tend to choose these options because they are already users of the company’s products. For example, IBM AIX runs on a wide variety of IBM hardware and can share hardware with mainframes. Thus, you find AIX in companies that already have a large IBM footprint, or that make use of IBM software like WebSphere.

1.3.6 Linux

One aspect where Linux is much different than the alternatives is that after an administrator has chosen Linux they still have to choose a distribution. Recall from Topic 1 that the distribution packages the Linux kernel, utilities, and management tools into an installable package and provides a way to install and update packages after the initial installation.

Some operating systems are available through only one vendor, such as OS X and Windows, with system support provided through the vendor. With Linux, there are multiple options, from commercial offerings for the server or desktop, to custom distributions made to turn an old computer into a network firewall.

Often application vendors will choose a subset of distributions to support. Different distributions have different versions of key libraries and it is difficult for a company to support all these different versions.

Governments and large enterprises may also limit their choices to distributions that offer commercial support. This is common in larger companies where paying for another tier of support is better than risking extensive outages.

Various distributions also have release cycles, sometimes as often as every six months. While upgrades are not required, each version can only be supported for a reasonable length of time. Therefore, some Linux releases are considered to have long term support (LTS) of 5 years or more while others will only be supported for two years or less.

Some distributions differentiate between stable, testing, and unstable releases. The difference being that unstable releases trade reliability for features. When features have been integrated into the system for a long time, and many of the bugs and issues addressed, the software moves through testing into the stable release. The Debian distribution warns users about the pitfalls of using the “sid” release with the following warning:

‘”sid” is subject to massive changes and in-place library updates. This can result in a very “unstable” system which contains packages that cannot be installed due to missing libraries, dependencies that cannot be fulfilled etc. Use it at your own risk!’

Other releases depend on Beta distributions. For instance, the Fedora distribution releases Beta or pre-releases of its software ahead of the full release to minimize bugs. Fedora is often considered the community oriented Beta release of RedHat. Features are added and changed in the Fedora release before finding their way into the Enterprise ready RedHat distribution.

1.3.7 Android

Android, sponsored by Google, is the world’s most popular Linux distribution. It is fundamentally different from its counterparts. Linux is a kernel, and many of the commands that will be covered in this course are actually part of the GNU (GNU’s Not Unix) package. That is why some people insist on using the term GNU/Linux instead of Linux alone.

Android uses the Dalvik virtual machine with Linux, providing a robust platform for mobile devices such as phones and tablets. However, lacking the traditional packages that are often distributed with Linux (such as GNU and Xorg), Android is generally incompatible with desktop Linux distributions.

This incompatibility means that a RedHat or Ubuntu user can not download software from the Google Play store. Likewise, a terminal emulator in Android lacks many of the commands of its Linux counterparts. It is possible, however, to use BusyBox with Android to enable most commands to work.