In this first chapter, we will guide you through the steps to preparing your work environment on both your work computer and the server that you will use to host the websites. There are a number of things that you will have to understand in order to establish a fully functional Nginx set up, particularly if you are working with a computer running a Microsoft Windows operating system.
This chapter covers:
For all of us working under a Microsoft Windows operating system on a daily basis for the past fifteen years, the idea of going back to a good old command-line interface may seem somewhat primitive, but it is nevertheless a reality—even a necessity for most server administrators. The first step of your preparatory work will consist of downloading and installing an SSH client. Secure SHell (SSH) is a network protocol that allows two devices to communicate securely by encrypting exchanged data. It is notably used for connecting to a system shell remotely. In other words, you will be able to take control of your server without compromising its security.
PuTTY is by far the most widely used terminal emulator for SSH access under Windows. As such, you may find a large amount of articles and other documents on the web explaining the various features offered by this program. We will only be covering the aspects that directly concern our subject—configuring PuTTY to connect to your server, entering text, and using the copy and paste commands. But you should know that there is much more that this free and open source tool can do—creating SSH tunnels, connecting to a Telnet, rlogin, even raw TCP communication, and so on.
PuTTY can be downloaded directly from its author's website:
It comes as a standalone .EXE program and does not require any external files. All its data is saved in the Windows registry, so it will not be filling up your system with configuration files.
Before reading on, make sure you are in possession of the following elements:
Let us take a quick peek at the main PuTTY window.
PuTTY saves your settings in sessions. So when you finish configuring the assortment of parameters, make sure to give a name to your session and click on the Save button, as highlighted in the preceding screenshot.
On the default PuTTY screen, you will need to enter a Host Name (or IP address) for the server you want to connect to. Then configure the port on which the SSH service is running on the remote server, 22 being the default port for SSHD. Here are a couple of additional settings that are optional but may be useful to you:
Once you have finished configuring your session, remember to save it, and then initiate the connection by clicking on the Open button on the main window. When you connect to a server for the first time, you are required to validate its authenticity by accepting the server fingerprint. If you connect to the same server in the future, you shouldn't be seeing the confirmation again, unless the server settings such as hostname or port have been changed or security has been compromised and you are connecting to an intermediate server (man-in-the-middle attack). Eventually, you should be prompted for a login (unless you enabled the auto-login option) and a password. Please note that when typing the password, it will not appear on the screen at all—not even as asterisks, so make sure to enter it carefully, then press the Return key.
If you have never worked with PuTTY or with a system shell before, there are a couple of details you may want to know regarding the behavior of the main terminal emulator window.
Connecting to your server and opening up a terminal window is one thing, being able to actually make use of it is another. If you have never worked with Linux before, you may find this section particularly helpful as it will help you get started by describing some of the most basic and useful commands. All the commands that we will be using in later sections are covered here, but you will soon realize that there is a lot more that you can do with the shell in general.
There are a lot of similarities between common shells such as BASH (Bourne-Again SHell, default shell for GNU/Linux distributions) and the Microsoft Windows command-line interface. The main resemblance is that we use the notion of working directory. The shell prompts you for a textual command; the said command will be executed in the current working directory.
When you first log in to your shell account, you should land in your home directory. This folder is generally used to contain your personal files; it is a private space that no other users on the system should be able to see (unless specific access rights are granted).
Here is a list of the most useful basic commands for file and directory management:
|pwd||Print working directory
[firstname.lastname@example.org ~]$ pwd /home/alex
cd Change directory [email@example.com ~]$ cd images [firstname.lastname@example.org images]$ pwd /home/alex/images [email@example.com images]$ cd /tmp [firstname.lastname@example.org tmp]$ pwd /tmpHere are some useful shortcuts that can be used with cd as well as any other shell command:
|ls||List all files in the current working directory (or a specified directory)
[email@example.com ~]$ ls images photo2.jpg photo.jpg shopping.txtTry ls –l for a more detailed view. The –a switch reveals hidden and system files.
|mkdir||Create a new directory
[firstname.lastname@example.org ~]$ mkdir documents [email@example.com ~]$ cd documents [firstname.lastname@example.org documents]$ mkdir /tmp/alex [email@example.com documents]$ cd /tmp/alex [firstname.lastname@example.org alex]$ pwd /tmp/alexCommand-line applications in general do not output any text in the case of a successful operation. They will only display a message if an error occurred.
Command syntax: cp [options] source destination
[email@example.com ~]$ cp photo2.jpg photo3.jpg
|mv||Move or rename files.|
Command syntax: mv [options] source destination Renaming a file:
[firstname.lastname@example.org ~]$ mv photo3.jpg photo4.jpgMoving a file to another folder:
[email@example.com ~]$ mv photo4.jpg images/
|rm||Delete a file or a directory. The –r switch enables recursion.
[firstname.lastname@example.org ~]$ rm photo.jpg [email@example.com ~]$ ls images photo2.jpg shopping.txt [firstname.lastname@example.org ~]$ rm –r images/ [email@example.com ~]$ ls photo2.jpg shopping.txtProceed with extreme caution with this command, especially if you are logged in as the Superuser (system administrator). Files cannot be recovered and a simple call to rm –rf / suffices to initiate a complete wipe of your filesystem.
|locate||Locate the specified file on the entire filesystem. This command is directly related to the updatedb command below:
[firstname.lastname@example.org ~]$ locate photo2.jpg /home/alex/photo2.jpg /home/jesse/holiday_photo2.jpgNote: The locate command completely relies on indexes. If you create a new file, you will not be able to find it until you perform a database update with the command below.
|updatedb||Updates the file database. Note that this command requires administrative permissions. For that reason, it is generally set to be executed on a daily basis via a "cron job" (the equivalent of tasks in Microsoft Windows operating systems) with administrative-level rights.
email@example.com ~]$ mkdir "Holidays in France" [firstname.lastname@example.org ~]$ locate France No file found: a database update is required.Once logged in with an administrator account:
[email@example.com ~]# updatedb [firstname.lastname@example.org ~]# locate France /home/alex/Holidays in France
|man||Displays documentation on a specified command
[email@example.com ~]$ man lsSee the screenshot below.
[firstname.lastname@example.org ~]$ man ls
Eventually, you can use the clear command to erase all text on your screen and start afresh.
The first obsession an administrator should have is who has access to which resources on their system. In that extent, Unix-based operating systems provide an elaborate user and group management mechanism.
Each and every operating system comes with a Superuser account, often required for performing administrative-level tasks. This account is usually called root, although on some systems it can be named otherwise ('admin' or even 'toor'). The Superuser has access to all files and directories of the system and has the right to read, edit, and execute all files as well as change file attributes and permissions.
Although an administrator should always have the possibility to access the root account, it is not recommended to constantly connect as the machine Superuser. In fact, some operating systems such as Ubuntu do not even allow you to do so. One of the great principles of computer security is least privilege—you should never be allowed to do more than what you need to do. In other words, why give a user the possibility to access your system configuration folder if they are only going to use your computer for surfing the web and writing documents with Open Office? Granting more privileges than one requires can only lead to situations where the system security and integrity get compromised. For that reason, it is highly recommended that you create user accounts, not only for physical users of your machine but also for applications to run in a secure environment with clearly defined boundaries.
One particular file in the system configuration directory holds the list of system users: /etc/passwd. Contrary to what the name suggests, it does not usually contain user passwords; they are, in most cases, stored using the shadow format in a separate file /etc/shadow for security reasons. It, however, does come with certain bits of information for each user. One line of the passwd file representing one user, the following syntax should be respected:
Name:password:ID:group ID:comment:home directory:login shell
In practice, the password bit is replaced by 'x' indicating that the actual password is stored in the /etc/shadow file.
Adding a new user account can be as simple as adding a line to the /etc/passwd file. However, you might find the manual process somewhat bothersome, and rest assured—you are not alone. In that extent, you will be pleased to learn that a program automating the operation is available on most distributions—useradd.
The most basic syntax for this command is useradd username. This creates a new user account with the default settings (which can be customized)—a home directory for the user located in /home, no expiration date, the default group for users, and Bash as a login shell. If you add an account destined to be running a service such as Nginx, it is recommended that you do not grant the user account shell access; consequently, you should make sure that the login shell is set to nologin (usually found in /sbin/nologin). The command would then be:
useradd --shell /sbin/nologin nginx
You can also define the location of the home directory to the folder where you have installed Nginx:
useradd --shell /sbin/nologin --home-dir /usr/local/nginx nginx
The trailing nginx indicates the name of the user account to be created.
If you wish to edit some of these parameters after the account creation process is complete, you may use the usermod command. It allows you to rename the account name, change the account password, move the home directory along with its contents to another location, and much more. Eventually, you might want to delete a user account. This is done via the simple userdel command as in userdel username. The –r switch allows you to delete the home directory along with the user account.
Remember that for each of these commands, you have the possibility to consult more detailed information using man, for example, man useradd.
In addition to user accounts, Unix-based systems provide an even more advanced resource management mechanism—user groups. The purpose of a group is to have its own access permissions on files and directories; all users belonging to the group will then inherit the group permissions. A user account has to belong to at least one group—the user's primary group—although it may also belong to secondary groups.
In practice, the list of groups on the system is stored in the /etc/group file. Each line of the file represents one group, respecting the following syntax:
Group name:password:group ID:user list
The group password is rarely used; instead it is replaced by 'x' to indicate that the group has no password. At the end of each line, you will find the list of users belonging to the group. Here is an example of a group file on a production server:
Again, if you wish to create a new group on your system, you have two options: either add a new line to the /etc/group file, or use the dedicated groupadd command. Its syntax is simple—groupadd groupname. There are some optional parameters to the command, which you can discover by running man groupadd.
Similar to the user management system, you will also find groupmod and groupdel commands for respectively editing group settings and deleting a group. More importantly, how to add a user to a group? It is done by either editing the /etc/group file to append the username at the end of the line corresponding to the group you wish to add the user to, or by using the following command:
usermod --append --groups groupname username
You may specify one or more groups. Skipping the --append option would have the effect to replace the user's group list by the specified groups. Eventually, the groups command shows the list of groups the current user belongs to.
Running a program in the shell is not as simple as entering its filename. There are a couple of subtle details that you should understand about the way Bash handles the execution of binaries and scripts.
There are three different situations that you may face when you want to execute a program or a script from the shell:
[email@example.com ~]$ cd programs [firstname.lastname@example.org programs]$ ./my-app
[email@example.com ~]$ /home/alex/programs/my-app
[firstname.lastname@example.org ~]$ nano
Note that when running a shell command, the prompt will be unavailable until the execution is complete. This can be problematic in the case of a lengthy operation, so you may want to start a program and have it running in the background instead of blocking the shell completely. This is done by appending a simple & at the end of the line.
[email@example.com tmp]$ cp home.avi ~/movies/ &  2629 [firstname.lastname@example.org tmp]$  Done cp home.avi ~/movies/ &
As soon as you send the command, the pid (Process Identifier—a number identifying a running process on your system) will show up and the prompt will return. Once the execution terminates, a message appears to indicate its completion, along with the original command used to start the process.
Most of the applications running in the background (often referred to as services), are not started via a simple command followed by the & character. There are actually complex scripts that manage their startup and shutdown. Those scripts can be placed in several directories, the most common one being /etc/init.d.
Some Linux distributions such as Red Hat, Fedora, CentOS, or Mandriva come with a script called service that (among other things) allows you to control a service by using the service name command syntax, where script is the name of the service you want to start and command is one of the options from the table below. Distributions that do not have the service script installed may also control services using a similar syntax: /etc/init.d/name command. Note that init.d scripts do not always provide implementations for all of these common commands.
|start||Starts the specified service|
|stop||Stops the specified service in a clean way|
|restart||Stops the specified service and starts it again|
|reload||Reloads the configuration of the specified service|
|status||Displays the status of the specified service|
Try service --status-all for listing all system services along with their current status.
As mentioned before, the system allocates a number to each and every process running on the computer. This number is called the Process Identifier (pid). Knowing the pid is important in various situations, some of which you are about to discover.
Finding the pid
Firstly, how does one find the pid of a process? Although there are a number of ways you could do that, most of them rely on a single tool—ps. Its many options (combined with the piping mechanism) will allow you to retrieve various details about a process.
The ps aux | grep sshd command can be dissected into three components:
An administrator's best friend—top
Another tool that you will find particularly useful if you run a high traffic website is top. This program lists all the processes currently running on the system with their pid, which is sorted by their CPU usage. On top of that, the list refreshes every second until you interrupt the execution flow (with Ctrl+C, for example) or stop the application by pressing the Q key. This allows you to keep track of the most resource-hungry processes.
The upper part also provides loads of useful statistics on the current resource usage such as system uptime, active users, load average, memory and processor load, and more.
If a command ever turns out wrong and the prompt does not return, one of your possible solutions is to press Ctrl+C to interrupt the execution flow of the application. The equivalent operation can be applied to background processes by using the kill command. There is a subtle detail here—you cannot kill a process by specifying its name; you need to provide its pid. The reason, obviously, is that one program may be executed more than once; consequently, a program name does not always correspond to a unique process.
[email@example.com ~]$ kill 12075
Again, if the command does not output any result, there is nothing to worry about. Actually, if there is one thing that kill may tell you, it would be something along the lines of no such process in case you entered an invalid pid. The kill command simply sends a signal to the specified process, which does not necessarily mean that the said process will have successfully stopped. If the program is locked, for example, it will not respond to the signal and thus will still be running. You will be reassured to know that there is a simple way to force a process to terminate—the -9 option specifies that the system should immediately stop the execution.
[firstname.lastname@example.org ~]$ kill -9 12075
Finally, as you can imagine, you may, at some point, need to terminate multiple processes at a time. For instance, you could kill all the processes that Apache spawned. In that case, we would use a slightly different command—killall. It differs from kill in the extent that it accepts a process name as argument instead of a pid.
[email@example.com ~]$ killall httpd
Linux-based operating systems have their files organized in a very specific way that follows more or less closely the long-established Filesystem Hierarchy Standard (FHS). According to the official FHS documentation, this standard enables:
Although the original standard specification was published in 1993, it is still used by modern distributions, but in a slightly revised version.
Unlike Microsoft Windows operating systems where all file paths begin with a drive letter (what happens if you have over twenty-six drives on your system?), FHS- based filesystems have a common parent. This parent is called the root directory, also known as / (the slash character). All files and directories (regardless of the device, drive, or partition, they are located on) are children of the root directory. Consequently, all absolute paths that you will find in this book start with a slash.
Let us now run cd /, followed by ls in order to discover the many subdirectories defined by the FHS. Please note that this directory structure is purely conventional; nothing actually prevents you from placing your own files in any of these folders or creating more directories at the root.
|/||The root directory: Not to be confused with /root. No files are usually placed at the root, although nothing really prevents you from doing so.|
|/bin||Binaries: Common executable binaries and scripts available for all users of the system. This is where essential programs such as ls, cp, or mv are found.|
|/boot||Boot: Critical files used at system boot time.|
|/dev||Devices: Device and special files, more information in the next section.|
|/etc||Et cetera: System-wide configuration files for services and applications. You will often need to browse this directory, for example, when you will need to edit the Nginx server settings and virtual hosts.|
|/home||Home directories: This directory contains home directories for all users on the system except the root user. In the examples we studied before we used /home/alex, the home directory for the alex user.|
|/lib||Libraries: System-wide shared libraries and kernel modules, required by binaries found in the /bin and /sbin folders.|
|/media||Removable media: A directory that allows you to easily access removable media using mount points for devices such as CD-ROMs, USB devices, and so on.|
|/mnt||Temporarily mounted filesystems: This directory is a suitable placeholder in case the administrator wishes to mount a filesystem on a temporary basis.|
|/opt||Optional software packages: In theory, this directory should host application files and add-on packages that do not come with the default operating system installation. In practice, it is hardly ever used.|
|/proc||Kernel and process information virtual filesystem: This directory provides access to a virtual filesystem containing a variety of statistics and details about all running processes.|
|/root||Root user home directory: The root user, also known as Superuser, does not have its home directory stored in the same folder as regular users (/home). Instead, its personal files are stored in the /root. directory. The slash-root (/root) directory is not to be confused with the root directory (/).|
|/sbin||System binaries: Utilities dedicated to system administration, thus generally accessed by the root user only. Programs such as ifconfig, halt, service, and many others can be found here.|
|/srv||Service data: A placeholder for data coming from services hosted on the system. Like many others, this directory is rarely used.|
|/tmp||Temporary files: Files that do not need to be conserved beyond program execution should be stored here. Many operating systems actually clear the contents of this directory on reboot.|
|/usr||Read-only user data: This directory provides a secondary hierarchy for shareable read-only user data. The /usr directory should contain the following:
|/var||Variable files: Files that are expected to be modified by running applications or services, for example, logfiles, cache, spool, and more. It comes with a hierarchy of its own:
As you may have noticed in the directory structure, Linux operating systems have a reserved directory for "device files" (/dev). As a matter of fact, this folder contains elements referred to as nodes, each node representing a different device on the system. They can be actual hardware devices or pseudo devices; either way, the purpose of having them listed as part of the filesystem is to facilitate input and output interactions with programs and services—software developers can access devices as simply as they would read or write to a file. You will learn that device files are used in a number of situations and you should sooner or later have a use for them.
There may be a large variety of devices available in the /dev directory, unfortunately all of them usually bear an obscure name making it nearly impossible for you to understand their purpose. Device files are named according to conventions in use in Linux operating systems. Since there is a potentially infinite amount of devices, we will only identify the most common ones. A device filename is composed of a prefix, conventionally defined according to the driver type, and optionally a number (or letter) if there is more than one device of that type present on the system.
Device file conventional prefixes for the most common types:
Some of the devices listed in the /dev directory do not correspond to actual hardware devices. Instead, they are here for the sake of providing administrators and developers with simple input and output access to specific resources. For that reason, we call them "pseudo devices". Here is a brief description of the most commonly- used pseudo devices:
This pseudo device is often nicknamed black hole as its purpose is to disregard all data that is being sent to it. When written to, it always reports the write operation as successful. When read from, the device returns no data.
This is particularly useful if you want to redirect the output of a program to nowhere; in other words, if you want to make sure a command executes but outputs no text on the screen.
[firstname.lastname@example.org ~]$ cat shopping.txt > /dev/null
|/dev/random||Random number generators|
|/dev/urandom||Streams that generate flows of random numbers. /dev/random generates true random numbers, whereas /dev/urandom provides pseudorandom numbers. These streams can be written to in order to feed the pool.|
Since they generate binary data, numbers coming from /dev/random and /dev/urandom cannot be displayed to the console terminal (they would look like a flow of garbage data). These devices are mostly used by developers wishing to collect reliable random numbers.
This pseudo device is a stream that returns an error when written to as it is always considered full. When read from, it returns an infinite stream of null characters.
The purpose of /dev/full is to provide programmers and administrators with an operation that will always trigger an error:
[email@example.com ~]$ echo Hello! > /dev/full ~bash: echo: write error: No space left on device
|/dev/zero||Zero data |
Much like /dev/null, the zero pseudo device always provides successful return codes when written to. However, when read from, it outputs an infinite stream of null characters.
There is a variety of cases where reading from /dev/null can prove useful, such as providing data as input to a program that will generate a file of a given size or writing to a storage device in order to format it.
As you may have noticed in the previous sections, some of the devices available in the /dev directory are storage devices, such as hard disk drives, solid-state drives (SSD), floppies, or CD-ROMs. However, accessing the content that they serve is not as simple as browsing them with the cd command. Storage devices need to be mounted to the filesystem. In other words, devices need to be attached to a fixed directory.
[firstname.lastname@example.org ~]$ cd /dev/md1 ~bash: cd: /dev/md1: is not a directory. [email@example.com ~]$ mount /dev/md1 /mnt/alexdrive [firstname.lastname@example.org ~]$ cd /mnt/alexdrive [email@example.com alexdrive]$ ls Documents Music Photos Videos boot.ini
The mount command allows you to attach a device (first argument, /dev/md1 in the previous example) to an existing directory on your system (second argument). Once the drive is mounted, you are able to access the drive like you would access any other directory of the filesystem.
In modern Linux distributions, CD-ROMs and other common devices are automatically mounted by the system.
If you want to obtain information about currently mounted devices, a simple call to mount does the job—it tells you where each device is mounted, as well as the filesystem in use:
If you wish to have a drive automatically mounted on system startup, or to simply set a directory to be used as the default mount point for a device, you will need to edit the /etc/fstab file logged with administrator privileges. It is a simple text file and thus can be opened with a text editor such as nano. The file, however, respects a specific syntax, and making some changes unknowingly could cause a lot of damage to your system. More details on the fstab syntax can be found online on websites such as tuxfiles.org.
Eventually, if you need to remove a device while the computer is in use (for instance, remove a USB storage drive) you should always unmount it first. Unmounting a device is done using the umount command:
[firstname.lastname@example.org ~]$ umount /dev/usb1
Note that the first argument of the command may either be the device filename or the mount point, producing the same result.
There is a common misconception of the notion of "filesystem" when it comes to Unix-based operating systems in general. Since those systems respect the FHS, they use a common directory hierarchy regrouping all files and devices. However, storage devices may have their independent disk filesystem. A disk filesystem is designed for the organization of files on a mass storage device (hard disk drives, CD-ROMs, and so on). Microsoft Windows operating systems favor the FAT, FAT32, and NTFS specifications; whereas the default and most recommended one for working under Linux is the EXT3 filesystem. EXT3 comes with a number of characteristics, and it is essential for administrators to master them in order to fully understand the operating system they work with.
Unlike Microsoft's antique FAT32 file system that only allows files up to 4 gigabytes, the size restriction with EXT32 is 16 terabytes (depending on the block size). Moreover, the maximum storage space that can be used by EXT3 on a device is 32 terabytes, so you should have no trouble using it for a number of years, unless storage drive capacities suddenly skyrocket. One of the interesting features of EXT3 is that it lays out the data on the storage device in a way that file fragmentation is kept to a minimum and does not affect system performance. As a result there is no need to defragment your drives.
The EXT3 filesystem accepts filenames up to 256 characters. Filename extensions are not required, although they are usually present and correspond to the content offered by the file—a .txt file should contain text, a .mp3 file for music, and so on. An important fact, however, is that filenames are case-sensitive—you may find, in the same directory, files named "SHOPPPING.TXT", "Shopping.txt", or "shopping. txt"; all three are different files.
With Linux disk filesystems such as EXT3, a large variety of information is stored for each and every file. This information is separated both logically and physically from the actual file data and is stored in a specific structure called inode (index node). Some of the data contained in the inode indicates to the OS how to retrieve the contents of the file on the device. But that is not all—to the inode includes file permissions, user and group ownership, file size, access and modification times, and much more. Note that it does not contain the actual filename.
Inodes each have an identifier that is unique to the device. This identifier is called inode number or i-number and can be used in various situations. It can be retrieved by using the ls -i command:
Among the metadata contained in an inode, you will find three different timestamps concerning the file. They are referred to as atime, ctime, and mtime.
The date and time the file was last accessed. Every time an application or service reads from the file using a system call, the file access time is updated.
|mtime||Modification time |
The date and time the file was last modified. When a change in the file content occurs, the file modification time is updated.
The date and time the file was last changed. This timestamp concerns changes on both the file attributes (in other words, alteration of the file's inode) and the file data.
Make sure to understand the difference between modification time and change time. The first one concerns the file data only, whereas the latter tracks modifications of both file attributes and data. Here are some common examples illustrating all three mechanisms:
File access time (atime):
[email@example.com ~]$ nano shopping.txt
The file is opened in a text editor; its content is accessed. The file access time is updated.
File change time (ctime):
[firstname.lastname@example.org ~]$ chmod 0755 script.sh
The file permissions are updated (chmod command detailed in a later section); consequently, the inode is altered and the file change time updated.
File modification time (mtime):
[email@example.com ~]$ echo "- a pair of socks" >> shopping.txt
The file data is modified; as a result, both file modification time and file change time are updated.
As you may have noticed, there is no creation time recorded in the inode, so it is impossible to find out when a file was first created. It remains unclear as to why such an important element was left out. Either way if you want to know all the timestamps associated with a file, you may use the stat command:
[firstname.lastname@example.org ~]$ stat shopping.txt
Important information for SSD (Solid-State Drive) users
It is proven that enabling the access time feature of the filesystem can cause dramatic performance drops on your drive. Every time a file is read, its inode needs to be updated. As a result, frequent write operations are performed and that is obviously a major problem when using this kind of storage device. Be reassured that a simple solution exists for this problem as you have the possibility to completely disable file access time updates. This can be done via one of the options of the mount command, noatime. The option can be specified in the /etc/fstab file if you want to enable it permanently. More documentation can be found online with a simple noatime ssd search. Credit goes to Kevin Burton for this important finding.
Symbolic links in Linux are the equivalent of shortcuts in Microsoft Windows operating systems. There are a number of differences that need to be explained though, the most important one being that read or write accesses to the file performed by applications actually affect the target of the link and not the link itself. However, commands such as cp or rm affect the link, not its target.
Creating a link is done via the ln -s command. Here is an example that will help you understand the particularities of symbolic links:
[email@example.com ~]$ ln –s shoppinglist.txt link_to_list [firstname.lastname@example.org ~]$ ls link_to_list photo.jpg photo2.jpg shoppinglist.txt [email@example.com ~]$ cat link_to_list - toothpaste - a pair of socks [firstname.lastname@example.org ~]$ rm link_to_list [email@example.com ~]$ ls photo.jpg photo2.jpg shoppinglist.txt
As you can see, reading the file content can be done via the symbolic link. If you delete the link, the target file is not affected; the same can be said for a copy operation (the link itself would be copied, but not the target file).
Another difference that makes symbolic links stand apart from Microsoft Windows shortcuts is that they can be connected to files using relative paths. This becomes particularly useful for embedding links within archives—deploying a shortcut using an absolute path would make no sense, as users may extract files to any location on the system.
Finally, Microsoft Windows shortcuts have the ability to include additional metadata. This allows the user to select an icon, assign a keyboard shortcut, and more. However, symbolic links are simple connections to the target file path, and as such, they do not offer the same possibilities.
Another type of link that is not available under Windows is hard links. They function a little differently, in the extent that they represent actual connections to file data. Two or more links may connect to the same data on the storage device; when one of those links is deleted, the data itself is unaffected and the other links still point to the data. Only when the last link gets deleted will the data be removed from the storage device.
To illustrate this example, let's create a hard link to that shopping list of ours—same command, but without the -s switch.
[firstname.lastname@example.org ~]$ ln shoppinglist.txt hard_link_to_list
If you decide to delete shoppinglist.txt, hard_link_to_list will remain here and the data it points to is still available. Additionally, the newly created link is considered as an actual file by some commands such as ls. If you run ls to calculate the total size occupied by files in this directory, you will notice that link file sizes add up. If the shopping list file itself takes up 5 kilobytes of storage space, the total size reported by ls for the directory will be 10 kilobytes—five for the shopping list file itself, and five for its link. However, some tools such as du (for Disk Usage, evoked further below) are able to dig deeper and report the actual occupied storage.
The next step towards your discovery of the Linux shell is to learn how to manipulate files with a command-line interface. There are many operations that you can perform with simple tools—editing text, compressing files and folders, modifying file attributes, and so on, but let's begin with a more elementary topic—displaying a file.
Before all, you should understand that we are working with a terminal here, in other words, there is no possibility to work with graphical data; only text can be displayed on the screen. In that extent, this section deals with text files only; no binary files such as graphics, videos, or any other form of binary data may be displayed on the screen.
The most used and simplest way to display a text file on the terminal is to use the cat command, as you may have noticed in examples from previous sections.
Although the cat command can be used to perform more complex operations (such as concatenation from multiple input sources), its simplest form consists of using the syntax—cat filename. The content of filename will be displayed to the standard output—in other words, the terminal screen.
If you reuse the grep mechanism that we approached in the process management section, you can achieve interesting results for filtering the output:
[email@example.com ~]$ cat /etc/fstab | grep sys /dev/sys /sys /sysfs defaults 0 0
As you can see, piping the output to grep allows you to specify a text string; all lines that do not contain the specified string will not be displayed.
You can pipe the output to other programs as well, in order to have your text displayed in a different manner. For example, if your file happens to be a large text document, it will probably not fit in the terminal window. The solution to this problem is to pipe the output to more:
More allows you to control the document flow—it displays as many lines of text as your terminal can contain and waits until you push the Return key to display more. Pressing Q or Ctrl+C will let you return to the prompt.
Even better—the less command allows you to scroll up and down in the document flow. It is used as a standalone program, no need to pipe its output from cat:
[firstname.lastname@example.org ~]$ less /etc/php.ini
If you are a long time Microsoft Windows or Mac OS user, you might be surprised to learn that there are actually advanced command-line text editors. Several of them come with most Linux distributions—vim, emacs, nano, and so on. The question here is—which one should you use? Since you are reading this, the best choice for you should be nano, which has already been mentioned in previous sections.
Nano is a user-friendly text editor that comes with a lot of interesting features such as syntax highlighting, text search and replace, and keyboard shortcuts. Unlike its competitors that usually require a lengthy learning process, nano's interface is intuitive.
Since there is no mouse cursor, the interface is controlled via keyboard shortcuts; available operations are displayed at the bottom in the command bar. Once you finished editing your document, save (Ctrl+O) and exit (Ctrl+X). Note that the list of available shortcuts is displayed in the bottom bar, the ^ character indicating a Control key combination (^G stands for Ctrl+G, ^O stands for Ctrl+O, and so on).
There are other ways to write in a file though, using commands that do not require any form of interface at all. One of the possible ways is to use the mechanism of redirection. This allows you to specify a location for the input and output streams interacting with a shell command. In other words, by default, the text shows up on the screen; but you do have the option to specify other locations. The most common usage for redirections is writing the output of a command to a file. Here is an example demonstrating the syntax:
[email@example.com ~]$ ls /etc > files_in_etc.txt
The command executes normally but does not output any text to the screen; instead, the text is saved to the file you specified. The > character allows you to write the text to the file, and if the specified file already exists on the system, the original is deleted and replaced. In this example, we list the files located in the /etc directory and save the results in a text file. Using >>, you have the possibility to append the output to an eventual existing file (if the file does not exist, it is created):
[firstname.lastname@example.org ~]$ ls /etc/init.d >> files_in_etc.txt
The list of files found in /etc/init.d is appended to the text file. There is much more you can do with redirections including replacing standard input, but covering it all would be unnecessary to your understanding of Nginx.
Finally, the touch command allows you to update the access and modification date of a file without having to actually edit its content.
[email@example.com ~]$ touch shopping.txt
Although the ZIP and RAR formats are popular and wide-spread across the Internet, they are both proprietary software technologies. As a result, they are not mainstream choices in the Linux world; other formats such as Gzip and bzip2 are favored. Of course, solutions exist for both ZIP and RAR under Linux; the point being that most projects and downloadable archives that you will find will come as .tar.gz or .tar.bz2 files.
You read correctly, there are two extensions—tar, and gz or bz2. The first part indicates the method with which files have been gathered together and the second part shows the algorithm used to compress the result. Tar (for Tape archive) is a tool that concatenates multiple files into a single one called tarball. It also gives you the option to compress the tarball once it is created, offering various compression alternatives. The tool is available under most distributions, though in some of the most minimal ones, you may have to install it manually with your system package manager (read the section further below).
The syntax for creating a tarball using Gzip and bz2 compressions respectively is as follows:
tar czvf archive.tar.gz [file1 file2...] tar cjvf archive.tar.bz2 [file1 file2...]
Conventionally, Linux users do not archive multiple files together; instead they first gather files into a unique folder and then archive the folder. As a result, when users extract the archive, only a single item is appended to their directory listing. Imagine extracting a ZIP file onto your Windows desktop. Would you rather have all files appearing individually on your desktop, or collected neatly in a single directory? Either way, the syntax remains the same whether you want to archive files or directories.
Tar can, of course, perform the opposite operation—extracting files. However, you need to enter a slightly different command depending on the compression algorithm at use:
tar xzvf archive.tar.gz tar xjvf archive.tar.bz2
Note that tar.gz files are also found as .tgz, and tar.bz2 files as .tbz. Other compression formats handled by tar are: LZMA (.tar.lzma) and compress (.tar.z), but they are now obsolete and there is a good chance you will never have to use them.
If you stumble upon RAR or ZIP files, you may still extract the files they contain by downloading and installing the unrar or unzip tools for Linux. The syntax that they offer is rather simple:
unrar x file.rar unzip file.zip
Since you are going to be installing and configuring Nginx, we assume that you are the administrator of your server. Setting up such an important component on your system requires good understanding of the administration concepts and tools available with your Linux operating system.
As we discussed in the Superuser Account section, it is important to respect the principle of least privilege. In that extent, you should log in to your system with the root account as rarely as possible. When you do so, you put your system at risk in many ways. Firstly, if your network communications were to be intercepted, the potential damage caused by a computer hacker would be greatly reduced if they intercepted a simple user account. Secondly, everyone makes typos. What if you accidentally type rm –rf / root/file.x, thus erasing your entire / directory, instead of rm –rf /root/file.x? What if you run an application that could cause damage to your filesystem? Being logged in as a regular user minimizes the risks in all situations.
This raises an obvious question—if you are always logged in as a simple user, how do you perform administrative level tasks or tasks that specifically require root privileges? There are two possible answers to this issue—su and sudo.
Su, short for substitute user, is a command that allows you to start a session with the specified user account. If no user account is specified, the root account is used. You need to specify the password of the account you want to use (unless you are already logged in as root and want to take over a user account).
[firstname.lastname@example.org ~]$ su - root Password : [email@example.com ~]# nano /etc/fstab
From that point on, you are logged in as root. You can run commands and administrative tasks. When you are finished, type exit to return to your previous session.
[firstname.lastname@example.org ~]# exit exit [email@example.com ~]$
You may have noticed the use of a hyphen between su and the username—it indicates that you are actually creating a shell session for the user, inheriting all of its personal settings and environment variables. If you omit the hyphen, you will remain in the current directory and will conserve all settings of the user account you were originally logged in with.
Although its name is closely similar to su, sudo works in a totally different manner. Instead of creating a complete session, it's only used to execute a command with the specified account, by default, the Superuser account. Example syntax:
sudo nano /etc/fstab
There is a major difference in the way su and sudo function; when executing a command with sudo, you are prompted for your own account password. I can already hear you scream—how come I can gain root privileges without the root password? The answer lies within the /etc/sudoers configuration file. This file specifies the list of users that are allowed to use sudo, and more importantly, the commands that are allowed to be executed. Moreover, all actions are recorded into a log including failed sudo login attempts.
By default, a user does not belong to the sudoers. Consequently, you first have to log in as root (or use sudo) and add the specified user to the /etc/sudoers file. Since this configuration file respects a strict syntax, a tool was specifically designed for it—visudo. Deriving from the well-known vi text editor, visudo checks the syntax of the file upon saving it, and makes sure that there are no simultaneous edits.
Visudo - and by extension, vi - works in two modes—command mode and insert mode. The insert mode lets you to edit the document directly. Press the Esc key to switch to command mode, which allows you to enter a command to control the program itself. When you first start visudo, press I to switch to insert mode and then make the necessary changes, for instance, adding a new sudo user at the end of the file:
alex ALL=(ALL) ALL
This grants the alex user all permissions on the commands defined in the sudoers file. Once you finished editing, press Esc to enter command mode. Enter the following commands: :w to save your changes and :q to exit. If you wish to exit without saving, type the :q! command. For more information about vi or visudo, use the man command (or if you are familiar with the jargon RTFM!).
Now that you have all the pre-requisites for administering your server, it's time for you to perform actual administrative tasks. The first set of tasks that we will approach is related to system resources. Before proceeding to system changes such as software package installs (covered in the next section), you should always check that your system is in a coherent state and that you have enough disk and memory space available.
The df utility allows you to check the available storage space on your mounted devices.
The –h option allows you to display sizes in a human-readable format. You should often check your available storage space: when you happen to run out of space, random behavior may occur in your applications (that is, unintelligible error messages).
If you notice that your disk is full and do not understand why, you might find du to be particularly useful. It allows you to display the space occupied by each folder in a given directory.
Again here, the –h switch specifies that the tool should display human-readable size statistics. If the --max-depth option is not used, du will browse your filesystem recursively from the current folder. You can now easily track the folders that take up too much storage space on your system.
The free utility displays the current system memory usage. It displays both physical and swap memory statistics as well as buffers used by the system. Use the –m switch for displaying numbers in megabytes or -k in kilobytes.
Basic command-line usage? Check. Users and groups management? Check. Enough memory left on your system and space on your storage device? Check! It looks like you are ready to install new software packages and components. There are basically three ways to proceed, and we will study them from the easiest to the most complex one.
A package manager is a tool that facilitates the management of software packages on your system by letting you download and install them, update them, uninstall them, and more. There are many different packaging systems in the Linux world, which are often associated with particular distributions—RPM for Red Hat-based distributions, APT for Debian-like distributions, simple TGZ packages for Slackware, and so on. We will only be covering the first two as they are the most commonly-used ones.
For systems using RPM, yum is by far the most popular package manager. As for APT, the apt-get tool comes with most distributions. Although their syntax differs slightly, both programs basically have the same features—given a package name, they will download software online and install it automatically.
The following example shows you how to install PHP on your computer using yum:
[firstname.lastname@example.org ~]# yum install php
[email@example.com ~]# apt-get install php
All required components such as libraries or other software are downloaded and installed first and then the requested software package is processed. There is nothing else that you have to do except to confirm the operation. You may also use the update or remove operations with either tool.
Be aware that there are only a limited number of software packages that you will find with these manager tools, as they are based on lists called repositories. The repositories that come with Linux distributions are often strictly regulated, and software developers cannot always use them to distribute their work. As a result, there are many applications that you will not find on the default repositories (you can use custom repositories though), which implies that you cannot use package managers to install them for you.
When you face such a situation, there are two options remaining—finding a package online or building from source, as covered next. This first solution generally consists of visiting the official website of the software you want to install, then finding the RPM release offered in the download section (or the DEB package for Debian systems).
Once you finished downloading the RPM file, for example, using the wget download manager, use the rpm –ivh command to install the package:
[firstname.lastname@example.org ~]$ wget ftp://example2.com/mysqlclient.rpm (Download successful) [email@example.com ~]$ sudo rpm –ivh mysqlclient.rpm
Use the dpkg –i command for DEB packages:
[firstname.lastname@example.org ~]$ wget ftp://example2.com/mysqlclient.deb (Download successful) [email@example.com ~]$ sudo dpkg –i mysqlclient.deb
Note that this method does not process dependencies. The application might not install correctly because a required library was not found on the system, in which case, you would have to install it yourself.
The last method, which is valid regardless of the distribution you are using, is to download the application source code and compile it yourself. This method has its own advantages—you usually have the possibility to configure a great variety of options, and you may even make some edits to the code if you are a developer yourself. On the other hand, it requires many development packages to be installed (compilers, libraries, and so on) and compiling might fail for obscure reasons—missing components, invalid version for one of the required libraries, and so on.
The general process is to download a .tar.gz archive containing the source code, extract the files, enter the directory, and run three commands—configure, make, and make install. In the following example, we download the latest version of nano and install it:
[firstname.lastname@example.org ~]$ wget http://www.nano-editor.org/dist/v2.0/nano- 2.0.9.tar.gz (Download successful) [email@example.com ~]$ tar zxvf nano-2.0.9.tar.gz (Extraction successful) [firstname.lastname@example.org ~]$ cd nano-2.0.9 [email@example.com nano-2.0.9]$ ./configure (Configuration complete) [firstname.lastname@example.org nano-2.0.9]$ make (Build successful) [email@example.com nano-2.0.9]$ sudo make install (Install successful)
Depending on the software install process, the output binaries may be copied to the /usr/bin folder (or another folder found in the PATH environment variable), but you will sometimes have to do that by yourself.
Some applications require more specific compilation commands and procedures, which are described in the usually included readme file. Under no circumstance should you omit to consult the readme file before building an application.
Unix-based operating systems use a complex permission mechanism to regulate access to files and directories. You should also know that directories are actually regarded as special files; they work in the same way when it comes to permissions.
There are three types of access—reading from a file, writing to a file, and executing a file. Each of these accesses can be defined for the original file owner, the file group, and all other users. Permissions on files can be consulted with the ls –l command:
[firstname.lastname@example.org photos]$ ls –l total 2 drwxrwxrwx 2 alex alex 4096 oct 31 11:35 Holidays in France -rw-rw-r-- 1 alex alex 8 oct 31 09:21 photo2.jpg
The first column provides a character representation of the file permissions. It is composed of ten characters:
On top of that, directories have specific attributes—sticky bit and set group ID. The first one ensures that files placed in that directory can only be deleted by their owner (and the root user, naturally). The second one makes it so that new files created in that directory conserve the group ID of the directory.
Permissions on a directory differ from regular file permissions:
Surely you have already read it somewhere: instructions telling you to change a folder's permission to 0755 or even 777. The given number is actually an octal representation of the file or directory permissions. This format is composed of three or four digits from 0 to 7, where 0 means no permissions and 7 all permissions.
The first digit is optional and indicates special attributes (such as sticky bit); often unspecified or set to 0
The second digit indicates permissions for the file owner
The third digit indicates permissions for the file group
The fourth digit indicates permissions for other users
Digit values from 0 to 7 are calculated using the following method: each attribute has a weight; all attribute weights added up together forming the total value. The weights are: 0 for no attribute, 1 for "r", 2 for "w", and 4 for "x". Consequently, each attribute variation has its own octal representation:
|- - -||0+0+0||0|
|r - -||1+0+0||1|
|- w -||0+2+0||2|
|r w -||1+2+0||3|
|- - x||0+0+4||4|
|r - x||1+0+4||5|
|- w x||0+2+4||6|
|r w x||1+2+4||7|
Full permissions for everyone (file owner, file group, and other users) thus translate to rwxrwxrwx, 777 with the octal representation.
Users may change permissions on their own files only, except for the almighty Superuser. The process is done using a well-known tool—chmod. There are two main syntax variations—you may either specify an octal value for a complete reset of the permissions or request a change on a specific attribute.
Using an octal value:
[email@example.com ~]$ chmod 777 photo2.jpg
The first argument is the octal value, followed by the file or directory name.
The second syntax is more complex:
chmod who+/-what filename
The first argument (who, + / -, or what) is composed of three elements:
Here are a couple of possible examples for this syntax:
chmod +x script.sh: Renders a script executable.
chmod go-rwx photo.jpg: Nobody is allowed to access the photo other than the owner.
chmod a-w shopping.txt: Nobody can edit the text file, not even the owner.
Note that the –R switch applies permission changes recursively on a folder:
chmod –R g+rx photos: The "photos" folder can be accessed by all users in the group; all its photos can be viewed.
The chown and chgrp commands allow you to respectively change a file's owner and group. While the first one can only be executed by the Superuser for obvious security reasons, any user may change a file's group, provided they are the owner.
The chown tool can be used with the following syntax:
chown user filename
In this case, user is the new owner of the specified file. As for chgrp:
chgrp group filename
Again here, group is the name of the new group for the specified file. Alternatively, chown supports the following syntax:
chown user:group filename
Similar to chmod, these commands accept the –R switch allowing you to apply changes recursively. Here are some possible uses for either tool:
chown alex photo.jpg: Executed as root; the new owner of "photo.jpg" is the user "alex".
chown –R root photos: Executed as root; the "photos" directory and all the files it contains now belong to the root user.
chown alex: Students shopping.txt: changes both the file user and group.
chgrp guests shopping.txt: The group for the "shopping.txt" file is changed to "guests".
chgrp –R applications /etc/apps: The "applications" group now owns the /etc/apps folder.
This last section on file permissions marks the end of this introductory chapter, summarizing commands and tasks that a web server administrator executes on a regular basis. Using the shell is mostly about remembering command names and arguments. It becomes a lot more efficient as you get used to it. After a while, as you get back to Windows, you will even sometimes find yourself opening up a command-line terminal to perform simple tasks!
Anyway, you have all the ingredients you need to start with the next step—downloading and installing the Nginx web server application. By the end of next chapter, you will have a working setup and should be able to load the default page of your server.