How to Build Your Own Linux Distributionby: Frank PohlmannGo to the source to learn Linux basics and build the right Linux for you Linux® From Scratch (LFS) and its descendants represent a new way to teach users how the Linux operating systems work. LFS is based on the assumption that compiling a complete operating system piece by piece not only teaches how the operating system works but also allows an independent operator to build systems for speed, footprint, or security. Many authors have written about UNIX® flavors, delving into the mysteries of scheduling, memory management, multiprocessing and threading, file systems, and the interaction between users and the kernel. The author writing about Linux has an advantage over UNIX authors: The Linux kernel is unlikely to split into competing forks -- corporate upheavals notwithstanding -- because the GNU Public License (GPL), the existence of a centralized research lab -- the Open Source Development Lab (OSDL) -- and Linus Torvalds' unassailable position make Linux, luckily, a slow-moving target. Why UNIX internals matter Different Unix kernels do not agree on much apart from what could be described as a certain family resemblance. The various UNIX flavors have an advantage, though, that Linux seems to lack: All UNIX flavors are supposed to be full operating systems. Linux, often described as "just a kernel" (an arbitrary definition if ever there were one), presents a core of common functionality and implementations that do not change fundamentally whether the kernel runs on an underpowered Pentium® II machine or on a Symmetric Multiprocessing (SMP) system. To simplify matters even more, one could say that the further you get from a Linux kernel, the more variety you're likely to find, while UNIX systems tend to be diverging implementations of various UNIX/POSIX standards. Things are never quite as simple as that. Inspecting Linux kernel and system-level code is likely to be a time-intensive affair and of somewhat limited use in the real world. The LFS project aims to remedy the problem of limited system-level intelligibility on Linux. The very fact that the kernel needs a large number of libraries and tools to get a Linux system to perform even basic tasks has been commented upon, but what if a somewhat more sophisticated user who has a slim-line Linux distribution does not like to download several gigabytes of binaries that lock him out of any chance to optimize a system and do not allow him to throw out all those pesky, unnecessary tools? What if a very sophisticated user refuses to accept the diktat of various community distributions and wants to run a Linux/Apache/MySQL/PHP (LAMP)-type application stack from a CD? LFS comes to the rescue. Linux From Scratch The LFS project is, obviously, based on the source files that are sufficient -- but not necessary -- to make up a basic Linux system. It has moved beyond the Linux kernel and the device drivers, because to produce a working Linux system, you have to add a complete compiler tool chain, a number of Linux assembler utilities, the glibc system library, system configuration tools, and tools connected to userland shell access. LFS is predicated on the assumption that a Linux or UNIX power user with some knowledge of scripting wants to get to know the workings of a complete usable system without having to delve into the kernel code itself. To get acquainted with the way a Linux system works, the creators of LFS decided that compiling the system by following the tree of module dependencies would be a natural way to get to know the mechanics of an operating system in general and Linux in particular. After users have mastered the compilation process, they can start eliminating those parts of the dependency tree connected to system components that are irrelevant to supporting the operating system's primary purpose. It is feasible, for instance, to eliminate the compiler tool chain itself after compilation is complete. Embedded LAMP stacks can make do without a full set of command-line utilities. Configuration utilities might be dropped, as well, and most users can make do with one, instead of the plethora of file systems Linux tends to support. Linux parts Now for the big caveat of LFS distributions: What a courageous distribution builder needs is a working Linux distribution, including a complete compiler tool chain and a suite of file system-creation utilities. Naturally, all source-based Linux distributions need to be bootstrapped using a particular compiler version, which is by no means identical from distribution release to distribution release. LFS is not the only system in this field, but it is the only system that allows you to work directly with individual source files. Most other source-based Linux systems, such as Sourcemage and MyGeOS, provide a complete download, which users are well advised to use. LFS makes no such assumption, and stripping down the LFS framework is encouraged. Presuppose a functioning Linux distribution installed on nonexotic hardware, even though LFS is probably less demanding as far as configuration tools and scripting are concerned. To compile LFS, you need to prepare a partition and a file system, and you also need to compile a compiler and system library. It is a fairly nerve-racking procedure if done by hand, but it definitely increases your confidence in dealing with the rest of the installation. The compilation of the whole system tends to take from an hour to four days, depending on the age of the underlying hardware and your command-line dexterity. If -- and this is a fairly big assumption -- you're willing to retain much of the book installation and keep changes to the installation proposed in the LFS book to a minimum, you could also use the automated installation routine to install an LFS-based distribution. The installation routine is not presented in the LFS book, but is available as an XML-based description under the name Automated Linux from Scratch (ALFS). The active installation is available as a C-based script that uses Automated Linux From Scratch Beyond LFS The creators of LFS recognized the need for other varieties of source-based Linux systems. For those who want to go beyond LFS and add X Window System, GNOME, and networking support, another LFS derivative was created: Beyond Linux From Scratch (BLFS). The trio of LFS books -- and let's not forget that we are talking about books, not distributions -- form a triangle standing on one of its angles: The basic LFS build is the foundation for an automated compilation and, if required, for a full source-based Linux distribution. BLFS turns the basic Linux system into a full user-ready Linux system. AFLS simplifies installing and extending a source-based Linux installation. The compilation of the complete source-based system is guided by a script you can leave to run on its own after you have tuned it to the hardware on which it's running. You can extend the installation sequence easily after you (or the installation engineer) have decided which packages are required to run, say, a particular office application suite. ALFS comes in handy here, too, as it lends itself to network-wide installations from source. Hardened LFS The final member of the LFS family addresses a particularly important aspect of source-based Linux: security. The common-sense approach to security for someone who does not intend to rely on patches delivered from your Linux distribution server of choice would be to track security advisories for selected core libraries and applications. For LFS implementers, the problem is somewhat different: It would be difficult, although not impossible, to audit Linux kernel code, and perhaps a number of libraries and utilities central to the internal functioning of a Linux-based operating system. Code audits are extremely time-consuming, and adding a large number of patches is advisable only if patch servers are maintained centrally by dedicated staff. It is, however, possible to replace some libraries that have been rewritten from the ground up to reflect new approaches to security problems. A good example is to make it extremely difficult to guess process identifiers by randomly allocating numbers from a reasonable large random number pool. The OpenBSD project has pioneered this method, which has found its way into various UNIX flavors and Linux distributions. A fairly new project known as Hardened Linux From Scratch (HLFS) takes this approach to security under Linux. The project, which presupposes a fairly decent grasp of LFS and some parts of BLFS, uses several utilities and libraries that do not tend to be standard in most Linux systems. Possibly the most important addition to HLFS is the Stack-Smashing Protector (SSP), which you enable by using a The growing LFS family The LFS family of Linux builds is, in many ways, a method for giving back the power to construct Linux-based operating systems to the people who started it all: the hackers. But the most important result for the creators of LFS seems to be that through LFS, all Linux distributions have become intelligible to interested users. By allowing users to build a Linux distribution piece by piece and by helping users see a Linux-based operating system as a system of many parts, alternative approaches to building Linux distributions become possible. More generally, users do not need to be programmers to change the way a Linux distribution is built: The bit of scripting users learn by building an LFS system is sufficient. An LFS specialist can change and extend the very composition of a Linux distribution without impairing its basic structure. This functionality is particularly important for organizations that have the manpower and expertise to maintain Linux systems, but not the financial wherewithal to buy commercial support from consultancies and corporations. LFS-based Linux systems have been demonstrated for educational purposes and for large networks. It is likely they will be used in other areas, as well. Resources © 2008 NetVisits, Inc. All rights reserved. |