MESSAGE
DATE | 2004-02-05 |
FROM | From: "Inker, Evan"
|
SUBJECT | Subject: [hangout] Linux v2.6 scales the enterprise
|
Linux v2.6 scales the enterprise Bigger, stronger kernel sizzles in our performance tests http://www.infoworld.com/article/04/01/30/05FElinux_1.html?s=feature By Paul Venezia January 30, 2004 If commercial Unix vendors weren't already worried about Linux, they should be now. Linux has seen wide deployment in datacenters, generally as a Web server or a file server, or to handle network tasks such as DNS and DHCP, but not as a platform for running mission-critical enterprise applications. Solaris, AIX, or HP/UX typically get the nod when an application demands the highest levels of performance and scalability. The recent release of a new Linux kernel, v2.6, promises to change that.
The v2.6 kernel ushers in a new era of support for big iron with big workloads, opening the door for Linux to handle the most demanding tasks that are currently handled by Solaris, AIX, or HP/UX. The new kernel not only supports greater amounts of RAM and a higher processor count, but the core of device management has changed. Previous to this kernel there were limits within the kernel that could constrain large systems, such as a 65,536 process limit before rollover, and 256 devices per chain. The v2.6 kernel moves well beyond these limitations, and it includes support for some of the largest server architectures around.
Will the new Linux really perform in the same league as the big boys? To find out, I put the v2.6.0 kernel through several real-world performance tests, comparing its file server, database server, and Web server performance with a recent v2.4 series kernel, v2.4.23.
Linux Meets Big Iron
A primary focus of the v2.6 kernel is large server architectures. Support for up to 64GB RAM in paged mode, the ability to address file systems larger than 2TB, and support for 64 CPUs in x86-based SMP systems brings this kernel and Linux into the more rarified air of truly mission-critical systems. The included support for NUMA (Non-Uniform Memory Access) systems; a next-generation SMP architecture; and PAE (Physical Address Extensions), providing support for up to 64GB of RAM on 32-bit systems, is also new.
There is much more to v2.6 than just bigger numbers in processor and RAM counts, however. This kernel breaks apart some of the artificial limitations that have been present in Linux from the beginning, such as the number of addressable devices and total available PIDs (Processor Identifiers). The v2.4 kernel supported 255 major devices with 255 minor numbers. (For example, a volume on a SCSI disk located at /dev/sda3 has a major number of 8, since it's a SCSI device, and a minor number of 3.) On servers with a large number of real or virtual devices, device allocation can become problematic. The v2.6 kernel addresses these issues in a big way, moving to 4,096 major devices with more than one million subdevices per major device. For most users, these numbers are well beyond practical limits, but for enterprise systems with a need to address many devices, it's a major step.
Also new in v2.6 is NPTL (Native POSIX Threading Library) in lieu of v2.4's LinuxThreads. NPTL brings enterprise-class threading support to Linux, far surpassing the performance offered by LinuxThreads. As of October 2003, NPTL support was merged into the GNU C library, glibc, and Red Hat first implemented NPTL within Red Hat Linux 9 using a customized v2.4 kernel.
Other goodies in the v2.6 kernel include integrated IPSec support, with the inclusion of the Kame Project; enhanced support for network file systems, including support for mounting Novell NetWare shares; initial NFSv4 (Network File System Version 4) support; and performance and compatibility enhancements with SMB (Server Message Block) shares, including support for CIFS (Common Internet File System). The v2.6 kernel also sports a brand new security architecture that departs somewhat from the standard Unix root user concept; its modular security mechanism provides a greater level of granularity to privileged user management.
Also introduced in the v2.6 kernel is a new approach to devices. The v2.4 kernel's devfs-based device handler has a companion in the v2.6 kernel. The newcomer is udev and is an implementation of devfs, but in userspace. Using udev, the system is able to follow devices as they move around on connected busses, with the device identifier remaining static. For instance, the first-seen SCSI device will remain as device sda, using the serial number of the device as an identifier regardless of the order in which it's found during a later boot. The use of udev is a significant change at the core of the kernel and the cause of some consternation among Linux kernel developers, with solid arguments provided by both sides. It looks like udev/sysfs will be the standard in the future, deprecating devfs, but both are present in the v2.6 kernel and are likely to remain for some time.
And yet another significant change to the v2.6 kernel is the merging of the uClinux project into the core kernel. The uClinux project has been focused on Linux kernel development for embedded devices. The main drive for this functionality is support of processors lacking MMUs (Memory Management Units), commonly found in microcontrollers for embedded systems such as fire alarm controllers or PDAs. The list of embedded controllers that v2.6 supports is quite long, including common processors manufactured by Hitachi, NEC, and Motorola. This definitely shows a separation from the roots of the Linux kernel, as all prior kernels were more or less subject to the limitations of the Intel x86 architecture.
Built for Speed
Prior to the release of the v2.6 kernel, Linux performed tasks on a first-come, first-served basis; interrupting the kernel midtask to handle another process or function was not in the cards. The v2.6 kernel, however, can be pre-empted when needed, and can allocate resources for a process that requires immediate attention, then resume processing on the interrupted task. These interruptions are measured in fractions of a second, and are not generally noticeable, but rather lend an overall feeling of smoothness to system performance. The v2.6 kernel does not bring Linux to the point of being an real-time operating system, but it goes a long way toward assuring that tasks are addressed and completed when required.
At the core of these enhancements is a new process scheduler. The process scheduler in the kernel divides CPU resources among system processes. The performance of the scheduler directly impacts system responsiveness and process latency. In the v2.6 kernel, the new 0(1) scheduler incorporates new algorithms that can substantially increase system performance, especially interactive tasks. The 0(1) scheduler can penalize CPU-hogging processes, improves process prioritization, and provides consistent performance across all processes. Also new in v2.6 are two new I/O schedulers. The scheduler used in the v2.6 kernel by default, the anticipatory scheduler, brings much improved handling of I/O scheduling, ensuring that processes get I/O time when necessary, without unnecessary queuing. Also present is the deadline scheduler, which assigns an expiration to requests using three queues, while anticipatory scheduler attempts to anticipate process I/O requests before they are actually requested.
There has been much debate over the scheduler used in this kernel, and there is support for both schedulers, defined at boot time with options passed to the kernel. The importance of scheduler performance cannot be overstressed. My tests show that the anticipatory scheduler in v2.6 surpasses the v2.4 scheduler handily. Some of my tests show a tenfold performance increase. For instance, a simple read of a 500MB file during a streaming write with a 1MB block size on my Xeon-based test system took 37 seconds with v2.4.23, and 3.9 seconds with v2.6. The deadline scheduler also performs quite well, but may not be as fluid for certain workloads as the anticipatory scheduler. Either way, the new process and I/O schedulers blow v2.4's schedulers out of the water.
In addition to the new scheduler, v2.6 has plenty of other major architectural changes. The module handling code has been completely rewritten, requiring a new set of userspace module utilities and mkinitrd packages to function. These can be found as updates to most major Linux distributions or via download. The new modutils and module kernel code is much smoother than that found in v2.4, and permits a kernel to be compiled without support for module unloading to ensure the integrity of the production kernel.
Clocking the New Kernel
To test the new kernel, I opted for scenarios that would be most appropriate for real-world users. Testing individual portions of the kernel, such as disk I/O, memory management, and so on could be interesting, but what does it mean for the overall system performance? In order to get the big picture, I selected a few tests representative of expected server workloads and used them to compare the performance of the v2.6 and v2.4 kernels.
Tests were run on three separate hardware platforms: Intel Xeon (x86), Intel Itanium (IA-64), and AMD Opteron (x86_64). The x86 tests were conducted on an IBM eServer x335 1U rack-mount server with dual 3.06GHz P4 Xeon processors and 2GB of RAM. The Itanium tests were run on an IBM eServer x450 3U rack-mount server with dual 1.5GHz Itanium2 processors and 2GB of RAM. And the Opteron tests were run on a Newisys 4300 3U rack-mount server with dual 2.2GHz Opteron 848 processors and 2GB of RAM.
The base OS distribution used was Red Hat Linux Enterprise Server v3.0, but the kernel testing relied on custom kernels compiled on each server. The v2.4 tests utilized the official v2.4.23 kernel, and the v2.6 tests utilized the official v2.6.0 kernel. Only the required modules and options were compiled, and there were no other modifications made to the kernels, other than those necessary for compilation on the various platforms, such as the x86_64 patches for AMD64 from x86-64.org.
The file-sharing test was designed to mimic a standard Samba server workload, and is based on Samba v3.0.1 with local authentication. The test harness utilized the smbtorture tools included in the Samba package and was run over Gigabit Ethernet. The tests were conducted with a simulation of 12 SMB clients communicating with a central server. The results of these tests are almost too good to believe
On the Xeon system, the v2.4 kernel pushed 38.85MBps on average, and the v2.6 kernel pushed 67.30MBps -- a 73 percent improvement. The Itanium tests show similar performance differences between the kernels, giving v2.6 a 52 percent gain, albeit with smaller overall figures. And on the Opteron system, which really showed its muscle in this test, the results were a respectable 49.37MBps on the v2.4 kernel and an impressive 72.92MBps under v2.6, an increase of roughly 48 percent.
The performance gains seen in the Samba tests are likely related to the vastly improved scheduler and I/O subsystem in the v2.6 kernel. Disk I/O and network I/O form the core of this test, and the performance improvements in the v2.6 kernel are very visible here.
The database tests were also enlightening. The test scenario was based on MySQL v3.23.58 and was run with the sql-bench test suite provided by MySQL. All tests were run from a remote server to remove the impact of the client suite running on the same server. In these tests the v2.6 kernel handily beat the v2.4 kernel. The numbers in the chart represent the total amount of time it took the systems to complete eight test procedures, but it does not show the individual numbers from each tested procedure. All eight tests in the sql-bench package were run on both kernels on all three hardware platforms.
Across the board, the v2.6 kernel outperformed the v2.4 kernel in the database tests, especially on the Itanium box, where it posted a speed increase of 23 percent (a 519-second lead) over the v2.4 kernel. On the Xeon platform, v2.6 showed almost a 13 percent gain (a 200-second lead) over v2.4. And on Opteron, it registered a 29 percent speed increase (a 415-second lead) over v2.4. The most impressive individual test was table inserts, showing the v2.6 kernel providing a 10 percent performance increase (with a 100-second lead) over v2.4 on Xeon, with even better results found on the Opteron and Itanium platforms.
The Web server tests also showed significant improvement. The static page test used a 21.5KB HTML page with two 25KB images served by Apache 2.0.48. The test was measured in requests per second using Apache's ab benchmarking tool. The Xeon tests show the v2.6 kernel outperforming v.2.4 by just under 1,000 requests per second, a 40 percent increase. The Itanium tests showed v2.6 providing a 47 percent performance increase, while the Opteron tests showed a 7 percent increase. It should be noted that the Opteron system outperformed the other two servers by more than 1,000 requests per second with the v2.4 kernel, and the smaller increase may be due to network bandwidth constraints imposed on the server. In retrospect, I believe that if I upped the network connectivity of the Newisys box with bonded Gigabit Ethernet NICs, I could push it even faster.
My Web application tests were conducted using a custom CGI script written in Perl, referencing a MySQL database running on the same system. The script ran a single select on a column in the database, returning 97 rows of eight columns, including one image. Again, Apache's ab was used to measure performance. The overall numbers showed smaller performance increases than the static tests, with the exception of the Opteron tests, but the 14 percent to 22 percent performance increases across all platforms are stellar.
My tests were geared to show the performance differences between the two kernels on each hardware platform, not to compare the platforms. That said, the Opteron's performance was outstanding; both the v2.4 and v2.6 kernels posted impressive results across all tests but most dramatically in the MySQL tests, showcasing the 64-bit support in v2.6. Overall, the v2.6 kernel shows very impressive performance gains over v2.4, itself a well-performing kernel.
While I didn't run into many problems with the v2.6 kernel, there are a few notable issues with the initial release. For example, the drivers for LSI Logic's Fusion-MPT RAID controllers have some serious I/O problems in a RAID1 configuration. When drives are addressed individually, there are no issues, but this is a significant hindrance to v2.6 adopters running with Fusion-MPT RAID controllers. These RAID modules are also problematic in the v2.6 kernel for Opteron, causing a panic unless iommu=merge is passed to the kernel at boot.
Further, on the Xeon platform, the v2.6 kernel compiles straight from the official source without a hitch, but not so on Itanium and Opteron. Although support for these platforms is present in the kernel, patches from specific platform development efforts are required to compile v2.6. Once built, the kernel boots normally, but requires the updated mkinitrd and modutils packages to fully function. Other than the driver-related problems, the v2.6 kernel compiled, booted, and ran without problems on all three platforms, handling with aplomb every test I threw its way.
Where From Here?
Today, the vast majority of production Linux systems run a version of the v2.4 kernel. Those satisfied with the performance and functionality of this kernel are not likely to make any sudden changes. If it ain't broke, don't fix it. IT shops running big databases and other mission-critical applications on v2.4 shouldn't necessarily jump on the bandwagon immediately but should definitely begin testing v2.6. The v2.6 kernel is the new boss, and it behooves any IT department to become familiar with its capabilities and plan for adoption.
And what of the v2.4 kernel? Marcelo Tosatti, the Brazil-based maintainer of the v.2.4 kernel, has announced on the LKML (Linux Kernel Mailing List) that once v2.6 is officially released, v2.4 will indeed enter maintenance mode, without further revision or major modification following the imminent release of v2.4.25. This stance has been met with some derision within the kernel development community and also amongst major corporate Linux sponsors. At the crux of the issue are the major changes in the v2.6 kernel and the fact that many manufacturers that continue to release binary-only hardware drivers have been extremely slow to produce drivers for current v2.4 branch kernels, to say nothing of the nascent v2.6 branch.
Also at issue are the fundamental changes in the core of the v2.6 kernel. Most applications that function on v2.4 kernels will continue to do so on v2.6. However, a few of the major changes could affect currently deployed applications. For this reason, Red Hat, the dominant Linux distribution in the United States, has decided to forego official v2.6 kernel support in its recent Advanced Server and Enterprise Server products, opting to stay with its highly customized v2.4.21 derivative kernel. However, Red Hat has back-ported several key elements of the v2.6 kernel into its v2.4.21 Enterprise Linux kernel, such as support for up to 64GB of RAM, 16 CPUs, IPSec, and NPTL. In this fashion, Red Hat is able to maintain application compatibility while providing what it considers to be the most desired features of the v2.6 kernel.
When building server architectures that could make use of the enhancements of the v2.6 kernel, admins will need to configure and build custom kernels tuned to their specific workloads. The problem with distribution-specific kernels is that they tend to differ greatly from the official kernel releases, both in the default option selections and the patches they include.
On the upside, these kernels are generally very broad in their hardware support, as they are configured and built with nearly every module that could possibly be used to ensure hardware compatibility for target systems. They also tend to include patches that can either increase or decrease performance, depending on the server workload. Admins who run these servers are generally best served to patch, configure, and build a custom kernel for their servers, both to ensure hardware compatibility and to squeeze out performance increases when possible. The base distribution running the server may require some modifications to accept a v2.6 kernel, such as the addition of the new modutils and mkinitrd tools, but should otherwise function normally with a new kernel.
As with any major development effort, bugs remain in the v2.6 kernel, and are being actively pursued by the kernel developers. As of this writing, kernel v2.6.2rc1 is available for download from kernel.org, and it includes various bug fixes and enhancements over the v2.6 kernel released just a few weeks ago. The process continues; those considering a move to v2.6 would be well-advised to test the new kernel thoroughly before any production implementation.
The Linux kernel has come a long way since Linus Torvalds' announcement of v0.1 in 1991. The v2.6 kernel boasts many new features as well as major performance improvements over the v2.4 kernel and is poised to take Linux into the next stage of the game: true enterprise adoption. To continue making inroads into the datacenter, Linux must grow with the needs of the established user base, as well as navigate previously uncharted waters to appeal to those still looking in from outside. The v2.6 kernel appears to be up to the task.
**************************************************************************** This message contains confidential information and is intended only for the individual or entity named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The sender therefore does not accept liability for any errors or omissions in the contents of this message which arise as a result of e-mail transmission. If verification is required please request a hard-copy version. This message is provided for informational purposes and should not be construed as an invitation or offer to buy or sell any securities or related financial instruments. GAM operates in many jurisdictions and is regulated or licensed in those jurisdictions as required. ****************************************************************************
____________________________ NYLXS: New Yorker Free Software Users Scene Fair Use - because it's either fair use or useless.... NYLXS is a trademark of NYLXS, Inc
|
|