http://www.kroah.com/linux/talks/ols_2001_hotplug_paper/hotplug.ps The history of hotplug

The history of hotplug.

What is hotplug, what problems does it solve, why do we have the current set of hotplug mechanisms, and what legacy mechanisms did the current hotplug implementation obsolete?

Before hotplug: static everything.

Originally, the kernel had no hotplug capability. A kernel without hotplug manages a fixed set of hardware, all of which is detected and initialized at boot time, and all of which remains present until the system is shut down. This is very simple, but also very limited.

This meant device drivers statically linked into the kernel image, and a /dev directory filled of device nodes for every potential device when the system was installed. A program that wanted to probe for available hardware sifted through /dev and opened devices it found there, elminating the ones which gave an -ENODEV error.

One reason for this was simplicity, but equally important was that early PCs were not designed around hotpluggable hardware, and Linux started out on a PC. When the PC was introduced, users couldn't even switch keyboards while the machine was on without risking hardware damage.

[FOOTNOTE]Of course users widely ignored this constraint whenever they could. Users hotplugged keyboards, serial, and parallel devices all the time, no matter what the manufacturer said, and by the late 80's most hardware developers had adapted to reality and buffered their more vulnerable external I/O ports. But the number of keyboard, serial, and parallel ports on the machine remained fixed, and each port could handle only one device at a time, so drivers focused on handling ports and left figuring out what device was behind an I/O port to the userspace application trying to talk to that device.[/FOOTNOTE]

Removable media

The original PC did have one type of early hotplug: it had removable storage media in the form of floppy drives (and later, CD-ROM drives, zip disks, and DVD drives). The first stirrings of hotplug support came from Linux's need to cope with removable media.

With removable media, the contents of the corresponding block devices changed, including even the size of the media represented by those block devices. Since filesystems could depend on those block devices, and processes depended on those filesystems, in extreme cases ejecting a floppy could lead to a kernel panic.

The kernel's response to this was to ignore as much of the problem as possible, and work around the rest. Removable media was treated as a special case, and the kernel grew workarounds rather than any real systematic solution to a larger generic problem.

Since the drives themselves stayed around awaiting the insertion of new media, media were treated as a property of drives, and most drives could have exactly one instance of removable media in them at a time anyway. [FOOTNOTE]There were "jukebox" style multi-CD changers, but they were poorly supported and mostly treated like a single drive with multiple partitions.[/FOOTNOTE] So device drivers used device nodes representing the drive instead of the media, and when the drive contained no media the driver would respond to attempts to access the drive's device node with error codes.

Poor hardware support for hotplug continued to be a problem: most removable media provided no notification mechanism to inform the system when media was inserted or removed. The driver could probe the device to see what media it contained at any given moment, but no interrupt was generated to signal changes. Thus the kernel had no way to respond to the a block device's removal except via extensive error handling after the fact.

Applications using removable media probed for them or received error codes on attempted access to an empty drive, and the kernel developer's advice about dealing with the problems of removing a mounted volume was "don't do that": inserting or removing media when the system didn't expect it was dismissed as user error.

A workaround: drive locking

To avoid being surprised by the unexpected removal of media containing a mounted filesystem, some drives grew the ability for software to "lock" a drive, preventing it from ejecting its media until it was unlocked. (Pressing the eject button still didn't generate an interrupt, it simply had no effect until the software unlocked the drive.) This let the operating system force users to eject removable media from software (via the "eject" command) rather than by pressing the button on the drive, allowing the OS to safely use the block device at the expense of annoying users.

By locking the drive to prevent unauthorized eject, the hotplug-less kernel avoided having to unmount filesystems on short notice. This meant the kernel didn't have to promptly flush buffers when data was written to the device (to minimize unmount time), and that the kernel could veto attempts to unmount a filesystem that was still in use for any reason, such as due to any process having open files in that filesystem or that filesystem containing any process's current directory. (Yes, even though a process's current directory could be deleted, it couldn't be unmounted. Not for any deep technical reason; support for it simply hadn't been implemented. Removable media was a poorly supported afterthought.)

Of course some drives (most notably PC floppy drives) had no provision for locking the drive; ejecting a floppy was a manual process controlled by a purely mechanical button. Users that didn't remember to manually unmount a floppy lost data, and were largely mocked as clueless by traditional Unix developers (or else PC hardware was mocked for not having drive locking support). The "mtools" package provided a popular workaround, reading and writing FAT files directly through a floppy disk's unmounted block device, probing for media before each command, and flushing all data after each command. (It even accepted dos-style names for floppy drives.)

As late as Linux 1.0, support for eject was still a special case for CDROM drives (/include/linux/cdrom.h had a "#define CROMEJECT"), and door locking support was a special case for SCSI (/drivers/scsi/sci_ioctl.h #defined SCSI_IOCTL_DOORLOCK). As late as Linux 2.4, the underlying problems with dynamically unplugging block devices were considered too hard to properly solve.

Modules

The first serious hotplug mechanism in Linux was modules, because modules allow device drivers to be loaded after the kernel boots and unloaded again before shutdown. Hotplug was a side effect, since most hardware used by Linux still wasn't hotpluggable. The primary motivation for modules was reducing the memory footprint of kernels, which was increasing due to the proliferation of device drivers. A generically configured kernel, such as those in the emerging Linux distributions, needed to be built with drivers for every piece of hardware it might encounter, but most systems it ran on would use only a small subset of those drivers. Modules meant that device drivers could probe for hardware present when the module was inserted, after the kernel booted. A module that failed to find any devices (of the kind it contained a device driver for) could refuse to load, so attempting to load all modules was a simple way of probing for available hardware. Modules could even be removed and re-inserted to scan for and handle new hardware. This was an improvement, but not a complete solution. Using modules as the primary hotplug mechanism quickly reveals numerous deficiencies: the granularity is wrong, there are insufficient notification mechanisms, and it doesn't handle configuration issues like device nodes. The granularity is wrong because a module encapsulates a device driver, and one instance of a driver can manage multiple instances of a device. When inserting a second instance of a device into a system, removing the module to reinsert it (and thus find the new instance of the device) takes the first instance of the device offline. This is an extremely unpleasant side effect. Ad-hoc mechanisms to rescan busses module manages muliple devices; insert a second ethernet card. Need a way to tell module to rescan devices. Still need a separate notification mechanism to trigger module loading. Module loading either has to happen _after_ device insertion, or module has to be told to rescan after module loaded. Doesn't handle unplug. Notification problem worse: ideally need to know before device goes away so flush buffers, umount filesystems, close file handles, etc. Cleanup. Unloading module. Doesn't handle /dev entries. Fill up /dev with every possible device, there could be thousands of them. (Every possible partition on every possible hard drive...) How does userspace know which ones are active? (In a static /dev, presence of an entry gives no information about whether or not the device is there. Iterate through the lot and test. Very slow, timeouts, generates spurious activity that can have unwanted side effects like spinning up drives...)

PCMCIA

The arrival of PCMCIA (a hotpluggable 16-bit expansion card bus for early laptops) provided the first , USB, laptop docking stations, static everything. modules The /proc directory Rescan scsi bus via /proc/scsi http://bash.cyberciti.biz/diskadmin/rescan-scsi-bus.sh.php laptops (pcmcia) USB Added in 2.2.7, written by Linus Torvalds throwing out earlier work. devfs Having the driver detect the presence of hardware is backwards. Driver loads in response to device being plugged in, but the driver detects the existence of the device... chicken and egg problem. sysfs Finally, the modern approach. /sbin/hotplug vs netlink Also, devfs provides /dev entries, but that's the wrong layer. Some hardware provides multiple /dev entries (partitioned hard drives), some dev entries have no underlying hardware (/dev/zero, /dev/null, network block devices), and some devices have no /dev entry (ethernet, for historical reasons).

Static drivers, modules, and hotplug.

A kernel without any hotplug capability manages a fixed set of hardware, all of which is detected and initialized at boot time, and all of which remains present until the system is shut down. The Linux module loading mechanism allowed drivers to be loaded after the system boots Early atte Hotplug allows the kernel to dynamically respond to the addition or removal of hardware. At boot time, /sbin/hotplug or netlink. Support for hotplug in Linux evolved out of Linus's rewrite of USB note: What's the probe for removable media? (ioctl()?)