Thumbnail oversimplification of processors:
The ARM processor is popular in the embedded space because it has the best power consumption to performance ratio, meaning it has the longest battery life and smallest amount of heat generated for a given computing task. It's the standard processor of smartphones. The 64 bit version (ARMv8) was announced in 2011 with a 2014 ship date for volume silicon.
Processor vs architecture
Although ARM hardware has many different processor designs with varying clock speeds, cache sizes, and integrated peripherals, from a software perspective what matters is ARM architectures, which are the different instruction sets a compiler can produce. The architecture names have a "v" in them and the processor designs don't, so "ARM922T" is a hardware processor design which implements the "ARMv4T" software instruction set.
The basic architectures are numbered: ARMv3, ARMv4, ARMv5, ARMv6, and ARMv7. An ARMv5 capable processor can run ARMv4 binaries, ARMv6 can run ARMv5, and so on. Each new architecture is a superset of the old ones, and the main reason to compile for newer platforms is efficiency: faster speed and better battery life. (I.E. they work about like i386, i486, i586, and i686 do in the x86 world. Recompiling ARMv4 code to ARMv5 code provides about a 25% speedup on the same hardware.)
The oldest architecture this compatibility goes back to is ARMv3 (which introduced 32-bit addressing), but that hardware is now obsolete. (Not just no longer being sold, but mostly cycled out of the installed base.) The oldest architecture still in use is "ARMv4", which should run on any ARM hardware still in use today (except ARMv7M, which is ARM in name only: it only implements the Thumb2 instruction set, not traditional arm instructions).
ARM architectures can have several instruction set extensions, indicated by letters after the ARMv# part. Some (such as the letter "E" denoting the "Jazelle" bytecode interpreter, which provides hardware acceleration for running Java bytecode) can safely be ignored if you're not using them, and others are essentially always there in certain architectures (such as the DSP extension signified by the letter "E" which always seems to be present in ARMv5). But some are worth mentioning:
The "Thumb" extension (ARMv4T) adds a smaller instruction set capable of fitting more code in a given amount of memory. Unfortunately thumb instructions often run more slowly, and the instruction set isn't complete enough to implement a kernel, so they supplement rather than replace the conventional ARM instruction set. Note that all ARMv5 and later processors include Thumb support by default, only ARMv4T offers it as an extension. The newer "Thumb2" version fixes most of the deficiencies of the original Thumb instruction set (you _can_ do a kernel in that), and is part of the ARMv7 architecture. The ARMv7M (Mangled? Mutant?) chip supports nothing _but_ Thumb2, abandoning backwards compatability with any other ARM binaries.
The VFP (Vector Floating Point) coprocessor provides hardware floating point acceleration. There are some older hardware floating point options, and some newer ones backwards compatible with VFP, but in general you can treat a chip as either "software floating point" or "VFP".
The other detail is "l" vs "b" for little-endian and big-endian. In theory ARM can do both (this is a compiler distinction, not a hardware distinction), but in practice little-endian is almost universally used on ARM, and most boards are wired up to support little-endian only even if the processor itself can theoretically handle both.
So for example, "armv4tl" is ARMv4 with Thumb extensions, little endian. This is the minimum requirements to use EABI, the current binary interface standard for Arm executables. (The older OABI is considered obsolete.)
Application Binary Interface
Linux initially implemented a somewhat ad-hoc ABI for ARM with poor performance and several limitations, and when ARM got around to documenting a new one the main downside was that it was incompatible with the old binaries.
So ARM has two ABIs that can run on the same hardware, the old one is called OABI and the new one is EABI. (This is a bit like the way BSD binaries won't run under Linux even though the hardware's the same, or the long ago switch from aout to ELF executable formats.
The oldest hardware that can run EABI is ARMv4T, so ARMv4 hardware without the Thumb extensions still has to use OABI, which is why you don't see much of that anymore. The kernel, C library, and compiler must all agree which ABI is in use or the binaries won't run. The transition to EABI happened somewhere around 2005, and these days everything is EABI.
In theory the best reference for ARM processors is the website of ARM Limited, but unfortunately they make it somewhat hard to find information about their older products and many of their pages are more concerned with advertising their newest products than giving the requested information. Wikipedia may be less frustrating, and occasionally even accurate.
|alpha||Nov 27 1994|
|sparc||Dec 4 1994|
|mips||Jan 12 1995|
|m68k||Mar 7 1996|
|arm||Jan 20 1998|
|sh||Jun 30 1999|
|ia64||Jan 4 2000|
|s390||Mar 10 2000|
|parisc||Oct 3 2000|
|cris||Jan 30 2001|
|um||Sep 11 2002|
|h8300||Apr 17 2003|
|m32r||Sep 16 2004|
|frv||Jan 4 2005|
|xtensa||Jun 23 2005|
|powerpc||Sep 19 2005|
|avr32||Sep 25 2006|
|blackfin||May 6 2007|
|x86||Oct 11 2007|
|mn10300||Feb 8 2008|
|microblaze||Mar 27 2009|
|score||Jun 12 2009|
|tile||May 28 2010|
|unicore32||Feb 26 2011|
|openrisc||Jun 4 2011|
|c6x||Oct 4 2011|
|hexagon||Oct 31 2011|
|arm64||Mar 5 2012|
|metag||Dec 5 2012|
|arc||Jan 18 2013|
This doesn't quite tell the full story: the x86 and powerpc directories were created by merging together 32 bit and 64 bit versions of the same architectures: the i386 and x86_64 directories for x86, and the ppc and ppc64 directories for powerpc. (A similar merge of arm and arm64 is expected when ARMv8 support stabilizes.)
The resulting dates are when the merge happened, the corresponding architectures were added much earlier.
In the 1950's and 60's mainframe and minicomputer processors took up an entire circuit board. Unix started on these kind of systems, few of them remain in use today.
In 1969 an engineer at Intel invented the first microprocessor, the Intel 4004, by being the first to squeeze all the functions of a CPU onto a single silicon "chip". As transistor budgets increased they upgraded the 4004's design into the 8008 and then the 8080, the chip inside coin-operated Space Invaders machines and the Mits Altair. The Altair was widely cloned to form the first family of microcomputers, which contained (and were named after) the S-100 bus, programed in Basic from a startup called Micro-soft, and ran an OS called CP/M from a startup called Digital Research.
One of the Intel engineers left to form his own company that made an 8080 clone called the Z80. But the main alternative to the 8080 was from some ex-motorola engineers who left form MOStek, the company that did the (much cheaper) 6502 processor, with its own instruction set. Motorola sued the escaped engineers for being better at it than they were, and in the end the engineers went back to Motorola and commodore bought the rest of the company to use these processors in the Vic 20 (the first computer to sell a million units) and its successor the Commodore 64 (even more popular). The 6502 also wound up running the Apple I and Apple II, and the first big home game console (the Atari 2600).
The march of Moore's Law quickly drove the microcomputer world to move beyond 8 bit processors. Moore's Law says that memory size doubles every 18 months, which quickly became a self-fulfulling prophecy hardware manufacturers used to manage inventory and schedule product releases.
The first widely cloned microcomputer (the MITS Altair) was introduced in 1975, and by the middle of the year established a range of 1-4 kilobytes of memory installed in altair compatible systems. (The original Altairs had only 256 bytes memory, but that wasn't enough and 1k expansion boards immediately became standard equipment. The first version of its standard de-facto standard programming language, Micro-soft basic, required 7k of memory to run but that was too expensive so they trimmed it down to run in 4k.) From then on Moore's Law turned the old high end into the new low end every 3 years. (In 1978: 4-16k. In 1981, 16-64k.)
These early microcomputers were called 8-bit machines because they had 8-bit registers, storing values from 0 to 255. But they used pairs of registers to access memory, providing 16 bits (64k) of address space. That was enough for four 18-month Moore's Law doublings before the high end of the microcomputer address range hit the 64k address space limit in mid-1981.
The result was a switch to 16-bit systems. IBM introduced its PC in August 1981, based on Intel's 8086 processor (actually a variant called the 8088 that ran the same software but fit in cheaper 8-bit motherboards and took twice as many clock cycles to do anything). True to form, it offered anywhere from 16k to 64k of memory preinstalled.
The main competitor to the 8086 was Motorola's 32-bit 68000 line of processors, used in just about everything except the PC (Macintosh, Amiga, Sun workstations...) Just as the Intel's 8086 was a sequel to the 8080, the Motorola's 68k was a sequel to the 6502.
Motorola skipped 16 bit registers and jumped straight to 32 bits, but back when 64k was as much memory as most high-end users could afford (costing hundreds of dollars) this provided no immediate market advantage. The 68k powered Apple's Macintosh (and several incompatible designs such as Commodore's Amiga and Sun's original unix workstations). But Apple successfully defended its hardware design in court, while IBM lost its court case against Compaq, spawning an army of clones that quickly marginalized the proprietary hardware designs and rendered the PC the standard computer.
The 8086 also used pairs of registers to access memory, but it overlapped their address ranges so instead of 16+16=32 bits of address space, each "segment" started only 16 bits from the previous one (with a 64k offset that mostly redundantly accessed the same memory as other segments), providing 16+4=20 bits of address space, for 1 megabyte. (The 640k limit of DOS was because the top third of the address range was reserved for I/O memory, starting with the video card's frame buffer.)
The continuing advance of Moore's Law meant high-end PCs would collide with the 1 megabyte limit in 1987. To prepare for this, Intel introduced its own 32-bit processor, the 80386, in 1985. Unfortunately IBM had bought the entire first year's production run of Intel's previous processor (the 80286) to keep it out of the hands of the PC cloners, and Moore's Law quickly left IBM with a giant unsold pile of slow expensive processors. This delayed its introduction of new 32-bit PCs until the original cloner introduced the "Compaq Deskpro 386" and the rest of the clones followed that, leaving IBM's PCs in obscurity.32 bits and RISC
In the 1980's and 90's, a new technology called RISC led to a gold rush of processors hoping to take market share away from Intel. These risc designs came from three sources: start-up companies producing new RISC processors, surviving mainframe and minicomputer vendors redesigning their big iron to use microprocessors, and m68k users giving up on that design once Intel's move to 32 bits removed the only argument in its favor. (If m68k had lost out to 16 bit x86, clearly it was even less interesting after the 386 came out.)
RISC designs ("Reduced Instruction Set Computers") simplified processor instruction sets down to fixed-length instructions that only took one clock cycle. This led to more verbose machine code which took up more memory, but also meant that since you didn't have to execute the previous instruction to figure out where the next one started (because they're all the same size), processors could contain a second "execution core" looking over the shoulder of the first core to potentially execute the next instruction in the same clock cycle (if it didn't use the same registers, memory locations, or depend on processor flags set by the previous instruction). Once compilers advanced to produce instructions in non-interfering pairs, RISC processors added a third core to potentially execute a third instruction in the same clock cycle (if that one didn't interfere with the first two).
Several processor manufacturers were convinced that RISC designs were superior to the older designs (which they called "CISC", for Complex Instruction Set Computers). The commodity PC had taken over the market, running a large installed base of CISC software, but the cloners were sure that during the 16 to 32 bit transition they could capture the market when everyone had to throw out their old software anyway.
MIPS was one early RISC startup, created and commercialized by some university professors based on early RISC research. The ARM design came from a tiny start-up in the UK which made the Acorn Risc Machine and got a contract with the British Broadcasting Service to sell it as the "BBC Micro" in conjunction with BBC educational programming.
At the other end of things, IBM engineers produced the Power minicomputer architecture, Hewlett Packard developed PA-RISC, Hitachi developed the SuperH (more commonly known as sh4), and Digital Equipment Corporation migrated its VAX minicomputers to their new Alpha processor (the first 64 bit processor).
Sun Microsystems develop the Sparc processor to replace m68k, but the official successor to m68k came from a partnership between Apple, Motorola, and IBM, redesigning IBM's Power processor to produce PowerPC. (Apple's macintosh redesign brought its m68k vendor Motorola together with IBM, which at the time had the world's leading microchip manufacturing facilities. The 90's were a great decade for IBM's microprocessor manufacturing arm: first to replace microchips' traditional aluminum wiring with copper, first to layer "silicon on insulator" to improve power efficiency, first gigahertz processor...)
When the PC market smoothly transitioned to the 80386, the RISC proponents were sure that better technology would eventually prevail over market forces. The 386 was an extension of the 8086 which was an extension of the 8080 which was an extension of the 8008. And the 386 wasn't just similar to previous designs to ease porting, it actually implemented a compatability mode to fully emulate the previous chip and run its software unmodified! Surely this long chain of backwards compatability that had accumulated customers for 20 years until it snowballed into market dominance had to collapse from accumulated complexity at some point?
Next Intel came out with the 486, which introduced CPU cache, allowing for "clock doubling" and "clock tripling" to run the processor faster than the rest of the motherboard and execute loops of code in cache while the slower main memory delivered the next part of the program and stored the cached results of earlier computations. But the RISC proponents saw this as merely buying time, sure that RISC would still win the day.
Then Intel introduced the Pentium, which translated CISC instructions into RISC internally. It could run as fast as RISC designs (executing two instructions per clock cycle) while remaining compatible with existing software.Mobile processors
Intel did hit a bit of a wall at 1 gigahertz, and produced a horrible RISC-like design (itanium) as its planned 64 bit transition, but AMD filled the gap with other x86-compatible designs and a sane 64 bit extension of the x86 architecture (x86-64), and Intel's customers forced it to adopt AMD's 64-bit design.
Intel and AMD competed on two metrics: absolute performance, and the price to performance ratio. They tried to produce the fastest chips, and the fastest chip for the money.
But in the late 90's laptop systems became a viable PC option, and in 2005 laptops outsold desktops. These systems are battery powered, and care about another metric: power consumption to performance ratio. The best performance for the battery life. In 2000 a startup called Transmeta (most widely known for employing Linus Torvalds in his first job out of college) proved that a processor consuming just 1 watt could provide reasonable performance. (Meanwhile Intel's itanium and Pentium 4 processors consumed around 90 watts, enough power to fry an egg on the heat sink.)
The processor with the best power consumption to performance ratio was ARM, which came to dominate the cell phone and handheld systems. When those coverged into smartphones, arm retained its dominant position.
Recently introduced processor designs attempted to compete with Arm, not Intel. Coldfire was a stripped down version of m68k, and new designs include blackfin, hexagon, tile, and so on. So far, these are doing as well against arm as the RISC gold rush did against x86.
|Copyright 2002, 2011 Rob Landley <firstname.lastname@example.org>|