Active Projects

You can read my development log (or subscribe to its rss feed) to find out what I've been doing recently.

I maintain the following projects, each with its own mercurial archive, web page, and mailing list:

Aboriginal Linux

This set of shell scripts builds a complete bootable Linux system from source code for several different target platforms (arm, mips, powerpc, x86-64, etc.). It's self-contained (builds its own cross-compilers from source code it downloads), modular, and the scripts are designed to be easy to read (so you can see/change what it's doing).

The root filesystem created by Aboriginal Linux is designed to be the smallest possible complete development environment capable of rebuilding itself from source code. Currently this involves seven packages (busybox, uClibc, the linux kernel, gcc, binutils, make, and bash).

The downloads directory offers prebuilt cross compilers, root filesystems, and boot images (a tarball containing a kernel, ext2 image, and shell script to invoke qemu) for each supported hardware platform. The default root filesystems contain gcc, so you can avoid this whole cross-compiling thing and just compile software inside your emulator (which is slow but simple).

I've been working on this project on and off since 1999, it's what got me into BusyBox and uClibc and compilers and so on. It's where I put together everything else I'm doing (like toybox and tinycc) to see what actually works and give it a good stress-test.

toybox

After I stopped working on BusyBox I started over from scratch, and this is the result. The goal of toybox is a small, simple, complete implementation of the standard Linux command line utilities, with minimal external dependencies. You could think of it as "BusyBox lite", or perhaps as a fleshed out version of Red Hat's "nash", but really it's another project in the same category.

In 2011 I changed the project's license to 2-clause BSD and began explicitly targeting Android as a deployment platform. I want to use it to allow Android to become self-hosting development environment, possibly in combination with a USB docking station.

Tinycc

After Fabrice Bellard stopped working on his Tiny C Compiler in 2005 (due to the success of QEMU taking up all his time), I stepped in and started collecting patches. Eventually this turned into a separate project, hosted here.

Tinycc is small, simple, and very fast. It has the ability to turn C into a scripting language by adding "#!/usr/bin/tcc -run" to the start of a C source file, and it's also used as a shared library to add dynamic code generation capabilities to other programs. But my interest is in turning it into a fully functional compileer capable of compiling any arbitary Linux source package (including the kernel).

Ultimately, to highlight that Linux does NOT have a "Gnu/" prefix on it (like GNU/Solaris "indiana"), I want to build a version that doesn't even use any Gnu code in the build process, let alone in the final system. This system should be able to completely rebuild itself, under itself, without any gnu code on the hard drive. To do that, I need a compiler that can build a bootable Linux kernel, uClibc, toybox, and itself. Three compilers have successfully built the Linux kernel in the past: tcc, icc, and gcc. When tcc was abandoned, I started maintaining tinycc.

Tinycc can already rebuild itself (for x86 and arm targets), and has previously built a modified subset of an older (2.4) linux kernel. I'm upgrading it to work on more hosts (such as my x86-64 laptop), support more targets (x86-64, mips, powerpc...), and to build more software (especially a current unmodified 2.6 Linux kernel).

This project is on hold. I need to replace its code generator with TCG from QEMU, and break it up into a swiss-army-knife binary that can be called as "cc", "ld", "as", "strip", and so on, as appropriate. I have some notes on qcc.


Older Stuff

filchmail

This is a small shell script (around 550 bytes) that can download your email. It's a replacement for fetchmail, although instead of using a pop or imap server it uses ssh and mailx.

When you run the script on your laptop, it does a passwordless ssh to the server (you need to have that set up already) and sends a script (a "here" document) that runs the "mail" command to copy stuff from your mail spool file into the mbox file in your home directory (still on the server), then transfers it to your laptop via a gzip pipeline. The script running on the laptop then sshes to the server a second time to run the "delete" command if it got a nonzero length file.

It could use some more error checking, but it's small and simple and it works for me. It depends on ssh, /bin/sh, mail, sed, head, tail, ls, gzip, gunzip, and echo. It assumes that a ssh public/private key pair has already been set up, and that sendmail and the mail command on the remote machine know how to talk to each other already.

micro-bunzip

While playing around with the busybox project, I got interested in bunzip, and the result is a working decompression engine in less than 7k (when compiled with -Os, anyway).

Here's the link to version 3.0 of the code. It's about 10% slower than the original, but much smaller and more readable. The current code, Version 4.1, is more optimized (10% faster than the original) but slightly less readable.

If you want to tailor it for some embedded purpose, it should be possible to strip it down even more. For example, it has both code for reading from and writing to files, and code to decompress to/from memory buffers. Any given use is likely to only need one of those...

The sample main() in the file decompresses stdin to stdout. Look at uncompressStream() to see how to use the engine. If you want to decompress from a memory buffer instead of a file, feed a pointer and length to start_bunzip and a file handle of -1. (Notice that this buffer must contain the contents of a complete bzip file: it won't ask for more data at the end, it would return an unexpected input EOF error instead.) If you want to write chunks of decompressed data into a memory buffer instead of a file, call write_bunzip_data with a pointer and length (the file handle will be ignored if len is nonzero). (Currently, requesting more than IOBUF_SIZE bytes of data from write_bunzip_data triggers a flush to the file handle. Memo: did I remember to fix that?)

Note: Newer versions of this code are used by BusyBox and by the Linux kernel's bzip mode. Toybox also uses this code, where it's under the toybox license and has a couple of important fixes as well as refactoring to make a multi-threaded implementation easier to do.

micro-bzip

Compression-side support. Email me if you care and I'll move it up on the to-do list.

thwim.so

An LD_PRELOAD library to prevent things like vim from waiting around for nonessential data (such as vim's log file) to hit disk. On a loaded system, this can easily take 30 seconds, and it does it every few hundred keystrokes, including cursoring around!)