Rob's Blog Creative Commons License rss feed livejournal twitter

2011 2010 2009 2008 2007 2006 2005 2004 2002

December 30, 2012

Ok, got losetup done-ish. (Needs more of a test suite, but various things worked from the command line.)

Next up is umount (which uses the losetup code, and is smaller/simpler than mount) but for a change of pace I think I should clean up cp first.

Which gets me back to where I left off in cp: figuring out what counts as an illegal recursive copy. The gnu/dammit version vetoes "cp .. ." but allows "cp ../.. ." and I'm not sure _why_. The case I'm trying to avoid is "mkdir -p temp/temp/temp/temp; cd temp/temp; cp -R ../../temp ." and it...

Duh, all you have to do is stat the source, remember that dev/inode pair, and if you see it again you've gone recursive. Ok, that's actually clever. (No way the FSF came up with it.) Here I was mucking about with creating absolute paths...

I note that I try not to look at gnu source code if I can help it (it's not so much a license contamination thing as a "life's too short to spend my hobby time trying not to vomit"), but I do run tests against it, up to and including strace.

December 27, 2012

The losetup implementation is _so_ much easier now that I've figured out how to properly encapsulate the code. And the losetup command definition is _so_ imprecisely defined: losetup -d can take multiple command line arguments (devices to deallocate). I should also test "losetup -da" and "losetup -dj filename".

Let's see: display current status when device given on command line (error if device isn't associated), display all devices with -a (silently skip non-associated devices), display all devices associated with file -j (is it an error if there aren't any? No, it is not.)

Associate one device. Associate device with offset. Associate with size limit. Change size of file and recheck size with losetup -c. Find first device with -f...

I haven't yet made -d actually _delete_ loop devices because I'm not sure how to tell how many precreated devices there are. (There's a /sys/module/loop/parameters/max_loop but it's 0 on my host and that's got 8 precreated loop devices.)

December 26, 2012

Figured out how I should probably handle losetup (run the losetup main() function from mount and have it leave the found device be at the start of toybuf; avoids more than one file needing to mess with linux/loop.h directly _or_ parsing TT from the wrong context).

Didn't get to play with it today because people sent me bug reports. Lots, and lots of bug reports. Here's the first half of the one I'm currently debugging...

December 25, 2012

So "mount -o loop" needs to do losetup -f (which is best implemented with /dev/loop-control these days), which basically means it needs most of the guts of losetup.c. If I move said guts out into lib/lib.c they depend on the contents of linux/loop.c and I don't want lib.c to depend on linux/*. (Even if I make a compile time probe, there's either #ifdefs in the C code or conditionals in the build script, and so far I've avoided both. All that sort of thing belongs in individual commands that already have config symbols compile time probes can switch off. (These config symbols work on a simple generic rule: name of C file under toys/* matches name of config symbol.) The lib stuff is just lib/*.c which gets pared down by the linker discarding unused symbols.)

Even if I bit the bullet and did factor it out, a theoretical loopback_setup() function cares about the layout of the GLOBALS block. I can sort of abstract out the arguments into char **device (which could start NULL and be set to the allocated "/dev/loopX" string for -f) and char *file (which could be NULL indicating we don't associate but instead display the results of an existing association, although it needs a flag value to indicate whether an existing but unassociated loop device is an error, since it isn't for "losetup -a" but is for "losetup /dev/loopX", but I can just use (char *)1.).

But the problem is the function needs more than that. It needs to know whether to open stuff read only or read/write (flag -r). It needs to know the device and inode of a comparison function for -j searches. So I can get it down to about 5 arguments... Although signalling we're NOT doing -j by setting dev and ino both zero is dubious, since lanana implies that NFS breaks everything, as usual. If I can't check toys.optflags the argument values get uncomfortably magic. (And I can't check losetup's optflags from mount, they're generated values and mount hasn't got losetup's macros defined in its namespace.)

Sequencing-wise, the -j stuff is in the middle between finding a device and binding or reporting the device, you filter out whether or not the device is one you want after you open it. To split that out I have to either provide a callback function (not an improvement) or split the one guts-of-losetup function in half with the first part returning a loop_info64 pointer out of linux/loop.h, meaning the linux header would have to be #included in multiple places because the users of the function need it.


It really looks like what I'm going to have to do is have mount xexec() losetup and parse the output through a pipe, which is almost as disgusting as the other alternatives.

(The reason this wasn't a problem from the busybox version is it shared less code and was less clean. I didn't have generic infrastructure doing option parsing, I didn't care about #including linux/*.h from anywhere I felt like it, I threw #ifdefs around with abandon, and there was no POSSIBLE way to make the makefile I inherited any uglier than they already were. Making it work isn't a problem, making it _pretty_ is the problem. It takes an awful lot of work to get a simple result that looks obvious.)

December 24, 2012

Poking at losetup again. The tricky bit is I want to use the code as a standalone command, and reuse the code in mount and umount.

For mount I need essentially losetup -f, but it has to communicate back to the parent program which device it grabbed. This is fiddly because the option flags specifying -f and the global block are command-specific. If I write the functions the obvious way they're not portable, and if I write them to be portable (marshalling buckets of arguments into and out of two function arguments when they're already there in the global block) there are only two users.

One idiosyncrasy of /dev/loop-control is that the naieve (check /dev/loop0 then loop1 and stop at the first one that isn't there) no longer works because the ability to create and delete devices means the set active at any given time can be sparse. So I need to list the contents of /dev and parse the loop[0-9]* entries, so dirtree and a callback. Which is why data needs to be in the global block, because the callback isn't set up to pass things like FLAG_f. (I've got existing data structures for per-node context and global context, and adding a third layer is just awkward.)

I need to do a test suite, which requires root access to work. Unfortunately, while it's easy to come up with tests:

losetup -f -s
losetup -f file
losetup -fs file
losetup -af (fail!)
losetup -j file -s
losetup -j file file (fail!)

It's much harder to make the results reproducible. The output of querying a device includes dev and inode numbers that aren't reproducible, the paths of the associated device are absolute (and thus include the directory you ran the test in), and the order that losetup -a finds devices when it's doing a directory scan is kind of arbitrary (in my tests, it's find them in the reverse order devices were created).

Also, losetup -f is inherently racy. It finds or creates a device, then tries to use it as a second step, and another instance could allocate the device in between those. I'm trying to figure out if this should report an error or if there should be retry logic in there...

Maybe I should break down and have the losetup device scan sort the devices before trying to look at them, but this widens a similar race window with the ability to create/remove devices.

Sigh. If I wanted to do a half-assed job I'd be done by now. It's being stroppy at the design level.

December 23, 2012

How does one test a dog for vampirism? Apparently garlic is bad for dogs anyway, so it's same as the "If you drive a stake through its heart, it dies" test not really being a good way to determine whether something is or isn't vampiric in nature.

Very time consuming dog. Very energy consuming dog. Very seperation anxiety sleep deprivation dog.

December 21, 2012

Why does nobody in washington understand basic economics? This "fiscal cliff" nonsense _can't_ raise interest rates because that's not how interest works.

Interest rates are the return on investment you can get when you loan out money. When the economy sucks, interest rates go down because nobody is making any money, so nobody can afford to make the payments if they borrow more money at high rates. (The people willing to do so don't qualify for the loan. Yes, lenders can always get higher rates by accepting more risk, but beyond a certain point you're just gambling and lose more money to defaults.)

The current economy is stuck in a type of stall we haven't seen since the 1930's: nobody's making any money because nobody's spending any money, and nobody's spending money because nobody's making any money. Living off savings is terrifying if you can't replace them. This problem is easy enough to fix with something like FDR's "new deal" where the entity that can print money (and thus can never run out) buys a bunch of everything to push the economy back up to speed. But we've got a really really BIG economy and it takes a LOT of spending to get it unstuck, and the "stimulus package" of 2009 was maybe a third of what we needed. (Enough to turn "hoovervilles" and the "bonus army" into "occupy wall street".)

Ordinarily when demand goes down below where we need it to be to avoid layoffs, the federal reserve will offer to loan money at even lower interest rates until people are willing to borrow again (sometimes just to refinance their existing debts and lower their monthly minimum payments, thus freeing up new money to spend each month on goods and services). Unfortunately if you get a big enough shock the interest rate you'd need to offer to get monthly spending back up to a rate capable of keeping everybody employed is BELOW 0%, and the federal reserve can't offer that. And because this knob normally works so well at controlling the amount of monthly spending people do, the feds no longer have a backup plan for what to do when the knob hits its end stop. (Asking them to pull out the techniques FDR used in the 1930's is like asking people to pull out candles during a blackout: how quaint, how uncivilized, we don't DO that anymore...)

Unfortunately the old fogies stuck "fighting the last war" are treating this as a supply-side crisis ala the OPEC oil embargo of the 1970's, so they're busy giving water to a drowning man and blocking any attempts to drain the swamp because they cannot CONCEIVE of a problem where customers simply don't buy products producers are selling. The problem MUST be at the producer end, It's not like customers have any CHOICE in the matter, they're just sheep behaving mathematically without volition, right? Only business owners are actually _people_...

Rich people looking for places to park their savings DESPERATELY want rates to go up because right now they're losing to inflation. They don't understand why rates are so low, they're convinced it's a conspiracy on the part of the federal reserve to punish rich people and prevent compound interest from making them richer. They've invented a "bond vigilante" fantasy whereby any day now rates will MAGICALLY go up, and suddenly their vast fortune will be earning, 3%, 4%, 5% above inflation instead of losing money to inflation every year. How will interest rates go up without any increase in people's ability to qualify for new loans or make additional monthly payments? Well, they just HAVE to. Because the alternative would be unthinkable!

Of course that's not how it works, but it turns out that people who made their money by modern white collar piracy ("leveraged buyouts") don't have to understand how the economy works any more than sports players who win via steroids (and retire at 30 with epic health problems) really understand anatomy, biochemistry, neurology... An olympic medal does not qualify you to perform surgery.

Speaking of inflation: it turns out the federal reserve could fake negative interest rates by raising the rate of inflation, because 5% inflation and 1% interest is essentially -4% interest. Go ahead and borrow money: by the time you have to pay it back it won't be worth as much anyway. In fact your existing debts get slowly eroded and less troublesome. But rich people HATE inflation eroding their existing fortune, and will fight to the death to stop this from happening.

P.S. a leveraged buyout is where you borrow a bunch of money to buy a profitable company, often using that company's assets as collateral. (Just like when you get a mortgage on a house, the house you're buying is the collateral.) Once you're in charge not only can you drain the company's bank accounts and pocket the money, but you can transfer your loans into the company's name so the debts you ran up buying the company are no longer your problem. If you haven't maxed out the company's credit rating yet you can have the company borrow MORE money (and pocket it). Next chip off any large assets (buildings, profitable division) and sell them, pocketing that money. Laying off employees can reduce expenses and allow the company to qualify for more loans. Rewrite the employee contracts so any retirement benefits are no longer based on savings and will be paid from future revenues, so you can pocket any existing pension fund. When there's nothing left to loot, sell the dessicated husk of the company and pick a new target.

This is how people like Mitt Romney made their money. Yes, it starts with the ability to borrow millions of dollars, which is easier to do if your daddy used to be the governor of Michigan. If this sounds utterly evil, you obviously don't understand the realities of business where corporations are people and employees are "resources, comma, human".

That said, stealing the Mona Lisa from the Louvre still doesn't make you Leonardo Da Vinci. Obtaining is not making. This is why the correct response to calling rich people "job creators" is to point and laugh.

December 20, 2012

Downloaded the new PC BSD 9.1 release and fired it up under qemu 1.3.0. It hung endlessly on something like PCI bus scanning, with and without ACPI, so I told it to boot in "safe mode" (what is this, windows? That would run under qemu...) and it paniced saying it could't find a time source. So much for this year's interest in BSD.

New dog is very time consuming dog.

December 19, 2012

I uninstalled my irc client after someone on there insisted that the directory (which contains nothing but symlinks into aboriginal/downloads) was confusing them, and that I needed to remove it so their brain could cope. (This was the same person who said I should grab or despite both of those already being taken: the whois command is a thing that exists.) Really: I have other things to do with my time.

Part of my short temper is due to my normal tendency to switch to a night schedule and Fade's morning-person tendencies wanting me to be on the same schedule as her, so when I do shift to a night schedule these days she wakes me up every couple hours to see if I want to get up now, and then I'm groggy but not sleepy all night.

Plus still recovering from injuries, which the tetanus shot more or less qualifies as at this point. (My arm is swollen, and the irritation is somehow maintaining the outlines of the band-aid days later.)

Fade and I are getting a dog. The cats are just going to be _thrilled_. We've spent about 4 hours a day all week looking at various dogs, and have gone through a half dozen where we decided "ok, we want this dog" and then either it's adopted out from under us (we were told they couldn't be reserved before pickup, except that the one Fade wanted most was reserved when we came to pick it up), or "we forgot to mention that this dog was part of a bonded pair that can't be separated" or "now, about this dog's medical problems..." But Fade really wants a dog, so we keep at it. Amazingly time consuming, dog hunting.

I need to finish filling out paperwork for the job I start a little over a month from now up in Minnesota. It's a six month contract: as much fun as toybox is, I'm still paying a mortgage on a place three times bigger than I used to live in, and putting my wife through college. (I'd pondered doing a kickstarter or something to see if anybody wanted to sponsor some full-time toybox work, but Fade wasn't enthused about the idea.)

I need to repost the perl removal patches, and even though it's the merge window, and I've posted them to the list a half-dozen times over the past three years, and they don't actually change the generated files, I should probably try to feed them through the linux-next tree because the kernel development clique is ossifying a bit in its old age, developing ever-more layers of procedure and ritual. I downloaded the linux-next tree and read a bit of the wiki, but so far there's nothing about actually submitting patches to it. Possibly it's in Documentation somewhere...

Friend visiting from out of town this weekend. (She used to run Mensa games night before retiring to Maryland).

Recovering from injuries, which the tetanus shot more or less qualifies as at this point. (My arm is swolen, and the irritation is somehow maintaining the outlines of the band-aid days later.)

December 18, 2012

I have a needle-phobia. Today, I got a tetanus shot. That was pretty much my day.

I put it off as long as possible but after the weekend's incident with the wire hoop and the picture of bleeding I posted to twitter with the "do not click on this link" warning... it's a piece of metal lying on a neighbor's lawn, out there long enough to corrode a bit despite being some variant of stainless steel. And the last time I _might_ have had a tetanus shot was 9 years ago.

Then Fade took me to look at dogs, I got home, played skyrim, and fell asleep on the couch until almost midnight. I have a bunch of things I _should_ do, but really wasn't up to any of them.

December 17, 2012

If anybody cares about the patches removing perl from the linux kernel build, I just posted 3.7 versions to the mailing list: 0/3, 1/3, 2/3, and 3/3.

My direct mail sending script mangled them slightly (the archive sort of has a long name for me, but not quite?), and I'm still waiting for the list to send copies back to me so I can see how they came through, but I did the cannonical patch format with diffstat, sent them to the get_maintainers cc: list, and it's at least not whitespace damaged (unlike Balsa). With several days of merge window left.

Just like the last half-dozen times...

December 16, 2012

I got a toybox release out, and an Aboriginal Linux release out.

And I tripped over a wire hoop in the neighbor's yard and re-injured my darn foot. Much bleeding. Really annoyed. Probably need a tetanus shot.

December 14, 2012

My giant build finally completed sometime after midnight (takes more than a full day to build all targets on this netbook, and that doesn't include the native compiles). And the the 3.7 kernel broke arm, mips, and i686.i That's above average collateral damage for a single package upgrade.

I dealt with i686 day before yesterday. Bisecting arm gives me commit 387798b37c8d which added multiplatform support and changed the Arm platform type default from ARCH_VERSATILE to ARCH_MULTIPLATFORM. Ok, add the explicit config symbol to LINUX_CONFIG in the arm target files... and it builds and boots. Right, rebuild the kernel on all those targets...

I have GOT to get a faster laptop. Or at least a server I can connect to and knock out some fast compiles just to show there aren't any OTHER problems...

Let's look at mips:

arch/mips/lib/delay.c:24:5: warning: "__SIZEOF_LONG__" is not defined

Lovely. This is another toolchain version thing, isn't it? Do a quick "gcc -dM -E - < /dev/null | grep SIZEOF" on both toolchains and... yes. Yes it is.

Ok, looks like it's time to update the kernel's README and Documentation/Changes because gcc 3.2 and ld 2.12 ain't gonna work no more. I'm having to patch gcc 4.2.1 and binutils 2.17 to get them to build this sucker, and this is no longer "the sh4 maintainer is an asshole", this is two different architectures breaking even the ones I test (much newer than the documented requirements) it in the same release.

On the bright side, fix that and it seems to be working again.

Gotta test the native builds. Gotta cut a toybox release. Gotta send the perl removal patches upstream (possibly into linux-next). I should check the armv4teb target to see if I can finish it in a reasonable amount of time. I should see if powerpc-440 can actually work with qemu's new -M bamboo board emulation. I should dig up the qemu-m68k branch and make puppy eyes at Laurent...

But first thing after cutting this release: get back to the ccwrap rewrite so I can switch to musl.

December 13, 2012

I need to send the perl removal patches upstream again, and deal with a backlog of documentation patches I've tagged, but Balsa is crap. There is NO WAY to get it to avoid whitespace damaging patches. Even forwarding a message AS AN ATTACHMENT did whitespace damage.

I checked the kernel's Documentation/email-clients.txt and it doesn't mention Balsa (unsurprising) and specifically says that the gmail web front end does not and cannot be made to work. (Longish list of reasons, including it converts tabs to spaces, period.)

Meanwhile, I've got the list of marked patches in balsa that I need to extract _from_ balsa. Right click on the message and... there's no "save as" option. Great. None of the icons when I've got a message selected does it. The file, edit, view, mailbox pulldown menus have nothing. If I "view source" on a message and cut and paste from that window, it turns tabs into "\t". (What were they smoking?) I tried creating a new mbox file to copy the messages to (although undoing the mime encoding is a pain but it's something), but right click has no copy! It has move, but there's no option to leave the original message in place instead of marking it deleted. (What is this, some kind of DRM enforcement? There can be only one copy?)

I eventualy found something usable under the Message pulldown menu, called "Save current part", which can deal with my flagged messages. But in terms of sending _out_ new messages containing non-whitespace-damaged patches, Balsa simply can't do it.

So for a third time, I'm writing python code to fling mail around, using the builtin packages in the python standard library that do this stuff with only a couple lines of code from me, the meat of which is:

recipient = recipient.split(",")

headers = ["From: " + sender, "Subject: " + subject, "To: " + recipient[0]]
if len(recipient)>1: headers.append("Cc: " + ",".join(recipient[1:]).lstrip())
headers.extend(["MIME-Version: 1.0", "Content-Type: text/html", "", ""])
body = "\r\n".join(headers)

session = smtplib.SMTP("", 587);
print session.ehlo()
print session.starttls()
print session.ehlo()
print session.login(sender, password)
print session.sendmail(sender, recipient, body)

So I hardwire sender to my email address, pass a comma separated list of recipients and a subject string on the command line, and redirect a file to stdin containing the body of the message.

This means I have now written a POP receiver, an SMTP sender, and an mbox filter in python, because in each case it was EASIER THAN FIXING BALSA. If I could decide whether ot pursue the gtk or qt bindings (or some other gui library), I'd just write a front end and be done with it. (I can compose messages in "mousepad".)

But I don't _want_ to write an email client. I just want to _use_ an email client. One that isn't crazy. I have other things to do...

December 12, 2012

So 3.7 added "static __initconst u64 p6_hw_cache_event_ids" to arch/x86/kernel/cpu/perf_event_p6.c and it's breaking my i686 toolchain. What the heck is __initconst? It's defined in include/linux/init.h as "#define __initconst __constsection(.init.rodata)" and right below that is an #ifdef CONFIG_BROKEN_RODATA for toolchains that don't handle this. Which is only set for parisc right now, but apparently applies to anything still using gcc 4.2.

One way to fix this is to default BROKEN_RODATA to y (which works), but I don't want to maintain yet another patch against the kernel that has no change of going upstream. Instead I should probably figure out how to patch gcc. I've been meaning to do a similar upgrade like I did with binutils, where I move to the gcc repository commit right before they went GPLv3 and then fix whatever's broken in that random snapsot, on the theory this might provide ARMv7 support. That would be a good target to support. (The new 64-bit ARMv8 will definitely require a non-gcc toolchain.)

Unfortunately, the gcc repository is crap. As far as I can tell the project is _still_ maintained in subversion and they just mirror it in git, and there are no tags, or even obvious commit comments announcing releases. I have yet to figure out where the release I'm _using_ is. A git log on the COPYING3 file found the commit that introduced that, and fairly extensive grepping of the commit before that (c473c074663de) didn't find any references to GPLv3 (or "[Vv]ersion[ \t]*3" that's actually about the license instead of libstdc++, or several other variants...) However this commit requires MPFR and GMP to build meaning it's off into 4.3 territory, and according to my notes 4.2.2 was GPLv3, so it looks like tags weren't the only history that got lost in the repository conversion. Sigh.

And then when I installed mpfr and gmp on the host just to see what would happen, but build broke in a hilarious way:

In file included from /usr/include/stdio.h:28,
from ../../.././libgcc/../gcc/tsystem.h:90,
from ../../.././libgcc/../gcc/libgcc2.c:33:
/usr/include/features.h:324:26: error: bits/predefs.h: No such file or directory
/usr/include/features.h:357:25: error: sys/cdefs.h: No such file or directory
/usr/include/features.h:389:23: error: gnu/stubs.h: No such file or directory

Translation: they did the -nostdinc thing with a lot of -I and prevented the standard headers from finding themselves, because gcc is _special_. It can't cope with building like a normal program, no, it has to micromanage the host compiler. (Even though all it DOES is parse input files and produce output files which is NOT HARD. Except for the FSF.)

Broke down and just did the kernel workaround for the moment, which got i686 building. Set a going to see what other targets break...

December 11, 2012

The 3.7 kernel dropped last night, so today is patch update day. My hacks to the arm "versatile" board for better qemu support don't apply cleanly anymore. (The reversion of the IRQ changes that break qemu still applies cleanly, but all the menuconfig symbol stuff to stick different processor types in the same board had context change around them.) I don't need to apply the ext4 stability fix since that's upstream.

Tryn's old "make BOOT_RAW a selectable menuconfig option" had another context change but I yanked it rather than rediff it because I'm not actually using it. (Possibly I should give that one more submission. The real value there is the help text...)

And then there's the perl removal patches. That's not just version skew, upstream had several changes: different #ifdef guards in the generated headers, and the UAPI changes finally went upstream so the kernel headers that get exported to userspace are now split out and kept in a different directory. Instead of chopping out "ifdef KERNEL" blocks while exporting them, the kernel's private headers #include the uapi versions. (They still need a pass to remove __user annotates and put underlines around __asm__ and so on. The "don't use u8 as a type in anything userspace sees" appears to be a coding convention rather than something the scripts clean out, another reason to separate the files rather than have different conding conventions inside and outside special #ifdefs.)

With stuff this fiddly, the best way to see what's changed and make sure you don't miss anything is to "git log" the old perl files and "git show" each commit that touched them so you see the patch and explanation, then make the corresponding changes to the shell script version. When I wrote the shell script I sat down and worked through everything it was doing and diffed the resulting generated files and it took _days_. But now that I've got equivalent behavior, I just want to see what new things showed up.

Which brings us to the new requirements, such as removing the _UAPI prefixes from the #ifdef guards preventing multiple inclusion in these files. Why do they need to do that? Git commit 56c176c9cac9 explains:

Strip the _UAPI prefix from header guards during header installation so that any userspace dependencies aren't affected. glibc, for example, checks for linux/types.h, linux/kernel.h, linux/compiler.h and linux/list.h by their guards - though the last two aren't actually exported.

I.E. "FSF code is buggy crap full of brittle assumptions about internal implementation details of the headers linux exports to userspace, and if we change those magic details glibc breaks, so work around this bug." The description talks about glibc, but the example breakage they cut and pasted was libtool. (I note that _one_ of the things the header export script has done for years is chop out all references to linux/compiler.h. But glibc and libtool are explicitly looking for it.)

The FSF wants _desperately_ to be the microsoft of the open source world, and they seem to think the way to get there is to produce code as bad as Redmond excretes. Hence the second paragraph of Documentation/CodingStyle in the Linux kernel, which Linus wrote in the very first version of that file back in 1996:

First off, I'd suggest printing out a copy of the GNU coding standards, and NOT read it. Burn them, it's a great symbolic gesture.

And you wonder why I'm following the #musl channel on freenode (after years of trying to use uClibc)?

Anyway, with the perl removal patches updated, now it's time to try a test build, and generally i686 is safest:

CC arch/x86/kernel/cpu/perf_event_p6.o
arch/x86/kernel/cpu/perf_event_p6.c:22: error: p6_hw_cache_event_ids causes a section type conflict
make[3]: *** [arch/x86/kernel/cpu/perf_event_p6.o] Error 1

But not in this instance. Right, what's going on here... it's complaining either about building with gcc 4.2 or 32-bit, because building the host version with the same .config (64-bit, gcc 4.6) happily proceeds. And 3.6 happily built this file. Ok, git bisect time...

Wow. A clean bisect (when's the last time that happened?) to commit e09df47885d7 which I'll have to look at in the morning because it's 5am.

December 10, 2012

Various people are surprised that when 1/1000th of the population skims off about half the wealth, everybody else is poorer. Sigh.

Did nobody pay attention to how capitalism _works_? That all profit is inefficiency in the market (it means there wasn't a competitor selling at closer to cost), and the way rich people get rich is generally by some variant of "cornering the market", I.E. fencing out competitors and selling to a captive audience. Whether it's a natural winner-take-all niche due to economies of scale or an artificial one due to patenting algebra, sustained profits require what Warren Buffet called "a moat around the business".

This is magnified by compound interest, the fact that earning 10% interest on a billion dollars gives you 100 million dollars for sitting at home doing nothing (and the historical rate of return of the US stock market over the past century averages out to a bit over 10% annually, including both world wars and the great depression). This is why the rich get richer, and if the system doesn't balance itself out you wind up with the French Aristocracy saying that if the starving peasants outside have no bread "let them eat cake" instead.

For a long time what the US did to counter this was tax the hell out of the rich, both to keep their share of the pie from crowding out everyone else and to pay for things like interstate highways, public schools, and tracking down Typhoid Mary.

Fifty years ago the top tax rate was 91% on individuals and 52% on corporations, and we used that money to put people on the moon and invent the transistor. This is why we didn't have to worry about "Citizens United" because the rich didn't have more money than the rest of us combined. The rest of us got together and voted to tax them into submission. Not just to raise revenue but to keep society balanced.

In 1964 President Johnson lowered the top tax rate from 91% to 70%, but it was really Ronald Regan who screwed everything up: In 1981 President Regan lowered the top tax rate from 70% to 50%, and then in 1986 lowered it again from 50% to 28% (and also raised the _bottom_ tax rate from 11% to 15%: yes he took from the poor to give to the rich). A quarter century of compound interest later, concentrating more and more wealth into fewer and fewer hands, and "the 1%" own the GOP outright.

Of coure the math of Regan's tax plan didn't _work_, and our enormous national debt is the result, as is lingering economic weakness. The whole "oh noes, Japan is eating our lunch, now India is eating our lunch, now China is eating our lunch" mess is because Ronald Regan and two Bushes screwed up a good thing. These days most of the money the US economy churns out is skimmed off by well-connected parasites. It no longer goes into fixing crumbling bridges, upgrading our ancient and decrepit national electrical grid, or putting a fiber optic cable to every home (like South Korea did a decade ago), or any of the other important things we "can't afford" to do.

(The other way the US dealt with monopoly profits is by breaking up monopolies using the Sherman Antitrust act. They broke up "Standard Oil" and as a result 4 of the 5 largest companies today are oil companies. They _didn't_ break up Ma Bell (their 1957 action resulted in a consent decree allowing them to continue but not expand outside the phone business) and the resulting company stagnated so badly over the next quarter century it changed its mind and allowed itself to be broken up in 1984, the breakup giving rise to cell phones and turning modems from a shameful semi-illegal abuse of their phone network (hooking up unofficial equipment to the phone lines, for shame!) into ubiquitous home internet access. Unfortunately, the Party of Regan also gutted sherman antitrust enforcement, so for example the 1995 and 1998 actions against Microsoft came to nothing.)

Add it all up and the weakness of the US economy, where parasites have sucked out half the blood and wonder why the beast's health is failing, is starting to hurt the rich. They keep treating near-zero interest rates as a plot to prevent them from continuing to compound their wealth. But loaning out money through credit cards doesn't work when the cards are maxed out because the cardholder is unemployed. Loaning out money in home mortgages doesn't work when a wave of reposessions has trashed everybody's equity and they can't afford the down payment on a new one.

Low interest rates won't make poor people living from paycheck to paycheck borrow more if they have mountains of existing debt, all they'll do is refinance. Even if they want to borrow more, the first thing they'll do with the money is pay off their existing high interest loans, lowering their overall interest rate without increasing their overall level of debt. To avoid that, the rich can't let them "qualify" for new loans; even tightly controlled store credit cards that can ONLY be used for new spending mean they might put the groceries on that and use the grocery money to pay down the other high rate credit cards.

The problem rich people have trying to park their money in a depressed economy is that compound interest isn't a mathematical abstraction, it's loaning or investing money in people who create new value by doing work. If the people aren't working, value doesn't get created: you can't take what doesn't exist. If the people's work earns less and less money, they can't buy stuff, and your big business has a shortage of buyers to sell stuff to.

Sometimes I ponder the distance between Occupy wall Street and the French Revolution and try to work out how much pain this country would actually need before declaring billionaires a game species. But mostly I console myself with the knowledge that the people who screwed this stuff up, and the people who profited from it, are in their 80's now. They and the baby boomers behind them will all die soon, and then a new generation gets a crack at it. Preferably with at least a 50% inheritance tax.

December 9, 2012

Got in a long walk today for the first time since I hurt my foot: walked to Dragon's Lair and back. Got to see Randy and several other people (several of whom I was not prepared to meet but said hi anyway).

Stopped at The Donald's along the way, and got some programming in!

Ok, todo items, otherwise known as "procrastinating about the linux 3.7 update". Test current toybox in aboriginal, fix aboriginal's so it actually finds netcat after the busybox->toybox switch, check baseconfig-busybox for more commands toybox provides that I can switch off in busybox (looks like cut, rm, touch, hostname, and switch_root).

Try the lfs-bootstrap build: it breaks in m4 hanging on an rm prompt about deleting a ro file without -f, which is odd because the previous rm didn't and all the rm instances in configure look like they have -f. Try a chroot splice version so I can more easily track down where that's called from, and chroot-splice is saying that the read only bind mount is writeable. And an strace on mount shows it's passing through both the bind and ro flags and the result is still writeable. Is that a regression in Ubuntu 12.04's kernel... no, apparently the read only attribute can only be applied on a remount, not on the initial bind mount, which is CRAZY. (Easily fixed, but still crazy of the kernel to require.)

Ok, the delete is actually from in the control image bootstrap infrastructure, which is deleting config.guess and substituting a tiny "gcc -dumpmachine" instead (why config.guess doesn't do that itself...)

This call to rm doesn't have a -f on it, which is fine because it's a symlink (to a read-only bind mount, but the symlink itself is in a writeable directory). The problem seems to be that faccessat() is not honoring AT_SYMLINK_NOFOLLOW. In fact, strace says:

faccessat(AT_FDCWD, "build-aux/config.guess", W_OK) = -1 EROFS (Read-only file system)

And that's not even showing the fourth argument to the syscall. Is there a kernel version limitation? Let's see, cd to linux source, 'find . -name "*.c" | xargs grep sys_faccessat' and it's in fs/open.c and... I don't even have to do a git annotate, it only has 3 arguments. So either the man page is wrong or libc implements wrapper glue that uClibc is getting wrong.

Sigh. I can check the link status in the stat info dirtree is already giving me (symlinks are always chmod 777 in linux), the problem is what if the directory it's in is read only? faccessat() should tell me if the path to here doesn't let me fiddle with it. Then again, if that is the case I can't delete it anyway so prompting is kinda moot...

December 8, 2012

And ubuntu crashed, for the first time in a while. (It was X11: chrome tabs kept freezing, then _any_ chrome tab would freeze after a few seconds, then the whole of X froze so badly the mouse pointer wouldn't move and ctrl-alt-F1 wouldn't give me the text console. Remember, the system is SO MUCH MORE STABLE if your only way of interacting with it is through a userspace process.)

This means my 8 desktops full of open windows all went bye-bye. Probably a sign I was swap-thrashing, but it's back to my todo lists to try to figure out what I was working on...

Stopped by Dragon's Lair's "webcomics rampage" thing shortly after lunchtime, but apparently they don't start until 6pm Saturday this year. Ok...

December 7, 2012

Saying "Since the server breakin we've deployed SELINUX" roughly translates to "Since that boat sank we've sprayed WD-40 on everything". Not helping. Really, not helping.

(See also: thinking WD-40 will turn a boat into a submarine, arguing about the merits of WD-40 vs scotchguard for stopping a dripping faucet, allowing people who sell undercoating and extended warantees to design IT security "solutions"...)

December 6, 2012

Sometimes, posix is so nuts _nobody_ implements it properly.

The posix rm spec, section 2, says how to handle interactive prompts for recursive deletion of directories. Section 2(b) says to prompt before descending into a directory, and 2(d) says to prompt before deleting the now empty directory.

This is not what the gnu/dammit implementation of rm does:

$ mkdir -p temp/sub
$ rm -ri temp
rm: descend into directory `temp'? y
rm: remove directory `temp/sub'? y
rm: remove directory `temp'? y

It only prompts before descending into a non-empty directory. But the spec doesn't say anything about the directory being empty, it says you prompt for 2(b) and you prompt again for 2(d).

Also, section 4 is just awkward. The bits dealing with directories should be 2(e) (because you can't get there unless you made it through 2(a)), and the bits dealing with files should be 3(b).

Oh, and posix requires infinite directory traversal depth (even though filesystems have a finite number of inodes), and explicitly says you can't keep one filehandle open per directory level. This means that A) you have to traverse ".." to get out of the directory you're in, pretty much guaranteeing that two parallel rm -rf instances on the same tree cannot both complete, B) you have to consume an unbounded amount of memory cacheing the directory contents because you can't keep the filehandle open and restarting directory traversal with -i would re-query about files you already said "n" to at the prompt.

Somebody really didn't think this one through, but even _trying_ to make it compliant means I have to start over and write a lot of bespoke code that only applies to rm. Not sure it's worth it. (I'll probably break down and do it, but I'm going to sulk a bit first and call the standards guys names.)

December 5, 2012

Updated the dirtree infrastructure to feed parent to dirtree_add_node() so it can print the full path to errors. The _other_ thing I need to work out how to do is notify a parent node that one of the child nodes had an error.

I'm going to have to have multiple negative values in parent->data for a COMEAGAIN callback, aren't I?

Oh well, could be worse. The two uses for that field are dirfd for directories and length for symlinks. (Should be zero for normal files.) Symlink length should never be negative and the only negative fd is AT_FDCWD (which is -100, and that's hardwired into the linux ABI at this point).

No, that doesn't work because I can't just reach out and set parent->data at failure time because it's using that filehandle to iterate though a directory and there may be more valid entries after the failing one. So I'd have to defer setting it, which means I need another place to store it which means it should just _stay_ there. I'm going to have to allocate another variable in struct dirtree, aren't I? (I keep being tempted to overload fields in struct stat, but they're just not well defined enough.)

Actually getting the functionality right took a couple hours. Getting error handling/reporting right is coming up on day 3.

December 4, 2012

I wonder why "rm" cares about the files it's deleting being read-only? The "unlink" command doesn't. Oh well.

Yeah, finally working on The Most Dangerous Command, which I've held off on not because it's hard but because it's most likely to screw up my system if I get it wrong.

The yesno() stuff is wrong, it's checking stdin, stdout, and stderr (in order) to find a tty and using the first one it finds, meaning it's trying to write a prompt to stdin. I can special case my way past this, but in general working out when "yes y | thingy" should feed answers to yesno and when "zcat file | tar" should bypass stdin and bother the tty... I think the caller has to specify the behavior it wants. Gonna have to rewrite that function, but I probably need more than 2 users to work out the right semantics.

The ubuntu man page for rm has a --no-preserve-root option, which is a technical violation of posix but probably a good idea anyway. Except that longopts with no corresponding short opts are kinda tricky. (I _sort_ of have support for them, you can put parentheticals before all the other options and it'll parse them and set a flag. But there's no FLAG_ macro for those, and teaching the shell script to do that sounds painful. (It would wind up being something like FLAG_no_preserve_root anyway.)

I could trivially just ignore "/" (and have the error message be "rm /. if you mean it"), but posix doesn't require it and it conflicts with simplicity of implementation. If you log in as root and "cd /; rm -rf *" or "rm -rf /*" you're equally screwed without hitting the special case. Doing a realpath check for "/" might catch a couple more ".." than you expected, but how is taking out "/home" by accident much better?

Another fun little piece of special case behavior:

mkdir -p sub/other
touch sub/other/file
chmod 400 sub/other
rm -rf sub

This complains about being unable to delete "sub/other/file". It does _not_ complain about being unable to delete sub/other, or sub, so error reporting is suppressed as it works its way back up the three. (But only for parent directories, not siblings and aunts and such.)

My first pass at the code complained about being unable to open ".", "..", and "file", with no paths. Even though . and .. are discarded by the callback what it was complaining about is inability to stat and thus generate a dirtree node, because the directory has r but not x, meaning you can list the contents but not stat the files in it. (How is that useful, you say? No clue.) The dirtree infrastructure notes the stat failure and doesn't construct a node, thus can't call the callback. I can trivially filter out . and .. from error reporting, but giving a path to file means calling dirtree_path() which means having a node to call it on.

I think what I need to do is move the error handling to the callback, which means have it make a node with zeroed stat info. I can probably detect that by looking at st_nlink which should never be zero, but occasionally is. Sigh, inband signalling vs wasting space in every node. And either way I need to go fix the existing users in ls and chgrp and such... Not liking this. What's the alternative? Making generic infrastructure too clever us unpleasant, but having it display the full path to a file it couldn't access is probably the right thing...

Also, really hard to make a test suite that captures the error output in a way the test passes on multiple implementations producing slightly different error messages. Maybe I can just count the number of lines or wash stderr through grep or something...

December 3, 2012

The losetup command is more complicated now than when I wrote one for busybox, mostly because the kernel keeps growing new features. You can set the _length_ of an association now, not just the starting offset. It's got a "capacity check" thing that updates the loopback device to match a changed file size (which is _not_ the same plumbing as setting the length of the association because the people writing this aren't big into code reuse). You can iterate through all loop devices associated with a given file...

The fun one is /dev/loop-control meaning loop devices come and go now, so the -a, -f, -j, -d, and possibly even the basic association options are more complicated, in a way I'm not sure how to avoid race conditions for, but only after July of last year. (xubuntu 12.04 has this, but 10.04 didn't. And looking at the log of drivers/block/loop.c in the kernel there's a LOT of activity going on in what you'd think would be an ancient stable system. Partition support (tweaked August 2011) with LO_FLAGS_PARTSCAN (no support in the userspace losetup xubuntu's using). And of course the kernel parameter to set the number of pregenerated loop devices (potentially to zero, so you _must_ use loop-control to request new ones).

It's a bit like moving from the old static PTY devices to /dev/pts, except not quite cleanly separated.

December 2, 2012

One of the musl guys expressed interest in a big endian arm target for aboriginal, so I'm taking a stab at that. It's... tetchy.

Ok, the old arm4eb target built something but qemu couldn't boot it. The main problem here is it's oabi, which is obsolete. Nothing uses OABI anymore, and EABI requires thumb extensions so we need to bump up a notch in processor version to support this.

New target config, armv4teb, based on an unholy union of armv4tl and armv4eb. Diff the two and see what changes armv4tl needs. The gcc/binutils tuple is derived from the filename, so that should be ok. In the uClibc config there's an "ARCH_WANTS_BIG_ENDIAN", set that.

Next problem: the kernel doesn't want the versatile board to be big endian. There's some big endian plumbing support but the key is declaring an ARCH_SUPPORTS_BIG_ENDIAN symbol that currently only the ancient mach-ixp4xx declares, more or less what the armv4eb oabi stuff from before was aimed at. (Apparently you can declare the _existence_ of symbols inside conditional blocks testing on symbols which may be modified at runtime. Wheee.) This big endian plumbing has further derived symbols for armv6 and armv7 processors, but nothing sets it. Leaked infrastructure to support out of tree boards, looks like. The plumbing's there but no board definition uses it.

So, patch the kernel kconfig so the versatile board is even MORE versatile, and then set CONFIG_CPU_BIG_ENDIAN. Now try to build it...

And the sanity test at the end of simple-cross-compiler fails because is big endian but the compiler is trying to build a little endian hello world. And it's doing that because uClibc feeds a CFLAG to force big endian but the default in the compiler is still little endian. Why is the default wrong? Dig, dig... gcc's libbfd is testing the host compiler to set the target endianness. That's just SAD. Ok, in the target config "export ac_cv_c_bigendian=yes" and try again... and that test is coming out right but it made no difference to the smoke test.

Right, the armv4eb test bit-rotted in current releases but it worked in 1.1.1 so check that out and build a working big endian arm oabi toolchain to compare against... Right, got something that can run hello world under qemu-armeb. Now to look at its build logs.

This is gonna take a while.

November 29, 2012

I installed a 32-bit x86 Ubuntu 8.04 image in kvm to check how stuff built against old systems, and I found a really weird bug: fstatat() returns sizes of 0x10000000xxxx, I.E. the 1<<48 bit is set. So toybox ls -l says the files are all 17 terabytes.

I traced this down to "what is coming back from fstatat()". It doesn't do this on other systems, zeroing the stat structure before making the system call doesn't change the behavior.

The stat structure seems to be defined in /usr/include/bits/stat.h which has outright #ifdef salad. Hmmm...

Ok, found it. My "ancient glibc workaround" stuff in portability.h (to deal with libc header versions that predate posix-2008) was prototyping, and thus calling, fstatat(). But the header magic substituted fstatat64() for that behind the scenes, so rename the prototype and throw a #define in there, and now it works.

November 28, 2012

So much of economics makes more sense if you realize that supply and demand are different things.

The great depression was a demand-side crisis, triggered by the stock market crash of 1929: nobody spent money after that because they were drowning in debt and out of work. The stagflation of the 1970's was a supply-side crisis, triggered by the OPEC oil embargo: not enough oil, which is an input to almost everything (transportation, fertilizer, medicine, energy, plastics, etc) so the economy can't produce enough stuff (stagnation) and everyone bids up the prices (inflation).

The current slump is another demand-side crisis: it's like the 1930's and NOT like the 1970's. This is _really_obvious_ if you compare the two.

When there's a shortage of demand, unsold inventories pile up, factories idle, and unemployment skyrockets because there aren't enough buyers for the goods and services already out there for sale, let alone more of them. The solution is to increase demand, if all else fails by having the government buy lots of stuff and hire lots of people until we work through the backlog. (They literally _can't_ run out of money, they can print the stuff. If doing so starts triggering inflation, that will actually help erode the mountains of debt preventing everybody else from spending enough money to run the economy. But it turns out if you're not exhausting the supply of goods and services, all the money in the world won't cause inflation because you're not bidding against anybody _else_ when you buy stuff. You're just employing the unemployed and buying stuff they've got too much of, "utilizing idle capacity" it's called. FDR used it to wire up the whole country with electricity and plumbing and telephones and highways and he STILL couldn't soak up all the out of work people and unsold raw materials.)

When there's a shortage of supply, people bid up the prices for what you've got (causing inflation), and that's your FIRST sign of trouble. You don't wait for it on top of anything else, years into some other problem, it's how you know there's a problem in the first place. You wind up with long lines of customers waiting to buy backordered stuff, and people trying to jump the queue by offering a higher price. It hurts everyone else because their salary may not increase fast enough to keep up with a rising cost of living, but they probably still have a job (or can find one) because we really need them to produce more stuff. If we can't do so efficiently without the right raw materials, we do so inefficiently (using more people and charging a higher price).

The symptoms of these problems are VERY DIFFERENT. You don't get inflation or long lines due to a shortage of demand, if anything you get _deflation_ as prices creep down to find buyers for the excess capacity. You don't get piles of unsold inventory from a shortage of supply (almost by definition: we've got plenty of stuff to sell which nobody's buying is not a supply shortage).

In the 1930's, Keynsian economics was invented to figure out how to deal with the largest depression we'd ever seen, and it did a marvelous job of dealing with demand shortfalls, leading to a series of stimulus measures: FDR's New Deal, lend/lease, and finally spending on World War II itself, which we could afford to do because the economy was already recovering. Note that the other big country that did massive government stimulus during the great depression was Nazi Germany, which economic powerhouses emerged from the great depression to slug it out? How could two utterly economically devastated countries suddenly turn into production bohemothos? Because when all the idle capacity got mobilized, the stall fixed itself.

In the 1970's, economists initially responded to the oil embargo with economic stimulus, exacerbating the shortages and driving inflation through the roof. They made the problem worse by misdiagnosing it, and had to invent a whole new economic school of thought ("voodoo supply-side economics") to cope, which was actually mostly smoke and mirrors (and an excuse for truly massive embezzlement of public funds). Mostly the problem had resolved itself by this point, once Opec had driven up oil prices enough they felt like selling more at the new higher price rather than encourage us to find alternatives. The actual possibly helpful part of 1980's Reaganomics was a carrot and stick approach to the middle east to keep the oil flowing and thus prevent another supply shortfall; our close ties with Saudi Arabaia, the iranian hostage crisis, the iran/contra affair, two wars in Iraq, etc. The central importance of the middle east in US foriegn policy is fallout from the 1970's. (Of course Carter's "reduce usage and find alternatives" was far better long-term policy, but instead Regan made a tiny jihad-prone mysogenist theocratic part of the world immensely rich by cowering before their economic might so they wouldn't hurt us again. He just sold it really well, and convinced us the real threat was a country that couldn't reliably feed itself, although that had helped contribute to the overall supply shortage.)

In 2008 we had another demand shock. This time instead of a poorly regulated stock market bubbling and then crashing leading millions of retirement savers to lose all their equity and still have mountains of leftover debt (selling the stock didn't pay off the 90% margin loan), a poorly regulated home mortgage market bubbled and crashed leading homeowners to lose all their equity and still have mountains of leftover debt (selling the house wouldn't pay off the underwater mortage). TOTALLY DIFFERENT.

The problem with old fogies conservatives is that they're always fighting the last war, not the current one. They replied to the 1970's stagflation as if it was the great depression, and then they replied to 2008's great recession as if it were the 1970's stagflation. They treated the last big supply problem as if it were a demand problem, and this time they made the OPPOSITE mistake and treated a big demand problem as if it were a supply problem.

(Yes, I'm glossing over the 1991 savings and loan crisis, which A) wasn't as big, B) was correctly recognized as a demand side problem requiring stimulus. This was before the GOP's War on Reality led to epistemic closure in a hemetically sealed echo chamber. It was also before "Intelligent Design", climate change denialism, and no tax increases ever under any circumstances. Treating "Regan Did This, So Must We!" as gospel and responding to a flood with water rationing because he had a drought and you're having torrential rains and both are weather problems... morons.)

I'm frustrated that so little of the discussion has been pointing out the difference between these two _types_ of problems. Shortage of demand and shortage of supply are not the same thing. They're as different as heatstroke from hypothermia. Expecting them to have the same symptoms and respond to the same treatment is _stupid_. (And yes, I blogged on this a year ago, it's still true.)

November 27, 2012

"I have no idea what I'm doing, but I have a lot of emprical tests that either succeed or fail" seems to be science in a nutshell, really.

And thus computer science _is_ science because I dunno what I'm doing but the computer _does_, and demonstrates dozens of errors every time I try to compile or run a program. Even when I wrote the darn program I have to change it extensively to get it to WORK, by trial and error, and often find I've taken a wrong approach entirely and have to scrap what I've done and start over.

November 26, 2012

The contribution making internationalization optional sprinkled #ifdefs through the code, and fixing that turns out to be nontrivial.

Backstory: #ifdefs don't belong in C code, instead I did CFG macros that resolve to constant 0 or 1 that I can if(CFG) on and let the dead code eliminator chop 'em out. I also have USE() macros that resolve to their contents when the feature is enabled and nothing when the feature is disabled. (That works like an #ifdef but all on one line.)

First problem: if I stick a USE() macro around a config option in the option string, it doesn't get a FLAG macro created for it when disabled, so code that uses that errors on an undefined symbol.

Second problem: if the headers are conditionally included the prototypes for the multibyte functions get yanked and it spits out warnings about functions with no prototypes.

The first problem I can fix by teaching scripts/ to produce "#define FLAG_x 0" where appropriate. In theory I could just use that as my guard since it resolves to constant 0 via c99 required constant propogation, but testing the CFG symbol acts as documentation.

The second problem is more "what environment do we build in". Currently musl's locale support is "the header are there, but we only support a single hardwired locale", which means I can include the headers for that. In theory there are uClibc configs it won't build against, but that's pretty much always true.

November 23, 2012

Wandering back to the touch command: it doesn't support all 6 posix command line options (-acmrtd), it doesn't support multiple targets (the ... after "file..."), the code that's there needs more cleanup...

The touch command can't use futimes() because touch needs to be able to change the date of a file after "chmod 000 filename", and you can't open a filehandle to that (without first changing the permissions back, anyway).

This means it can't use loopfiles_rw() even with flags 0 (because for historical reasons that's O_RDONLY; there didn't used to be anything useful you could do on a filehandle other than read or write it, so open for reading is always implied and there's no way to switch it off).

I want to use loopfiles because it's the existing infrastructure for "apply this function ln the remaining arguments", and posix says touch can take multiple arguments (the contributed implementation just does one).

The current code is creating files with the wrong permissions, it should be 0666 which the default umask clips down to 644. If you umask 000 you get rw-rw-rw-...

This is going to need a large test suite.

November 21, 2012

Discarded readlink -m because it's horribly designed gnu crapware: "readlink -m /dev/null/and/subdirectories" is not actually answering a useful question. When you go readlink -e you get something that exists, when you go readlink -f you get where it would try to create something if you wrote to that path, when you go readlink -m you get... something. Something that happily traverses through nonexistent links and responds to ".." entries afterward by backing up a directory (so root and non-root may get a different answer for "readlink -m /root/symlink/.." if symlink exists but root isn't readable to other users...)

Wrote my own realpath implementation, although it's xabspath in the toybox library at the moment. Might have been overkill, but I've been fighting with that on and off for years and I just got tired of arguing with libc over this. Still debugging it of course, but the code's accessible now. (Ok, depends how you define accessible...)

And now it is time to drive over the river and through the woods, to grandmother's house.

November 20, 2012

The "readlink" command is being stroppy. Fixing "cp -r" to detect infinite recursion properly involves getting cannonical paths for each source and target, which basically means doing "readlink -f". Except the toybox readlink -f doesn't quite work right.

So yesterday I rewrote the xabspath plumbing to strip off entries at the end of the $PATH until realpath() returned something, and then glue the appropriate amount back on (or fail) as necessary. I extended readlink to implement not just -f but -e and -m as well. (Sadly, there's no standard for this command.)

Today I tackled upgrading the test suite, and found some of the existing tests no longer work. Specifically, if the path ends with a broken link, readlink -f should show where the link points to, not where the broken link lives. (The point of readlink -f is "if I write here, where would it attempt to create a file".) The problem is, realpath() returns NULL for a path ending with a broken link, and I can't beat different behavior out of code locked away in libc.

What I can do is call readlink() on each path realpath() doesn't like, and if it returns something I know I have a broken link. Except what if it's a broken link to "../subdir_that_exists/./broken", I need to clean up the path some more, duplicating realpath's job.

Which basically says I need to write my own realpath. I was trying to avoid doing this, but the one in glibc at least does not do what I need.

November 19, 2012

Sigh. Lateral progress.

The "mame" package plays old coin-operated game roms. Five years ago, it worked great. I had to tell it "-scale 2" or even 3 for various games (displays are much higher resolution than in the 80's), but life was good.

Recently they redid the display logic, so it fullscreens itself by default. I don't want that, I want it in a window so I can pause it and do other things. So "-w" to windowize it. It grabs the mouse by default, so "-nomouse" to do that. It detects the current screen height by default, but does _not_ adjust for pixels used by the title bar of the window it's in, so the display goes off the bottom fo the screen. I don't know how to fix that.

This worked fine for years. Then they "improved" it, and now it doesn't work. Welcome to Linux on the desktop, where we move constantly sideways breaking things that used to work and calling it progress.

Did I mentino the sound control in the desktop no longer works, and I have to launch "pavucontrol" and navigate to tab 3 and scroll down to find the control that actually controls speaker volume? And that if I plug in headphones the speakers go dead but I get no sound from the headphones either?

November 18, 2012

Ooh, judgement call time. (Toybox design issue.)

If you call error_exit() before setting toys.which, currently it segfaults trying to print this-> in the error message. This doesn't happen much, but it can happen if CONFIG_TOYBOX_SUID is enabled and you don't have the appropriate permissions. The permission dropping is done right at the start of toy_init(), before it does the setup that would allow it to print good error messages.

Two ways of fixing this: move the permission dropping down further, or teach the error function to skip the command name when the pointer's null. One results in doing more work as root, the other results in bad error messages.

I'd go "obviously, better error messages" if I hadn't seen a recent counterexample. (Ok, the bug there was "don't feed user data to the first argument of printf", but still.)

Ah, proper fix is of course the third option: temporarily init the global context to point to the toybox multiplexer early on. That way it attributes early complaints to "toybox", which isn't necessarily ideal but eh.

The whole thing adds another wrinkle to building standalone commands, but that's what todo lists are for.

November 17, 2012

The powerpc target isn't working for me in the latest aboriginal linux release: segfaulting right after hard drive initialization, before launching userspace. (Yes, the _emulator_ is segfaulting.) Cue (or queue) the wild flaining as I try to figure out what broke.

Upgrade my random qemu snapshot du jour to latest git: nope. Checkout 3.5 kernel from linux repo and build that: nope. Extract last "known good" 1.2.0 release tarball and try to run that: nope. Ok, that last one says it's _got_ to be qemu, and it's still broken in current -git.

Time to bisect, with the magic invocation "./configure --target-list=ppc-softmmu && nice -n 20 make -j 2" building just the powerpc target so it doesn't take an hour each iteration (as I said, slow netbook)... And it's:

commit b90600eed3c0efe5f3260853c873caf51c0677b1
Author: Avi Kivity 
Date:   Wed Oct 3 16:42:37 2012 +0200

    dma: make dma access its own address space
    Instead of accessing the cpu address space, use an address space
    configured by the caller.
    Eventually all dma functionality will be folded into AddressSpace,
    but we have to start from something.

Right. No idea what's up with that, but it's definitely what broke it. Reverting it fixes it, but it doesn't revert cleanly against current git. Right, report it to the qemu list and install 1.2.0 for now.

The powerpc snapshot requires 20 seconds instead of 15 for the initial delay before writing the first command. (If powerpc gets input when it isn't expected it, the sucker locks up. I think the behavior for the serial buffer filling up is wrong. Probably another qemu glitch, although heck it could be kernel for all I know...)

November 16, 2012

Slipped in the parking lot of the grocery store and took all the skin off the front of my ankle. (Those semi loading pads are darn slippery, one of them was left on the asphalt and slid a couple feet with my left foot on it. My right foot was still on the asphalt and wound up under me as I fell over backwards. Sprained my right wrist a bit too, but the grocery store was very nice about supplying ice.)


November 15, 2012

Is it just me, or is Atlas Shrugged the ultimate Mary Sue fanfic for the idea that if you take your ball and go home society will collapse without you? Lots of people say "I'll go away and live in a cave and THEN you'll be sorry" when they're five years old, but very few make it to adulthood and write a novel where that's the plot.

I suppose the book could only have been written by somebody who never accomplished anything else in their life. Real pioneers like Newton are sure that they stand on the shoulders of giants, they're very aware how much they build on the work of others. Even in the less scientific parts: Lewis and Clark explored the territory Columbus opened up, the US constitution was modeled on ancient greek ideas of democracy as filtered through generations of later philosophers, and so on.

Isn't it generally the "I coulda been a contender" types who are sure someone else is holding them back? Following an apparent line of reasoning 1) they've accomplished nothing, 2) it's not their fault, 3) profit!

I dunno, the confederacy just overwhelmingly voted for a devotee of Ein Rand to be vice president. They're an epically racist lot, still mad that black people are no longer doing all their work for them. I suspect these are connected.

November 14, 2012

Aboriginal Linux 1.2.1 is out. Building lots of binaries.

Sigh. Toybox threw a use after free error in tail -f. Gotta go read through Timothy's code more closely and see what's up. (Cosmetic problem with the build, not sure it saved all the logs.)

Going through the static-tools build control image to make sure everything is the most recent version. Strace 4.7 is still current, Dropbear 2012.55 is still current.

Adding a native busybox build to target means I only get it for the targets that can actually build stuff, which is different than the set that can run the busybox binary. For example, sparc dies if the kernel command line is too long (openfirmware throws an error), so can't set up the three drives. And sh4 only has 64 megs of ram and no third hard drive, so it's excluded twice due to a crappy board emulation. (Fixing that involves learning a lot about device trees and/or modifying both qemu and the kernel. Probably all three.)

Last month I got pointed at the ph7 embedded php engine which sounds like fun, but the source is a zip file and the current setupfor infrastructure doesn't understand zipfiles. (And baseconfig-busybox doesn't include unzip.) So it's easy enough to build on target by hand (build defconfig busybox or info-zip as a prerequisite, unzip it,

Hmmm, the busybox build's scripts/ does 'mkdir -p -- "$d" 2>/dev/null' and that makes a "--" directory. I have code to filter that out, why isn't it triggering...

November 13, 2012

Toybox 0.4.1 is out. Working on Aboriginal...

November 10, 2012

I have a giant re-indent pending for toybox, which I want to be the first commit in the NEXT development cycle. Doing the re-indent involves the sed regexes mentioned earlier plus various by-hand fixups (removing the vi: indent comment at the top, reflowing lines that now fit in 80 chars, adding/removing blank lines currently wrong, turning two spaces to one space after periods when I notice now that people are doing the "we have always been at war with bananastan" thing about one space because HTML treated all runs of whitespace as a single space and people got used to reading it that way so it became the new standard...

The most time consuming bit though is that some of the code is indented with four spaces, some is indented with tabs, some with a mix, and the occasional file is idented with things like three spaces. So I have to pick the right set of regexes to normalize each file, then do the manual fixups.

I'm about 2/3 of the way through the codebase doing the reindent, and this screws up the repo history enough I want it to be at the start of a release cycle, and I'm testing the code for a release and finding little problems.

So I keep backing out the reindent in a file, making a fix, committing it, then applying the fix to the reindented version.

Cutting releases usually works like this, the things I want to do (and are partway through doing) get blocked on noninterference with stable/testing, and having two diverging branches doesn't help because I just wind up porting fixes between them (potentially introducing regressions), so I wind up juggling and impedence matching between versions until I can get the known-good snapshot archived.

Lots of projects deal with this by never HAVING known good snapshots, but that's just developers being lazy. Suck it up and deal.

November 09, 2012

The uClibc issue turned out to be yet another subtle bug in uClibc's makefile dependencies.

Building a full run of all architectures plus native static tools and LFS on my netbook is slow. (As in all day to build all targets, then who knows how long to natively build stuff for all targets. The AMD processor in this sucker is not as nice as the Intel one was, but Atom can't handle 8 gigs ram. I'll take overall slowness vs locking up swap thrashing for a minute and a half so I can't read email while it's doing other stuff.)

And then of course I have to restart it with each new bug fixed. (The release binaries should be built from a specific repository version. That's how you do a development stabilization so they can bisect new bugs.)

November 5, 2012

Darn it. I wanted to get a quick pair of releases out to flush the current changes in both aboriginal and toybox before doing my next big set of changes to both (musl and the re-indent, to start with), and my regression test found... something.

When I build i686 by itself, it works fine. Goes all the way through Linux From Scratch 6.8. (Got pending 7.2 updates from a contributor, I should follow up on that too...)

When I do the "FORK=1 CPUS=1 more/" for the release, uClibc's header install doesn't install bits/sysnum.h. The uClibc build is behaving differently depending on what _else_ the system is doing while it builds.

I... how do you break that?

I need to confirm this _isn't_ something to do with the build now using toybox by default (or just having upgraded toybox to would-be 0.4.1). Unfortunately, the test case that reproduces the problem takes FOREVER to run. (My netbook has 8 gigs of ram so it runs comfortably, still has a gig of memory free during it. But the processor is overwhelmed and takes hours to churn through all that compiling.)


November 4, 2012

I keep finding myself having to explain why microkernels were a horrible idea. It requires a bit of context. I just wrote 140 lines of text and was about 1/3 of the way through the full story. (I need to write a book on computer history.)

The brief version is they were designed for abstract hardware that didn't exist, and while the microkernel guys were complaining about the hardware ("our form of government works, we just need a better populace to govern"), the hardware changed in ways that made microkernels deeply, truly suck. (On processors with both an MMU and CPU cache, TLB invalidations flush cache, meaning copying data between process contexts is about an order of magnitude slower than you thought it would be. You can get around this by having a single priviledged context that is always mapped, I.E. "kernel space", mapped with a couple large pages with permanent TLB entries, but that's exactly the idea microkernels were trying to _avoid_.)

That's just the tip of the iceberg, though. Microkernels are actually much worse than that, but explaining the difference between theory and reality involves rather a lot of description of both theory and reality. And I do not appear to be able to do so concisely...

November 3, 2012

Since the demise of my old server, I've been building stuff on my netbook, but building an aboriginal linux release (all targets) takes about a full day, which is inconvenient.

The dreamhost machine is running on is an 8-way server with 16 gigs of memory (they're about $800 these days, the current price/performance sweet spot last time I checked), and although it's probably shared with who knows how many other users (containers) I thought maybe a nice -n 20 build would be ok. (After all, I'm not using PHP or perl web scripting, my domain COULD be eating buckets of memory and maxing the CPU, but it's all static pages.) And hey, it's got gcc and make and mercurial installed already, I doubt they'd have done that if they objected to this sort of use of the thing.

So I fire it up and try to build host-tools, and it dies in the toybox build:

lib/dirtree.c:64: warning: incompatible implicit declaration of built-in function 'stpcpy'
lib/dirtree.c:131: warning: assignment makes pointer from integer without a cast
lib/dirtree.c:21: error: 'AT_SYMLINK_NOFOLLOW' undeclared (first use in this function)
lib/dirtree.c:84: error: 'AT_FDCWD' undeclared (first use in this function)
make: *** [toybox] Error 1

Those first two are from either glibc 2.7 or gcc 4.3: even though stpcpy is posix-2008 it's treating it as a funky nonstandard extension you only get if you define feature test macros. So I should probably add a probe for that and a workaround in portability.h.

The second two are because the headers are too old, from before those #defines were introduced.

Translation: "Your build environment is too old". Specifically, the system is circa 2008:

$ lsb_release -d
Description:	Debian GNU/Linux 5.0.9 (lenny)
$ cat /proc/version
Linux version (root@womb) (gcc version 4.3.2 (Debian 4.3.2-1.1) ) #2 SMP Sat Mar 13 00:42:43 PST 2010
$ gcc -v 2>&1 | tail -n 1
gcc version 4.3.2 (Debian 4.3.2-1.1)
$ ld -v
GNU ld (GNU Binutils for Debian)
$ ls -l /lib/
lrwxrwxrwx 1 root root 11 2011-01-12 01:29 /lib/ ->

For toybox I've made the design decision to use the newest available linux kernel features, on the theory that Moore's Law is still happening (if starting to show signs of the S-curve flattening out), and we're not going backwards. It's a simplifying assumption that lets me avoid having duplicate code paths to handle old kernels... but it means I don't handle old kernels. And it means that 4 years ago isn't recent enough to build this. Hmmm.

My rule-of-thumb has been that 7 years is about how long you want to care about this sort of thing. (That's a human/sociological thing, which crops up everywhere from the "7 years of bad luck" for a mirror to this comic.) So on the one hand all I have to do here is wait another 3 years and this fixes itself, and on the other Aboriginal is still stuck on gcc 4.2 and binutils 2.17 for license reasons (which is why I need to get going on qcc). So I can't say I've strongly followed this rule of thumb myself. (Yeah, toolchain's a bit of a special case since c99 is still current especially with c11 being a bit of a joke. But in terms of architecture support: armv8, microblaze, hexagon... Getting problematic.)

Because of said rule of thumb, I should probably add a compiler probe and some portability.h glue to #define my way past this... except it's pretty easy to just extract a root-filesystem-i686 tarball and build under that in a chroot... except I don't have root access on the server. (I could probably jump through hoops to set it up, but haven't at the moment. I've got a web panel I config stuff through.)

The problem is, having the #defines for openat() variants in your toolchain doesn't help if the kernel's system calls don't support them. The system may be able to build it, will it run the resulting executable? I can test by checking out v2.6.32 in the linux kernel git repo and grepping the source, and both of those are include/linux/fcntl.h. So the kernel supports it just fine, and glibc hadn't caught up with reality (and didn't bother to use the kernel's exported constants directly, instead had their own erroneous and laggy copy of them).

So it looks like this old kernel has the functionality I need, and the old toolchain doesn't. So if I can get it to compile, it should work. Hmmm...

November 2, 2012

Ha. I've been saying for years that whoever plugged a smartphone into USB to add keyboard mouse and screen first would win. Looks like Samsung did it.

Meanwhile in Toybox, I broke down and re-indented all the code to two spaces. Before there was a policy of tabs to be interpreted as four spaces (inherited from busybox), but in reality the code had a mixture of tabs, four spaces (what tcc uses), three spaces, and two spaces all in the same codebase, often in the same file.

The code I do for myself has been two spaces for a couple decades now (wow, I've been doing this a while), and as long as I'm designing toybox to be code I'm happiest with, might as well give in and do my preferred whitespace even if it's not what other projects do.

The sed invocation to convert leading tabs to two spaces each is:

sed -i ':loop;s/^\( *\)\t/\1  /;t loop' filename

And the sed invocation to convert groups of four leading spaces to tabs is:

sed -i ':loop;s/^\( *\)    /\1\t/;t loop' filename

In theory, anybody annoyed by this can use the regex to convert the indent back to tabs, and then convert it back to diff a patch.

I'd have done it a while back if "hg annotate" had something like the "hg diff -w" option to ignore whitespace, but alas no.

While I'm at it, I'm cleaning out the vi set ts=4 lines that stopped working in Ubuntu years ago (Ubuntu's continuing war against vi remains annoying), trying to make sure that command options in the help text have a tab after them (keeps the binary size down), removing trailing space at the end of lines, and so on.

Once again, grooming the codebase rather than actually coding. As you do.

November 1, 2012

A while back Anthony Liguori asked me for a way to build Aboriginal Linux from git repositories. You could do so by hijacking the package cache, but the result was ugly and hard to explain.

It was one of those things I never had the time/energy for while The Cubicle was draining the life out of me, but it's come up on the mailing list again.

So I just cleaned it up, documented it in the FAQ, and provided an example script (more/ to set it up for the more common ones. The new design is that repository checkouts go in packages, and if they're there it'll use them by default unless IGNORE_REPOS tells it not to.

I ripped out the old alt- logic entirely, because this is a better way to do it.

Everybody seems to want to do some sort of git subordinate thing to splice together various repos the way Android reinvented with their "repo" command, and I think this new design allows them to do that without me having to care. :)

October 28, 2012

Ok, digging up my old half-finished toybox mount code led me to starting a umount command (since that's simpler and related), and _that_ led me to doing losetup (since losetup -d is part of umount, and the way I did busybox mount it autodetects the need for loopback devices anyway so you never need to say -o loop).

Looking at the current linux/loop.h header file to see what the ioctls are... it's changed a bit. There's an enum with flags... autoclean? Partition scanning? What partition scanning _mean_? (Possibly you select which partition you want, somewhere? I doubt it would dynamically create /dev/loop1.1 both because there's no reserved minor range and the naming is non-obvious with loop devices already being numbered instead of lettered like most partitionable devices...)

Off to read the kernel code to see what the heck this means. I feel that I should document it, but am not sure where...

October 26, 2012

Bit-rot's an interesting concept.

The df command was the first one I added to toybox. I was actually halfway done with a fresh from-scratch rewrite of df for busybox when Bruce happened all over everything, so I finished my df implementation for toybox instead. It worked fine at the time, and I've always been proud it doesn't special-case rootfs or anything silly like that (as busybox did at the time, prompting my rewrite).

Today I tried toybox df and it gave the wrong answer, becase the getmountlist() code in lib is now reversing the /proc/mounts list, so having also df do it means it thinks rootfs is the most recent mount on /, not the /dev/sda1 that's overmounted it. I.E. library code changed out from under the df implementation.

Ubuntu's mounts now bounce off /dev/disk/by-uuid which has symlink to the real /dev/sda1, and the path in /proc/mounts is to the non-human-readable symlink. So it needs to resolve symlinks to give reasonable names. I.E. the OS changed out from under df.

And since then, I implemented xprintf() to detect when writing to stdout fails, and checking the other "df > /dev/full" it does print an error to stderr. So the requirements changed out from under df.

This is what bit rot looks like. I just fixed all three.

(Sigh. I fix my email, and dreamhost changes the server's key so rsync stops working, so I can't update the web page until I figure out if somebody's doing a man in the middle attack. Great. Oh well, I guess this blog entry and the new hg commit should go up someday...)

(Update: Apparently dreamhost changed its keys on purpose. Ok, easy to fix on my end. They send emails all the time, but this I had to check their website for.)

There was a _line_ at UT for early voting today. Good sign. (Bunch of local city issues on the ballot so badly worded I couldn't figure out what they were trying to accomplish. And a bunch of judgeships where the only candidates running were a republican and a libertarian, and I'm not sure voting for a libertarian counts as voting against a republican since they're both "Make John Galt Koch richer by selling poor people as soylent green".)

October 25, 2012

Email downloaded. Chipping away at the backlog.

Finding _lots_ of things wrong with Balsa, and reporting them to the list. I'll have to go back to installing it from source and start banging on it, but right nowthat's not a priority.

Played a lot of Sims 3 today until I biked to Fry's to get away from it. Picked up a dozen donuts on the way back. Not sure this really counts as a net health improvement.

I need to set up a git repo I can start feeding linux kernel documentation through, since is too bureaucratic for me to cope with.

October 24, 2012

The touchpad in xubuntu is kind of annoying. If I don't check the "disable keyboard while typing" button my thumb hits it and does random stuff all the time. If I check that checkbox, the touchpad is dead for something like two seconds after typing so I'm always making little circles with it until the mouse cursor deigns to start moving. Surely there's a happy medium?

I'm sure this used to suck less in previous Linux versions. After all these years, lateral progress remains the dominating factor with desktop Linux. They keep breaking stuff that worked and then treat fixing it a year later like progress. Sigh.

October 20, 2012

Yay, my email is downloading again. It's currently up to September... of last year. It's currently taking about 8 hours to download a month of email, which should be fine on an ongoing basis (15 minutes/day to download email, and it backgrounds nicely) but may take a while to catch up to the present. (Pop3 was designed for dialup, and is thus kinda slow, but imap and gmail just don't combine well.)

I really look forward to having email again. The saga of getting this working was just _silly_. (Earlier today I had to ping Tryn so he could fish my google domain administrator password reset of out of an old email address he no longer uses.)

Sometimes, I can't come up with a better solution than pulling on the loose thread until I've unraveled the whole thing, then knit a new sweater out of the yarn. This was one of those times. (Technically, this means I spent a week frogging software.)

October 19, 2012

I remember hearing years ago about some rock star who insisted on a bowl of green M&M's in their dressing room, and threw a fit if it wasn't there. Years later I learned _why_, I think from an Amanda Palmer interview.

Some types of touring musicians drag enormous quantities of elaborate and expensive equipment with them from city to city: maybe a million dollars worth of instruments, lights, amplifiers, mixing boards, pyrotechnics... After each show this stuff gets rapidly torn down, shipped ahead overnight to the next venue, and set up by local technicians who are not familiar with it before the band and their own roadies get there.

Not only do the locals have to avoid damaging the equipment doing this, but if they don't set it up _exactly_right_ it won't sound right, may fail halfway through the show, or even kill someone. Pyrotechnics are dangerous, stages are dangerous, just the lights can start fires or explode sending glass shards flying, or simply fall on someone from 40 feet up, or give you heatstroke bad enough to trigger heart attacks. Even a miswired microphone can kill you.

This stuff is so complicated it's been compared to flying a jumbo jet that gets dismantled and shipped ahead at each stop, and then you have to go up in it and fly in circles for a couple hours. If it was put together by people who didn't follow the instructions, or did a few steps how they think it should go instead of how you said it should go, bad things happen in front of (and sometimes to) very large crowds.

So experienced touring musicians provide the locals with detailed instructions to be followed _exactly_ about how to set up their show in each new venue. And the really experienced ones bury a canary item a couple dozen pages into the checklist, so that when they walk into their dressing room and see the canary they know the instructions were followed exactly and that it's safe to perform. If they DON'T see the canary item, or the canary item is prepared wrong, the rest of the instructions weren't followed correctly either and their equipment isn't going to work.

Of course people who don't understand what's going on attribute it to the musician being crazy. Refusing to perform over a missing bowl of green M&M's? What a prima donna... And there you see the Dunning-Kruger effect in action. One's own lack of knowledge making the experts seem stupid.

October 18, 2012

The way I break everything is really annoying sometimes.

The pop3 downloader built into balsa refuses to give a good enough error message to identify the failing message through the web UI. It's message #331 in the transaction, but that tells me nothing from a web perspective. It pops up xfce transient dialog boxes (two of them, one saying that message 331 had a "data copy error" and the other giving an innocuous snippet of text from the message that the gmail web search command can't find.

I downloaded balsa's git repo and tried to follow what claim to be "detailed build instructions" which start by telling me to run an ./autogen script that isn't in the repo. Ok, try again installing the most recent balsa source release (one dot release newer than the one xubuntu installed), but Gnome Dependency Hell ensued. To satisfy its ./configure I installed intltool, libgmime-2.6-dev, gmime-bin, libgtk2.0-dev, libgtkhtml3.14-dev, libesmtp-dev, libnotify-dev, and then gave up when it still wanted more. (Most of those caused apt-get to produce an explosion of dozens of packages, those are the separate instances of me telling apt-get to trawl for another mount of unnecessary crap.)

So I installed pop3browser in hopes that something else looking through the pop3 interface can identify (and delete) "message 331". It turns out to be a perl script that requires my plaintext password in a .rcfile. (Sigh.) And then it got confused that gmail wants "" to be my _username_ (that google domain hosting stuff Tryn set up when I moved my mail from Eric's server to gmail), so pop3browser tried to connect to "" and had a DNS lookup failue.

I spent a while debugging a perl script (*shudder*) until I figured out it wanted https and pop3browser doesn't know how to do that. Telling aptitude to search for "pop3" pulled up plugins for various "mail systems" (I have an operating system, anything _else_ describing itself as a system is probably overengineered), and the rest were servers. So I started reading about python's pop3 library to write my own darn browser.

This is not what I _want_ to be doing today. I want the darn thing to just WORK. But I use linux on the desktop (smell the usability) and you have to engineer a Rube Goldeberg solution to some horrible regression every few months if you're doing that.

(An aside: RFC 1081 defined Post Office Protocol version 3. It was obsoleted by 1225, which was obsoleted by 1460, which was obsoleted by 1725, which was obsoleted by 1939, which was then updated by 1957, 2449, and 6186. And yet it's STILL CALLED VERSION 3. That's the part I don't get.)

Ok, so python poplib.POP3_SSL has apop login... ah, gmail doesn't support apop. so libpop.POP3_SSL(""), pop3.user(myuser), pop3.pass_(mypass), blah=pop3.list(), for i in blah[1]: open(i,"aw").write("\n".join(pop3.retr(int(i.split()[0]))[1]))

(Yes, that's horrible, but apparently that's all you need to download a batch of email; gmail is presenting the mailbox as only containing a few hundred messages, but that's fine. And... it worked fine, none of them complained.)

Ok, the truncated bit was a "From " line about peter anvin. Grep says that 316 is the only one anywhere near there that's From: him. (Note: the "From " lines are an mbox affectation not present in pop transactions, and I just dumped the messages pop sent me. So balsa hallucinated one of these, but... why?)

Message 331 looks ok? Some of its header continuation lines start with two spaces instead of a tab, which is ugly, but Joe Perches' mailer does that and message 280 had that.

Speaking of "From " lines, grep says that 4 of the messages have "From " at the start of a line, so let's see how balsa is escaping that... As "=46rom ", mime encoding FTW! But that message already had a mime type, what does it do on messages that claim to be plan text? Does the same escape anyway. Ok...

Alright, I typed a draft message into balsa and saved it, then looked in the drafts folder at the encoding. The message was:

Suppose I do crazy things, ala:

From me
 From me

how does it encode that?

And the result was:

Suppose I do crazy things, ala:

 From me
  From me

how does it encode that?=

Which is WEIRD, because it's sticking a space before the "From" and when I _have_ a space in front it sticks two. This isn't what their POP encoder is doing, meaning they have two codepaths with inconsistent behavior.

I don't have time to write my own gui mail reader right now...

October 17, 2012

At the car dealer getting two recalls, a long overdue inspection, and various small things done to the car. I has netbook! (Ok, technically I'm at the junk-in-a-box down the street having the pretzel burger they advertise on The Daily Show. Eh, it's edible.)

So far, I've mostly filled in missing entries in my blog over the past week. ("Caught up on twitter, played sims, went out to have pumpkin bagels at The Mug of Unity..." (Ein Stein -> One Mug -> Unitary Mug.)) Lots of decompressing after a year of Cubicle.

I'm fiddling with the toybox build infrastructure, trying to add better help text, and it's HARD. The kconfig stuff doesn't let you attach help text to menus, and if you put comments at the top of a menu it forces you to cursor through them, so the actually _selectable_ stuff doesn't stand out.

Tempted to modify the kconfig infrastructure, but I also need to throw it out and start over. But NOT RIGHT NOW, darn it.

My friend Adrienne visited over the weekend talking about politics in her programming domain ("eecms" apparently, some web content management system framework package program code thing), one which actually cost her a job recently (the bottleneck project around which an open source plugin ecosystem has formed suddenly terminated a partner program). I suggested she float a kickstarter to reimplement the bottleneck bits BSD-licensed. She's a domain expert, it'd take her what, 6 months?

I've been pondering this myself recently. Once I get my breath back and my email fixed, I've got a bunch of toybox backlog to work on. Various companies have expressed interest but always done so privately asking me please not to mention their names. If they scraped up like $50k between them I could spend six months on Toybox. (Yeah, still paying for a real house, putting my wife through college, reluctant to live without health insurance... I got old. Ten years ago I could live on $25k/year, but that was before I got married, and before gas hit $3/gallon and fast food combos crept over the $5 mark.)

If not, I'll get out another release or two and then have to go look for work again. (Hopefully less life-draining than the last one.) *shrug* I'm doing it anyway, it would just be nice to get it done while it could still affect the outcome of iOS vs Android as the successor to the PC.

I'm also wondering if I should approach Google (or the AOSP guys anyway) and go "Hey, toybox, and the larger Android Should Be Self Hosting stuff: do you want to actually try using any of it?" Not quite sure how to go about it, though. Grepping for "^Author:" in a git log of the Android Core (thing containing toolbox) produces 3998 hits, 1715 of which contain and 1482 That's "Mozilla under AOL" and "OpenOffice under Sun" levels of community participation, that is.

Oh well, first things first: make the code work. All else is details.

October 16, 2012

Finished the toolbox triage instead of working on email parsing. (Ok, I may have spent most of the day playing Sims 3 but I ALSO got the toolbox triage finished.)

May have rsynced half of yesterday's blog post in the process. (Sometimes I leave a day's entry unfinished and finish it the next day. An advantage of doing a blog by hand in vi that wordpress and such don't offer.)

There are too many layers of sound infrastructure in Linux. This should not be this complicated. The aumix tool no longer works, the cursor up/down volume control in xfce no longer works, alsamixer _sometimes_ works but not really, but pavucontrol (interfacing to pulse audio) FINALLY managed to turn up the audio from a whisper. It also told me when the headphones were plugged in (although the volume control froze at 100% and refused to budge while this was there, and everything including the headphones was muted). I _did_ get sound from the headphones as long as I only plugged them in halfway, meaning it hadn't muted the netbook's speakers yet.

Linux on the desktop: smell the usability.

(I got the sound working to watch Romney bounce his resonance in the second debate. It's an In Nomine reference, look up "Balseraph" and "Level 6 role". Apparently Clint Eastwood isn't the _only_ one who does better when his opponent doesn't show up.)

On the bright side, I got this netbook upgraded to the full 8 gigs it supports last week. It's been a while, I forgot how NICE that makes things. (I'm sure I'll overload it again.)

October 15, 2012

Triaging the android toolbox source, which I finally dug up the git repository for. (That and bionic, but that's blocked on ccwrap.)

I'm also working on fixing my email, which is... complicated. I used to use kmail, and I suffered through kmail's inability to scale. (Kmail used to blow chunks if a folder ever had more than 32767 messages in it, and its index would get corrupted and display nonsense halfway through messages. They fixed that.) But KDE 4.0 was a career limiting move, and Kmail is doubly bundled (both the KDE desktop and the koffice suite), and if I don't want any of that crap running under XFCE it just got too painful to deal with. (Stop giving me notifications about an rss reader I never use. Stop running a dozen daemons in the background. It has to install 40 packages of prerequisites...)

So I switched to Thunderbird, which is a stupidly written pile of crap that doesn't scale. It constantly parses and re-parses folders that aren't even open, and once you've got a folder with 100,000 messages in it (things like linux-kernel and qemu-devel don't take long to get there) it can take 2 hours per pass to deal with that stuff, and soon it takes 12 hours. (When N^2 algorithms go bad they go bad FAST.) And you CAN'T TELL IT NOT TO DO THIS. Plus it's mail filters are hardwired to run on the "inbox" imap folder, which in gmail contains only a subset of email (in a way that's unfriendly to mailing lists; if you get cc'd on a thread so one copy has "List-ID: archivethiscopy" and the other doesn't, with kmail you could send the list-id copy to the list so thread view doesn't get chopped up and have the other show up in your personal email folder so you saw it promptly. Thunderbird's hardwired assumptions don't allow this.

So thunderbird is crap, but I still want a graphical email client that gives me the OPTION of seeing html email with the graphics, or cut and paste without my terminal program converting tabs to spaces, and so on. I _want_ to be able to click "reply" and have it pop up a new window so I can read more email without having finished this one yet (yes, accumulating a half-dozen unfinished replies). A text client doesn't match my working style (going back to Netscape on OS/2).

After fighting through lots of deeply crappy email clients, I found Balsa which has a tolerable amount of suck so far: no obvious N^2 algorithms, puts stuff in mbox files. Yes, it takes 10 minutes to open an mbox with 100,000 messages in it, but it DOESN'T try to fiddle with that folder when I'm not looking at it. And it doesn't get corrupted dealing with more than 16 bits worth of messages. (It may not be happy with more than 2 gigs of inbox file, I haven't tried yet. Using int for file position is an easy mistake to make.) But since that would be somewhere around 300,000 messages in a folder, that's more than I actually need right now.

Balsa is _simple_. The name means they were trying to be lightweight, and although the development team got Gnome all over them, the result doesn't actually pull in noticeable amounts of gnome crap. Plus it doesn't save its index files but reparses the mbox files, which isn't the world's fastest approach but means I can just kill the sucker and relaunch it if it misbehaves, and I'm not corrupting magic hidden state. I like this. (I plan for program misbehavior as "when", not "if".)

Downloading imap from gmail utterly refused to delete email from the server because gmail is horrible (it only allows mail to actually be _deleted_ after being moved to the trash folder because "special"; Balsa's imap move is a two stage copy/delete and again the delete fails because gmail is "special").

But POP can delete messages off the darn server. So I can start to deal with the 2 year backlog and not having to iterate through hundreds of thousands of messages figuring out what's new each time I try to check email. (I hope POP is pulling from "All Mail" instead of the "Inbox" subset, but haven't figured out how to control that yet. I'll see if my mail threads are broken or not once I filter the messages into folders.)

I note at this point that I am aware of Fetchmail. (I was friends with Eric Raymond for many years until the Very Merry Unbirthday crowd got him and he disappeared up the rabbit hole of climate change denialism. No, I didn't specify which hole in the rabbit he used.) The problem with fetchmail (which Eric handed off to other maintainers something like a decade ago anyway) is that it requires a mail server such as postfix installed on the loopback address of your laptop. It seems to have LOST THE ABILITY to just append the mail to an mbox file you specify, and requires you to run a daemon to append the results to a file for you.

I'm not doing that.

So Balsa's got a built-in POP implementation, which is a bit limited but more or less works. Each POP transaction only grabs about 400 messages, so I told it to check mail every 1 minute and left it running for a day or so, which downloaded about a gigabyte of email and got up to August of last year. (I'm aware that "ssh server 'gzip < inbox' | gzip > inbox" would have finished in 5 minutes, but we're using an ancient protocol to work around gmail's stupid baked-in assumptions, and the protocol has a round trip latency acknowledging each message to Google's moderately whelmed servers.

Unfortunately, POP hit a message it can't download for some reason (possibly the message size limits), which means my little "tail -f inbox | grep '^From '" status check complained about the file inbox being truncated. (Ok, understandable so far; when in doubt abort and leave the data where it is.) But the failing message was towards the end of one of these 400 message blocks, and Balsa's failure mode here got the granularity wrong: it didn't delete the first 300+ messages in the block it had already successfully downloaded off the server. Nor did it truncate the inbox back to where it was when it started. And it retried once a minute for an hour or so before I caught it, piling up duplicates.

Meaning I now have a mbox file containing 140,000 messages, most of which are the only copy of a bunch of messages it already deleted off the server, plus several thousand duplicates. Wheee. And due to some weird idiosyncrasy of gmail (which gives me 400 messages sorted by _thread_ or something, it's not even close to purely chronological order; each download spans around month range of dates but the delta between the start of new POP transactions is about a day; still mostly not dupes as checked by "grep '^From ' inbox | sort | uniq -d | wc", and keep in mind we expect a _few_ dups when a message comes in with and without List-ID: tags)... where was I? Oh right: the messages aren't all nicely clustered at the end of the file where I could truncate 'em off, because sometimes the batch of messages gmail's hash tables chose to send me didn't contain the one it couldn't download.

So I have a horked inbox file, and even though it's mostly old data I don't care hugely about (although I'd like to keep the personal mail), it's good to set up the recovery tools now because this could happen again later. (I should also find the one it can't download and deal with it, possibly manually deleting an email through the web interface containing a giant ISO image attachment or something. I _could_ just up the message size it accepts to something ridiculous, but I'm not sure I want to, and I haven't confirmed yet that this is the problem.)

So I pulled out Python, looked up Wikipedia[Citation Needed]'s opinion on mbox file format (it has several), and wrote a quick and dirty mbox parser, identifying duplicates by _all_ the headers matching an existing message. (So presence/absence of List-ID still makes for different messages.)

The OTHER fun little thing is I set up Balsa's filters for List-ID to pull out the linux-kernel messages, and the filter never triggered. I tweaked it a dozen ways and it has yet to move a single message into the linux-kernel mbox file. The copy of balsa in xubuntu 12.04 says that regex filters aren't implemented (it lets me select them, and then says this), and thus only exact string matches will work except for the part about "will work".

So while I'm there, I'm teaching the darn mbox parser to sort my mail into folders...

To make a long story only slightly longer: that's why my email isn't set back up quite yet. Working on it.

October 14, 2012

Adrienne was visiting this weekend. Mostly stayed off my laptop.

Started to rewrite ccwrap.c, which needs to do a bunch of things. It needs to work with bionic, musl, uClibc, and glibc. It needs to work as the argument parsing front-end of qcc. It needs to have distcc functionality built-in. And it needs to do the relocation stuff the current ccwrap does.

That's a lot.

October 12, 2012

Last day at work. I am so fried. Headache and chest pains. Wheee. I wanna go home.

October 5, 2012

Posix doesn't specify mknod, and LSB remains a crappy standard that sort of suggests something should exist but gives no behavior for it. For example, the gnu/dammit mknod supports -m to specify permissions. By itself, mknod creates nodes 644. With mknod -m +w the nodes are 666. With mknod -m +x the nodes are 777. Where did the extra write bit come from in the +x case? Who knows...

Meanwhile, mdev is creating nodes 660 instead of 644 as its default. I think I'll just have the -m options apply against the 660 and leave it at that...

October 4, 2012

A couple people tried to engage me about religion recently.

I'm quite capable of arguing about a creation story that depends on a snake's ability to talk starring an omniscient, omnipotent, omnipresent being (who spends most of history hiding and needs to operate through third parties for some reason) defining sin so he can declare people guilty of it and punish them, performing entrapment, punishing disobedience far more harshly than murder (cain and abel anyone?), himself comitting mass murder or multiple occasions (great flood, sodom and gomorrah, taking sides in wars all the time ala pharoah's soldiers swallowed by the red sea), allowing the Pharoah's wizards staves to turn into snakes (biblical magicians had real power, it seems, even when on the other side) and proving that Moses was better because his snake was _stronger_ and ate theirs, and then millennia later deciding that forgiving the sin of eating fruit involves "immaculate conception" (slipping Mary a roofie), incarnating as a human, gathering followers to perform several remarkably minor miracles (walking on water, duplicating/transforming food, faith healing, smiting a fig tree), instructing said followers to perform ritual cannibalism, and then having yourself tortured to death. Because just saying "I forgive you" wouldn't have worked somehow?

But there's exactly as much point arguing minutiae of Federation vs Klingon engagements with a fanatical Trekkie who believes it's all real. Pointing out plot holes in the series isn't the point: the burden is on you to show that the bible is any different from Illiad and the Oddesy (full of intervention by the Greek gods), or the Eddas with the Norse gods, or the stories of the Egyptian pantheon, or Anansi or Coyote. It's exactly the same as arguing with a child about Santa Claus: you don't prove that Rudolph can't _really_ fly by freeze-framing Rankin-Bass specials.

Any _specific_ religion has to deal with the famous objection of Spider-Man's Uncle Ben: "With great power comes great responsibility". Whatever god you're talking about is omnipotent, omniscient, omnipresent, omnibenevolent... and apparently intentionally hiding, and not just allowing evil but allowing all sorts of horrors done in his name.

But Pascal's Wager, people insist! Even an infintessimally slight chance that there really _is_ a Santa Claus who will reward the Nice and punish the Naughty at the North Pole after you die, there's no _downside_ to believing, is there? (Except, you know, wasting the life you've got here. And Homer Simpson's quote "Suppose we've chosen the wrong God? Every time we go to church we're just making him madder and mader.")

The argument seems to run that if this life is all you get, then you've lost nothing by beleiving that Valkyries will take you to Valhalla, and thus you have to pick one of the religions that each claims their God created the world and somehow the majority of people on the planet weren't informed of this (because no religion has a majority of the population, and the one closest to it is only about 1400 years old and even they admit the world's older than that). So go ahead and believe something that's about as likely as the Tooth Fairy, because who knows, maybe there is one?

The Atheist oblivionists (you're born you die and that's it) say that wasting your one and only life _is_ quite a downside. But their Theory of Inevitable Doom isn't very appealing.

The theory of reincarnation also says that Pascal's Wager has a possibly even bigger downside, which is that you waste your life OVER AND OVER grasping at a nonexistent heaven and ignoring the time you spend here. Reincarnation actually has some evidence for it: you were born once, why not again? And there _are_ other people on this planet, rather a lot, so why are you _you_ rather than any of them? Well... maybe you are? How can you tell?

From a scientific perspective, the question is unanswerable. Testing whether "continuity of experience" happens between lives is just like testing continuity of experience day to day: science can't do it. Subjectively we can tell we're here, _objectively_ we can't tell consciousness exists at all. The Turing Test for artificial intelligence is just "speak to it, and if it seems like there's somebody there that's as close as we can get to knowing whether or not there actually is".

This is an issue that science cannot currently address because we have no objective way to measure the relevant variables, which is not the same as saying "the answer is definitely no". Atheists tend to take the "definitely no" answer out of self-defense against theists, which agnostics refuse to do. Agnostics go "algebra says 0/0 is indeterminite, you have to develop calculus to actually answer these sort of questions, in the meantime indeterminate is the correct answer, and is not the same as zero". Some agnostics await better technology, and in the meantime refuse to say that it's impossible to go to the moon or impossible to go faster than light, we just say "we don't know how to do this". (Others just wait to see what happens when they die.)

The best rebuttal of Pascal's Wager I've seen is the theory of "retroactive reincarnation" which says there's no reason reincarnation has to take place in the _future_, so the main objection to reincarnation (too many people simultaneously insist they were Abraham Lincoln in a past life) isn't actually a big deal. Taken to its extreme, there's no need for more than _one_ consciousness, it could be the same one born over and over again like needlepoint embroidery through spacetime. So "what happens after you die" would have the same answer as "what happened before you were born", which doesn't get asked as much (because we're not _afraid_ of it) but is about as ineresting and probably related.

And no, I am not the first person to come up with this, to quote Robert Heilein's letter to Theodore Sturgeon:

I have had a dirty suspicion since I was about six that all consciousness is one and that all the actors I see around me (including my enemies) are myself, at different points in the record's grooves. I once partly explored this in a story called BY HIS BOOTSTRAPS. I say "partly" because I touched on one point only—and the story was mistaken by the readers (most of them) for a time-travel paradox story...

See also Andy Weir's The Egg, the resolution of Terry Pratchett's "The Truth" where the bad guy is retroactively reincarnated as a potato months before dying, and so on. (Personally, I came up with it while pondering the question "What sort of religion would time lords have".)

If the retroactive reincarnation hypothesis did match reality, then Pascal's Wager would be just _sad_. If you get to live every life the world has to offer (the good and the bad, doing all the evil in the world... to yourself, so there's a strange sort of justice to it, and "do unto others as you would have others do unto you" would be obvious)... but then waste the majority of those lifes worried about keeping on a nonexistent god's good side so you could get into a nonexistent heaven and avoid a nonexistent hell... The "you only live once" morons would be wrong, but making the most of what you've got right here and right now (where you're _not_ a starving illiterate peasant like so much of human history is full of) becomes vitally important because the future is built on the work of people in the present, and we still haven't got colonies on mars.

Is that actually what happens? Beats me. I'm an agnostic, and that's just a hypothesis. But it simultaneously shouts down the "you die and that's it, go be depressed now" atheists, and the "let me list the names of all the angels in starfleet, now give me money" theists.

The point is that Pascal's Wager says there's no downside to belief in god. If Atheists have nothing to offer but oblivion, and if you get that _anyway_ if you believe in god, then you've lost nothing. Except: you have a life now, and wasting it on a nonexistent god is a downside. Reincarnation would mean you could waste _multiple_ lives on a nonexistent god, and retroactive reincarnation means you could waste an uncountable number of lives endlessly waiting for a heaven at the expense of the world you actually live in. That's a HUGE downside.

This hypothesis, true or not, provides an alternative to the theist "a magic man behind the curtain is responsible for everything, including turnips", and to the Atheist's zero-sum pointlessness. It accounts for evil in the world by acknowledging that people do things to other people, and provides the possibility of justice by having everybody eventually be on the receiving end of everything they dish out. It's not a sweetness and light philosophy because it doesn't say you get to live every life, it implies you _have_ to. (Some of them were really unpleasant.) It has as much merit as any other I've heard, doesn't _conflict_ with anything we've observed, and requies a minimum of unproved postulates (the big one of which is that there's something inside a person other than chemical reactions in the first place, which seems obvious when you're on the inside looking out but is impossible to otherwise prove.) Its postulate is that everybody has basically the same one.

The main point of this theory is to let you stop worrying about it. If there's a message it's "get on with your life, what you do here actually matters, living well is important, but doing so at the expense of other people is stupid".

And it's just one hypothesis. There may be a better one that works out for people about as well, without requiring us to have guessed correctly about the existence of the easter bunny (and being punished if we got the species wrong). Einstein said the only important question was "Is the universe friendly"? We didn't have to know how the sun worked to be warmed by it.

Probably a lot more people have come up with something like retroactive reincarnation over the years, and not bothered to tell other people because it doesn't require _belief_ to be true or false. It either is true or it isn't; Tinkerbell doesn't die if you don't clap your hands, there's no need for a little girl to chant "I believe, I believe, it's silly but I believe" in order to be rewarded with a new house (as in the original Miracle on 34th street). In fact if it _is_ true, knowing it sort of spoils the trick a little. (Then again, merely _suspecting_ doesn't really. It's the perfect hypothesis for agnostics.)

This is why me being an agnostic is not the same as believing in other people's religions. I don't need certainty in an absolute universal truth to knock down obviously self-inconsistent falsehoods other people cling to out of fear, self-interest, or inertia.

And that doesn't mean I believe the alternative is definitely oblivion. If we get that I can't stop it, but I'm here now, along with billions of other people in a universe with billions of stars expected to last billions more years. I can't explain why I'm here now, and I do not know what comes after. I am admitting that truthfully, I don't know.

Belief in zero is just another way of hiding from the unknown.

October 1, 2012

Gave my 2 weeks notice at work today.

Turning down a steady paycheck (plus health insurance) to return to the uncertain world of consulting is hard. In theory I could have milked that job for months more, and if I eventually got fired could have collected unemployment... But that's just not what I want to do with my life.

Today was the 1 year anniversary of going full-time at Polycom (still working on the same piece of hardware that hasn't shipped yet), but the real catalyst for my departure was that the 3.6 kernel came out last night. I had a long todo list I wanted to have ready for this merge window... and I've done none of it. I've barely touched upstream kernel stuff since 3.5 came out, instead working on Polycom's fork of TI's fork of Android's fork of the 2.6.37 kernel. Work drains all the life out of me until I get very little done on anything else, even on weekends.

Staying at that job, I've slowly drifted further and further behind in the stuff I actually care about, instead working on a device that's languished in development for my entire tenure at Polycom (contractor and full-time combined add up to about a Moore's law doubling period).

The Mars board is basically a high-end dedicated hardware version of Google Hangout, which produces the same HDMI output format as a $35 Raspberry PI. (You can plug a very nice camera into it, but the camera isn't new.) A year and a half ago, when I started as a contractor at Polycom, the Mars hardware was "finished" and off to the application developers, and we were working on its successor Saturn (which did the same thing in triplicate) and were looking forward to Jupiter (which could handle even more cameras and screens). In the month I took off between ending my contract and starting a full-time position, Jupiter was cancelled, Saturn was shelved, and everybody was back fiddling with Mars, in a new building where nobody had offices anymore and everybody sat in cubicles.

That month was also when Polycom announced a change of direction to a "Software Strategy" where 80% of revenue would come from software, even though the department I worked for did hardware bringup (hardware debugging, bootloaders, power-on self test, device drivers, performance analysis and optimization...)

It's hard to get enthused about building dedicated hardware when "there's an app for that", especially if your own company already has that app. As conflicting new management strategies unfolded and made clear that the upper echelons of the company had no clear roadmap and were actually working at cross-purposes, Angela and Xianghua left, but I wanted to see the project through and actually ship the device...

It's been a year since then. I'm ready for something else now.

September 30, 2012

Huh. Opened up my laptop and... there's nothing I want to do with it. I have to go back to prison the cubicle tomorrow, which means I don't even want to _think_ about programming or anything online.

Work has destroyed my interest in the main hobby I've had since grade school. How odd. I'm... completely dissociated from it.

My email is broken. It's been varying degrees of broken for a month. I need to do some reasonably straightforward work to fix it. Haven't.

The kernel documentation work: I know exactly what I want to do, and exactly how to do it, and I've been stopped my some modestly unpleasant bureaucracy that I don't have the energy to overcome.

I haven't made a commit to toybox in two weeks. I have three in-progress commands and a half-dozen submissions I should review. I haven't.

Adding musl to aboriginal, the next step's about a day's worth of work. Not done yet.

Need to submit the perl removal patches upstream again. Need to update aboriginal to use the 3.6 kernel. Need to finish setting up securitybreach. Need to resync the LFS control-image with linux from scratch 7.2.

And I hate my day job so much, I can't work up the enthusiasm to do any of it. It's like a chronic debilitating illness, getting gradually worse every month. I don't like this trend.

September 29, 2012

Reading the MUSL install file from current git repo, about installing musl as the primary C library. I appreciate the sentiment, but changing one spec file entry isn't going to stop it from finding the wrong headers, from everything from zlib to being linked against the old libc...

Hmmm, there's only and the rest (libdl, libcrypt, libm, etc) are only available as .a files. Ah, it put all the logic in one library and the rest are empty stubs in case anybody did "-lm" on the gcc command line, so it finds something rather than barfing. Ok, makes sense.

September 28, 2012

Weekend! Tiny sliver of time I attempt to cram my entire life into, overcoming the prison sentence of cubicledom. Got home in total zombie state, as usual. Fade made meat pies, which helped.

Mirell came over for dinner (enjoyed the meat pies enough he asked for the recipe, they are quite good), and brought securitybreach (the old server from impact linux) which is no longer quite firebreathing but is still pleasantly warm and higher end than than the server of mine that died. (Downside, it costs $6/month in electricity to sit idle, and it's loud enough we'd have to put it in the guest room and just power it down when guests arrive.)

Alas, it requires a USB keyboard to install something on it. (He left an opensolaris variant on it? What?) I do not currently have a USB keyboard. Trip to Fry's in the future?

Trip to Best Buy to buy Fade a new netbook, because replacing her macbook air is more than we're really comfortable spending right now. The version I had for so long (before it got stolen) is down to $199 for 10 inch screen, 1 gig ram, and the shareware version of windows 7. She sprang for the "more or less what I have now" version for around $270, bigger screen and keyboard, twice the ram, still windows but a variant the vendor may actually pay money for. Over the next two hours the machine bluescreened twice and hung once, but that's windows for you. She installed microsoft's free homeopathic antivirus software, which probably can't stop a clock, but I didn't have any better suggestions. Still, she's just using the sucker to chat, play games, and do homework. No financial anything should ever go through this machine, period.

Finally watched to the end of the 11th doctor's first season (pandorica opens, big bang: they did a Bill And Ted's Excellent Adventure storyline, and it was good) on the Roku player we bought to replace the wii (which we'd previously streamed netflix video on, if for no other reason than to be able to use the "it's streaming through my wii" joke far too often). I'm glad Fade was here to set it up because that sucker is HUGELY locked down to crazy "give us all your login info to everything you'll ever watch and we'll record it on our servers, oh you thought this did youtube? You sad deluded clown, youtube didn't give roku any money, why should we put their content on our screens? I wanted to smack 'em 30 seconds in for being stupid. Still, the box works reasonably well once you get past that.

Got a little bit of programming in once I felt better. The binutils tarball I whipped up last weekend was subtly broken in two ways, both due to me tarring up the package cache directory once I'd tested that it built properly for all the targets. This meant A) my checksum file showing what tarballs and patches went into assembling this directory got checked into the tarball, which confused the setupfor logic. (Yeah, in-band signaling is bad, but decoupling this info from the area it's used is bad too.) B) the file was already patched so re-applying the patches failed with the "possibly reversed" message. (I ship clean tarballs and keep my local patches in another directory. This is "specific git commit with a thing run on it to approximate the release process", and then my local fixes remain separate.)

Oddly, this actually worked for most builds because the sha1-for-source.txt file in the tarball matched what it was trying to patch it to, but the actual extract stage would fail the first time because it would extract then patch and the patch would fail, which would abort leaving a correct directory state.

And then I ran FORK=1 more/ which deleted the build directory and re-extracted everything, and caught the failure. I went "huh, patch file is already applied, weird", deleted the patch and re-ran, which succeeded (no patches to fail) but attempted to re-extract every time (sha1-for-source.txt not matching), and the parallel builds each attempting to extract binutils in parallel stomped over each other... And I had some head-scratching to do to figure out what was going on.

Anyway, that's why I redid the tarball and updated the sha1sum in to match the new tarball.

September 27, 2012

Balsa hit a message it couldn't download. Not sure why, it pops up one of the xfce notification boxes that fades 5 seconds later and I just caught the tail end of it.

What I do know is that it's not deleting messages as it downloads each one, it downloads them all and then does a delete pass. And if the download pass is interrupted, next time it'll download the same ones again (causing duplicates), and if you leave it running with the "check mail ever 1 minutes" setting to try to clear some of the year and a half backlog in gmail's all mail folder, it will happily download 100,000 duplicate messages at the point it got stuck before you notice.

Bravo, balsa. Bravo.

In other news, I have yet to get a "simple" balsa email filter rule to trigger on a list-id tag, and the regex one is there in the menu but says not implemented if I try to select it.

Ok, I need to read up on mbox format, then write a python message parser to kill the duplicates and distribute them into the various folders. (Yes, I've heard of procmail. Trying to apt-get install it tries to install postix, which is stupid and I'm not having it.)

Once I've got my email back I'll see if the guys are interested in doing an rsync from _their_ end to copy my local version of up to If so, I don't need to log in, they can login to _my_ box via cron job.

As for git tree, I can set one of those up anyway and then just ask them to pull it. If they need signed commit tags, I'll work it out (and the key having no signatures on it is not my problem). Or I can learn quilt, now that patch understands "git mv". (Memo: need to update my patch implementation to handle that. It's been on my todo list for 3 years now...)

Sigh. I should quit my day job and launch a kickstarter to see if anybody wants to sponsor me working through my todo list. I'd happily live on instant mashed potatoes but Fade requires a higher standard of living, is getting a doctorate, and we have a real house now. Oh well.

I _am_ amused that the Linux Foundation was willing to sponsor me to do documentation work back in 2007 when I _wasn't_ documentation maintainer (and became confused that I couldn't just reach out and _take_ the maintainership position). Now that the previous guy's handed it off, they don't care. (No point, just amused.)

September 26, 2012

For the past few months I've been collecting signatures for my PGP key so I could get my account back and maybe eventually update and put up a Documentation git tree linux-next could pull from.

Today, Fade came home to find people in our house, who ran away before the police arrived. (In future, the cat door gets locked when nobody's home.) Not much was taken: all the game consoles and the games that go with them (but not the TV), her airbook (but not her imac), and the netbook I just switched off of. Which still had all my files on it as of a week ago, including the new pgp key I'd been collecting signatures for.

The netbook cost around $250 when it was new (over a year ago), and will almost certainly be wiped and windows put on it site unseen, but... it spells the end of my attempt to get a account.

Dear guys: I give up trying to jump through your security hoops. Somebody broke into my house and stole a machine I kept backups on, that key is compromised (in a way that has essentially 0% chance of ever meaning anything). I'm going through and cycling my ssh keys on general principles but I'm not going through the motions of satisfying your paranoia again.

I'm happy to do the _work_ of maintaining the directory and the kernel Documentation directory. But I'm not filling out forms in triplicate for the privilege of volunteering my time to do this work.

September 25, 2012

Still fighting with Balsa. Telling it to download 420,000 messages in one go doesn't work, it times out after about 80k messages. Which would be fine if I could restart it, but the problem is when it tries to delete the messages it's already downloaded, gmail throws an error.

This is because gmail is really, really stupid "special".

A sane email server will collect the mail for a given address and put it in a folder, possibly after filtering out spam. The email client will then download the email off that server (using POP or IMAP), deleting the messages out of that folder.

The gmail "All Mail" folder is the folder that contains all the non-spam messages. (The "Inbox" folder is just a subset of them, where Google went "these messages aren't spam but we don't think you want to see them anyway so let's apply our preferences to your inbox whether you asked us to or not", and that "folder" is really just a search view showing messages tagged with the Inbox label. Deleting messages out of that folder just removes the label, it doesn't free up space on the server.)

Google decided that the "All Mail" folder is special, and refuses to allow IMAP to delete messages out of it. Instead, you need to _move_ messages out of it to the Trash folder, which is the special folder that actually responds to the delete command. Except balsa's "move" command is "copy" followed by "delete", which still doesn't work.

See the radiohead song "creep" for more information on software thinking it's "special". You can generally read it as "here's how this is broken so it's not general-purpose".

September 23, 2012

Still sick.

The binutils release was breaking mips because not all the lex/yacc files were being regenerated during my host build, and the mystic invocation to do this remains undocumented. (Remember: maintainers are _special_ people. Ordinary developers should never be shown how to do this. But leave the documentation saying "make dist" and then have that do nothing but print a message saying "oh no, that's now how you do it, go read the directory cotaining the documentation that says that _is_ how you do it".)

Anyway, after lots of carpet-bombing and napalm, I wound up with:

git clean -fdx && git checkout -f && patch -p1 -i ~/aboriginal/sources/patches/binutils-screwinfo.patch && ./configure --disable-werror && make configure-host && find . -name Makefile | xargs sed -i 's/^all:/& $(DIST_COMMON) $(EXTRA_DIST) $(DIST_SOURCES)/' && make && patch -p1 -Ri ~/aboriginal/aboriginal/sources/patches/binutils-screwinfo.patch && make distclean

And THEN I have a binutils tarball I can tar up and use in the build, which works on mips.

September 22, 2012

I has a cold. Sore throat, runny nose, headache, standing up is exhausting, napped for 3 hours and woke up tired... Of course this would happen during Camine's visit. This is how we entertain guests: by inflicting disease upon them.

Not a whole lot of coherence to do programming with at the moment.

Balsa's still stroppy. Haven't got a better idea though. Killed it, restarted, made sure not to suspend my laptop even briefly while it was working, gave it _all_day_ to grind away at trying to apply this one filter to at least make the mbox nonzero length, wandered into its pid directory under /proc and looked at the "fd" subdirectory to see what it had open, noticed that stderr is going to ~/.xsession-errors, grepped that for "balsa", and it says:

(balsa:4341): GLib-WARNING **: (/build/buildd/glib2.0-2.32.3/./glib/gerror.c:390):g_error_new_valist: runtime check failed: (domain != 0)
** (balsa:4341): CRITICAL **: imap_cmd_step: assertion `handle->state != IMHS_DISCONNECTED' failed
** (balsa:4341): CRITICAL **: imap_cmd_step: assertion `handle->state != IMHS_DISCONNECTED' failed
** (balsa:11400): CRITICAL **: imap_cmd_step: assertion `handle->state != IMHS_DISCONNECTED' failed

Which means what, exactly?

Am I going to have to re-learn GUI programming (which I did under OS/2 and Java AWT 1.1 but never had the stomach for the whole "gtk, qt, wxwidgets..." mess in Linux) and write my own darn email client? Seems kinda excessive, and I DO NOT HAVE TIME FOR THIS.

If you're wondering why so many Linux people have Macs these days...

September 21, 2012

Coming up on two weeks without email. This is embarassing.

Last weekend I updated my new netbook yet again (the one with a slightly crappier processor but 4 gigs of ram upgradeable to 8, instead of the massively evercommitted 2 the old one has), and this time I deleted a couple important directories off the old machine to _force_ myself ot use the new one.

The main thing the old machine had that the new one didn't is a configured email client, but I didn't want to reinstall thunderbird on the new machine because it's crap, and in the last week before the move I didn't manage to successfully check mail once. (Left it running overnight several times: the result ran for 12 hours and then found no new mail, not even in linux-kernel.) So as soon as the _old_ machine no longer had email, nothing stopping the switch.

Of course finding a new graphic email client is a pain. Mutt's text and I want a graphical client. Thunderbird is in about the shape Mozilla was 10 years ago: a giant overcomplicated mess that's not actually very good at its supposed central function but covered by so much overgrowth it's hard to tell _what_ it's doing. And of course kmail is nailed to a giant succession of boat anchors (the KDE desktop, Koffice) that what was a very nice standalone email client just has too much baggage now to bother with.

I'm trying to set up Balsa, which seems the simplest and most straightforward graphical email client I've found so far. I got it to imap to gmail, and it shows the "all mail" folder that Thunderbird could never figure out how to stick filters on. It takes it ten minutes or so to open that folder, but given it's got over 200,000 messages in it, I can't really blame it.

So I added a filter, to move List-ID: tags containing "" into a new local linux-kernel mbox file, which should be a good first pass to clean some of those 200k+ messages out. Took a bit to figure out how to _use_ filters in balsa, but it's a two step process that actually makes sense: create a filter, then go to the folder you want to apply the filter to and add it to the list of filters that apply to that folder. There's even a convenient little "apply now" button, which I clicked.

This morning.

It's now coming up on 9pm and it's still going. Or at least the button hasn't popped back up and none of the other buttons in the pop-up dialog respond yet. The linux-kernel mbox file is still zero bytes. There is no progress indicator. Neither top nor iotop show it doing anything. Possibly a network variant thereof would do so, but ntop is some sort of crazy web server and sntop is a tool to ping random websites (why?). I vaguely recall having found a useful network top variant back in like early 2010, but don't remember what it was.

September 20, 2012

Work is eating all my time.

The musl guys pointed out that I _don't_ need to install it as root to play with it, I can put the absolute path down under my home directory instead. (Dynamic linking gets a bit strange, but static linking should still work fine.) There's even a getting started page describing it.

That said, I've got the new binutils in and _mostly_ working, capable of building musl. (In theory the musl guys could move the global variables to a separate .o file from the global functions, and do the -bsymbolic thing to create a combined function .o file and then link the rest without it. But no point it asking them to change their build system for an admittedly old toolchain when I can fix it at my end.)

Got a little git bisecting done on the question "why does the binutils last gplv2 git snapshot build break on mips", and bisecting forward in the repo to 2.18 (which doesn't) it turns out the fix is a commit that "regenerates" a couple dozen files for the release. What that means, I have no idea. I've been doing a "./configure --disable-werror && make && make distclean" to make all the files generated by lex and stuff show up. Possibly I need to delete some file they checked into the tree, then the build will rebuild it given the right unusual host tool? Dunno. Ran out of time to look at it because I had to go to work.

September 18, 2012

I'm amusted at "Atheism+" (now with electrolytes). I've always considered atheism a religion the same way zero is a number, and now they've managed to have a schism. I'm so proud. (Arguing about how not to believe something takes _talent_.).

Personally I consider myself an agnostic. Your parents tell you about the tooth fairy and the easter bunny and so on, and then as you grow up you spin the big wheel to see which one they weren't lying about, and it lands on an omniscient miracle worker who enforces a moral code, rewarding those his big beard approves. There's movies about him, pictures of what we think he looked like, and at the certain times people everywhere dress up in special costumes to act on his behalf.

He sees you when you're sleeping, knows when you're awake, knows if you've been bad or good so be good for goodness sake. Visiting all the children of the world in a single night to dispense a mountatin of toys and coal, with a pantheon of elves and living snowmen and flying reindeer which everybody is taught the names of as they sing the hymns...

As I said: spin the wheel. Ranting at length about how offended you are by belief in Santa Claus seems a bit weird. Having an asantaist _movement_ is just... odd. Mostly because nobody persecutes unbelievers in Santa. (Because an omniscient, omnipotent entity clearly needs _you_ to act on his behalf.)

That said, atheist lectures on youtube are pretty much it for modern philosophers. It's a pity Dawkins turned out to be a clueless dick about white male priviledge, but it's hard for old white men to see out of that particular black hole (and anybody who is married to the actress behind Romana's second regeneration because Douglas Adams introduced them at a party has a certain baseline level of awesome). I can only take Penn Jillette in very small doses, and found Christopher Hitchens annoying even when I agreed with him. Stephen Fry remains epic. Douglas Adams had some pretty impressive talks. I have yet to find actual recordings of Bertrand Russell, but he wrote some fun essays.

Trying to get morality from religion puts "thou salt not kill" and "remember the sabbath day and keep it holy" on the same list, as equally arbitrary pronouncements, and then goes on to declare wearing wool and leather in the same outfit sinful before it gets _really_ weird. This seems awkward somehow as a baseline for figuring out how to live one's life.

I'll believe we live in a secular society when Jesus can join The Avengers (alongside Thor) without Marvel getting its offices burned down. Wonder Woman can be daughter of Zeus in her current reboot, but you can't mix saints with superheroes because that's blasphemous disrespectful. Right.

September 17, 2012

Ubuntu makes some really horrible UI decisions, and impose them even on their non-Unity stuff. In 12.04 they screwed up the default terminal colors of Xubuntu so badly that the kernel's "make menuconfig" is in conflicting pastel shades. (If you can't decide between a white background and a black background, a grey background is the worst of both worlds. It reduces the maximum possible contrast in either direction. Changing the other 16 colors to random pastels does not improve matters.)

How to fix this?

rm .config/Terminal/terminalrc
sudo rm /etc/xdg/xdg-xubuntu/Terminal/terminalrc

Close all terminals and re-open. Now you've got the defaults the way the IBM PC did it in 1981, which (like the qwerty keyboard) are a horror that everything expects and can cope with.

To change the border width (and thus the ability to actually grip and resize stuff at the corners rather than the insane 1-pixel borders xubuntu defaults to), settings->settings manager->window manager and then select theme "kokodi". (It autodetects the width from the graphics files it uses to draw the border, kokodi has a reasonable width. Grey goose does not.)

Finally made the jump to the new machine, slowly making it usable. Started deleting files off the old oen to _force_ myself to switch, and do all the little cleanup work (up next: fix module parameters for the networking and sound cards in a way that persists across suspend and reboot) that using linux on the desktop requires to get a minimally usable system.

Still no email client set up. I'm occasionally checking the gmail web interface, but without any filters, so there's 1000 or so messages of linux-kernel and qemu-devel and such to wade through and try to spot the occasional email to me.

September 16, 2012

Bread and Circuses 2012.

Various people have been talking online about the idea of "basic income", in which each US citizen would a small montly check from the government, amounting to about the official poverty level (just enough you would't starve or be homeless), and then they'd collectively tax everybody enough to prevent this from causing inflation.

The idea is that society already provides some minimal standard services to everybody: today even homeless people can call 911 to get police and firefighters, homeless kids can attend public school, we give starving people food stamps, emergency rooms can't turn away sick people for inability to pay, and so on. Given that there's a basic level of services we provide to everybody, should that be extended to cover minimal food and shelter so there _are_ no homeless people, and nobody starves?

The first question is scale. Could the government afford to feed everybody if they really wanted to? Well, the market cap of McDonald's is currently $93 billion. That's how much money it would cost to buy the entire company outright. Their annual revenue is $27 billion, which is how much money everybody paid them last year, of which $5 billion was left over as profit but let's ignore that for now. (Also, only about half of McDonalds' 33,000 locations are in the US, the rest are feeding people in other countries, but again let's ignore that right now.)

The current opinion of Wikipedia[citation needed] is that the annual federal budget is a little over $3.5 trillion: enough to buy McDonalds 35 times, or to run the one we've got for well over a century. Just the $700 billion we spent on the defense department (more than china, russia, and the whole of europe _combined_) would buy 7 McDonalds' and run this one for 25 years.

This isn't even getting into the massive farm subsidies, incredibly stupid biofuel subsidies (ethanol from corn is just a bad idea), existing food stamp program, school lunches, or the fact that a dollar menu McDouble is meat vs the cheaper starch-heavy diets of the poor (ramen, rice and beans, potatoes), or the way we throw out half our food...

Of course any services the government actually _runs_ immediately turn into a political football, where biased idiots sabotage projects they disagree with to "prove" they don't work (don't adjust your beliefs to match reality, adjust reality to match your beliefs). Even if medicaid and or the VA are more cost effective (because they prevent middlemen from inserting themselves into the process and draining off profits), they'll be chronically underfunded and sabotaged to make using them as painful as possible (until they can be "privatized" to middlemen who insert themselves into the process and drain off profits).

And of course people bring up the idea that if ramen noodles were free nobody would ever want to go to Red Lobster, and that private industry is so incredibly fragile that it couldn't survive the existence of any alternatives. Which is, again, stupid.

However, the "give everybody small regular amounts of cash" idea addresses both concerns. Then the Republican bastards can fight for that cash with payday loans and pawn shops and by being slumlords and such, but nobody should actually starve or be homeless (unless they're one of the millions of mental patients Ronald Regan turned out on the streets, but that's a separate issue). And people have to buy their food and housing on the open market, so ever-so-fragile capitalism doesn't instantly collapse as everyone immediately abandons it given the slightest alternative.

There's a view of the world that nobody would ever work without the constant threat of starvation and homelessness. Oddly, the people who say this tend to come from rich parents and have trust funds, and are implying that nobody like themselves ever amounted to anything.

September 15, 2012

Weekend! Surfaced from the fog. Spent most of the day recovering at home instead of doing useful work or biking anywhere.

Thunderbird has melted down. I haven't successfully checked email for a couple days (didn't turn my notebook on all of Thursday, and the couple tries before that it exhausted memory to the point the intel binary-only wireless module saw a memory allocation failure and lost its mind). This morning I killed all the chrome renderer tabs let it try to check email, with a over a gigabyte of free memory, for 12 hours.

At the end of that, I called it. Thunderbird is toast. I can't check email until I install a new mail client, which means tomorrow I finish switching to the new netbook.

Elsewhere I poked at md5sum in toybox. Long ago (2007?) I did sha1sum and unraveled the algorithm to do a much smaller simpler implementation. (This is one of the things I pushed into BusyBox in the intervening years.) Now I'm tackling md5sum and the standard (RFC 1321) is... confusing. It looks like some of the infrastructure is the same as sha1sum except different endianness, until I get to the sample code and the endianness looks the same as sha1sum? The algorithm explanation switches from "i" to "t" as variable names, without ever explaining what "t" is.

Oh well, if all else fails it does _have_ example code, which is a horrible loop-unrolled thing that's probably actually slower on modern (post-1991) hardware because fitting your code in L1 cache with predictable branches is _faster_ than constantly faulting the instruction cache.

Simplifying code is like diet and exercise: the result is healthy and runs faster. Loop unrolling is plastic surgery: the result does not age well and winds up making things _worse_ after a while.

Don't optimize. Simplify.

September 12, 2012

And work paralysis returns with a veangance. I set foot in The Cubicle and I'm trapped, lethargic, full of eyestrain and headache, utterly incapable of focusing on anything, and somehow don't manage to even go home at 5 but keep sticking there until around 7. And then get home exhausted and go to bed.

Not really liking my work environment. And it's getting worse. I felt at least somewhat recovered by the end of my week off, but if anything it just highlighted the contrast between work and real life.

September 9, 2012

Finally started tackling musl support, and the first thing I hit was the lack of -Bsymbolic-functions. It's not _that_ hard to patch that in to binutils-2.17 because it's just taking -Bsymbolic and making it apply to functions but not data.

However, the #musl chanel on freenode suggested I look at the last binutils git version before the license change, which was a bit tricky to track down. (Tricky enough that the FreeBSD guys have it wrong, 2.17.50 is gplv3 in the bfd subdirectory. Keep in mind the FSF is so slimy it retroactively replaced the source tarball of the last GPLv2 release of binutils with a version containing GPLv3 files, declared that GPLv2 code didn't count as Free Software anymore, and so on.

But they didn't write git (Linus did), meaning it's possible to git bisect your way to an answer in the binutils repo, and the last version without any mention of GPLv3 is commit 397a64b3 (after which the "gas" subdirectory went GPLv3. Even though the top level directory still said GPLv2 in the 2.18 release, by the logic of GPL contamination it means resulting binaries cannot be distributed under the terms of GPL version 2.)

So I stuck that into aboriginal, and of course a version straight from git wants lex and yacc and presumably autoconf. And of course "make dist" spits out a message about how this package doesn't support making distribution tarballs the way the gnu project's packaging guidelines say to do it, and instead you should look at the etc directory. Look at _what_ in the etc directory, they don't say, and a full hour of grepping texi files failed to produce any clues, so I did this:

./configure --disable-werror
make distclean

The "up yours" approach.

Some other things I copied from my aborted attempt to get binutils 2.18 to work years ago, before discovering the licensing issues. (Although the makeinfo fix really should return 0 instead of falling through to touch, otherwise it fills the build dir with random files.)

The result builds aboriginal (and linux from scratch) on i686! It builds on arm, and actually seems to have armv7 support! (I should see if I can get a similar gplv2 toolchain with basic armv7 support out of git.) And the build breaks on x86_64 and mips.

So I need to regression test the build with this before doing much more with musl. I've started reading through ccwrap anyway, to see what needs to change to add musl support. (I note I still haven't built anything with musl yet because to do so I need to isntall it into my host, as root, which isn't happening. It'll work out of a subdirectory or it'll wait until I can _make_ it do so. Yes, this implies static linking of test programs.)

September 7, 2012

Left Thunderbird downloading mail overnight. It was still going at 100% CPU noon today (that's a dozen hours on a 64 bit processor going at 1.5 ghz to CHECK EMAIL). When it finally stopped, it hadn't downloaded anything in three folders I know I've got mail in. (Dude, I can see the web archive...)

That's it for Thunderbird. This piece of software is officially useless.

What I want is an imap based wget that copies a mailbox to a local mbox file. Unfortunately, fetchmail doesn't seem to do this, it insists on delivering to a local MTA, and configuring a loopback instance of postfix is way more infrastructure than "wget with a different protocol" has any business requiring.

Oh well, fight with balsa some more and see what that's like...

On the aboriginal list Alessio asked how to use source control repositories with the aboriginal build, which I gave a long elaborate answer to, and now I'm thinking about how to simplify that.

The answer boils down to "the package cache can use a repository directly", but then I have to explain what the package cache is. I already have a big long web page explaining this but it's one of those "you can totally ignore this" things.

The reason it's there is that extracting and patching tarballs over and over again, only to rm -rf the result afterwards, is really slow. Plus it doesn't scale to multiple parallel builds (eating buckets of disk space and doing horrible things to the disk cache). So I extract _one_ copy of the source code, snapshot it with a tree of either symlinks or hardlinks as needed, and figure out when it's out of date using sha1sum fingerprints of the tarball and patches that went into it.

The clever bit is you can ignore that I'm doing it, and just pretend each build stage just extracts and patches a fresh copy of the source code each time.

Now I have to explain how you can use a source code repository from git or mercurial or subversion or something in place of a package cache copy of the source. Doing so is actually pretty easy, explaining what's going on when you do it is hard.

Possibly the package cache should be in "packages/cache" instead of under build. I had it under build because all the temporary files go there, and rm -rf build is "make clean". But packages is downloaded files that aren't checked into the repo, a "make distclean" would be rm -rf build and rm -rf packages. Hmmm...

One of the reasons I put the package cache into build is so it could be the only writeable space (everything else could come from a read-only mount), and you could hardlink between the package cache and the working directory (which saves thousands of inodes and is much faster than symlinking). But _most_ of the time that's the case anyway, and I can autodetect when a hardlink won't work (hardlink sha1-for-source.txt and if that fails, cp -s instead of cp -l).

The other bit of fuzziness is alt-packages, which the email I linked to above explains. When to use the repo version and when to use the tarball version is tricky. I think the alt- stuff predates the package cache, possibly I can just remove the alt- infrastructure and instead document how to directly fiddle with the package cache...


September 6, 2012

Less stressed today. (I realize I've been venting a bit in this blog, but it's not like anybody reads it. :)

Dealing with the giant backlog of stuff accumulated so far this year. I've got a message in the aboriginal mailing list and a thread in toybox to reply to, but since thunderbird now takes 4 hours to check my email (and loses track if it's distrubed during this), I haven't downloaded them yet to reply. It's ironic that the main thing stopping me from using the new netbook is the need to set up a new email client before I can stop using the old one, and the old one is BARELY limping along.

Fiddling with configuring balsa on my old netbook, in hopes of finding a non-stupid graphical email client. The setup wizard uses a fixed window size that's bigger than the netbook screen (1024x600), with no ability to resize it. The "next" button is off the bottom of the screen. So "non-stupid" is already disqualified here.

I miss kmail. If it was still available as a standalone program I'd happily go back, but I won't reinstall the KDE katamari to get it, nor the Microsoft Office wannabe suite it got sucked into. (Extrude? Excrete? Expunge? Eczema? I think it began with an E. Not looking it up.)

Biked to Fry's, hanging out in the coffee shop today. Different baristas, the usual pair are apparently off thursdays. Yay exercise, I need lots more. It's odd, when I'm at home I sometimes get to 5pm without having eaten anything because I just didn't think of it. When I'm in the cubicle I'm constantly going "I feel terrible, maybe I need food". Plus visiting the break room is an excuse not to be in the cubicle.

Fade figured out that her calculations weren't taking into the account that money's geting deposited in _two_ accounts. I've been taking a little under 1/3 of the money I make and depositing it into my old credit union, and using it for all _my_ expenses, plus buying internet access for my sister's kids, paying my friend Heather's medical bills, mailing money to various webcomic artists, and so on. (She knew this, she just forgot. That makes a lot more sense.)

Catching up on my back email (part of ditching thunderbird: going through all the messages that are still marked unread), I caught a linkedin notice from the 26th that Xianghua (the more recently departed co-worker) already found a consulting gig. Good for him.

Playing with initramfs in aboriginal, working out why ls is complaining about "exe" a bunch of times when I "ls -l /". (The only hits on "find / -name exe" are the /proc/$PID/exe files, and ls should _not_ be recursing to hit those without -R.) Ah, I seem to have _completely_ screwed up the ls logic in commit 580, when I reversed the meaning of the DIRTREE_NOSAVE and DIRTREE_NOFOLLOW flags, the ls filter() function didn't get updated right. It's a small forest of nested if (blah) return VALUE; with some other stuff in there, and both the return values and the sequencing are wrong in places now. How did that ever work?

Fill out the regression tests. Todo item I just haven't had TIME for commuting to a cubicle. (Oddly the broken ls code still gives the right answer, when it doesn't hit a permissions conflict on the gratuitous recursing that spams stderr, if you don't mind it leaking memory, and if you never use -aAf.)

Vaguely pondering installing toybox into /usr/toybox and busybox into /usr/busybox as a transitional measure while I'm weaning aboriginal off busybox. Right now if busybox installs a file toybox won't overwrite it, which keeps biting me but also reminds me to trim the busybox config...

September 5, 2012

On an 8am conference call for work this morning. (My boss was kind enough to let me use some of my vacation time, but that doesn't mean I can stop working.) Afterwards he emailed me "a wake-up call" to tell me how disappointed he is in my current performance, at which point I went back to bed for several hours.

Heh. The initramfs in the most recent aboriginal release _does_ work, what broke it in my working directory wasn't the kernel, or busybox, it was using toybox in build/host. Specifically toybox wc had a trailing space on the output, so LEN=$(echo $THING | wc -c) followed by cut "${LEN}-" was becoming "cut 63 -" which is different from "cut 63-", so the initramfs generation created a bunch of files named "/" which didn't extract.

Redid a chunk of the wc code and now initramfs is working again, meaning I can use it to test switch_root. Basically I'm trying to get a lot of backlogged stuff cleaned up, broken up, tested, and checked in this week. Might cut a toybox release this weekend on general principles, I've done enough to it...

Todo: gotta get access fixed and go through the Documentation backlog. Gotta get musl into aboriginal, which involves teaching ccwrap.c to handle it. Gotta get toybox back into aboriginal (mostly done, just cleanup left). Gotta review the new toolbox, and sasl, and the pile of pending submissions for toybox. Oh and the 3.6 kernel will require a redo of the perl removal patches for the new kernel header refactor. And I need to finish the writeup of my "why I stopped using gpl" page. And finish the saga of git bisect writeup I'm halfway through and maybe see if the H-online guy wants it or something.

And of course I'm having all sorts of ideas on things I should do with aboriginal, such as making an initramfs image that mounts a virtfs, so instead of hda, hdb, and hdc I could have it be directories exported from the host by the wrapper script. (With NFS this would be horrible, but the p9 protocol isn't an actively stupid design...) Plus I need to update LFS to current, and do a BLFS image like I did for the Hexagon way back when. And maybe I should try to integrate the bullet point list from last week into aboriginal's about page...

Dug up the bigger netbook I bought back in June and re-rsyncing my files onto it in hopes of actually switching to it this week. The problem with the atheros 9485 wireless card turns out to be that hardware encryption doesn't work, so if I insmod with nohwcrypt=1 it works fine. Unfortunately, unplugging the headphones made the sound go away and the bug report about it is _also_ from June, with no response. Found that aumix no longer works, and found that alsamixer has quite possibly the worst user interface of all time, but after fightihg through that... it's not a volume or mute problem. Sound just stopped working.

Digging into it, apparently I have to feed the snd-hda-intel module the right model= parameter from Documentation/sound/alsa/HD-Audio-Models.txt (google says somebody got it to work with 6stack-dig but mine only has 2 jacks: input and output...) Oh, and of course given the horror that is audio modules where the reference count on this sucker is _7_ (alsa breaks audio modules up into so many tiny pieces I've never managed an rmmod/insmod cycle that brought the hardware back) I have to reboot to test each guess.

Oh, and Google also said that in 10.04 (still using the opensound stuff instead of alsa), this just worked. It's a regression. (And the gui controls for manually selecting headphone vs internal speaker are only present in gnome, not in xfce.)

Linux on the desktop: smell the usability!

September 4, 2012

Headed into campus with Fade. Hanging out at the Stein of Ein on the drag for the first time in months. It's relaxing. I miss going to UT.

Initramfs isn't working in Aboriginal 1.2.0. Bisecting why. It worked in the 1.1.1 release, but that was 3 kernel versions ago.

Mildly regretting not buying that server upgrade yesterday, but I couldn't use it from here anyway, so. *shrug* In any case, I should send the money to my sister. (I promised to buy her kids internet access for a year, and she installed it but so far only paid for the first 4 months. And that was... a while back.)

September 3, 2012

Fixed toybox mktemp so the kernel build works, and confirmed aboriginal builds through LFS with current toybox once that's fixed. This took a while on my netbook, and the new one (twice the memory but slightly slower processor) doesn't help much. Hmmm.

Priced an upgrade to the dead server: a 3.6 ghz i5 processor (4-way hyperthreaded to 8-way) with 16 gigs ram, in a new motherboard, comes to $570 with tax. That's keeping the case, power supply, and hard drive.

And then I didn't buy it. On the one hand, I'd love to take advantage of this during my week off. On the other hand, I'm not sure I'll actually get the whole week off and if I don't recover from this burnout I'll be living off savings (and coughing up lots of money for CORBA) for a bit. I keep having to bail out desperate friends and relatives the economy is pulling under. Reluctant to spend the money on a server I don't strictly _need_.

Sigh. Much higher expenses at the new place than the old place, makes me nervous. For many years I arranged my life so I didn't actually need to stay employed in any given year, I could take a year off and go back to college or something if I wanted to, because my expenses were really low. That's no longer really the case. I used to be tied down by cats, now I'm tied down by a mortgage.

Fade loves the new house, but paying for it means I can't afford to ever stop working full-time. My calculations when I bought the place said we could survive on about half of what I'm making now, meaning if I worked for a year I could earn a year off, and make great progress towards retirement. But Fade says we'll burn through our savings at anything less than 2/3 of what I'm making now, and that's _if_ we scrimp and save. So if I lose this job, I have to get another full-time permanent position immediately. No more consulting with big gaps where being "too busy right now to do X" has a light at the end of the tunnel. No more "I could always switch to something that pays less but is more fun." Now I have to work for money instead of fun, sitting in a cubicle for the rest of my life, until I die.

But hey, at least I get a couple weeks parole every year. And my boss likes me enough to do me the favor of letting me _use_ my vacation time, at least on the third attempt, so far. (Always look on the bright side of life. Whisle whistle whistle.)

Yeah, still suffering a little bit of burnout. Yay time off.

(Maybe my problem is that I'm working on a product I personally would never, ever use? Not only am I not in the intended audience, neither is anybody I know, which makes it hard to get properly enthused about the actual product I'm working on. And since we're working on a company fork of a vendor fork of the android fork of an obsolete kernel version, none of the source I'm working on will see the light of day. (I'm sure a technically compliant historyless source tarball will eventually be thrown up on some website, but nobody will ever care. They'd only care if it _didn't_ get released to comply with license obligations.)

I think I need another nap.

September 2, 2012

Slept most of yesterday. I needed that.

Much to do. Aboriginal Linux serves so many purposes it's hard to document. It's both a generic tool and something that serves my specific needs.

As a generic tool, Aboriginal Linux needs to:

  1. Build the simplest possible self-hosting Linux system.

    • Document how to do it (these packages, configured this way).

      • linux, uClibc, busybox, gcc, binutils, make, bash.

        • shows minimal command list.

      • Build system needs at least 256 megs ram, serial console, network, 3 block devices, current clock time.

    • Provide a convenient, automated way to _get_ this environment.

      • Provide prebuilt binaries: download our tarball and use it without caring where it came from unless you want to.

      • Reproducible from source, reliably on an arbitrary host.

        • The output of our build system is a system image, but the stages along the way are also tarred up in case you want the cross compiler or the target root filesystem in tarball form.

  2. Demonstrate that it works

    • Show that it can rebuild itself under itself.

    • Show that it's a sufficient to build everything else starting from this environment.

      • Build Linux From Scratch under the result.

    • Regression test that it still works after each package upgrade.

  3. Make cross compiling go away ("We cross compile so you don't have to.").

    • The point is to eliminate the need for cross compiling, not the ability to do so. We do all the cross compiling that's necessary to bootstrap a new target for you, and provide a sufficient capable native environment that you don't have to fall back to cross compiling.

    • Replace cross compiling with native compiling under emulation.

      • The advantage of emulation is that anyone can download QEMU and run it on cheap commodity "cloud" hardware. Not everyone has a dozen different embedded boards with appropriate I/O devices.

      • If you have real target hardware powerful enough to build on, go for it. But we shouldn't require you to.

      • The build should work the same on real hardware as on the emulator. (Identical root filesystem, different kernel .config and invocation script.)

    • Create a native development environment for each target.

      • Provide a preconfigured build environment for each interesting target. This includes:

        • Cross and native compiler configurations.

        • C library configuration.

        • Matching kernel and emulator configurations.

        • Wrapper script to launch emulator to a shell prompt.

      • Anything QEMU emulates well enough to boot Linux is probably "interesting".

      • Make adding support for new targets as simple as possible.

        • All target-specific configuration is in a single file under sources/targets, with most target-generic information factored out into sources/baseconfig-*.

        • Existing targets act as examples for new targets.

        • All targets are as similar as possible, and provide the same minimal base set of functionality so there's not much to configure.

  4. Provide a native development environment simple enough for casual interest.

    • Install QEMU and download a system image, run "./" and you've got a shell prompt on arm, mips, ppc, sparc, sh4... wget your source, compile and run it right there. Type exit when done.

    • Send a package maintainer a link to a system image and a set of commands to build and run their package in there so they can reproduce a bug that only happens on hardware they haven't got. Bonus: they can test their fixes in the same system image.

  5. Provide a native development environment robust enough to deploy as a project's build system.

    • Accelerate that native development environment where possible to reduce cross compiling's speed advantage, let Moore's Law handle the rest.

    • Distcc calling out to the cross compiler (in a way transparent to the native build) has no impact on configure/install but up to 7x speedup of make. This also allows the build to take advantage of SMP.

    • Static linking of target tools to reduce qemu page retranslations caused by dynamic linking (20% speedup of ./configure).

    • Optimize qemu's network and block devices (emulate gigabit ethernet, virtio, etc).

  6. Automate the native builds. (control images).

    • QEMU configured for easy automation.

      • Serial console is stdin/stdout, log output via "tee" or shell redirection

      • Emulator process exits when emulated linux shuts down.

      • Driveable from "expect" or a cron job.

    • Control images bundle control script and data, emulated system runs script (instead of shell prompt) and exists when it finishes.

      • build-static - build dropbear and strace.

      • lfs-bootstrap - build Linux From Scratch.

      • Act as examples to build your own.

The above set of goals is pretty much met. I started working on them around 2003, and shipped a 1.0 release that more or less "did that" in 2010. It's still not perfect, but oh well.

Along the way to making that work I became busybox maintainer, because getting a simple self-hosting system down to that few packages involved a lot of work on busybox (and several pokes to the uClibc guys, although uClibc's 0.9.26 release in January 2004 should have been 1.0 because that's the one that "mosty worked". In comparison busybox had a 1.0 release that was dozens of commands short of working in a self-hosting build environment.)

These days I'm using Aboriginal as a test harness for toybox (gradually replacing busybox and bash), and soon the Musl libc (replacing uClibc). These are licensed in a way that Android might be able to deploy them. (Android's "no GPL in userspace" policy prevents busybox or uClibc from fitting that niche.) And besides, I prefer the toybox code now that busybox has turned into a forest of #ifdefs and weird magic macros with the entry point buried down in a subdirectory.

Eventually, Aboriginal should act as a test harness for qcc (replacing gcc/binutils/make). That gets the package count down to 4 (linux, toybox, musl, qcc), which is about as simple as it gets.

My other todo item is making more bootstrap control images for various linux distros, but that's a bit down the todo list.

September 1, 2012

Burned out. Completely burned out. My day job has been hugely stressful, and I've been repeatedly turned down on attempts to schedule vacation days, but I _might_ have this coming week off. I think. Maybe?

Last october we moved into a new building, where instead of offices or desks in the hallway, we had cubicles. Since then, the department I was hired into has slowly disintegrated. I blame the cubicles.

In a larger context the company has been in crisis du jour mode all year. Around new years senior management announced a "software strategy" for the company where we'd be doing 80% software and 20% hardware, whatever that means. Considering that I work in hardware bringup, that's made my coworkers and I a touch nervous. (Oh, and their "software strategy" involved partnering with Microsoft. This is #3 on Vezzini's list of classic blunders after getting involved in a land war in asia, and going up against a sicilian when death is on the line.)

Senior management has done something flashy to appease the stock market every quarter. First quarter it was random layoffs (with a pile of donuts to apologize to those of us left). Second quarter they idled the facility for a week (and charged us vacation time for it) while they brought in chinese engineers to study our code and start a test project to show that they can reproduce what we did (deeply, deeply reassuring), and then third quarter it was layoffs again (this time the apology was a hamburger lunch, and the bulk of the layoffs was apparently in canada).

Along the way, the guy in charge of the Austin site got deposed and replaced by somebody who lives in California and only occasionally visits Austin. The consensus is that executives pull resources close to where they live, and the Austin site has indeed had a hiring freeze since this switch, although they assure us they're not closing it before they can figure out how to get out of the long-term lease they signed in October.

There are also various all-hands meetings, during one of which they spent an hour telling us how they were going to change the logo (and how big a deal this was) all the while refusing to_show_ us the new logo for the first 40 minutes of this. Shortly before they finally revealed it I told a co-worker "as long as it's not The Brown Ring of Quality, I don't care", at which point they revealed a new logo consisting of three rings, two grey and one brownish red. Sigh. There was an hour long meeting justifying this. They spent eight minutes explaining their font choice to engineers. I counted. There was a video from the CEO, although we get those weekly in our inbox anyway. But this was what they considered important to the future of the company. This was what was holding them back, apparently.

In one of the all hands meetings I asked the executive du jour (too many levels of management for me to have any clue who is who) whether their videoconferencing strategy had any response to Skype, Google hangout, or the builtin video chat stuff on the iPhone. He said that's not our competiton, Cisco and Microsoft are. (I refrained from attempting to explain Moore's Law or Disruptive Technologies to him.)

When I figured out my cubicle was giving me Seasonal Affective Disorder, my manager found me a table in an open lab, next to some windows. This was great, for a while. In this case "open" means it was a server farm that the servers were mostly moved out of, and they added some sofas and a whiteboard. It's not in the conference room scheduling system, which means it's like the tiny sliver of unregulated WIFI bandwidth routers use: everybody uses it for everything. These people include my boss's peers, my boss's boss, and so on. When the guy in charge of the Austin site (the one who lives in California) spent his week in town scheduling one-on-one interviews with the people who report to him (this is like 4 levels of management up from me), I fell out of the habit of using the room, and have been in the Gloom Cubicle ever since.

On the actual engineering side of things, my department's been working on the same pair of board designs since my first stint as a contractor over a year ago. When I started the first board was "done" from a bringup standpoint, and all the work was on the second board. A year later neither board has shipped, and the first board is now the majority of what I spend my time working on. The proposed successor to both boards has been cancelled, but work installed a Doom Clock with a big red timer counting down the days, hours, minuts, and seconds to when the first board presumably ships, so that's something. (Apparently it ships at midnight.) The question of what we do when it's done is... policial.

This doesn't mean I've spent the _entire_ year working on the same pair of boards with no obvious progress: for a while Angela and I were transferred to work on a third board, and then transferred back as that other project was slowly strangled due to lack of resources. At which point Angela quit, and went to work for TI.

The two co-workers I learned the most from at this job were Angela and Xianghua. Along with Hai (who I don't talk to much), they were the lynchpins of competence of the department. (This doesn't mean my other coworkers are bad people, just that I've learned nothing from them.)

On Angela's last day she finally had time to explain to me what she'd been working on, and then based on this "knowledge transfer" I tried to come up to speed on the audio hardware she'd worked on. (Apparenlty it can go through the ALSA layer, but that's not how we're using it.) And then I had to bring her replacement up to speed, a contractor named Victor who is a hardware guy and really doesn't know linux. (Cursors up 30 times rather than retype "ls -l". I bought him one of those laminated Linux cheat-sheets at Fry's and it was appreciated. I spent a couple weeks trying to teach him everything from our build system to how git works. It was exhausting.)

Around then Xianghua burned out. During the week the chinese people were learning to do our jobs, he took his kids to disneyworld, and returned really wanting to spend time with his family instead of in a cubicle. I convinced him to stay a while longer to see if he recovered, towards the end of which he told me he was working at "about 15%" of capacity. He then took a 6 month leave of absence he didn't expect to come back from, and is thinking of trying contracting.

So another knowledge transfer resulting in me taking over responsibility for yet another set of hardware I don't really understand and eventually have to teach somebody else about, although when the new guy starts and what I'm supposed to be doing with it in the meantime are unclear to me. (It's somewhere between "if you could magically fix this it would be appreicated" and "it's broken and you have been designated to take the blame".) I've also taken over the server administration Xianghua was doing, and I'm not clear if I should be putting together system images for hardware testing, or if Hai should be doing it. (He does board #1, Xianghua did board #2, and Angela did Board #3 back when we were working on it. Angela left a system set up to burn board #3 system images when she left, and within a day one of the hardware guys had burned an image to /dev/sda, wiping the drive of that system. So people come ask me to do it now, but they never ask me for an updated kernel or anything, just another copy of the old stuff. If that's what they want...)

So far this year, other than the week where we had to go away while chinese engineers learned to do our jobs (for which we were charged vacation time), I've only managed to schedule one day of vacation time. I've _tried_ to schedule more than that, and was repeatedly denied. I asked about taking some time off back during Spring Break (back before Angela quit), but was told that I needed to do so a month in advance. And since then, it's been one crisis after another.

So a month ago, I asked with a full month's warning if I could bridge the labor day weekend to a week. (This is back before the Doom Clock went up.) And I was tentatively told that as long as I'd done everything they asked by then, I could get the time off. (This was hard because at this point I envy Xianghua's 15% efficiency. I can't concentrate in that cubicle. I sit down and get a headache.)

Yesterday, my manager forwarded a request from senior management that everybody work the weekend. Yes, the three day weekend. This is the point at which I let my manager know HOW burned out I am, which apparently came as a surprise to him. I stayed until midnight trying to get stuff done, including the new todo items they sprung on me Friday.

And now I am at home. They have my cell phone number. They might call me in at any time, I don't know. I'm being charged for vacation time this week, maybe I'll actually get to stay away from work.

Here's hoping.

August 28, 2012

Yay, slashbeast tracked down his mdadm problem (two missing uClibc config symbols) and emailed me the fix.

I actually have a giant tangle of aborigial work that didn't belong in the recent release, such as introduction of toybox and musl support. Now I have to untangle that, check in in as a coherent patch series, regression test it all...

August 25, 2012

On the apple vs samsung patent thing:

Bravo Apple, on a phyrric victory. You've just guaranteed that the entire rest of the world will scrupulously avoid interoperability with your devices, that whatever cross-vendor standards emerge will not have any of you in it, and that henceforth your company will be steered by lawyers. RIP Steve Jobs.

August 24, 2012

Thinking about the random gun violence du jour, not one but two this time: 9 people shot at the Empire State building, and 19 overnight in Chigago. The only things we have in society that kill as many people at once are automobiles.

Driving a car in this country requires a license. Getting a driver's license is an elaborate process involving many hours of supervised practice, a written test, a performance test from a driving instructor, and then you need to renew the license periodically. Your license can be taken away if you're ever found to be driving recklessly even before you hit anyone with it: driving drunk, consistently speeding, running too many stop signs...

The car itself requires another license prominently displayed on front and rear plates, plus an annually renewed registration (sticker displayed on the windshield). The car must be inspected annually, at the driver's expense (if it fails you can't legally drive it until you get it fixed). You're also not allowed to operate your vehicle without large amounts of insurance covering the potential damage you can do with it.

A significant portion of all police manpower is devoted to automobiles, from "traffic cops" to "meter maids". Get involved in a high speed chase and they have everything from roadblocks to spike strips to helicopters standing by to deal with the problem.

Yet there's no car lobby screaming "they're going to take our cars away". Cars can have requirements for seat belts and catalytic converters, but the NRA freaks out at the suggestion magazines that hold a hundred rounds of ammunition should not be available mail-order.

Cars have a purpose other than killing people, which guns do not. The "don't blame the gun, blame the person" argument ignores the force magnification aspect: a crazy person with a knife can kill two or three people, that same crazy person with a gun slaughters an entire norwegian camp full of children, or a high school, or a movie theatre, or Virginia Tech five years ago or the University of Texas tower back in the 60's.

The justification for regulating ammunition less stringently than sudafed always boils down to a need for "keeping the government honest", which is stupid. Examples of armed resistance to federal authority have all ended the same way for over a hundred years, from the civil war through the branch davidian compound.

Modern warfare involves close air support based on realtime sattelite imagery. The idea that civilian gun ownership has any military value completely ignores the existence of tanks, airplanes, artillery, cruise missiles, unmanned drones, or the fact that the military has been practicing urban warfare against small arms fire for decades in Vietnam, Somalia, Grenada, Iraq, Afghanistan... The hard part of modern warfare is _finding_ the enemy, not killing them once they've been identified.

In modern urban warfare "Improvised Explosive Devices" (mines, tripwires, car bombs, etc) have consistently had far more impact than actual guns since somewhere around the invention of the armored personnel carrier. The successful guerilla attacks on america were also based on improvised exposives: the unibomber maintained his campaign for decades, the oklahoma city bomber filled a truck with fertilizer and fuel oil, and the world trade center was destroyed when jet fuel melted the support girders the 1970's asbestos scare had left uninsulated.

In all of those cases, civilians were hurt, the government was not. Whether or not the government was the intended target was irrelevant. The same goes for the argument "Oh if only some of those people at the Batman showing brought guns so that after the teargas canister went off, they could have fired blindly into a dark crowded room in the vague direction of someone wearing full body armor." "Friendly fire" isn't. What guns are really good at is killing innocent bystanders.

The supremely hypocritical part is that most of the people insisting the only thing keeping our army from turning on us is civilian gun ownership are rooting for that same army to "shoot an idea" in Afghanistan (which is hard when they don't all wear the same color shirt). The people our army is killing in Afghanistan often have guns, and yet we're in year 11 of finding more of them to shoot. Anybody know the rates of civilian gun ownership in afghanistan before the war? Does anybody _care_? Did it matter?

And of course while you're going on about guns the feds have warantless wiretaps, intercept and record all internet traffic at the backbones, have cameras on all the traffic lights, complete logs of all credit card and banking transactions, extrajudicial rendition to guantanamo bay, persecuting the wikileaks guys for the modern version of The Pentagon Papers, an unmanned drone flying over your city as we speak, and everybody carries around a wireless microphone and camera combo with GPS reporting 24/7 called a "cell phone". But it's ok because you have a handgun, that'll show 'em.

If you think that stopping the government from "taking your guns" has _anything_ to do with restraining federal power, you're utterly delusional.

The NRA is a delusional entity stuck in the past sometime before "pickett's charge", which is terrified that we might someday become a country like britain, japan, or canada. Yet Wikipedia[citation needed] is of the opinion that the NRA has 4.3 million members and a budget somewhere over $200 million, while their opponents have 28 thousand members and an annual budget of just under $4 million.

Then again, the smaller organization is more likely to appreciate a smaller check.

August 23, 2012

Alas, reinstalling my server quadrolith with ubuntu didn't fix the hideosly slow disk access problem. (Like 200k/second slow.) Although dmesg isn't being obvious I _think_ the problem is the iommu chip returning "all ones", which implies the processor is fried somehow.

I admit to a certain amount of envy of a department server at work which I happen to know is a cheap 8-way i7 with the memory upgraded to 16 gigs. Still, "cheap" in this context comes out to $1000 with all that memory, not quite an impulse buy.

August 22, 2012

Aboriginal Linux 1.2.0 is out!

I am _less_behind_. Woo!

Alas, reinstalling the server with ubuntu didn't fix the hideosly slow problem. Although dmesg isn't being obvious I _think_ the problem is the iommu chip returning "all ones", which implies the processor is fried somehow. I admit to a certain amount of envy of

August 19, 2012

Fade's recent wisdom teeth surgery makes it difficult to celebrate her birthday. (I'd take her to fogo some cow, especially since it's a weekend and we can do it at lunch rates, but she's still on a diet of what Cohen the Barbarian would call "Shoup".)

Yay! I got the Arm versatile stuff working again! (Patch to the kernel to revert the most recent breakage.) The Fedora stuff is fixed, and I tracked down (and fixed) the longstanding ".config is a directory" bug in uClibc++.

One todo item left, and that's Piotr Karbowski's bug report about the raid stuff not working.

Sigh, what else am I doing. (Not remotely a complete list, just "off the top of my head"...)

August 18, 2012

Austin's having another drought crisis to the point we've got watering restrictions, but once again I biked to Fry's and got trapped there by a thunderstorm. Not sure if this is ironic or some sort of personal rain dance. (It's not that I mind getting wet, it's that my netbook, phone, and headphones probably would. Plus being hit by lightning sounds unpleasant, and it can get a bit intense at times.)

Shared a bus home with a nice woman named Tina who works in the Fry's cafe. Introduced her to the weapons-grade cute that is the Chi anime series. Met another guy who works at dell at the bus transit station, and discussed comptuer stuff with him for an hour before deciding my bus wasn't coming, but by that point the rain finally let up enough I coudl walk home.

Tomorrow, I need to return to Fry's and pick up my bike. Wondering if the "rain prevents return" thing requires a round trip to kick in. One way to find out...

August 17, 2012

Fade got her top wisdom teeth out today. Our first date was shortly after she got her bottom wisdom teeth out, we went to the newly opened Kirby Lane on The Drag and tried to find soft foods, and she wound up eating gingerbread. Kirby no longer sells gingerbread (except as pancakes), but the Jason's Deli down the street has little gingerbread muffins in the salad bar, which they sell by the dozen if you ask.

Yesterday's unexpected half day off got me unblocked on Aboriginal Linux and my backlog of email (both of which were blocking toybox). I am CATCHING UP. Yay!

I also converted my blog here to have more obvious links to each individual entry. (It's had the #anchors for years, but the only links were in the rss feed. Now the HTML has them too.) This involved hitting all the previous years with sed to regularize them, so I could make big rss feeds for each year in case I want to import them somewhere else someday.

Actually using the "span" tags I've been throwing in forever is a problem for another day. :)

August 16, 2012

I got a half day off from work! Woo!

Time to catch up.

My Fedora server ate itself, by which I mean it runs fine for a couple minutes after you reboot it and then becomes INSANELY SLOW with either one or two of the processors pegged in I/O wait when the system's completely idle, and the others blocked waiting for the filesystem for insanely long times.

I thought "maybe this is the leap second bug", so I did a yum upgrade from the console (which took an hour), then rebooted the machine... and it worked for a couple minutes after reboot and then became insanely slow again.

Alas, this is Fedora which means I have a script I run each boot to switch some of the insanity OFF at least until I can ssh into the thing. (It's behind a router, only the ssh port is forwarded, but installing an ssh server via yum didn't edit the firewall rules to let the ssh port THROUGH. It returns a no route to host packet until I flush the firewall stuff. Because Fedora is nothing but Red Hat Enterprise Rawhide and they only care if I pay them to care, that's why.)

I installed Fedora on the server because A) the Gentoo install ate itself and I didn't feel like devoting an entire weekend to redoing it, B) I had a Fedora bug to track down. But today I finally fixed that, so I no longer need to keep such a horrible distro around. If it hadn't eaten itself I'd have kept it just for testing diversity, but if I have to reinstall it won't be Fedora.

I'd prefer somthing _other_ than Ubuntu though, since my netbook is xubuntu. Debian is Ubuntu Rawhide, it doesn't really increase my testing base even if I was willing to put up with the eternal flamefest of FSF politics without the potholder of Cannonical preventing me from coming in direct contact with it.

Reinstalling Gentoo isn't worth the effort. I'd almost certainly break down an install Funtoo on the server at this point, except I'm still not welcome on the IRC channel.

(Speaking of one of the friends that guy insulted, her last dialysis treatment is tomorrow but that's because she cut it short. The doctors wanted 12 weeks but she could only afford 2 weeks, and although her job at HEB was willing to accomodate her schedule by making her part-time (which cut her pay enough that she'd almost exactly make rent with no money left over), that made her health insurance less effective which made the copay on each dialysis session jump to $130 each, which she'd have to pay twice a week and just couldn't afford even with my help. Her job can make her full time again next week, and the two weeks of dialysis at least gave her damaged kidney _some_ time to recover, so her doctor's either hopeful or making the best of a bad situation, and she'd rather lose one kidney than be unable to make rent. Did I mention that the reason she moved across country in the first place was to escape a husband who beat her?)

Yeah, not paying more attention to Funtoo.

So what does that leave? Lots of things like Knoppix and Crunchbang are Debian reskins. SuSE is chasing Fedora's tail lights and might be interesting if they hadn't gotten eaten by "private equity"; the people I know who used to be big SuSE fans say it's gone downhill. Is slackware still a thing? Does Arch do servers? Given that the market share figures I dug up show Ubuntu with 34 times the market share of Fedora and everything else a rounding error, does it _matter_?

I'm almost tempted to put FreeBSD on there, but I actually use this server.

I guess I'll just throw ubuntu on it and then run under distros under qemu for testing. It'd be really nice if I had a machine that actually had the darn VT extensions _working_ so kvm was actually accelerated. (They never default to on because a virus that uses them could completely hide from the OS by installing itself as a hypervisor. This is apparently why the bios can disable them.)

August 10, 2012

Terry Pratchett's "Making Money" _almost_ gets the nature of money right, without ever quite explicitly nailing the core point, that money is a promise.

In the previous book, "Going Postal", Vetinari took a convicted con man (Moist von Lipwig) and put him in charge of Ankh-Morpork's near-abandoned post office. Moist revitalized the post office by selling stamps, I.E. a promise to deliver a letter you hadn't even written yet.

Stamps could be exchanged freely once purchased, and the fact this promise could eventually be redeemed for a specific service got buried under the fact it could also be traded for other goods and services. People started using the stamps as money, a valuable "fiat currency", I.E. a portable reliable promise of something valuable, made by an entity trusted to keep that promise.

Ankh-Morpork is moving from a middle ages setting through the renaisance, which means it's retracing the development of modern money, where barter was replaced by commodity money which was replaced by fiat money.

Barter doesn't scale well beyond people you personally know, and leads to commodity money when everybody agrees on a single type of valuable stuff they'll all accept as payment and list prices in, so they can stop haggling and re-enacting the MASH episode about getting tomato juice to Colonel Potter (through a chain of a dozen different trades). One universal commodity everybody wants (if only because so many other people want it).

Commodity money is vulnerable to annoying levels of inflation if it's too easy to get more, and catastrophic deflation that crashes the economy into a depression if it's too _hard_ to get more. Deflation is sudden and deadly, like an engine running out of oil and siezing up. A shortage of money leads to a lack of buyers which drives prices down, discouraging people from spending the money they've already got if it'll be worth more tomorrow, and besides that money would be hard to replace if nobody else is spending any. Even a brief bout of deflation causes a feedback loop of hoarding money, which spirals downwards fast unless somebody pumps more money into the system to get everything moving again.

Commodity money can't cope with growth because nobody can control the money supply. Even simple population growth requires an ever-increasing amount of money to lubricate the exchange of goods and services, and without a slowly but steadily increasing money supply deflationary crashes repeatedly choke back growth. If it's population growth, the result is famine.

Fiat money fixes this by using _promises_ as money, allowing control of the money supply by controlling the rate at which promises are issued and redeemed. But who can issue rock-solid promises "as good as gold"? This gets us back to Postmaster Moist von Lipwig: a con man skilled at making and selling promises.

The post office's stamp was one such promise, Moist took the old practice of stamping a letter when people paid to have it delivered, and sold the stamps seprately, the promise to deliver a letter that hadn't even been written yet. The institution of the post office backed these promises, they would deliver a stamped letter. And regulating the supply took care of itself as well: the post office would provide enough stamps to meet the demand, but not so many they couldn't deliver the resulting letters (at least until they could hire more workers as needed).

People started using these promises as money, a valuable universal commodity everybody would accept as payment. The post office could control the rate of issuing these promises to keep them available but valuable. And thus Ankh-Morpork got its first fiat money.

Vetinari saw the usefulness of fiat money, and moved to centralize it within the ankh-morpork government. The question was "how", and Vetinari's answer was "let Moist sort it out". Before stamps, Ankh Morpork had some other weak "promises as money", and Vetinari put Moist in control of the pair of entities that issued them: a coin mint attached to a bank. The Vetinari sat back to see what this promise-making con man would do with those resources.

Coins are a sort of promise, arising naturally in markets where the commodity money is a metal. A trusted authority measures a certain amount of metal, and stamps their mark on it to certify that it's the right amount and that it's sufficiently pure. This allows people in the market to trade it without having to re-measure it. There can be coins of different sizes and different metals, generally with fixed exchange rates when making a change from one type to another.

As the market grows (and people get rich thus hoarding money), the money supply must increase to avoid deflation, but it has to be hard to get more gold (or silver, or copper) because otherwise it would be bad commodity money subject to too much inflation. The only way the trusted authority can stretch out the supply of commodity metal is by making the coins smaller or less pure, and switching the promise to be "we promise to treat this coin AS IF it contained the original amount of metal".

This is a big controversial change, and not all marketplaces manage to do it. Of course the ones that can't bring themselves to do what is necessary to survive some succumb to deflation and revert to barter, so you don't hear much about them. In the rest old men bemoan the "watering down" of the currency but the alternative is worse.

Coins are a halfway state, stretching the commodity money with promises, but not getting away from the commodity entirely, it's still there as an unspoken ideal that current coins aspire to. It also doesn't solve the full scalability problem: enough money to buy a house is too heavy to carry, performing large transactions with coins is ridiculous. (In "The Truth", framing Vetinari for stealing more coins than a horse could carry pointed this out.)

Banks solve the house-buying problem by making another sort of promise: You don't have to carry a house worth of coins, instead deposit your commodity money with the bank (including the promise-coins), then write a check giving someone else permission to collect that pile of coins, or just have the bank continue holding the coins for the new guy. This you can carry around the contents of a vault in your pocket. The bank can even print up special bearer bonds called "bank notes" that anyone can exchange for the appropriate number of coins in the vault, and thus can be used in place of them. (In our world, this is how paper money originated.)

Banks stretch out the money supply by taking the sting out of hoarding. If savers keep their money in a bank instead of a basement or mattress, the bank will loans out those deposits (charging interest which they share with the depositor). Money that goes into the bank goes right back into circulation, even though it's still in the bank.

For example, buying a house with a mortage means the builder gets paid now (and can spend it immediately), the buyer gets to live in it while paying back the loan (instead of saving up for 20 years before they can get a house, living who knows where in the meantime), the depositor still has the same number of dollars "in the bank", and the bank and depositors are earning interest on the buyer's promised future payments on the loan (the buyer's promise is an asset to the bank). As far as the economy is concerned, the money now exists in two places at once (stretched out by the pair of promises making up the loan), and_ it made a purchase along the way.

Once people are comfortable with the idea of keeping money in banks and paying with "bank notes" or by check (moving money from one account to another without going through any "cold hard" form of cash the bank has to get from outside), the banks can start to loan out more money than they actually have. If how much money you have in your account is just a number written down somewhere, the bank can just write down new numbers. When you give that money to someone else who also has an account with the bank, the number just moves from one account to another.

There are downsides, of course: our old friends inflation and deflation. If a bank can print more "bank notes" than they have deposits, why bother to sell goods and services? Allowing a for-profit business to print money is the old "commodity too easy to get" problem all over again, plus "the tragedy of the commons" if there are multiple competing banks.

But a bank that _can't_ print money is vulnerable to a new type of deflationary crash called "bank runs" (see "It's a Wonderful Life" and "Mary Poppins" for classic examples). If a day ever comes without enough cash on hand to satisfy the withdrawls depositors actually make, the bank breaks its promise to give the money back when people ask, the broken promises become worthless, feedback loop, suddenly the bank's collapsed and nobody has any money. Knowing how much "capital reserves" to keep sitting idle requires predicting the future, which is hard.

To control inflation banks need some external force auditing them to impose capital reserve requirements. To control deflation that external force has to be able to loan the bank unlimited amounts of money to cover panicked withdrawls without foreclosing on everybody's mortgage (the classic "something must be done, this is something, therefore it must be done" idiot move of burning the village to save it).

This gets us back to Moist von Lipwig working for Vetinari. Moist's job at the mint, attached to the Bank of Ankh-Morpork, is to introduce paper money backed by _government_ promises. In theory these "promissory notes" are an IOU from the mint good for dollar coins they haven't manufactured yet. In practice, the purpose of the notes is to be spent as money without ever exchanging them for anything. It's money that's a pure promise written down, but a promise of _what_ exactly?

In theory you can exchange a one dollar promissory note for a dollar coin, which at this point has "approximately the gold content of seawater". You can exchange it for a dollar's worth of gold, but as the plot of the book goes on it turns out all the bank's gold was stolen years ago by the rich family running it, and nobody noticed. (At least until Igor replaces it by magic, which is something of a neusance at that point.)

Moist is a con man, used to selling "sizzle without steak", I.E. promises backed by nothing. He has great instincts but little explicit understanding of what he's doing. Here's what he's doing:

Moist is making promises backed by Vetinari's government, which Vetinari allows to continue because he finds them useful. He's printing those promises on pieces of paper, and Vetinari's government promises to accept those pieces of paper in payment when you buy services _from_the_government_.

When do you buy services from the government? There are some small fees for licenses and such, but the big one is when you pay taxes. Moist is saying "these pieces of paper are worth a dollar", and Vetinari is backing that up by agreeing to accept them in payment for taxes. Moist's dollar notes are backed by the full faith and credit of the Ankh-Morpork guild of tax collectors.

It's not coincidence that at the end of the book, Vetinari reassigns Moist to be in charge of tax collection, showing the city first the carrot, then the stick.

August 9, 2012

Starting a very basic mount command for toybox, and pondering infrastructure issues again. This time, command line argument ordering. Specifically, "-o ro" is a synonym for "-r" and "-o rw" is a synonym for "-w", and the last one listed on the command line wins. so if you have "-r -o ro -w" then the -w wins, and "-r -w -o ro" the -o ro wins, and "-o ro -w -r" the -r wins. Unfortunately, the automated parsing infrastructure doesn't quite work that way.

I can make -r switch off -w and vice versa, but working in the --longopts stuff along with that is... funky. I'm telling it to pass through unknown options (so I can turn --loop into -o loop, although I plan to autodetect when SOURCE is a file and do that automatically, --bind is a better example), but I can't say that -o isn't a recognized option and defer the parsing because A) kinda hideous code duplication making "-oloop" and "-o loop" both work, B) -o -r is a theoretical possibility where -r is the option to -o. (Who knows _what_ funky options filesystems like virtfs may want.) In which case -r would be consumed by the option parsing before I could take defered action, and the order is once again horked.

I'd say "don't do that then" (and just have -rw always stomp "-o rw,ro"... actually I _think_ that's what I did in busybox 5 years ago), but the tricky part comes in when you're mounting a filesystem in /etc/fstab or remounting a file in /etc/mtab (with -o remount) in which case the base options out of the file get preserved and then the command line arguments get applied on top of them.

Then again, keeping order straight between command line options and something _else_ is pretty straightforward. The automated parsing handles the options, but splicing together "-o rw -o loop" into a single option blob is secondary processing anyway, and _that_ I can control the order of.

(A preposition is an excellent word to end a sentence with. English is not latin; deal with it.)

August 8, 2012

Wouldn't it be nice if I could run a kickstarter to afford to work on open source stuff full time for a year? Other than the part about not having any health insurance, being unemployed in a recession at the end of it, and of course nobody contributing to such a kickstarter.

I like my day job, but my most interesting (immediate) coworkers keep leaving, and it's hard to take pride in working on a dedicated box to perform a high-end proprietary version of a function that's built into my phone. (Moore's Law has to work into it somewhere, but upper management insists their competitors are upmarket, not down. Oh well, ours is not to reason why, except within a very narrow band of defined job responsibility...)

Plus my commute eats about as much of my schedule each week as working on linux-kernel Documentation, and I'm the maintainer of that. Once upon a time I did get sponsored to improve kernel documentation, back when I wasn't really in a position to do much about it. Now that I can (because the previous guy has even less time to work on it than I do, and handed off the authority to go at it), I've got evenings and weekends, and early mornings if I'm not too tired to snooze my alarm.

Toybox had a lot of anonymous interest but nobody willing to go on record for fear of aggroing the fanatic hordes of the FSF who would blog mightily against such things from the comfort of their armchairs), especially after they made an example of Tim Bird (who was barely even involved).

Oh well. I continue to work at my own pace, but I'm making much more progress on the stuff I can dink at in 15 minute intervals than the stuff I need to find an uninterrupted 4 hour block for. (If it takes 15 minutes just to review all the context and get the details of the problem straight in your head, by around the third time you've done this without making any progress because you get interrupted right after that, even simple things start to seem hard, because if it's so simple why am I still grappling with this?)

August 5, 2012

Biked to the the little coffee shop in Fry's (which they tell me is actually a Cafe, which I assume means they serve sandwiches). I'm not sure if it was unusually exhausting and nausea inducing because I'm that out of shape, or because it's 105 today and I set out around noon.

Poking around at toybox, taking the #define GNU_DAMMIT out of taskset. (The sched_getaffinity() and sched_setaffinity() system calls are Linux-specific things, I.E. things Linus Torvalds added to Linux which do not exist in other operating systems, so the crazy Hurd guys running glibc decreed that #include won't give them to you by default unless you #define ALL_HAIL_RICHARD_STALLMAN which ain't happening.)

The taskset command itself is crazy. For example, the -p option is not followed by the pid, instead the pid is always the second argument, meaning it expects "taskset -p mask pid" instead of "taskset -p pid mask". Oh, and -p has to come before mask, it won't accept "taskset mask -p pid". Did I mention the mask is optional, so "taskset -p pid" queries the current mask without trying to set a new one, but NOT FOR THE REASONS YOU THINK? And back before the -a option went in, it was the default behavior for -p, but when the new option showed up the default behavior changed. Ubuntu 10.04 has the old one, Fedora has the new one. Wheee...

And of course there's no standard, just a magic implementation, all idiosyncrasies of which are the de-facto standard, and it changes from version to version. Like Perl, or Microsoft Word/Excel, or web browsers back in the '90s...

(I'm strongly tempted to make PID a proper option to -p anyway. Grrr.)

Only got a couple hours at Fry's, then it was time to bike home to meet everybody for the Drafthouse zoolander singalong. (Never seen zoolander.) Except when I biked up to Fry's I thought it was at the one on Anderson and I'd meet them there (halfway back from Fry's), and it turns out it's Lakeline and I had to bike home so everybody could drive there. Didn't make it in time (Austin only slopes down towards the river somewhere south of 183: heading there from Fry's it's got significant uphill bits), and may have given myself some variant of heatstroke trying. (I am REALLY out of shape.) So I spent a quiet evening at home, which was lovely.

If I seem a bit incoherent right now... eh, it's been that kind of week.

Luckily I have to tomorrow off (the trick is to ask to use your vacation time a month in advance). Maybe I can finally nail down this Fedora issue and get an Aboriginal release out...

August 4, 2012

Went to Texas Linuxfest, near-delirious and kept upright by three energy drinks. I think I met Anthony Liguori, but couldn't find him again and am honestly not sure it was him. (My attention span is... not good right now.)

I did find someone with a account who could sign my pgp key, and a lead on a second, so that's good. I really need to get access to that back so I can catch up on the todo list there.

Got a ride home earlier than I'd planned, went to bed at like 5pm. TIRED.

August 3, 2012

I'm so tired that everything hurts.

Last weekend was armadillocon. Tuesday two of Fade's friends flew in from out of town, for a week. This weekend is Texas LinuxFest in San Antonio, which I really should go to if I want to get keys signed (to get access back) because I haven't made it to a linux conference since 2010.

I haven't gotten an aboriginal release out: worked around the sh4 bug and figured out why arm broke (tldr version: the old kernel code only ever worked on qemu and who uses _that_... posted by a qemu developer to the qemu mailing list). Still need to fix it and fix the fedora-specific breakage. (Oh, powerpc and x86-64 fail on fedora for the same reason: host/target confusion. Yes, building on Fedora tries to run powerpc code on x86-64.)

At the day job I've spent all week walking the new co-worker (who started on Monday to replace Angela) through all the setup he needs to do. I've repeated every piece of it at least 3 times. I should get impatient with him calling me over to fix things that I've never encountered before either but I can figure out in 15 minutes by reading the wiki page he's got open in front of him or opening the Makefile in vi and finding the wrong path the build just died on. But by Thursday I was already a bit tired of watching him cursoring through the command history (for 30 seconds) to avoid having to retype "ls -l". I love teaching, but this instance of it is not remotely rewarding so far. (Not entirely his fault: he's a hardware guy moving into software the same way I'm a software guy who wanders through hardware. We _mostly_ do software here but require knowledge of both, and that's hard to come by. Everybody who does it reasonably well is busy.)

In theory the reason I'm the one doing this is knowledge transfer out, to teach him what Angela told me when she left a month ago. (Domain knowledge about the audio subsystem which I didn't otherwise have. Not that I ever got her test case to work so I could fix the bug she was working on either.)

Meanwhile Xianghua's last day is today, and I've been doing knowledge transfer from him. I have to learn about a DSP we can't build code for (just run potted snippets), and server administration for a magic build machine you set up by rsyncing directories from one of the other magic build machines (and don't upgrade the previous ubuntu LTS to the current ubuntu LTS or everything will break), and don't mind that there are up to four versions of each package under /opt because they're all needed.

There is now a Doom Clock down the hall from my cubicle, a big red digital display counting down the days, hours, minutes and seconds until the board we've been working on (since before I started here) theoretically ships. When I started a year ago, I thought it was a couple months away from being ready. The Doom Clock agrees that it's a couple months away from being ready, but has advanced 2 days since they installed it, so I guess they really mean it this time. (According to the clock the release happens at midnight, which implies we'll be working late that day.)

Did I mention I can't use any vacation time without scheduling it at least a month in advance? I got monday off to hang out more with Fade's friends, but asked for it before Angela left. Every morning I want to call in sick, but usually don't. Yesterday I finished two energy drinks before I even got in the car and _still_ felt exhausted. I've gained 15 pounds since I started this job. Chest pains and headaches are a daily occurrence now. Sometimes I get a little exercise on the weekends, at the expense of programming time. (Weekends are the only time lately where I get to do real work.)

I didn't get the Documentation patches sent to the kernel before the 3.6 merge window closed. Not that it's vitally important they go in during the merge window (I've been feeding them through the -trivial tree anyway), but I thought I had one more day. I want to do major _improvements_ to the kernel Documentation directory, but till haven't got a account set up to submit them through a git tree (everything else has to be cryptographically signed commits which is more or less an equivalent problem of getting a key signed by enough of the _right_ people), and I just haven't had the time anyway. Not even to dig up the stuff I already tried way back when. And I need to get access back to update also. Which is why I need to go to texas linuxfest this weekend instead of hanging out with Fade's visiting friends, or doing open source development, or exercising, or sleeping.

I didn't get the perl removal patches submitted either, but there's a reason for that. I need to rewrite them to remove perl from the _new_ way of doing things, apparently. I haven't had time to _look_ at anything but the article.

And my babysitting duties have kept me in my cube all week (when I wasn't in Angela's old cube with the new guy) rather than the lab I can see sunlight from, which isn't helping matters.

I feel a bit bad about not updating this blog more often, but considering this entry is entirely whining and excuses, I think I'm better off waiting until I have something constructive to report.

July 27, 2012

Rushing to get stuff done before work. Powerpc only breaks on Fedora, builds fine on ubuntu. (I really would like to know _how_, the whole point of host-tools was to avoid that sort of thing. I'm guessing #!/bin/sh which is still an absolute path.)

The sh4 bug is actually a qemu problem triggered by an apparently pointless kernel code change (let's check a register that can never be used in this configuration, just 'cuz we can). Writing it up for the qemu-devel list, and I can do a kernel patch _anyway_ to get something that works with existing qemu releases.

Still haven't fixed the arm thing, or tackled the horrors of Fedora-specific breakage. And this weekend is Armadillocon.

July 24, 2012

Arm versatile in v3.5 is deeply horked. (Yes, again, after the saga of bisecting 3.3->3.4 they broke it again _before_ reverting that breakage. So that's two consecutive releases without a single commit that would boot to a shell prompt on serial console on the versatile board. Bravo.)

I bisected it to:

commit 1bc39ac5dab265b76ce6e20d6c85f900539fd190
Author: Russell King <>
Date:   Sat Mar 10 11:32:34 2012 +0000

    ARM: PCI: versatile: fix PCI interrupt setup

    This is at odds with the documentation in the file; it says pin 1 on
    slots 24,25,26,27 map to IRQs 27,28,29,30, but the function will always
    be entered with slot=0 due to the lack of swizzle function.  Fix this
    function to behave as the comments say, and use the standard PCI

    Signed-off-by: Russell King <>

So the arm maintainer noticed a place where the code didn't match the documentation, changed the code to match the docs, and result didn't work. But of course.

Unfortunately we can't just reverse this patch in 3.5, because later patches stomped on the context. So let's read:

The patch does two things: change the IRQ number, and change "swizzle" (something to do with PCI bridges). The line to change the PCI number back is trivial, but trying it doesn't fix the problem. Apparenlty the "swizzle" part is the problem. I have no idea what swizzle is, it used to be NULL and now the field isn't even initialized so we get some default swizzle that's breaking things. To fix this, I need to disable broken default behavior. That's usually only a problem with FSF code...

Right, no time to bang on this now...

July 23, 2012

Toybox 0.4.0 is out. I need to do binaries for it, which means I need to fix aboriginal linux, speaking of which I really really really need to get an aboriginal linux release OUT because the 3.5 kernel dropped yesterday and DUDE am I behind.

How many platforms are broken... four. Sigh.

July 16, 2012

Economics is a strange hobby of mine (spun off of the stock market investment columns I used to write), and I'm finally reading some of Paul Krugman's really _old_ papers. As you'd expect from a Nobel prize winner: they're really good.

Currenty reading an autobiographical piece he wrote back in the 90's called How I Work, which has lots of nice quotes. As far as I can tell, economics boils down to crashing toy cars. And that the really fun stuff isn't necessarily some hideously complicated thing you have to build a mathematical gantry crane around in order to reach, but can instead be too simple and obvious for other people to have paid attention to, at least once you look at it right.

Here's the meat of the paper with most of the actual economics stripped out, and just the "what it takes to be a brilliant scientist" parts left:

"My first love was history; I studied little math, picking up what I needed as I went along."

"I found my intellectual feet quite suddenly, in January 1978. Feeling somewhat lost, I paid a visit to my old advisor Rudi Dornbusch. I described several ideas to him, including a vague notion... I had studied in a short course offered by Bob Solow... might have something to do with international trade. Rudi flagged that idea as potentially very interesting indeed; I went home to work on it seriously; and within a few days I realized that I had hold of something that would form the core of my professional life."

"I was, of course, only saying something that critics of conventional theory had been saying for decades. Yet my point was not part of the mainstream of international economics. Why? Because it had never been expressed in nice models... I suddenly realized the remarkable extent to which the methodology of economics creates blind spots. We just don't see what we can't formalize... So there, right at hand, was my mission: to look at things from a slightly different angle, and in so doing to reveal the obvious, things that had been right under our noses all the time."

"The point of my trade models was not particularly startling once one thought about it: economies of scale could be an independent cause of international trade, even in the absence of comparative advantage. This was a new insight to me, but had (as I soon discovered) been pointed out many times before by critics of conventional trade theory... To make the models tractable I had to make obviously unrealistic assumptions. And once I had made those assumptions, the models were trivially simple; writing them up left me no opportunity to display any high-powered technique. So one might have concluded that I was doing nothing very interesting (and that was what some of my colleagues were to tell me over the next few years). Yet what I saw -- and for some reason saw almost immediately -- was that all of these features were virtues, not vices..."

"To get this system or aggregate level description required, of course, accepting the basically silly assumptions of symmetry that underlay the Dixit-Stiglitz and related models. Yet these silly assumptions seemed to let me tell stories that were persuasive, and that could not be told using the hallowed assumptions of the standard competitive model. What I began to realize was that in economics we are always making silly assumptions; it's just that some of them have been made so often that they come to seem natural. And so one should not reject a model as silly until one sees where its assumptions lead."

"Finally, the simplicity of the models may have frustrated my lingering urge to show off the technical skills I had so laboriously acquired in graduate school, but was, I soon realized, central to the enterprise. Trade theorists had failed to address the role of increasing returns, not out of empirical conviction, but because they thought it was too hard to model. How much more effective, then, to show that it could be almost childishly simple?"

"I had some trouble getting that paper published -- receiving the dismissive rejection by a flagship journal (the QJE) that seems to be the fate of every innovation in economics -- but pressed on... What had been a personal quest turned into a movement, as others followed the same path... Our magnum opus, Market Structure and Foreign Trade, served the purpose of making our ideas not only respectable but almost standard: iconoclasm to orthodoxy in seven years."

"Here, perhaps even more than in trade, was a field full of empirical insights, good stories, and obvious practical importance, lying neglected right under our noses because nobody had seen a good way to formalize it."

"Doing geography is hard work; it requires a lot of hard thinking to make the models look trivial, and I am increasingly finding that I need the computer as an aid not just to data analysis but even to theorizing. Yet it is immensely rewarding. For me, the biggest thrill in theory is the moment when your model tells you something that should have been obvious all along, something that you can immediately relate to what you know about the world, and yet which you didn't really appreciate. Geography still has that thrill."

"My work on geography seems, at the time of writing, to be leading me even further afield... So I expect that my basic research project will continue to widen in scope."

"My four basic rules for research:

  1. Listen to the Gentiles (Pay attention to what intelligent people are saying, even if they do not have your customs or speak your analytical language.)
  2. Question the question
  3. Dare to be silly
  4. Simplify, simplify"

"...there was already a sizeable literature criticizing conventional trade theory... Yet all of this intelligent commentary was ignored by mainstream trade theorists -- after all, their critics often seemed to have an imperfect understanding of comparative advantage, and had no coherent models of their own to offer; so why pay attention to them? The result was that the profession overlooked evidence and stories that were right under its nose."

"The same story is repeated in geography. Geographers and regional scientists have amassed a great deal of evidence on the nature and importance of localized external economies, and organized that evidence intelligently if not rigorously. Yet economists have ignored what they had to say, because it comes from people speaking the wrong language."

"I do not mean to say that formal economic analysis is worthless, and that anybody's opinion on economic matters is as good as anyone else's. On the contrary! I am a strong believer in the importance of models, which are to our minds what spear-throwers were to stone age arms: they greatly extend the power and range of our insight. In particular, I have no sympathy for those people who criticize the unrealistic simplifications of model-builders, and imagine that they achieve greater sophistication by avoiding stating their assumptions clearly. The point is to realize that economic models are metaphors, not truth. By all means express your thoughts in models, as pretty as possible... But always remember that you may have gotten the metaphor wrong, and that someone else with a different metaphor may be seeing something that you are missing."

"The same is true in a number of areas in which I have worked. In general, if people in a field have bogged down on questions that seem very hard, it is a good idea to ask whether they are really working on the right questions. Often some other question is not only easier to answer but actually more interesting! (One drawback of this trick is that it often gets people angry. An academic who has spent years on a hard problem is rarely grateful when you suggest that his field can be revived by bypassing it)."

"If you want to publish a paper in economic theory, there is a safe approach: make a conceptually minor but mathematically difficult extension to some familiar model. Because the basic assumptions of the model are already familiar, people will not regard them as strange; because you have done something technically difficult, you will be respected for your demonstration of firepower. Unfortunately, you will not have added much to human knowledge."

"What I found myself doing in the new trade theory was pretty much the opposite. I found myself using assumptions that were unfamiliar, and doing very simple things with them. Doing this requires a lot of self-confidence, because initially people (especially referees) are almost certain not simply to criticize your work but to ridicule it. After all, your assumptions will surely look peculiar... Why, people will ask, should they be interested in a model with such silly assumptions -- especially when there are evidently much smarter young people who demonstrate their quality by solving hard problems?"

"What seems terribly hard for many economists to accept is that all our models involve silly assumptions... The reason for making these assumptions is not that they are reasonable but that they seem to help us produce models that are helpful metaphors for things that we think happen in the real world."

And from about that point on (starting a couple paragraphs before the explanation of "Simplify, simplify") I'd just be quoting his paper verbatim, so go read it.

July 15, 2012

Sigh, so much stuff I wanna do, so little time to do it.

Got od more or less sorted out and checked in. I've got a pending passwd submission to review, and a _desperately_ overdue aboriginal release to bang on. Plus a pile of linux kernel documentation stuff to review and more I need to do myself, and I need to get a git tree up for them because their ongoing efforts to make sure the average age of kernel developers increases one year per year has now moved to "if you don't do git, we don't want you". (Really: a recent patch submission to Documentation/SubmittingPatches removed the instructions for generating a patch using diff. It's git or nothing.)

So of course I spent most of yesterday rearranging the living room and watching Fade play Final Fantasy 13, and feeling a bit unwell because when I'm not at work the fact that I sit in a cubicle all day every day and get ZERO exercise really starts to add up. (In the "I now weight over 200 pounds for the first time in my life, I wonder if these chest pains are psychsomatic or not" sort of way.)

Today I biked to Fry's (exercise!), and am now working on a script to automatically update toybox's roadmap.html, which is hugely labor intensive to triage manually but if I throw <span id=walrus> tags in there I can chop the command lists out with sed and have the script assemble the global todo and lists. (It can harvest the done list by running the toybox binary, and I can have a blacklist of "it's in defconfig but probably shouldn't be"... which I need to work through.)

July 12, 2012

I'm no longer the only one saying it, but I just realized a new wrinkle.

I've written up my whole "mainframe -> minicomputer -> microcomputer -> smartphone" spiel in lots of places, how the PC is going the way of the minicomputer because nobody needs the computer on their desk if they have a computer in their pocket. Successful tablets are big smartphones, not small PCs. This "java everywhere" bit is just ROM basic all over again; the new platform will probably outgrow it.

I'm on the old Neuros mailing list and those guys are making an audio docking station; I keep meaning to poke them and go "no, you need to make a full USB docking station with a USB hub connected to USB video, USB audio, USB keyboard, USB mouse, and maybe extra hard drives and ethernet, plus an installable app that provides a development chroot". The rest is just software, most importantly a shell prompt, full posix command line (ala toybox), and native toolchain.

The _old_ technology gets "kicked up into the server space". Back when mainframes were invented you stood in front of them waiting for your printout. When everybody got a minicomptuer terminal and could type interactively the mainframe was relegated to "batch jobs". (Don't laugh, web transactions via http are batch processing at its finest.) Then when the PC showed up minicomputer wound up in the "department server" role.

The new thing I just noticed is: there's only ONE server space. When the mainframe and minicomputer both got kicked up into it, they merged together and the minicomputer essentially disappeared. Digital Equipment Corporation ground to a halt and was sold for parts, the Alpha processor is no more. The mainframe model won, and IBM chugged along another 20 years milking it.

Now the PC is getting kicked up into the server space and it's being called "the cloud", I.E. rackmounted servers with the same kind of virtualization support OS/360 had back in the 1970's. The computer you actually interact with is your phone, and these other computers you only ever talk to through your own personal computer. (Sound familiar? Yeah, we've been here before.)

But this time around, IBM is dying. They're dying _fast_. Because there's only one server space, and this time "the cloud" is eating the mainframe.

Me, I'm working on pushing forward the transition of Android to a self-hosting development environment, because I want it to win over the iPhone. I do NOT want to spend the rest of my life dealing with yet another proprietary lock-in platform held in place via the network effects of its dominant market share making it the de-facto standard, and that's what we've got if iPhone wins: Windows put out by somebody competent.

And yes, it has to be Android, it won't be vanilla Linux any time soon, for three reasons. 1) Open Source can't do user interfaces for about the same reason wikipedia can't write a novel, 2) it's too late to the party (5 year headstart is forever in computers), 3) preinstalls matter (GPLv3 spooked all the hardware vendors, Android has a "no GPL in userspace" policy which is rigidly enforced).

And no, they're not equivalent: switching the installed base from Android to Linux is like switching the iPhone installed base to FreeBSD. The whole of userspace is different. Linux's mainstream end-user success finally happened when "No GPL in userspace" excluded every line of code the FSF ever touched, to the surprise of nobody outside the FSF. We _might_ convince google to eventually rebase on Linus's tree if enough of their stuff gets upstream, but userspace has moved on.

July 10, 2012

Pondering making the date of each blog entry a link to the blog entry. The rss feed already has links to each entry (this one would be like so, just day-month-year on the end).

The problem is that the python script that makes the rss feed is kinda crotchety. Originally a friend from Timesys wrote it, except he used classes in python (why?) and I've played with it a bit over the years but never quite finished the non-OO rewrite (an 86 line program does not need inheritance and encapsulation) and I should finish said rewrite instead of patching the old one and getting them further out of sync...

Someday I'd like the script to take this giant html file I edit in vi and populate a directory for each year with files that have next/back buttons and an index. I could even put _titles_ on entries (oooooh).

But then I go "way too much effort for a hand-hacked thing with no comments, I should do a real blog program" except those exist (I'm not administering a server with PHP on it, thank you), and they aren't updated by a file I can edit in vi so I wouldn't use them. (I have a livejournal.)

And thus the problem languishes, because moving away from where I am now opens up massive vistas of scope creep...

July 6, 2012

People keep asking me "what's so bad about GPLv3" and I still don't have a centralized place to point them at. Instead I have a bunch of "death by a thousand cuts" arguments. Here's a few more.

In an comment I wrote a long explanation of why I responded to GPLv3 by switching from GPLv2 to BSD for new code I write. This was sort of a follow-up to an earlier one where some GPLv3 advocate said projects switching from GPLv2 to BSD made GPLv3 stronger because they were "GPLv3 compatible". (You may boggle now.)

I also did a much shorter version of the "C++ is not as good as C" explanation I've previously done here as a three part series.

July 5, 2012

Ten years ago, I proposed a "patch penguin", I.E. a secretarial position under Linus Torvalds to take the integration burden off the guy and let him focus on design stuff (being the architect)

Linus pretty much immediately shot down this idea, implying that "code review" was not seperable into integration and architecture portions, because the architect still has to read and understand the code.

Instead, after an epic flamewar Linus started using distributed source control (bitkeeper at the time), which _did_ handle much of the integration burden by allowing patch sets to be marshalled automatically between trees (using nested 3-way merges aware of more history than simply comparing two versions as diff/patch do). He also added a reporting layer of "subsystem maintainers" (I.E. "lieutenants") who fed code to him, which the hundreds of people listed in the MAINTAINERS file would feed code to, so normal developers were now a fourth layer: developer -> maintainer -> lieutenant -> linus (architect).

In general, I've treated that as an instance of "don't ask questions, post errors". People come out of the woodwork to shoot down a _bad_ solution to a problem, ranging from the unstoppable "someone is _wrong_ in the internet" urge to nitpick to putting in the work to come up with a good solution now rather than see a bad one enacted and become entrenched.

Today, the lieutenants are burning out, except linux-kernel is now dealing with it via the invitation-only kernel summit so it's none of my business. (I didn't even find out until the discussion was weeks old.)

The theme seems to be "outside voices are not welcome, we must build the ivory tower higher".

The purest expression of this is probably Peter Anvin's proposed topic, which is "which users can we exclude". The guy who added perl to the kernel build is suddenly worried about complexity, and wants to drop support for old toolchains (like the ones I use in aboriginal linux, the last GPLv2 releases of gcc and binutils).

If that happens, I don't know if I'll dive into llvm/clang or just dig out a bsd kernel and see if I can port the android userspace on top of it. (According to this month's Vanity Fair, iPhone sales are more than than everything Microsoft sells combined, and iPhone is BSD based.)

July 2, 2012

Anybody wondering about inequality in this country: in 1963 the top tax rate was 91%. Now 1% of the population makes half the money each year. All the national debt built up since then: money that half didn't pay in taxes. The stagnant wages of everybody _else_ are why so many people lived on credit cards and home equity lones until it all went "tilt" recently.

I really don't know why the "we are the 99%" guys didn't just say "put the top tax rate back to 90%". Want to solve the massive inequality problem? Your first million dollars each year is taxed normally, anything you make above that is 90% tax, and capital gains is income. Problem solved.

(Oh no, some billionaires will move overseas and renounce their US citizenship, like that Facebook creep. How is this a problem? And anybody screaming "I can't live on a million dollars a year!" has some serious budgeting problems.)

Sure, reality is more complicated, but why isn't this one side's _starting_point_? Returning to the status quo of 1963 is too radical to even consider?

July 1, 2012

So this Romney idiot: leaving aside all the other objections to the man (from the way even he doesn't believe anything he's saying to the possibility that he _actually_ got rich laundering money for foreign criminals through offshore bank accounts)... his whole platform is that being rich makes him good at running the economy.


The logic here seems to be that an athelete is the same as a surgeon, so a gold medal means you're qualified to cut somebody open. "Good with bodies." He got rich, therefore he's a superior economist. "Good with money."

Except if Romney got rich by strip mining and clear cutting the corporate landscape, leaving devastation in his wake, burned every bridge and salted the earth behind him, so that nobody else will ever be able to repeat the looting and pillaging that made him wealthy... And he says he can do the same for this country...

Um, sustainability? It's an issue? You become a billionare by cornering the market, by excluding competitors who want to compete with you and drive down the price. Even the _good_ ones: Steve Jobs' first act upon returning to Apple was to kill off Power Computing and start suing everybody over patents. You become a billionare by reducing consumer choice and in extreme cases forcing people to buy things they don't want. (My last 3 computers came with Windows preinstalled. I wiped it with Linux before ever booting into any of them, but Microsoft still got paid.)

Is that really what you want from a president? "Dear Rich people: your taxes are now zero. Dear poor people, we're eliminating bankruptcy and replacing it with victorian workhouses. Market: cornered." The "win at all costs" guy who was ejected from the game for taking steroids wants to become the referee now?

I'm not seeing the appeal of this message.

June 28, 2012

Posted a long rant to my old livejournal, because it didn't really fit here.

Possibly I could dig out some of my old "article" style posts and put them somewhere other than The Blog Nobody Ever Reads, but I did an explanation of why Oracle is doomed back in 2010 and then forgot to ever link it anywhere until yesterday...

June 27, 2012

Of course, Seasonal Affective Disorder. Duh.

That's one of the reasons I moved back to Austin from Pittsburgh in 2007. Put me in perpetual gloom long enough and I get a permanent case of jetlag. (Yeah, I saw a doctor about it on Timesys' health insurance plan. He wanted to put me on prozac. No thanks.

So of course a month or two after moving into the new building at Polycom, my productivity went into the toilet. I'd sit down in my cubicle and 15 minutes later I was falling asleep and totally unable to concetrate despite upwards of 4 energy drinks a day and all the free diet mountain dew I could exhaust the break room's supply of (but there's another break room down the hall, and when _that_ one's empty there's another off towards the elevators).

You can't see a window from my cubicle (I'm the 4th row from the window and you can't even see the next aisle), and all the cubicles are under indirect lighting, where flourescent lights point at the ceiling so everything we get is diffuse reflections of a greyish surface.

I feel like an idiot for taking this long to figure it out. I dug the clutter off the corner of a desk that has a clicky lamp and moved my netbook under that, and I feel so much better already...

Sheesh, that's probably why I'm only getting open source programming done on weekends these days. After 8 hours in a gloom cube, I can't think straight even when I get home, and by the time I'm starting to recover in the morning it's back to work...

(Yeah I've done gloom cubes before at various contracting jobs, but not for this long. It takes a while for my internal clock to lose tracking and start seriously flashing 12:00.)

June 25, 2012

The remaining busybox commands used by the aboriginal build with toybox 0.3.1 (which is out, by the way):

ash awk bunzip2 busybox bzip2 comm cp cpio cut dd diff dnsdomainname egrep expr fgrep find ftpd ftpget ftpput grep gunzip gzip hostname ifconfig init install less losetup lspci man mount mv od pgrep ping pkill readlink rm route sed split stat switch_root tar test time touch tr umount vi wget zcat

There's definitely some low-hanging fruit in there. Especially since I already implemented cp and just need to fix it for the dirtree changes, and did the busybox mount/umount/losetup implementatinos once upon a time...

But first, off to work to _not_ do anything about this for 8 hours, plus an hour's commute...

June 24, 2012

5 hour walking/shopping chunk out of yesterday with a friend. Didn't get nearly as uch programming done as I wanted, but oh well. New laptop's closer to set up, anyway.

Finally got chgrp/chown rewritten with -hHLP support. Checked in. I need to fix toys/cp.c for the new dirtree, which isn't a huge amount of work, but I'm also trying to do more or a "release early release often" sort of thing. Holding up the release of the next low hanging fruit thing is how I wound up with such a huge gap between releases _last_ time...

Still, regression testing against at least the aboriginal linux i686 build is kinda important. Kicking that off now...

June 23, 2012

I need to spend a weekend on toybox infrastructure. Here's _just_ the stuff that affects chgrp/chown:

The options can of worms is I want to have a block of options appended to a base usage line and description. The -fvhRHLP options are the same between chgrp/chown, so repeating them is silly but each should have its own help text. The "this is an option to X" stuff that calculates that should actually be the "depends" line (depends CHGRP | CHOWN) rather than a magic CHGRP_OPTION name which isn't as flexible.

Sigh. I could so easily make a full-time job out of this. Except for the "getting paid" part.

Start with extending dirtree_read(), I guess...

June 22, 2012

Friday night, which means I get to do real work again.

I'm trying to merge chgrp and chown, since chown has a user:group syntax and thus chgrp is a subset of chown. And of course this means I got sidetracked to adding -HLP support to chgrp (the standard wants it), and I'm hitting an old problem again: The dirtree stuff doesn't have a place to store global data about the traversal.

The callback has a return code, which performs a similar function. The return code indicates whether or not to recurse into subdirectories, whether or not to call the callback again after dealing with all children (thus allowing depth-first operations), and whether or not to follow symlinks when statting children.

That last one's a bit problematic: the stat() vs lstat() question is already resolved for the _current_ node before the callback gets called, and for chgrp() that's what -H -L and -P deal with. (And -h.) I added an extra argument to dirtree_add_node() indicating which stat it should use, but that's called at the top of the tree from dirtree_read() (which always feeds it 0), and then the callback doesn't get passed _in_ which way it was done. This isn't information that naturaly propogates down the tree, and adding more arguments makes the recursion eat more stack. The callback() doesn't call dirtree_recurse() precisely _because_ doing so would eat more stack...

If there was a natural place to stick it, the dirtree code could check that, but it only looks at the current node. There's no "global state for this whole traversal". Even if I wanted to hijack the root node's ->extra (instead of leaving if for users of the infrastructure), following ->parent up to the root node is a non-constant-time operation.

Meanwhile, what's the behavior of chgrp? The standard is weird, so running some experiments: chgrp by itself: follows symlink to dir or to file given directly on the command line. With -h, modifies symlink. -R modifies each symlink instead of following it, and does not follow symlinks for recursion. (So -R implies -h, and adding -h does not change the behavior.)

-HLP are ignored without -R. -RH modifies the target of symlinks (both directory and file, on the command line and encountered recursively) but does not recurse into them. -RL modifies the target of symlinks _and_ traverses through them. And -RP is the same as straight R, I.E. -P acts like -h: modify symlink instead of target, do not recurse through them.

Ok: chgrp default, change target of symlink. -h change symlink. -R change symlink, don't follow symlink. -RH change target but don't follow when recursing. -RL change target and follow. -RP is the same as -R (presumably undoing a previous -H or -L to restore default value).

June 19, 2012

Not enough hours in the day.

Up late last night getting the new netbook working, and then didn't plug in the old one (cable was loose) as I left files copying overnight, so it only did the first few gigabytes before the battery ran out. Overslept so I'll be at work late giving 'em 8 hours, and then I drive to a friend's house after work.

On the bright side, the new netbook might actually have the darn processor extensions kvm needs. Dunno yet, but if so I could debug the darn Fedora glitch without needing access to my server. That'd be nice. (Tried to do it under qemu, it's just waaaay too slow. And the old netbook is a bit memory constrained to run virtualized Fedora without closing every other program I'm running...)

June 18, 2012

The saga of attempting to install Ubuntu 12.04 "Rancid Roadkill" on an Acer Aspire One 722 continues. My work machine has 12.04 installed on it, so I stayed late to burn two USB boot sticks there, one with the desktop install and one with the advanced install. The advanced stick isn't recognized as bootable, but the desktop one booted, and actually installed the system! (Newer syslinux, apparently.)

The result is a brick. Syslinux would boot from the USB stick, but apparently Grub isn't recognized as bootable.

I thought about keeping a windows 7 partition on there (in case of brick) but the xubuntu install widget that gave me two draggable partitions so I could say how much size to give each one DID NOT LABEL EITHER ONE, so I didn't know how much of the space would be for windows and how much for Linux. And it let me go to 10 megs on either one so obviously it wasn't actually referring to the existing contents to determine how much space it needed.

So I told it to wipe windows, and it did, and now it won't boot from the hard drive at all. Strangely, now when I boot the install USB stick it boots to xubuntu with the root filesystem running on the hard drive. (I get no choice in this, it _can't_ boot from the usb root filesystem anymore, it falls straight through to booting the ubuntu that's on there without asking. Great install image, guys!) So I sort of have a workaround, but a really sucky one.

I'm now upgrading the software packages in the vain hope it might help. As far as I can tell, it hung trying to upgrade the flash package. (It's been sitting at the "downloading flash tarball" thing for twenty minutes with no progress indicator. I had to open the "details" view to see that much.)

If you wonder why Linux on the Desktop has utterly failed to take over the world, I get tempted to buy a mac every time I buy new hardware. (And don't tell me it's Acer's fault that the usb boot stick jumped straight to the installed root filesystem, and that ubuntu's package upgrade procedure hung.)

Oh, and the xfce terminal in Rancid Roadkill now has a grey background instead of a black one, and no obvious way to switch it to black except by copying the RGB values for all 16 colors out of a 10.04 system. Note that upgrader's built in terminal window still uses a black background. This is an aesthetic change forced down my throat, and I hate it, but that's not ubuntu's fault. That's showing that our pervasive usability failures have depth and variety.

Ah. So it installed grub to the USB stick, not to the hard drive. Of course it did. And now the upgrade program is trying to upgrade grub ON THE USB STICK. (Which I removed after booting, and it's complaining grub was installed to a disk that is no longer present, which is how I found out about this. Shoulda guessed.)

And of course current grub/ubuntu no longer provides a human readable config file, instead /boot/grub is a directory with 207 files in it. Sigh. Read read read... ok, maybe (as root) "grub-config > /boot/grub/grub.cfg; grub-install /dev/sda" does it? Possibly?


Ok, update the packages. Again. I told it to update while installing (and it claimed to do so), then it wanted to update another hundred and change, and now it wants to update 6 more _after_ the reboot. And now apt-get crashed and it wants me to send a system report.

All this is self-inflicted. Honestly. Other than the usual half-baked driver support for the hardware (needs a kernel newer than any distro ships, hardware's already on clearance), the rest of this is just plain _bugs_ in the distro.

Oh, and the way to SHUT UP the obnoxiously loud power beep when you plug or unplug the AC adapter is "amixer -c 1 sset Beep 0", which I think I'll have to stick in whatever /etc/rc.local is called this week. (And yes that's sset not set, because it's a "simple" control. And yes Beep is probably case sensitive. I tried stracing this to bypass the tool but it's ioctl() calls on a /dev node.)

I'd set it copying files overnight, but the reason for the update crash turns out to be that the wireless driver ate itself (dmesg has a bunch of "could not kill baseband RX" and similar messages, no idea what they mean. At least it's not the iwlagn driver the last one had, so I can stop using my "modprobe -r iwlagn && sleep 5 && modprobe iwlagn" script every fifteen minutes). And yes, I did the "move the netboot to the front of the boot device order so the bios initializes the hardware properly before linux loads" trick, doesn't seem to have quite worked.

Eh, at least this time around the driver for the _wired_ internet works. Plug it in and copy the files that way...

So if you're wondering why I didn't get any toybox or aboriginal work done this evening, now you know.

Oh, in day-job-land another co-worker in my department gave two weeks notice today.

June 17, 2012

Took Fade to finally see Avengers, plus a lot of shopping.

Meant to do toybox/aboriginal release work this evening, but instead I bought a new Acer Aspire One 722. It was on clearance at Staples, and comes with 4 gigs ram upgradeable to 8 which is a serious bottleneck in my older netbook. 2 gigs is all an atom can do, this one has some AMD processor in it that's slower but apparently doesn't drop the battery life so it's probably a wash. The form factor's a bit bigger but still reasonably petite. I'm a little worried the battery life might be down a bit because it's a floor model, but then it was never running off battery and modern charging circuitry shouldn't overcharge, so...?

I then spent the rest of the day arguing with the thing trying (and failing) to get Linux installed on it. The "desktop install" ISO of the new Xubuntu LTS (I think we're up to "Rabid Hagfish"?) gave me the splash screen and then never switched out of it (I gave it half an hour). The "alternate install" is even worse, unfortunately SELINUX jumps out of text mode into some bitmap mode that's not supported by the graphics hardware, so I get a black screen with a weird little grey rectangle in the lower left corner. Why SELINUX feels the need to jump to graphics mode, I couldn't tell you. (It's a bootloader. It has a job that used to be done by a single 512 byte sector, but it's gotten so fancy it can't do that job.)

Googling says I have to do some weird bios setting to avoid the wireless driver locking up, but that shouldn't prevent me from getting a desktop. Alas there are a bunch of dash numbers below the model number (my old one is an acer aspire one d255e-1082 and you have to _hunt_ for the 1802: it's on the serial number sticker on the bottom of the box.) This one's 722-0825, which could easily be completely different hardware from the ones all these web pages are about.

Linux on the desktop: smell the usability. And still impossible to find preinstalled unless you mail order something sight unseen from guam. (They made me wait while they wiped their demo software off the floor model and put Windows 7 back on. Store policy.)

June 16, 2012

I'm frustrated by the new IPPROTO_ICMP sockets. They look like the perfect thing to implement "ping" with, except for:

In order not to increase the kernel's attack surface, the new functionality is disabled by default, but is enabled at bootup by supporting Linux distributions, optionally with restriction to a group or a group range (see below).

Why the HELL would anybody do that? You must _as_root_ run an enabling step before anybody can use this new API. But having an suid root ping binary has been SOP for 30 years. This API _DOESN'T_EVEN_WORK_FOR_ROOT_ by default, which is epic brain damage. Why was this merged?

Because of this, I have to implement ping using the old "requires the suid bit" functionality. This renders the new API _utterly_useless_. (Grab a static binary and throw it on my system, oh why can't I ping? Lemme try setting the suid bit... no, that didn't work.)

Dear people afraid of root: you are not increasing the security of systems, you are decreasing it. You're just moving the problems to areas fewer people understand, so they don't SEE them as easily.

Adding weird per-syscall sandboxing also undermines the containers work. Bravo, guys.

June 15, 2012

Friday evening, finally some time to work.

Dear posix standards committee: with regard to the -h, -H, -L, and -P options to chgrp, what were you smoking?

I'm trying to write up help text for what these options do, and you just can't explain it in one line. Because it's CRAZY. My first attempt was:

-h    change symlink instead of target
-H    follow symlinks listed on command line
-L    follow all symlinks
-P    do not follow symlinks

This is not what they do. I'm tempted to IMPLEMENT the above anyway just because it's sane, but testing the gnu/dammit version it does indeed follow the weird craziness of the standard. Bits of it, anyway.

In the standard, -H, -L, and -P are ignored unless you specify -R. That means "mkdir walrus; ln -s walrus walroid; chgrp root -P walroid" the -P is a NOP and the directory gets changed instead of the symlink, but "chgrp root -PR walroid" changes the symlink but not the target, and then doesn't recurse into the directory. In theory -h does what -PR does, but it says it only affects "file" arguments. Does that mean any argument other than the one that specifies the group name to change to, or does that mean non-directory?

I thought maybe -P existed because -h wouldn't affect recursion, so if I have walroid->walrus and then walrus/poing it would change walroid and then recurse into walrus to change poing. But no: with -h the -R doesn't recurse, because a symlink isn't a directory. So -P has NO REASON TO EXIST.

Implementing a given behavior is usually fairly straightforward. Figuring out what the standard-specified behavior actually is (and why anybody would ever want to do that) is the hard part.

Oh well, it still beats logger. That one they just punted ("The logger utility saves a message, in an unspecified manner and format") and then spent most of the rationale section apologizing defensively not-quite-apologizing for it. (No really! Go read it. I'd hate to have sat through those standards committee meetings without protective gear and possibly air support.)

June 14, 2012

Broke down and did a repository for qcc (and even a mailing list, but no posts to it yet. Not even notes-to-self.) No web page yet, but I suppose it's only a matter of time.

Last month when I poked Fabrice Bellard about hiring him to glue tcg to my old tcc fork, he said he wasn't interested but _did_ say he'd be ok with relicensing the code as BSD. (I didn't quote that part of his email.) I started doing triage on my old repository to identify commits by people other than him, and came up with a reasonable list (the numbers are which hg commit out of my tree):

223, 227: Peter Lund
250, 258, 259: Dave Long
268, 270: Fabrice relicenses to LGPL
281, 282: mauro persano
285: romain francoise
295, 306, 307: Daniel Glockner
305, 308, 309: TK
348: Jon Griffiths
352: Bernhard Fischer
354, 364, 366, 368, 370, 372, 377: Grischka
384, 385, 386, 387, 388, 389: Filip Navara
394: Tao Wu

Those first three lines are either Fabrice switching to the current license or before that (so he already had permission to relicense that code). The commits after that are all the ones until my tree forked off. I need to either get permission from their authors for 2-clause BSD or else remove them from my tree.

Unfortunately, about this point I noticed that the tinycc mailing list archive the website points to was down, so I threw the whole mess on my todo list and decided to come back later.

Today I noticed that I'd missed the 3 year anniversary of the last release of tcc, and checked the mailing list archive. It's still down.

I don't expect to put much if any time into this before toybox 1.0, but eh. Chryasora on twitter pointed out that putting a little regular time into things is necessary to get anything done, and she's got a point.

I think I'm going to ignore the C11!!one! standard, though. Adding asserts to the base language and making u8 a keyword (_that's_ not going to break any existing code) so you can explicitly specify utf8 strings (which work just fine when you don't do this... what the? All you have to do is be 8-bit clean and it "just works") aren't really striking me as _compelling_ features, so far.

June 13, 2012

Up waaaay too late diagnosing the ls bug, which works fine on x86-64, gives (null) on x86-32, and crashes on arm. Turns out it's printf treating various stat fields as integers, when they're size_t and similar. This coincientally works because x86-64 aligns at least some stack entries to 64 bits and is little endian.

June 12, 2012

Sigh. My work week is just a big blur; day job converts hours of my life into money and insurance, but I honestly can't get excited about what I'm doing.

In part I haven't been able to concentrate _at_ work since they moved us to cubicles. It's this pathological combination of isolation without privacy: I can't see a window but I can hear every conversation in fifty feet and somebody walks by the open front of my cubicle every couple minutes.

Video chat is built into kde now, and it's implementing a standard that communicates with google talk (and thus presumably google hangout). Of course Google Talk is built into Android, supporting video chat in 2.3.6 and newer (Nexus S and up). Facetime is built into the current iPhone and ipad 2 (that's why they added forward-facing cameras to the newer models), plus third-party apps like CanFrog and Fuze. Facebook's got video chat. Skype just refreshed its Linux client (despite being owned by Microsoft and Linux being a rounding error in that space).

It's not that we have competition, it's that this company's management doesn't seem to be _aware_ of what's going on in this niche. They keep mentioning what Cisco and Microsoft are doing, never what Google and Apple are doing.

It's really hard to get enthused in that context. Apparently I'm not the only one, either: the co-worker whose office I shared before the move just gave his two weeks' notice.

June 9, 2012

Biked to chick-fil-a. (It's been a while. Yay exercise.)

Ok, posted my toybox todo list for the release yesterday, let's see...

First up is ls, which needs -L. That's what's blocking Aboriginal Linux from building with defconfig toybox, and that's really my "good enough to cut a release" regression test. (The rest of the todo items can be in future releases, since I'd like to have the suckers more often _anyway_ holding up a release to do more work is counterproductive.)

While I'm fiddling with ls: implemented -qutskrno. (check that in). Implemented DIRTREE_SYMFOLLOW and -cSHL, check _that_ in.

I note that specifying -C and -l together can segfault the ls command at the moment, because the optargs groups stuff isn't in yet (so [-Cl] can switch off C when you say -l and vice versa).

Ok, now I need to plug that into the Aboriginal build and see what breaks...

June 8, 2012

Finally got a character to the end of Act III in Diablo III, where they sacrificed your plucky sidekick to a demon in a cutscene the protagonist was irrelevant to, and then Act IV wasn't about getting her back, or going after the woman who betrayed her, but was instead railroading the plot to go save a bunch of asshole angels who'd previously voted to destroy the world _anyway_. Who the hell would consider _that_ a priority?

Turned it off as soon as the cutscene ended, and am never giving Blizzard any money ever again, and yes that includes WoW and similar.

On the bright side, less distraction from getting a toybox release out. (Got nothing done this week, work made me sit in a cubicle. The rest of tonight's a My Little Pony marathon finally getting to the end of season 1.)

June 6, 2012

Fade needed the car to take the cat to the oncologist yesterday morning. (Aubrey's had lumps surgically removed twice, third time they came back everywhere. Wasn't expecting it to end well, the vet says 3-4 months.)

So I drove to work with Fade and Aubrey in the car, and then this as an excuse to get some exercise and walked home from work yesterday afternoon. Turns out it's about 15 miles. Got home at 3am. Sort of in pain this morning. (I'm out of shape.)

May 31, 2012

I'm really starting to hate the FSF twits.

Because responding to a situation by writing new code is against the core ethos of the FSF. Obviously the only reason anyone would ever write an alternative to code that exists is because they had it in for the original. So gcc support for Solaris in the 1980's was intended to do nothing more than undercut Sun Microsystems' attempt to sell its compiler, and for no other reason.

Honestly, I'm starting to believe that copyright and patent law are both so broken we need to throw them out and start over. Last year the RIAA demanded more money than actually exists in the whole world as damages in a single lawsuit. _ALL_ the legal rulings I root for these days are ones where some IP thing is declared _not_ enforceable, such as the recent dual defeat of Oratroll on both patents and copyrights.

Think about the xbox mod chippers. They didn't need anybody's IP to reverse engineer the machine, but they were _stopped_ by lawsuits saying that such reverse engineering was illegal. Which side are we on here?

I strongly suspect that lawsuits over linux will make it the norm for phones to load their OS from the network when they boot. If that seems unlikely, note that Diablo III is built on the World of Warcraft engine, so you log into a server to play the game. In this case I suspect Blizzard just had more WoW developers lying around than standard game developers and repurposed the engine they were most familiar with, but the end result is that the game lives on their servers, not on the player's home machine. A phone OS could be just a bootloader that loads the rest from the network when you power it on, in which case having the source code to the open source bits does not let us _install_ anything.

It's a bit like the way 80% of all antibiotics sold in the US are given to livestock. Back when we'd just discovered penicillin but hadn't yet discovered antibiotic resistance that may have seemed like a good idea, but the reality was we blunted our tools by cobbering everybody with it until we'd taught the ecosystem how to fight back. It was really, really stupid. What the SFC is doing is also really, really stupid.

I am so sorry I ever opened that can of worms. I did it to get code, and stopped when I confirmed there _wasn't_ any. These guys are doing it out of pride and arogance (and because they made a business model out of it that needs to be fed fresh victims), and I really hope they don't drive Android to rebase on a BSD kernel like Apple did. (The FSF has already doomed gcc, it just hasn't stopped twitching yet. This time EGCS is called LLVM, and it's not going to allow itself to be recaptured.)

Luckily, it's pretty clear that in the long run copyright and patent are toast. The internet is doing to them what the printing press did to the Catholic church's ability to burn people at the stake for translating the bible into english (or other languages people could actually read). Copyrights arose because regulating printing presses was different from regulating readers. We dont' have laws about what you can read because they're unenforceable, but we do have laws about what you can print because they were enforceable. Now, reading and printing are pretty much the same thing, and the law will take a while to catch up.

But if I can see where it's going today, there's no reason not to help it along. Copyright on digital information is _silly_. Up until 1983 you couldn't copyright a number, and it turns out changing that was a _mistake_.

We should focus on attribution. Claiming credit for the work of others is plaguarism. That's NOT the same thing as copyright.

Which gets back to the other reason the FSF twit above is wrong: Tim didn't write toybox, I did. They prefer to blame sony so they can feel properly victimized, because _me_ doing it leaves them with no argument.

May 30, 2012

Spent a couple days doing more clicky game (diablo 3) than programming. (It's been a while since I had time off from work, I kinda needed it.)

So the build break using toybox defconfig in aboriginal linux is due to ls claiming to support -L and ignoring it. Obvious solution: fill in the giant option set of ls that the standard requires.

I redid the help text to group the options into (so far) "what to show", "output formats", and "sorting". The fiddliest options in terms of new infrastructure required are actually -C and -x. Figuring out how many columns to use in a way that scales well to large numbers of files is slightly nonobvious, and the navigating said columns in both directions in ways that won't cause integer overflows at large sizes is again again a bit tricky, but I think I've got 'em both sorted.

The current problem is that for vertical sort (-C) I want to make the _right_ edge ragged, not the bottom edge. If you're sorting into vertical columns, then the short column logically should be along the right edge, not along the bottom. This turns out to be hard to do given the scanning I use for -x (figure out the max possible columns and then loop through seeing if they fit, and drop down one each time if they don't).

The problem is that at a given alignment, the number of "leftover" entries might eat a whole column (for example, with 36 entries trying 11 columns, the height at the left edge is 4 (36/11 rounding up). 4*11 is 44, and 44-36 is 8, meaning an even 2 columns at the right edge vanish. So you _can't_ vertically sort into 11 columns, you have to vertically sort into 9 columns, and then only if it fits.

Really what the logic wants to do is iterate over rows instead of columns for vertical sort, but that doesn't mesh well with the way the rest of ls works.

None of this is _hard_, it's just fiddly...

May 27, 2012

Yanking ccache out of the $PATH on Fedora didn't fix the build break. Hmmm...

Got chgrp checked in, and now I'm trying to test it in Aboriginal. It's slow going: something broke and it's hard to tell what. For one thing, the hosttools build breaks (in distcc's ./configure stage), and the wrapper is _fiddly_ to insert into host-tools because the thing spends its entire time building a new context, and doesn't exit and re-enter which would set up fresh paths with the existing detection logic.

Also, the toybox build wants to use "strip" (to make toybox from toybox_unstripped), and I specifically excluded that from the wrapped $PATH to make sure nothing in the native build tries to use the host strip instead of the target strip. (So it's ok building it before the $PATH relocates, but not after.)

The way I'm testing toybox, it removes the appropriate busybox functionality as it goes, by searching for a "toybox" comment in baseconfig-busybox and truncating the file at that point. Turns out some stuff got added to the end of the file that has nothing to do with toybox (CONFIG_LONGOPTS and a couple of flags), so that should move up anyway, and then for testing I want to move the comment to the end of the file so I can select busybox or toybox implementations at runtime before doing the Linux From Scratch build, like so:

CHROOT=blah-i686 more/ i686 ../control-images/build/lfs-bootstrap

And then in there, do:

cd /bin; for i in $(busybox --list); do ln -sf busybox $i; done; X=74; for i in $(toybox); do ln -sf toybox $i; echo $i; X=$(($X-1)); [ $X -eq 0 ] && break; done; cd; /mnt/init

Adjusting the "X=74" to a smaller number, bisecting down to the first command that breaks the build. An advantage of this approach is when you screw up your build environment, just "exit", rm -rf blah-i686, and then rerun the chroot-splice command (via cursor up command line history).

The first bad command turns out to be "ls", which is also what was breaking host-tools. The autoconf boilerplate does an "ls -Lt" test, falling back to "ls -t" if the command doesn't support -L. (So why does it bother with the ls -Lt one when it doesn't actually _need_ it... Sigh.)

Except toybox happily accepts ls -L, and ignores it, because it's unimplemented but in the option string (which contains some todo items). The symptom is configure saying the build environment isn't sane because newly created files are newer than distributed files, which has NOTHING to do with the actual failure and what's breaking is the test, not the build. That's autoconf for you.

Once I chopped out ls (ala "ln -sf busybox /bin/ls"), the rest of the LFS build went much farther until unshare overwrote the toybox binary. Sigh. Bad installers follow symlinks to overwrite the file they point to, busybox has been on the receiving end of this before, now it's toybox's turn. You can tell because A) toybox is bigger than 9k, B) when you type "cat" you get the usage message for the unshare command. And it's not the only one, if I delete /bin/unshare before running the build, I get another command that refuses to identify itself overwriting toybox.

To dagnose _this_, replace the "ln -sf toybox $i" above with "cp toybox BLAH$X; ln -sf BLAH$X $i", so each toybox command has an independent binary to overwrite. If I do this, the i686 Linux From Scratch build _completes_ using 73 of the 74 toybox comands (everything but ls, which for some reason is giving a NULL date built against uClibc anyway). Yay! I can then list the resulting /bin directory to find out that the 5 overwritten commands are "unshare dmesg setsid clear cal", 4 of which are in util-linux and the other (clear) is in ncurses.

Both packages use a giant "install-sh" script from the X consortium, copyright 1994, over 500 lines long. It jumps through ENORMOUS hoops to try to perform a portable install, and it fails to do what "mkdir -p" and "mv" can. It is _elaborately_ wrong, which is what defensive programming always snowballs into.

In other news, the 3.4 kernel is in, and it didn't even require patch changes, but arm is still broken. Tracking that down now...

May 25, 2012

Yay! Nine days off!

Ok, in toybox the chgrp code's -R option uses dirtree and needs to change the directory _after_ the contents, which is what the DIRTREE_COMEAGAIN code is for. Since this is the first actual user of it, I probably have to debug not just the implementation, but the design.

For one thing, DIRTREE_COMEAGAIN is meaningless without DIRTREE_RECURSE so I might as well have the handle_callback() code treat DIRTREE_COMEAGAIN as implying DIRTREE_RECURSE, so the callbacks don't all have to return both flags.

One issue is that the dirtree->data field contains the file descriptor of the directory, so using that for signalling the comeagain second pass means we can't easily perform an operation _on_ the directory at that point. (And currently the code's leaking a filehandle.)

I'm trying to avoid opening the same file twice if I can (both for simplicity sake and to avoid who knows what security race conditions), so I'll switch to using the symlink field for signalling, set it equal to 1 which is never a valid pointer value (it's in the zero page, and unaligned...

Ah, the code _isn't_ leaking a filehandle: dirclose() disposes of it, which means the filehandle's no longer valid when callagain happens so the -1 signalling using that variable made sense. Can I move the dirclose() int handle_callback()? Not easily (no obvious signalling path unless I add another argument, the diropen() happens in the function so dirclose there makes sense), and not without breaking the existing code in ls.

Easiest thing to do looks like have the callback dup() the file handle into the dirtree->extra field.

Ah, the filehandle leak happens when we _don't_ call dirtree_recurse()... no, it's only opened for the recursion case. So the filehandle is never valid in the first callback, so the dup() strategy above is wrong.

Next question: should individual files have an open filehandle? The directory open is required for recursion, but in the non-recursion case we don't want to do it. But I don't want to open the directory twice and have a race between the two opens...

Can I put the COMEAGAIN callback into dirtree_recurse()? Not easily, it doesn't have a flags argument. And _most_ users shouldn't need an open filehandle to a directory when they've already got the stat block.

Sigh. What I really need is a dirtree_root structure to hold the flags. And the callback pointer. And I'm really not sure this should even be recursive since it can climb back up the tree via dirtree->parent so blowing stack space on deep trees might not buy us anything. Possibly I can abuse dirtree->extra on the root node, although that's really for users not the infrastructure...

Ok, the least intrusive change right now is moving the new->data=openat() for directories before the callback. There's a bit of ugliness because for "." and ".." entries we perform 3 operations (stat, open, close), but the problem is the callback is what decides not to perform any other operation on those entries, and those are common setup/teardown around the callback for all entries. If I don't do that, either the callback _doesn't_ have the option to work on those (which ls -a needs), or I special case them in a way I haven't currently got infrastructure for.

Sigh. Designing common infrastructure with lots of different users is a hard problem...

May 20, 2012

I've gotten to the point where I enjoy working on toybox noticeably more than on Aboriginal Linux, but I still need Aboriginal as a test suite for toybox.

Aboriginal (previously Firmware Linux) was my main hobby project for the longest time, but it already does what it set out to do. I had the 1.0 release, and now each new release is primarily fixing regressions in other people's code. I'm no longer setting the agenda, or the schedule, instead I'm responding to other people's breakage. My toy didn't change: the packages it's made of that changed, or the environments its's built in changed. I'm not really _interested_ in figuring out why the threaded hello world built on sparc now says "./a.out: symbol '__gcc_personality_v0': can't handle reloc type 0x17" (uClibc changed), or why ccache is breaking the gcc build (Fedora changed).

And of course ccache is installed by default on Fedora, a distro so horribly overcomplicated that installing dropbear doesn't let ssh through the iptables rules, that puts /usr/local/sbin in the $PATH but not /usr/local/bin, where systemd is proud of not having runlevels anymore but it still has /etc/rc?.d directories that I had to laboriously figure out that it was parsing the rc5.d one...

Really, Fedora is so brittle and schizophrenic that I consider installing it "pilot error" at this point, but alas people use it and send me bugs and the bugs only happen there, so I have to figure out what the crazy people broke so I can work around it.

Sigh. I need to figure out why the 3.4 kernel broke armv5l, and bisect releally isn't helping. (Too much auxiliary breakage to easily track down the breakage I'm interested in, and the "peeling back the layers" approach (bisect for fix for bug, apply fix for bug, bisect in the previously hidden area) has been a touch frustrating because I bisected to a merge commit that brought together the "serial doesn't work" and "pci doesn't compile" branches into "shows the current bug". Wheee...

May 19, 2012

I actually got some programming done today. I am happy.

May 18, 2012

Seriously looking forward to a weekend by myself to catch up on the backlog of Toybox and Aboriginal todo items, so of course I'm panicing at work for a demo on monday to somebody who might be the company CEO (about three levels of management up, I lose track). Lost a lot of time waiting for the local representative of the India development site to get his code built on the current kernel, and eventually had to show him A) how to pull and merge the current kernel since that's not what he was building his modules against, B) debug his _userspace_ application build for him.

End result: I may have to come in over the weekend to do more demo prep. Wheee.

Oh well, I'd probably have spent it mucking out Xeno's Condominium anyway. (Each time I go there, I pack half of what's left.)

May 17, 2012

Dear "Mittens", the reason J.P. Morgan losing $2 billion is bad, even if other people profitied off of it, is that banks should not allow themselves to be robbed.

If every Automatic Teller Machine a bank owned simultaneously spit out all its cash into the street, that would be a sign that bank was not good at its job, even if passing strangers who walked off with the cash got to keep it. And the loss from something like that would be FAR less than $2 billion.

JP Morgan has been the main advocate for reducing banking regulation, based on the fact that it avoided getting the 2007 mortgage foreclosure crisis all over itself. "See, we're good at this, WE don't need to be regulated." Now it turns out, they take large speculative gambles that can blow up in their faces too, they just don't always realize they're doing it. And they do so while benefiting from FDIC insurance from the feds.

If you take the government insurance, the insurance company gets to regulate what you can do while covered by their policies. The Mythbusters have their insurance company tell them "No, Adam can't go up in the balloon chair" all the time, this is not a novel concept.

Why do Republicans keep nominating inbred twits incapable of understanding obvious things?

May 16, 2012

Why do I think the Tizen project is doomed? Let me count the ways...

First, Tizen's focus on HTML5 reminds me a lot of OS/2's focus on being a good java deployment platform. (Kiss of death, right there.) Java hurt OS/2 for two reasons: you get lost in the crowd, and "write once debug everywhere" means developers perceive you as defective.

Blending in with the crowd means that because your programming API isn't specific to your platform, this gives app vendors carte blanche to ignore you. It translates to "Don't bother specifically targeting this platform, it's safe to completely ignore because there is nothing unique about it."

Worse, when (if) they get around to testing their already-written code on it and find out that "write once run anywhere" is never perfect, you've violated an implicit promise. Porting to your target is one thing, it's an up-front decision to invest resources into a target. But when your target shows inevitable quirks in a supposed standard, it means your platform is _defective_. They're not debugging _their_ code (which already works elsewhere), they're finding bugs in your target and coming up with workarounds. Psychologically, this erodes their respect for you unless you were the reference platform they develop against in the first place (which Tizen isn't even seriously trying to be for HTML5).

Second: releasing source is not the same as open source development. This one screams out of the recent coverage of a Tizen meeting where the project 'worked hard to establish its "developer story"', everybody was interested in Tizen's "compliance program", and "developer outreach" was an unsolved problem:

Tizen's community manager Dawn Foster dealt with the outreach question in her state-of-the-community talk on Tuesday. I brief, the Tizen community at the moment is small; considerably smaller than the MeeGo community was, with fewer volunteer contributors joining the paid developers from Intel and Samsung.

The fact that Tizen couldn't retain MeeGo's developers is damning, because the project's history is a series of "dinosaurs mating": Nokia's Maemo merged with Intel's Moblin to form MeeGo, and then that merged with LiMo (a random consortium) to form Tizen. If each resulting project is _smaller_ than its predecessors...

But let's get back to the fundamental structural problem with the development community (or lack thereof): if most of your developers are employees of the same corporation, the ones that aren't become second class citizens. This breaks the feedback loop between users and developers that drives the project to serve the needs of the userbase.

Open source developers tend to do a lot of coding that employers won't pay for, or at least consider a luxury, most of which is variants of "throwing out code that already works". Employers view code as an asset they paid for, open source developers view existing code the way a sculptor views a block of marble. Ken Thompson (founder of Unix) said "One of my most productive days was throwing away 1000 lines of code."

Good open source developers do not consider refactoring and simplifying a luxury. The open source peer review process regularly rejects code that works but could be improved (a common result of review is calls to clean up the code), and entire patch series are devoted to reworking an area of code that already works but could be slightly improved via an extensive rewrite. This is how peer review makes code better: we take what you did and we rewrite it, over and over, into cleaner and simpler and more robust and smaller and faster and easier to read forms.

We've seen a LOT of "employer owned" open source projects, two of the most _successful_ being Mozilla back when most of its developers were Netscape/AOL employees, and Open Office back when most of its developers were Sun employees. Both survived by filling a gaping hole in the ecosystem when they launched (before Mozilla we didn't have an open source browser, before Open Office we didn't have anything that could write word files). Both were also slow, bloated, unstable piles of overengineered crap. The projects didn't start to recover until AOL transferred Mozilla to a foundation and pulled all but 3 engineers off the project, and until Sun went under (allowing the user base to unify behind a fork, LibreOffice). But the result is still a mess relative to most open source projects, due to years of backlog. The real successes were things like FireFox that forked off of the old codebase, ditched a lot of the old code, and started over.

Finally, there's the Linux Foundation, which is a pointy-haired bureaucracy pretending to other bureaucracies that it speaks for hobbyists. They are an ever-growing bureaucracy that's utterly tone deaf to the non-commercial aspects of the open source community. And even if they were less clueless: Tizen is to Yocto what ChromeOS is to Android. (Not that I'm hugely interested in Yocto, either.)

As far as I can tell Tizen is a convenient excuse for Intel's ongoing attempts to convince Samsung (the largest Android phone vendor) to use Atom processors instead of Arm. Why anybody else should care is an open question.

May 15, 2012

The gnu/dammit implementation of the chmod command is _weird_. It rejects "x=g" but thinks that "g+" and "g=t" are perfectly valid file modes.

And the spec actually seems to require some of this: ugo+ and ugo- are NOPs, and "=" by itself clears _all_ the bits (even the sticky bit)...

How crazy is this? Let's see "touch blah; chmod = blah; chmod u+s blah; chmod g=u blah" and it does _not_ copy the suid bit to the sgid bit. Ok, that simplifies the implementation _slightly_.

Ok, lemme try to come up with a coherent usage message. Something like:

usage: chmod [-R] PERMISSIONS FILE...

Set read, write, execute, suid, sgid, and sticky bits for user, group, and other.

PERMISSIONS are an octal bit pattern or one or more (comma-separated) [ugoa][+-=][rwxstXugo] stanzas.,/p>

For each category (u = user, g = group, o = other, a = all), set (+), clear (-), or copy (=) the permissions (r = read, w = write, x = execute).

Special permissions:

Octal top+bottom bit patterns (7777 is all bits set):

ug uuugggooo

Hmmm... Not _entirely_ coherent, but it fits on one screen...

May 14, 2012

Last week I emailed Fabrice Bellard:

Hello Mr. Bellard, I'd like to run a kickstarter to hire you to:

1) Adapt qemu's Tiny Code Generator to work as the back-end for your old Tiny C Compiler, to create a new qcc (QEMU C Compiler) that can produce output for the various targets qemu supports.

2) Resurrect tccboot with the result, and get it to boot a current (3.x) kernel to a shell prompt. (Another "modified subset" is fine, as long as it boots to a shell prompt.)

3) Release the result under a BSD license.

Does this sound doable? If so, how much would you charge (so I know how much to ask the kickstarter for), how long do you think it might take (ballpark), and when might you be available to start (if we can get you the money by then)?

(I.E. "it would take me a dozen fortnights, cost my weight in canadian 'toonie' coins, and the next open slot in my schedule is 37 years from now.")

--- Optional details:

My notes on this project, from when I tried to do it myself, are at:

I can maintain this after it works, I just don't know enough to make it work in the first place, and have been trying to find time to learn for years now but keep growing _other_ projects instead (toybox, aboriginal linux, I accidentally became linux-kernel Documentation maintainer...)

I have no particular interest in the current "no releases in 3 years" tcc mob branch, and am just as happy for you to start with your old code if you prefer. If you want anything out of my old tcc fork, I hereby grant it to you under the same BSD license as tcc/tcg.

It doesn't need multilib, being able to build "arm-tcc" and similar would be fine, and probably the common case given the need for libc, libtcc, crtbegin, and so on. (Being able to specify code generation with the same granularity as qemu's -cpu option would be nice, but not a huge deal in the absence of any real optimization.)

Eventually I'd like to "busyboxify" tcc/qcc, I.E. make it so the front-end recognizes whether it's called as cc/cpp/ld/as/strip and reacts accordingly. But I can handle that part later, and make its command line parsing understand more gcc-isms if necessary. I wrote some notes about that years ago here:

I don't care about C++. The missing C99 bits from your old tccboot notes would be really nice, though.

Simple dead code elimination would be really nice. (Busybox depends on it to avoid linker calls to undefined functions.) Just detecting if (0) constructs after constant propogation and suppressing output (or diverting output to a ram buffer that gets discarded) would be plenty. But if that sounds out of scope, I could probably tackle that after the fact too...

Thanks for your time,


Today, I heard back:


I had the same idea when I was working on TCC and QEMU. The code generator of QEMU is not generic enough to do it, but at that time I began to modify it to handle the missing bits. Unfortunately it is a large project and I lost interest in it. Maybe someday I'll be interested again in compilers (perhaps to do a mix between C & Javascript), but now I have other projects which have a higher priority, so I cannot help you now.

Best regards,


Hmmm. I wonder if Paul Brook or Anthony Liguori or one of the other codeweavers guys would be up for this?

May 13, 2012

An article wandered by on twitter about friday's rant, about how clean energy is really popular in all the polls, it's just one political party is adamently opposed to it and the other has no spine, so what the vast majority of people actually _want_ is going unserved. Oh, and republicans are trying to stop the US military from using biofuels.

(Sigh. "And to everyone else out there, the secret is to bang the rocks together guys.")

Elsewhere, Paul Krugman points out that inequality is one of the main causes of the current bad economy, I.E. if .01% of the populace has half the wealth, everybody else is significantly poorer than if the resources were distributed more equally.

The current "demand-limited liquidity crisis" is due to people who have money not spending it, I.E. a small number of billionares who got rich by "cornering the market" on everything including our political system have set themselves up as choke points, and this means a few dozen scared old men hoarding cash can throttle the entire country's economic activity.

Meanwhile, here's the best summary I've seen of the situation in Greece, which roughly jibes with Krugman's guesstimate that Greece's exit from the euro could happen within the next month or so.

May 12, 2012

Meant to do programming today. Instead slept in, then mucked out the old condo some more. (The upstairs is clear, the downstairs, including the kitchen, is not.)

I need to get the server set back up so I can debug the weird Fedora build break, which still isn't happening anywhere else. I need to get an Aboriginal Linux release out before the _next_ kernel release ships. (I refuse to miss two in a row.)

And dalias (the musl maintainer) poked me about xargs not following the insane quoting semantics that predated the invention of "find -print0 | xargs -0", and seems to think it's important. Sigh... I suppose a config option since it _is_ in the spec...

May 11, 2012

And now the Koch Brothers are mounting an Astroturf campaign against wind power.

Dear Barack Obama: ARE YOU BLIND???

I've come to the conclusion that if the democrats WEREN'T spineless and clueless, they'd tax the hell out of the fossil fuel industry and use the money to subsidize the alternatives. It is the OBVIOUS political move.

There's no real political downside because the fossil fuel companies already exclusively back the Republicans, AND SET THE REPUBLICAN AGENDA. Follow the money: this is why they keep electing "oil men" president. It's why the US invaded Iraq the first time (because Kuwait had oil), and the second time (because Iraq did, especially after a decade of sanctions put it way behind the rest of the middle east's peak oil depletion curve). It's why they support the Keystone oil pipeline and love fracking. It's why they _apologized_ to BP during the gulf oil spill, congress was biting the hand that fed half of it.

All the anti-global-warming denialism is funded by oil (and coal) companies desperate to avoid paying for the consequences of their actions, using the Tobacco Industry's model to distract, delay, and deny. If they have to work to undermine the country's belief in science itself in order to deny what the science is clearly and unambiguously saying, that's what they'll do.

Peak Oil happened in 2005 and has been really good to the fossil fuel industry, because over the past decade prices have quadrupled. Opec doesn't have to restrict supply any more, production can't keep up with demand even with massive offshore drilling looking at the only areas we haven't sucked dry yet: the deep ocean.

That's is why this year's Fortune 500 is topped by Exxon, Wal-Mart, Chevron, and Conoco: three of the four largest companies in the US are oil companies, and like the Tobacco industry did they side with party that can be bought, the one that credits the rich for their success and blames the poor for their poverty, even when both are inherited for generations.

Obviously, continuing to depend depending on oil _after_ world production has peaked, with China and India rapidly growing their imports (competing for the remaining supply), is financial suicide for the country. But hugely lucrative for the companies that supply an ever-more-precious commodity to a captive market, for as long as they can keep that market captive.

The Democrats try to help people, and the poor need the most help. Social Security, Medicare, affordable housing, affordable education. When they see a problem, they're likely to try to do something about it, and they believe government _can_ do good things. The Republicans believe in self-interest, and getting the government out of the way of people's self-interest. I.E. the Democrats naturally want to stop pollution and dependence on fossil fuels, and the Republicans can naturally be bought.

So massive amounts of oil money are funneling into the Republican party, working to undermine alternatives to oil, roll back environmental regulation, open the arctic wildlife refuge to drilling, and control oil-producing countries with our military. (Notice how we effect "regime change" in countries like Iraq and Libya that have a lot of oil but don't give us preferential pricing. Sanctions work, but they mean we can't just buy that oil cheaply right now. Why restrict the oil supply when we have laser-guided bombs?) And, of course, stop the Democrats from doing anything about it.

The Koch brothers are oil billionaires (investing oil profits in spin-offs the way Phillip Morris owned the Kraft food company, but Oil's the heart of it). That's also how the Bushes made their money. Dick Cheney was CEO of Haliburton -- an oil company --, and since leaving office he's already partnered with Rupert Murdoch in another oil company. And of course Rupert Murdoch's justification for invading Iraq back before it happened was (and I quote): "The greatest thing to come out of this for the world economy would be $20 a barrel for oil. That's bigger than any tax-cut in any country." These are not coincidences.

So of COURSE the Republicans attacked and mocked Soylindra. Obama subsidized solar electricity generation, they were terrified it might succeed. And you can tell Obama's an amateur because when they howled in pain he BACKED OFF.

Every time this guy hits a nerve he stops and apologizes instead of following up. Obama is NOT strongly pushing a huge carbon tax and pouring the money into massive R&D and subsidies for solar and wind and batteries and hydrogen and treating the whole thing as a giant jobs program and preventing the country's money from draining away to the middle east. You want to fix our trade deficit, it's obvious where to start. He's not highlighting the pattern of "Big Oil says jump republicans say how high" and calling them out. Instead he's treading lightly so as not to offend the oil companies, because he's completely spineless. He's trying preemptive unilateral compromise with the successor to the Tobacco institute funding the Party of No.

So yeah, it's nice that Biden cornered Obama into declaring gay marriage to be a state's rights issue like abortion was _before_ Roe vs Wade and like interracial marriage was back when he was born (his parents' marriage was illegal in 16 states). Somehow, this is counted as progress from mister half-measures. He (and I quote) "hesitated on gay marriage in part because I thought civil unions would be sufficient". Yes, the first black president endorsing separate but equal. If marriage is really a religious ceremony why can christians marry hindus, or an atheist marry anybody? It's a tax status with insurance benefits, and nobody's who's been divorced more than once can talk about the "sanctity of marriage" (basically all the republicans).

But I wouldn't exactly consider myself "energized" by this. If we had primaries I'd happily give Hillary a chance. Oh, what do you know, early voting opens monday...

May 10, 2012

Wow, Linux Documentation maintainership means being cc'd on MASSIVE quantities of irrelevant crap. Most patch series touch something in Documentation (or _should_), and so do things like device tree format changes, and scripts/ will now add my email address to the cc: list of any patch series that touches anything in there. Meaning I get cc'd on the entire patch series, and all ensuing discussion.

Oh well, I'd been meaning to get back into reading linux-kernel anyway, but _dude_.

Speaking of which, I pulled up last week's linux-kernel web archive threaded view (on the theory that it's stopped updating so I won't have to go _back_ to check for additions I missed), and skimmed the first half of it. (Interesting topics, interesting users, basically what caught my eye.) This took 2 hours. It looks like just _skimming_ linux-kernel is now about 4 hours a week. Not as bad as I feared, but a kernel-traffic or kernel podcast style summary would still save me a lot of time. Alas, the people who did both stopped because it was too time consuming.

(In the past few years I've tried to hire two different people to do it for me, since it's a great learning experience for a computer-interested high school or college student and a heck of a portfolio piece when you're ready to get a real job. But both fell through for different reasons and it turns out I don't really KNOW a whole lot of high school or college students at the moment. They all graduated.)

May 9, 2012

Finally got a couple hours to bang on open source again. Went through one more of Georgi's patch backlog, removing strndupa from mdev, which was kinda moot since dirtree had changed out from under it and the command was never finished in the first place (no hotplug support). Did the basic cleanup so it compiles again, anyway.

I really need to get an aboriginal linux release out but the server isn't set back up so I can't track down the Fedora build break. (I tried the build on the new Ubuntu LTS and everything worked there, so it's not some newer package version, it's Red Hat being screwy.)

The amusing part is that last night I spent half an hour looking for a Generic Power Cable. The kind that's so bog standard that my commodore 64's disk drive used it in 1982? The TV uses it. The printer uses it. The server uses it. The replacement power adapter I got for my netbook uses it. I have a spare bundle of _5_ of the suckers back at the condo. And I apparently haven't got ONE of them in the new place.

Ironically, the XBox360 uses a nonstandard version. It looks like the normal one but only has 2 of the 3 wires: no ground pin. Yes, it's the most generic piece of hardware in the past 30 years and The Law Offices of Small and Limp Esquire did a nonstandard version I couldn't repurpose.

May 5, 2012

Moving to the new house. Everything's in boxes.

April 29, 2012

So one of the problems Fedora has is it installs ccache by default, and ccache support turns out to be broken in record-commands. I added infrastructure to support this almost three years ago, but for some reason the fallack directories are being created but not included in the $PATH that more/ is using to call $CC.

April 28, 2012

I felt mostly recovered from the cold except for interminable coughing (which gives me a headache and a raspy throat, but those are effects of the coughing). Then I tried packing some boxes, which involved a couple hours of breathing dust...

That didn't end well. Lungs not efficiently clearing themselves at present.

Programming! I'm trying to debug the m68k target in Aboriginal Linux. The m68k support in qemu is unfinished, but current progress is going on out of tree:

git clone git:// && git checkout remotes/origin/q800 -b q800 && cd qemu-m68k && ./configure --target-list=m68k-softmmu && make -j 3

(You have to re-checkout fairly regularly because this branch rebases against upstream every time anything new shows up in it. No point in pulling.)

So using that, I get as far as:

Linux version 3.3.0 (landley@brillig) (gcc version 4.2.1) #1 Sat Apr 28 20:26:59 CDT 2012
bootconsole [early0] enabled
Detected Macintosh model: 35
VIA1 at 50f00000 is a 6522 or clone
VIA2 at 50f02000 is a 6522 or clone
Apple Macintosh Quadra 800
qemu: fatal: Illegal instruction: 7f45 @ 00000000

And then it dumps registers. So the kernel is booting, and then dereferencing a null pointer long before it's time to run userspace.

My rinse/repeat command line here is:

./ m68k && ./ m68k && KERNEL_EXTRA=ignore_loglevel PATH=/home/landley/qemu/qemu-m68k/m68k-softmmu:$PATH more/ m68k

The first two stanzas rebuild the kernel and repackage the system iamge, then the "ignore_loglevel" thing is a kernel command line argument that tells the kernel to show _every_ printk on the command line.

Beyond that, it's a question of descending into build/packages/linux and groveling around for strings from that output (they're mostly in arch/m68k/mac/config.c so far) to find recognizeable points _before_ it crashed, and then follow it forward to isolate the crash.

So mac_identify() is where the last identifiable chunk of output comes from, and that's called from config_mac() (which makes it all the way to the end as determined by a printk() there producing output).

That gets called from setup_arch() which turns out to be in arch/m68k/kernel/setup_mm.c, and... it goes into paging_init() and never comes out again. Ok, this is an mmu emulation bug.

Oddly, if I enable DEBUG in that file (which just switches on a bunch of printk() calls) it turns into a hang. And the hang refuses to narrow down to a specific line with additional printks, it seems the kernel does something and some amount of time _later_ QEMU goes into a tight loop that kill won't take down with -9.


April 26, 2012

Back at work for a half day. Still coughing, but at least not sneezing twice a minute, probably not contagious. Mostly stayed in my cube anyway, and did the "self-assessment" part of my annual review (which is due friday, and that was the extended deadline).

Hanging out at Fry's' coffee shop after work, which means no internet, which means I can't watch The Daily Show (no RSS feed to download it), so I'm catching up on the Rachel Maddow episodes I downloaded but haven't watched yet. April 2: The Abortion Show. April 4: The Abortion Show. But both had enough of a "war on women" slant to at least make sense in a larger context.

April 6: she found an excuse to do a Bad Science About Nuclear Power segment through a really strained segue about an abandoned ship floating around the pacific for a year after the tsunami (totally tipped her hand since the segment was titled "half life of the ghost ship" so I was going "how is this going to turn into Bad Science About Nuclear Power" for several minutes before it actually did. The ship actually had nothing to do with fukushima, it was just an excuse to look back at the tsunami and go "I still don't understand nuclear power but am terrified of it!" I mean, at least she's not going on about homeopathy or anti-vaccination stuff, but _dude_... And the interview is about that too. (Control-right arrow skips 60 seconds forward in VLC. Shift-right arrow is a five second skip.)

Sigh. She has some great segments towards the _end_ of each show. She just has to vent her fixations at length first. The need for 2/3 votes for "immediate effect" in Michigan sounds really important, but you have to sit through the rest of the show to get there...

(I say this as someone who tweeted earlier today the observation that if you look at the numbers, Planned Parenthood is the most effective anti-abortion organization in the USA because all the contraception it distributes reduces unwanted pregnancy, which is a prerequisite for abortion. Obviously I care about the issue, but the sheer broken record hammering on it gets old, then gets annoying, then gets to "please stop talking to me about your religion" levels...)

April 24, 2012

Still sick. I am so ready for this cold to be over.

I need to get back to poking at thread/nptl/sysdeps/unix/sysv/linux/x86_64/pthread_once.S to PIC-ify the __fork_generation symbol. Basically declare a global int, do an |= assignment to it in main(), and then compile that with -fPIC and disassemble it to see what it does.

Alas, I haven't got the concentration right now...

April 23, 2012

Home sick from work. Thought I'd get some work done, but mostly wound up napping or sitting on the couch catching up with twitter or playing skyrim. No brain at all today.

April 22, 2012

Bleah. Sick today. Scratchy throat, voice sounds weird, headache. Thought some exercise might help fight it off, but a half hour into biking I got nauseous.

Fade says she had a mild version of this and fought it off with those massive 1000 miligram vitamin C packets. Took a couple of flintstones' chewable "with extra C", for the placebo effect if nothing else.

Trying to concentrate either on fixing toybox ls (directory handling is all wrong) or aboriginal x86_64 (an assembly hunk is missing an #ifdef __PIC__ stanza and dies with an impossible relocation error: apparently the uClibc guys don't really test x86-64 much). Unfortunately, my brain is going "a 4 hour nap might be a good idea", and I'm an hour bike ride from home at present.

An energy drink counts as hydrating, right?

April 21, 2012

Yay, it's the weekend: I can do real work again!

Poking at aboriginal to get armv4l building again. It looks like they didn't bother to implement nptl for arm-oabi. Can't really say I blame 'em. I also need to fix x86-64, and figure out why fedora host breaks the build. I'm reinstalling the upstairs server with Fedora, since gentoo ate itself. (If you only tell your gentoo server to --sync and upgrade itself every 3 months or so, it gets really unhappy. These guys do not have releases, and their idea of legacy support seems to be "a whole month".)

Poking at toybox to make ls -l with no arguments work: all those fiddly decisions about when to recurse into directories, and command line arguments being special. Although I think "ls symlink" vs "ls symlink/" is handled for me by the OS because the second one is implicitly "ls symlink/.".

The Documentation/ discussion continues apace. Converging on a reasonable solution, sounds like. The thing to remember is "this wastes a megabyte of permanent storage for every upload" is a touch eroded by Moore's Law these days. And if I rsync to my server and have _it_ do the git checkin (with its' 1.5 terabyte drive), if it gets big enough to bother them upstream it's still not my problem: they're the ones who felt git push was preferable to rsync for generated files.

But mostly, packing boxes to move stuff into the storage space we rented several days ago and have yet to actually put anything in. The "dust makes you cough" thing turns out _not_ to be a myth.

April 20, 2012

Poked at the users at list to see about getting my account reinstantiated so I can update again. I need to create a gpg key and get it signed by people I don't know, because getting it signed by people I do know (who are or have been in the kernel MAINTAINERS file) doesn't count, because they don't have active accounts and their current process is remarkably insular.

Getting an account means I get access to a "kup" tool which is designed around the assumption that you're doing git. Each file you upload must be individually gpg signed, although it has built-in support for telling a git tree on the server to create a patch file or a tarball, and store it. I have a python build script that takes half an hour to generate a directory full of files, which I then rsync to a server. Using kup to do this would be hilariously awkward, and involve writing a script to drive it and leaving it to run overnight. Every time.

I asked the guys if they could maybe redirect to a server I maintain, and bypass all this. Instead, they're trying to figure out a git-based rsync so I can update again. Apparently, if all you have is git, everything looks like a repository. (Imagine if every .o file during a kernel build had to be checked into git to produce a vmlinux. Every time. Oh well, it's their disk space.)

I'm grateful they're trying to adapt to meet my use case, but amazed at the blinders their mindset has. Especially since predates git.

April 18, 2012

Catching up on Aboriginal Linux. I've got the kernel updated, busybox updated, uClibc all the way up to, and everything using NPTL. Now grinding my way through the Linux From Scratch (still 6.8) build...

Also poking the users list, to see about getting updating again. (I think the kup program is what you'd get if the TSA wrote software. The security vs usability tradeoff is skewed pretty far to "maybe if we make the system useless, nobody will try to crack it". I really don't want to try to reinvent rsync on top of somebody's perl script.)

April 16, 2012

A police state is the legislative equivalent of training wheels. Needing secret police to govern means you suck at governing.

Back when El Shrubbo perpetrated the Department of Homeland Security, TSA, and Guantanamo Bay, I visualized him with a life preserver and flippers crouched in a row of olympic swimmers at the edge of a pool. The idiot kept calling a timeout and demanding water wings, a shower cap, and noseplugs on top of that because he was completely unprepared to do his job. Every demand for additional power as one more sign he was profoundly ineffective with what he already had.

That's part of the reason I was so disappointed when Obama signed off on FISA and kept Guantanamo open. Until then I thought he was good at his job.

April 14, 2012

Checked in the dirtree and ls changes, and three or four other pending little fiddlibits in my toybox tree. For the first time in ages, "hg diff" shows no changes. Feels good.

I fall over now. Tomorrow I need to do linux/Documentation and Aboriginal stuff.

April 13, 2012

Friday! Went home exhausted around noonish. Dunno if I had a stomach bug or if I was just so utterly drained by sitting in a cubicle that I was sweating and my vision was greying out. (My subconscious is _not_ happy with my recent life choices in regard to what I do 9-5. Wearing a suit and tie would be _less_ demaning, at least that's blues brothers/tenth doctor cosplay. Cubicles are just dumb.)

Slept for several hours before feeling human again. I've only managed to do my "get up at 5 am and be productive before work" thing once this week. It's sad, I miss it, but I'm just so _tired_ all the time. Of course now that the sun's gone down I'm perking up. (It's also been 8 hours since I've had to sit in a cubicle, and I don't have to do it again for 2 full days. That really helps too.)

So, trying to finish up ls this weekend, and get Aboriginal Linux and the kernel Documentation directory under control. I've fallen _way_ behind on all the work I actually care about.

I'm most of the way through ls, but hitting strange behavioral corner cases. Currently, when to print the directory name header. You _don't _ do this when you're listing just files on the command line, including directories with -d. You also don't do this when you're listing just one directory (either "ls dirname" or the implicit "." you get with no arguments), except that if you say -R then the first directory gets a header. If you say "ls dir1 dir2" then both get the header.

I read the ls spec, which said:

If more than one directory, or a combination of non-directory files and directories are written, either as a result of specifying multiple operands, or the -R option, each list of files within a directory shall be preceded by:

"\n%s:\n", <directory name>

If this string is the first thing to be written, the first shall not be written. This output shall precede the number of units in the directory.

Except that if I go:

mkdir sub
cd sub
ls -R

I get ".:" instead of no output, even though the bit in the spec about -R was a conditional about _why_ you might list more than one directory, and this is just one directory without even any files in it. (I.E. the behavior of the gnu/dammit version of ls isn't quite what the spec requires, but what else is new?)

And of course "ls -lR file1 file2 dir" still lists those two files without a "dirname:" prefix, then gives a dirname prefix on the dir. In fact those files don't have a "total:" either.

In a way the pathological case for doing this the way the spec implies is "mkdir -p sub/sub; ls -R sub" because you can't tell you're listing more than one directory until after you descend into it, so you either do an arbitrary amount of readahead (the directory could have 5000 files ane no subdirectories, in which case it should retroactively have no prefix) or you defer the decision until you descend into it (so the test is in the recursive instance of the display function).

Some more fun ls behavior: "ls -l todo.txt todo.txt todo.txt" treats them as three separate files (understandable), and "ls -l doesnotexist sub" gives a "sub:" label on the directory even though the other argument doesn't exist. So what it's _testing_ is that the number of command line arguments <= 1.

Sigh. I think I need to implement -R forcing the dir label, if only because it's what everybody expects, regardless of what the standard says...

April 11, 2012

Added All the Flags Ever to ls, haven't implemented most yet.

I think the right way to handle -d is actually display-time filtering, which implies the command line parsing shouldn't create separate files and dirs lists but should just make one big one, and yank/free the non-directory entries as it displays them. (Actually I can filter them out as it assembles the table for sorting, although figuring out what to free when gets a bit tricky, but not too bad.)

April 10, 2012

The Daily Show's commercials have gotten weirdly mysoginistic. There's a Junk-in-a-box commercial where somebody marries a hamburger, and the wedding ceremony ends with "you may eat the bride". Then there's a commercial where some guy is followed by a cloud of nanites coming out of his hair which disintegrates three women (leaving behind piles of clothes), and then the women are reconstituted in his bathroom at home by The Product.

They spent money filming these things. I'm not sure why. Possibly I'm just "not watching television for the past decade", so these seem unusually weird to me. Then again I've yet to figure out what the woman in the pink and white striped dress has to do with t-mobile, so...

April 9, 2012

Swamped. I've been dinking away at an ls rewrite to test the new dirtree stuff, and it's about 80% done, but this is one of those "breaks stuff that used to work" patches that it's hard to check in just part of. (I must admit, loading my old Red Hat 9 image and comparing the ls -l output with what Ubuntu 10.04 is producing, it's changed a _lot_ over the years. And of course SUSv4 is remarkably vague.)

Fade's family is visiting. Went to see Miyazaki at the Drafthouse. Fade's off with them at their hotel tonight, I helped Tryn pack boxes then came home to bang on software.

Signed paperwork to buy a new house on Thursday, although we won't get to move into it for a while. Currently two different places want to be declared our legal residence for mortgage purposes, and the "I'll claim one and Fade claim the other" idea turns out not to work if you're married (for some strange reason) so we'll probably just wind up selling the old one sooner than we intended. So that's more paperwork to look forward to.

Work remains crushing: I sit in a cubicle, under flourescent lighting. It's a _cubicle_, combining claustrophobia with a complete lack of privacy in a way that doesn't seem possible, and yet. The old building didn't have cubicles.

Might poke my boss to see if I can get time to work on the kernel's Documentation/ directory as part of my day job duties. I'm not doing toybox or aboriginal work on company time because those are my projects and the company may decide "we own him, therefore we own them" (which isn't true but is an easy enough mistake to make I'm keeping some clear lines here to avoid upleasantness). But curating a library of kernel documentation is a small part of a larger project that predates me, and something they're unlikely to get grabby about...

April 5, 2012

User @vixy on twitter linked to "An excellent theory as to why men keep trying to make laws about vaginas", an article which titles itself Birth Control -- and why we'll still be fighting about it 100 eyars from now.

Its thesis is that when a priveleged class has arranged a concentration of wealth and power, not only will they die before they give it up but their descendants have to die too. Previous revolutions often killed off the old guard, but the printing press didn't, and thus took centuries to not just establish new norms but make them stick. (The fact we're still arguing over teaching evolution in schools implies we still haven't gotten through this one.)

Another tweet from @mattyglesias is The entire future of the American economy in three paragraphs, with important Deep Space 9 allusions. This one points out that our economy is eliminating the need for manual labor, which would result in everyone at leisure if our society was set up that way, but is instead winding up with everyone unemployed because nobody needs them to do anything. The world _is_ changing fundamentally, but we aren't changing to take advantage of it.

The trend I'm noticing here is that concentration of wealth and power requires there to be poor people to extract tribute from. What good is money if you can't pay people to work for you? The existence of rich people, who can command hordes of peons at will, requires hordes of peons, each willing to work full-time for a tiny fraction of the money the rich person has at their disposal. You can't have an english manor without a servant class to staff it.

Part of it is that being a slumlord still pays, and with pawn shops, prepaid credit cards, and "payday loans" it pays more than ever. But also if we're all rich then we've merely raised the standard of living. To be rich is to stand out from the crowd and be BETTER than everyone else. You can't be better than everyone else without someone to be better _than_. Lots of someones.

The most fundamental servant class in history was "women", and the conservatives attempting to re-establish their gloriously hallucinated past are trying to put women back in the servant class by taking away birth control and equal pay and domestic violence protections and so on.

The words "conservative" and "liberal" are actually kind of interesting, from an etymology perspective. When something is liberally applied it means you use a lot of it, free-flowing. Liberals are the experimenters who try everything to see what works, the Johnny Appleseed types who spread their influence far and wide and let successes snowball and failures recede on their own merits. To be a liberal means trying new things. Conservatives are conservators and conservationists: they retain the past. Their job is to prevent change, and when they do strive for something different than today it's by going back to the past: researching a perceived golden age and attempting to recreate it, albeit often via mythology so they're often trying to "rebuild" a camelot that never really existed and doesn't work in practice.

All this gets us back to the baby boomers, still the dominant force in my lifetime. The rise of conservatism is becomes the baby boomers are shriveled up old fogies unable to cope with change. As teenagers they were willing to try anything once, and thus were liberals: Free love, hippies, woodstock, protesting vietnam, the space age, the works. All before I was born. Now they've stopped trying new things, and are thus ultra-conservative. The _baby_boomers_ are the ones who made "liberal" a dirty word. What they know is all there is. They don't want anything new, because they couldn't adapt to it. They just want to relive their glory days and return to an imagined past, not the _actual_ past when they were hippies or yuppies, but when they were the fine upstanding straight-laced citizens they never actually were. And in pursuing this fantasy they're providing an army of votes and funding for the concentrators of wealth and power to elevate the 1% of the 1% so far above everyone else they're as untouchable as Marie Antoinette.

Then there's the libertarians, insisting that Alexandard the Great couldn't be great if he wasn't free to conquer the world, and thus allowing Genghis Kahn to rampage across asia makes us all more free somehow. The greatest defenders of the 1% are those who insist we can't restrain them: you can't have a government because that would stop us from having kings.

Kind of annoying, really.

April 4, 2012

So Randy Dunlap is retiring as maintainer of the linux-kernel Documentation/ directory and he asked me if I wanted to do it, so this happened. I need to reinstantiante my kernel login, but probably not before signing giant stacks of house-related paperwork on thursday, and then Fade's relatives visit of the weekend, and I'm trying to get the toybox and aboriginal stuff caught up...

And I think I may need a caffeine detox. The energy drinks have stopped working again...

Busy month.

March 29, 2012

Why do I think busybox has lost the plot?

I needed a tftp daemon for work, and turned to busybox as the trivial solution to the problem. According to the --help "./busybox tftpd /path", right?

Wrong. It kept spitting the help back out at me, without ever saying what was actually wrong. Eventualy I dug into the source code and found this (in tftp.c, not tftpd.c, and the source remains an unreadable forest of #ifdefs as usual):

our_lsa = get_sock_lsa(STDIN_FILENO);
if (!our_lsa) {
     /* This is confusing:
      *bb_error_msg_and_die("stdin is not a socket");
      * Better: */
     /* Help text says that tftpd must be used as inetd service,
      * which is by far the most usual cause of get_sock_lsa
      * failure */

No, the help text says it "should" be used as an inetd service, not that daemon mode has been completely removed. The old message was better.

Not having daemon mode is annoying: Ubuntu is upstart based and I haven't looked into adding services to that without rebooting, nor do I want to try to install inetd alongside it and hope they don't fight.

But fundamentally: the point of inetd was to have a single running binary that used a small amount of memory, which spawned short-lived instances of the other daemons to service each request, then they exited while inetd waited for new connections. (This was back in the 1970's, before the widespread use of swap space and dynamic linking with lazy binding, or even faulting in pages as-needed for static binaires.) Having a daemon running pinned memory, so inetd made sense. These days daemons like Samba and Apache are much larger than their historical counterparts, but we've got way more memory and we have swap space to flush them into when they haven't done anything recently.)

Now let's look at inetd in the context of busybox: if inetd is in busybox, and tftpd is in busybox, you have busybox launching another instance of itself, just like daemons do. Having it be two applets accomplishes what, exactly? It's just an excuse not to factor out the accept() code into libbb and make daemon mode cheap.

Even worse: the busybox help text gives two examples, one for inetd and one for udpsvd, both of which are in busybox. I.E. it's got redundant implementations of the same darn functionality, which has no business being a separate app. (Note: netcat server mode can do this too. Netcat server mode at least has a _reason_ to be able to do this, being a general-purpose tool and all.)

As for my "simple, quick and dirty solution", I just downloaded tftpd-hpa which has daemon mode and doesn't reply on any external packages in the ubuntu repository, and made a note to write a tftpd in toybox someday.

March 28, 2012

Huh. I think Ulrich "death to static linking" Drepper is no longer Glibc maintainer. Not only did he move to Goldman Sachs, but according to Roland McGrath the glibc steering committee just dissolved.

March 27, 2012

The Fedora bug appears to involve gcc building with "--target=i686-unknown-linux", "--build=i686-walrus-linux", and "--host=i686-unknown-linux". Since the machine it runs on and the machine it produces binaries for are the same, it's linking stuff against the target library and trying to run it on the host, even though I told it the _build_ machine is different.

Note: this is a bug. We're cross compiling a program that runs on machine X (the conventional "target", you supply an existing cross compiler that produces these binaries, and the _only_ way we should have to specify this is by supplying the cross compiler). The program happens to be a compiler, which produces output for target Y. You should never have to tell the build what kind of machine the build host is: you can check compile-time macros if you really care, but mostly you shouldn't.

The fact that gcc doesn't think this way is because the designers of its build system were insane, and created a giant pile of overcomplicated crap that serves no purpose.

So when building gcc, --host actually means target, --build means host, and --target means output type for binaries generated by the new compiler. I told it --build is not the same as --host, I.E. host != target, so the fact it's trying to link host binaries against target libraries is a bug.

Now to dig into why it's doing that...

March 26, 2012

Catching up on The Rachel Maddow Show: the Friday March 16 show was "All Abortion All The Time", at least until I gave up and skipped to monday, more than halfway through the episode.

Monday the 19th was a special report on "Bad Science About Nuclear Power". At the 6 and a half minute mark: she showed a picture of Osama Bin Laden and Some Other Guy with A Big Beard, while explaining how Eisenhower's "atoms for peace" thing from the 1950's was somehow responsible for them wanting nuclear weapons. (I thought the US dropping bombs on hiroshima and nagasaki sort of clued the rest of the planet in that nukes were a thing.)

I think Rachel's point is that if Eisenhower had kept nuclear power a secret, nobody else would ever have discovered it, or something? Because obviously after we used nukes to end World War II, nobody else ever would have followed up on that. "Gee, all this radiation, just like Marie Curie was researching last century, I wonder if that's involved somehow."

It's sad that whenever nuclear power comes up Rachel's brain seems to shut down and go into the same kind of "the evil must be destroyed without question, don't look at it or you'll turn into a pillar of salt" mode the right wing pundits spend all their time in. I really used to like this show...

Ok, let's skip ahead to the most recent show... and her first piece is on the Gabby Giffords shooting. Well, at least it's a new topic. Not sure it's news at this point, but ok... And it looks like it's working up to another "and there's nothing you can do about it" ending.

I miss Keith Olbermann. He was the lightning rod for all the "mad as hell" impotent rage, and Rachel could spend her time being informative. Rachel trying to rile up her audience is no fun to watch. (Too bad the other new spinoff shows haven't got feeds.)

Nope, didn't make it through the gun segment. After she'd gone on for 400 years about the inevitability of our lovecraftian demise with "stand your ground" legislation spreading through our precious bodily fluids, I closed the window and deleted the files I'd downloaded. She's just gotten too depressing to watch.

At least there's still John Stewart and Colbert, but I can't download those and watch 'em offline. (Well, not as easily.)

March 25, 2012

Somebody left a newspaper on a table at McDonald's, which mentioned that Cheney had a heart transplant. This implies Bush got a brain and Obama may have gotten some courage. We can only hope.

Biked to Chick-fil-a yesterday, today I biked to the coffee shop in Fry's. I need waaaay more exercise. (According to the time and temperature signs I passed, it was 55 farenheit. In direct texas sun, it was more like eighty, but that's still better than summer. Stopped twice to apply sunblock.)

I got Aboriginal to the point where I can build an x86-64 image on my laptop, and then build all the targets under that on quadrolith, and they pretty much work. (Or at least gives the expected failures, which are generally qemu things. Half of 'em are that qemu's serial initialization on powerpc and sparc and such eat an unlimited amount of info, so feeding in a script as a here document, even with significant whitespace padding at the start, doesn't work. Once upon a time I was writing an expect implementation in shell, I should get back to that at some point...

(The fiddliness with expect is that shells don't have the idea of a circular pipe: you can't have the input and output of process X go to the output and input of process Y. The closest I've come is to mkfifo and connect the start and end of the pipeline together via the FIFO, but that leaves trash in the filesystem.)

Meanwhile, I set quadrolith building Linux From Scratch under all those targets (the server is called quadrolith because the first one was monolith and its successor was duolith... the fact it's 4xSMP is actually coincidental). And since I haven't got net access at Fry's I cant bang on the Fedora bug from here either (which _might_ have been fixed by fixing the brown-paper-bag bug slashbeast found, or at least I kicked off a rebuild of the native-compiler stuff to see if that's the case), I'm back banging on toybox for the moment.

So toybox: I've got like three major logjams trying to digest Georgi's patch pile, and the most tractable of them is dirtree. Having seen the fts functions and the scandir() stuff, I now have a reasonable idea of what dirtree needs to look like. I've got to wean it off of toybuf and PATH_MAX in general, and switch to something that uses openat() and fdopendir(). I worked out some quick sample code to confirm I'm using the suckers right:

#include <stdio.h>
#include <dirent.h>

int main(int argc, char *argv[])
  int fd=open(argv[1], 0);
  DIR *dir;
  struct dirent *d;

  dir = fdopendir(fd);
  printf("dir=%p\n", dir);
  while (d=readdir(dir)) printf("name=%s\n", d->d_name);

  return 0;

I added a close(fd) at the end there and it returned -1 so the man page was right when it implied that fdopendir() took custody of the filehandle. Specifically, closedir() closes it so I'm not leaking them if I don't. (Recursing into big directory trees, I need to care about this sort of thing.)

The other thing I'm adding is dirtree_path() which takes a struct dirtree *node argument and traverses up to the root of the tree to assemble a full path. The fact we only assemble this path on demand (and then if we need it to be absolute we feed it to realpath()) is one of the big advantages this has over the fts stuff: if you tell that to assemble a tree for a big directory it's gonna eat buckets of memory with all the paths.)

Hmmm... there's no lstatat(). There's an fstat(), but I need to open thething first which means symlink resolution's already done. Either I can call open twice (once with NOFOLLOW), or I can use readlink first... Ah, there's a readlinkat(). And readlink() returns EINVAL if it's not a symbolic link, so I can use that to test whether or not it is. Ok, let's do that then.

It's slightly racy since the readlink() and open() could have something happen between, but... Hmmm. Ah! Do openat() with the NOFOLLOW flag, and then only readlinkat() if we couldn't open. This means the stat buffer isn't filled out when we read link info, but the only thing we're missing there is the date stamp on the symlink, which we can't _get_ in the absence of lstatat() anyway...

Hang on, posix 2008 has included openat() and readlinkat() so there's no way they'd have left this hole. (If _I_ can spot it... there's a difference between Microsoft and IBM paying off the committee to leave holes large enough to drive NT and OS/360 through, and missing this sort of thing. So let's look at posix's lstat() page...

There is an fstatat(), with an AT_SYMLINK_NOFOLLOW flag. Beautiful.

Grrr. "ls -a" shows "." and ".." entries, and I'm filtering them out because recursing becomes _insane_ otherwise, and 99.9% of the time you don't care. (Modern filesystems don't actually store them anymore, they're handled at the VFS level. I need to think about how to deal with that...)

March 24, 2012

Bunch of weird stability issues in Aboriginal Linux, which I'm cleaning up. The sanitize_environment stuff (to unset every environment variable that isn't recognized, because they can break the build) works by assembling a big long comma-separated list of allowed variable names in the environment variable "TEMP", and then iterates through the environment variables and unsets every one it doesn't recognize.

Somebody on freenode ("slashbeast", I'd credit him in the changelog if I knew his actual name) had the build break for him because TEMP wasn't in the whitelist. Oops. Brown paper bag bug, that. (It worked if TEMP wasn't an existing variable, and was thus at the end of the list. If TEMP _was_ an existing variable that got replaced, when sanitize_environment saw it, it would blank it... and every variable after it since the whitelist was now empty. If TEMP was before PATH in the environment variable list, bad things would happen in, which is what happened to slashbeast.)

Another really weird one is that when building the kernel, "make allnoconfig KCONFIG_ALLCONFIG=filename" works, but "make allnoconfig KCONFIG_ALLCONFIG=<(shellfunction)" only works SOMETIMES. It might have to do with /dev/pts being mounted, or a bash 2.x vs 4.x thing... Sigh. I think I'll just write the tempfile and not worry about trying to make the guts of bash reliable. (I can revisit this when I write my own shell in toybox and make the build use it: fixing bugs in an obsolete version of bash isn't interesting.)

Oh, and then there's EXTRACT_PACKAGE. It runs tar in a subshell, so it can pipe the output to dotprogress (a shell function that reads in filenames and outputs one period for every 5 filenames. This turns a "scrolls by so fast you can't see it, for pages and pages" tar output into a resonable if not ideal progress indicator. Making an actual progress bar out of that would involve knowing ahead of time how many entires to expect so you know where 100% is, which is a maintenance nightmare in the presence of package upgrades).

The problem with running tar in a subshell is if it _fails_ (due to a truncated archive or some such; yes we're checking sha1sum but that doesn't handle ALT packages or a packages directory populated by something other than, and what if the disk fills up during extract, or the user kills tar with killall or ctrl-C?), the subshell needs to propogate the failure up to the parent. Except the subshell is in a pipeline. A subshell in a pipeline that needs to pass out exit status is kinda funky, and -o PIPEFAIL didn't quite cover it for some reason.

My solution involved cacheing $$ and then calling kill on it from the dienow function. Guess what I forgot to include in baseconfig-busybox because it's not used in the normal flow of operations and thus didn't show up in output? That's right, the kill command! So it blithely continued on and wrote the sha1sum file, which was then accepted as gospel later on even though the archive had only extracted halfway, and it didn't correct itself on future runs until I did an "rm -rf build/packages" to zap the corrupted package cache.

Obviously I want to put kill back in, but I'd also like to make the failure path more naturally detected. So I only want to create the sha1sum file for the extracted package within the subshell. If tar exits successfully, touch the file in the subshell, and then once we exit the subshell the parent can test that it exists. (Yes, it's IPC through the filesystem, but it works. And before you ask: using NFS ever for anything is pilot error.)

Problem: extract supports parallel operation (FORK=1 EXTRACT_ALL=1 ./, and is version-independent (so it can't depend on any _specific_ filename tarballs extract to). So each tarball extract happens in its own temporary directory, and then the directory created under that gets renamed to the expected desination name in the package cache. We don't know what the name of that new directory created by extracting the tarball actually is (it contains version information in formats that vary all over the place, every 2.4 linux kernel was just "linux", squashfs doesn't have a dash between name and version number, gcc-core extracts into "gcc", etc). So we wildcard it away: as long as there's just ONE subdirectory in the temporary directory, we don't care what it's called. If extracting the tarball creates more than one thing in the temp directory (and thus the wildcard expands to multiple things), then the fact the destination we try to rename the wildcard to isn't a directory (because it doesn't exist yet) breaks the build in an early and obvious way, and we know "hey, you forgot to say --prefix=linux/ when you did your git archive versionname | bzip2" or some other way you got a bad archive.

Anyway, so I want to touch a new file "$TEMPDIR/*/sha1sumsfile" from the subshell, which would actually have to be "$TEMPDIR"/*/sha1sumsfile because wildcards aren't expanded in quotes but we need to the quotes in case the absolute path in $TEMPDIR contains spaces.

The reason this doesn't work is that even though $TEMPDIR/* expands to a unique directory, $TEMPDIR/*/doesnotexistyet does not exist. (Because we're trying to create it.) So the wildcard isn't expanded, and touch complains it's trying to create a file in a nonexistent directory named "*".


The fix seems easy enough:

touch "$FILE"/doesnotexistyet

Except wildcards aren't expanded in variable assignments either, we need FILE="$(readlink -f "$TEMPDIR"/*)" which is well into the realm of black magic at this point. (I'd happily use echo instead of readlink -f, but I don't trust it. What does "echo *" do if one of the files in the current directory is named "-n"? Did you notice that echo doesn't support -- to end argument parsing? Oh, and $() trims trailing whitespaces even when you put quotes around it, presumably because $(echo hello) actually has a newline on the end. But I'm just going to assume no package tarball is crazy enough to create a directory name that ends with whitespace characters.)

WHY all this has to that way is sadly non-obvious, other than "it's what I need to do to make it work". Half this blog entry was written so I could figure out how to phrase the darn comment, because oh boy does that need one.

If you were wondering about the difference between "mature" software and "it worked for me"... it's all this sort of stuff. Never seems to run out. The sanitize_environment stuff was added to guard against variables that broke the build for people who weren't me. The immune system code broke the build in other ways, even though it also worked for me. Trimming down busybox instead of building defconfig protected against build breaks in the ever-growing katamari of busybox, but we're still adding back stuff seldom-used error paths need. And the KCONFIG_ALLCONFIG works as a file but not a pipe is somethign I hit when rebuilding the code under itself (because my gentoo server has emerged its way into a corner and the compiler internal-compiler-errors all over the place now, and rather than spend the weekend reinstalling it I decided to just grab a root-filesystem-x86_64 and build under that: which turned out not to work. Been a few months since I tried it. Linux From Scratch still builds, but the kernel didn't...)

Imagine the regressions if I was doing active development on this thing. Speaking of which, I wonder if m68k works in qemu yet...

March 22, 2012

I am informed that Ulrich Drepper no longer works at Red hat. I'm not sure his new employer is a step up.

And for future reference, running qemu with display on a remote server means doing this on the server:

qemu-system-386 -m 2048 -hda fc16.img -vnc"

And then locally going:

ssh user@server -CX vncviewer

The client side can be killed and restarted as many times as you like, it's just a view into the server's framebuffer (with attached keyboard and mouse).

March 21, 2012

Installing Fedora 16 under qemu to reproduce Denys's bug turns out to be a bit of a pain. It didn't like 768 megs of ram, so I tried on the server where I can give it 2 gigs. Then it managed to have three different bugs until somebody on the #fedora IRC channel pointed me to an A) updated B) xfce ISO image. That installed much more easily (Gnome 3 sucks mightily).

Then I had to install glibc-devel, glibc-static, gcc, and patch to get a reasonable development environment. (Remember, Ulrich Drepper works for Red Hat, therefore static linking got chopped out into an optional package because he really hates the concept.)

In theory, once finishes the build gets a little easier. In practice, it died with a "No space left on device" error because I only gave Fedora EIGHT GIGABYTES of disk space and let it use its defaults for the install. (What a pig.)

And attempting to figure out how the disk space is USED is hard, because fdisk says there's one partition, which is a device mapper entry.

Ok, let's play this game: df /dev/dm-0 says it's a 1 gig devtmpfs, same with /dev/dm-1. (Why are there two devtmpfs instances? How do you have a PARTITION be devtmpfs? That's a synthetic filesystem derived from tmpfs which is derived from ramfs! That's like a partition coming back procfs or sysfs.) Let's try "ls -l /dev/disk/by-label", which returns one entry: "ext4" which is a symlink to "../../sda2" (which fdisk didn't list), and "df /dev/sda2" says it's /boot and half a gig (bit wasteful: I let it auto-partition). This implies there might be a /dev/sda1 (even through fdisk says there wasn't), and df says it's another (the same?) 1 gig devtmpfs. Same with /dev/sda3.

Results of investigation: Fedora is insane.

Looks like I have to try again, giving qemu a 16 gig image. (The host has a terabyte drive, presumaly Fedora can only waste so much space before becoming usable or nobody would be _able_ to use it.)

March 20, 2012

Spent last night at the coffee shop in the middle of Fry's reviewing the mode parser Daniel Walter contributed to toybox (getting it about halfway cleaned up). I am also hugely behind on Georgi's patch stack, but working on it. (As I posted to the list: the work Georgi's doing keeps raising design issues, and I need to resolve those. For example, we have three different directory traversal function sets: cp uses my readdir() based lib/dirtree.c, ls uses scandir(), and Georgi's patch stack has commands using the fts_open() family.)

Meanwhile the 3.3 kernel just dropped, so it's time to push out Aboriginal 1.1.2, meaning I need to test that on all targets, and install Fedora Core 16 to debug Denys's weird build issue (more host/target confusion leaking through because the host distro changed), and debug the mdadm build issue (meaning I need to download the mdadm build script that's having the issue and reproduce it from source). Plus I have pending todo items about wrapping cpp and distcc not being reliable, which I've sadly neglected far too long.

Oh, and the more/ stuff (which I'm using again for testing toybox) turns out to always have been subtly screwed up trying to record the build (which itself edits the $PATH, sort of the point of that stage), and when a toybox change broke host-tools I went down the rathole of cleaning _that_ up. (The problem there is really that $OLDPATH is used to mean two different things, which used to coincide but it was a coincidence...)

I haven't even _looked_ at the new uClibc release yet, but that should go in too.

Wound up being up until midnight last night, meaning I slept through my alarm (well, went back to sleep) and didn't get any morning programming time in. Spent an hour dealing with financial paperwork, now back to daylight jobbery...

March 17, 2012

Fried enough that I took saturday to recover and didn't do any programming at all.

March 16, 2012

"GNU/Linux" is an oxymoron: Linux is GPLv2 only, GNU is GPLv3 or later, the two projects cannot share code. Either it's "mere aggregation", or it's a license violation. Pick one.

March 15, 2012

I've got an extremely active contributor to toybox (Georgi) who I basically can't keep up with. They've tested toybox in various build environments I can't currently reproduce (macosx, android bionic, musl) and sent various patches.

This discussion resulted in me yanking the "#define _GNU_DAMMIT" from lib/portability.h (because the FSF has no claim on anything I"m doing and if I can _break_ compatability with the Hurd while remaining standards compliant I will do so, the same way I sprinkle c++ keywords into my C99 code as variable names and such. I refuse to #define _ALL_HAIL_RECHARD_STALLMAN in code that not just ISN'T GNU, but is actively ANTI-GNU.)

I only had the macro in there as a temporary hack to get dprintf() (which I can live without) back beore I mothballed the code, and I replaced it with the feature test macros that SUSv4 actually recommends. (Note that SUSv4 doesn't _require_ feature test macros, this incredibly stupid idea only contaminated the standard for "strictly conforming" applications, whatever that means. It's like the "-pedantic" compiler option, I think.) Of course, this breaks stuff.

Let's back up: feature test macros. The glibc headers nominally require you to #define various things in order to get all the function prototypes and constants the headers actually define. Except they don't _actually_ require it: if you NEVER MENTION ANY FEATURE TEST MACROS in your program, then the headers detect this (with a couple of big #ifndef blocks in features.h) and it gives you _BSD_SOURCE, _SVID_SOURCE, _POSIX_SOURCE, and sets _POSIX_C_SOURCE to 200809L, all automatically.

I.E. most people will NEVER HAVE TO CARE ABOUT THIS. But if you _do_ define any feature test macros yourself, it switches all that stuff _off_ unless you do it manually. I.E. touch the knobs and you disable the autopilot and crash horribly. So in general, people don't do this.

There's one special feature test macro, _GNU_SOURCE, which is the Great Big Button That Switches On Everything. In features.h switching that on automatically #defines ten other feature test macros. Everybody who ever bothers with feature test macros just presses the big Go Away button. In fact things like the unshare(2) man page say "#define _GNU_SOURCE" before including sched.h, even though on an Ubuntu 10.04 LTS development system the actual header guard (in bits/sched.h) is __USE_MISC which gets #defined by _either_ _USE_BSD or _USE_SVID (both of which you get for free if you never mention any feature test macros in your source). I.E. the man page tells you to #define something you don't need to define, which would be the wrong symbol to #define _anyway_, just because it's the big Go Away button for feature test macros.

The unshare() function is part of Linux container support, implementing things like network namespaces. The kernel developers invented unshare() in response to a rejecting the OpenVZ patches: Linus and company designed a different interface for the kernel to do container support, and the namespaces bit is via a brand new system call.

The FSF has nothing to do with this, and is not even aware of it. The Hurd does support anything remotely like this.

Needing to say "_GNU_SOURCE" to get unshare() is _really_ _stupid_. The man page there was simply wrong, if you used unshare() without #defining anything, it worked fine in Ubuntu 10.04.

In current ubuntu, they "fixed" that, so you DO need to define _GNU_SOURCE in order to get unshare(). They took the bug in the man page, and changed glibc to match. Right, I can just #include directly and include the unshare() system call prototype directly in my code to work around the glibc regression.

So now we come back to MacOS X (which hasn't got _GNU_SOURCE), and toybox, which is explicitly opposed to the FSF (the project's license is BSD because GPLv3 was that bad)

And then there's musl, which ONLY IMPLEMENTED _GNU_SOURCE (not any of the other feature test macros like _POSIX_SOURCE which are actually _mentioned_ by SUSv4) because nobody ever tries to _use_ granular feature test macros: they just hit the big Go Away Button.

Both musl and uClibc require you to #define _GNU_SOURCE to get strpcpy. That link is to strpcpy in posix 2008: it ain't GNU and requiring _any_ #define to get it is a bug.

tl;dr: Feature test macros are a bad idea.

March 14, 2012

Well of course Ubuntu is displacing Red Hat. Red Hat abandoned the developer workstation in favor of focusing on the enterprise market, and developers are going to deploy on what they developed on.

Red Hat abandoned its original workstation distro after Red Hat 9 because it figured out how Sun made money. Government and Fortune 500 procurement contracts often cap a bidder's profit at a fraction of the cost of materials, so these bidders spec the most expensive materials they can. They'd rather make a $500 profit off a $5000/seat Solaris license than a $3 profit off a $29.95 retail boxed copy of Red Hat. The engineers would rather use Linux, but the sales force and management insisted on the more expensive component to pad the bottom line. (And of course convinced the clueless _customers_ that obviously Solaris must be superior, since it was so much more expensive. For more on large institutions "protecting" themselves in the foot with both barrels, see this excellent article by Joel Spolsky.)

When Red Hat figured this out (You mean if we raise the price they'll buy _more_ copies?) they came out with the ridiculously expensive "Red Hat Enterprise", everybody switched to the better technology and suddenly their "enterprise" division made ten times as much money as the rest of the company combined. This "tail wagging the dog" situation sucked all Red Hat's attention away from the developer workstation market, which they decided to unload on the open source community in the form of Fedora.

When Red Hat did this, they had a little over 50% of the Linux installed base: desktops, servers, everything. It's what Linus Torvalds himself ran. The default Linux was Red Hat. And then they abandoned the developer workstation.

Unfortunately for Red Hat, Fedora didn't work. They were sort of aiming at something like Debian (without the endless flamewars), and clearly hoped that the community would pitch in and do lots of work for free as unpaid Red Hat employees, without asking for any significant voice in controlling the project. But this is not how open source development works: developers push projects in the direction those developers want the project to go. People are most likely to get out and push when they're trying to steer.

The first year of Fedora development was well-summarized in a parody IRC log: the tension between Red Hat's employee engineers and the would-be volunteers eroded much of the initial momentum. In year five Red Hat vetoed the demand for an independent Fedora Foundation (and in doing so crushed most remaining third party interest). In explaining why no Fedora Foundation would be forthcoming, the main point was "Red Hat has veto power over decisions" and would not be giving it up:

Red Hat *must* maintain a certain amount of control over Fedora decisions, because Red Hat's business model *depends* upon Fedora. Red Hat contributes millions of dollars in staff and resources to the success of Fedora, and Red Hat also accepts all of the legal risk for Fedora. Therefore, Red Hat will sometimes need to make tough decisions about Fedora.

This is a bit like Mozilla back when AOL/Netscape couldn't let go of it, which Jamie Zawinski post-mortemed in his resignation letter. The whole thing's an excellent read, but let me quote a little bit:

The truth is that, by virtue of the fact that the contributors to the Mozilla project included about a hundred full-time Netscape developers, and about thirty part-time outsiders, the project still belonged wholly to Netscape -- because only those who write the code truly control the project.

A similar problem occurred with every open source project Sun ever tried (Java, OpenSolaris, OpenOffice... often exacerbated by license issues and copyright assignment). The company wanted to retain control, but attract volunteer contributors. They wanted obedient galley slaves, free to row in unison alongside their employees but not allowed to nudge the project off course. Open source doesn't work that way.

Red Hat's utter failure to allow outsiders any control over Fedora reduced Fedora to "Red Hat Enterprise Rawhide", I.E. a beta release of the next Red Hat Enterprise, nothing more. In fact, Fedora became _so_ uninteresting that the Centos project was launched to do an open tracking fork of more interesting Red Hat Enterprise.

(How uninteresting is Fedora? Wikipedia considers 70 Ubuntu-based distributions "notable", compared to 19 Fedora based and 11 Red Hat Enterprise based. Those 19 Fedora distros include Fuduntu (combining Ubuntu and Fedora), Yellow Dog Linux (a decade old PowerPC distribution from the days before Red Hat enterprise), sposored projects (K12LTSP is based on Fedora because Red Hat gave them money), corporate-based distributions such as Intel's defunct Moblin project (merged with Maemo to form Meego, and then superceded by either Yocto or Tizen depending on which order you rank the Linux Foundation's quest for relevance), and of course Red Hat Enterprise itself.)

Another failure mode illuminating Fedora's loss of market share was the Solaris x86 debacle. Sun Microsystems' management alienated their development community by repeated killing the x86 version of Solaris, because why would anyone deploy that on a server? But Sun workstations cost tens of thousands of dollars, so developers don't write software on them. Instead they bought cheap x86 workstations, installed Solaris x86, and then ported the results to Sparc hardware when it was done.

The loss of the workstation market shrank the Sun developer pool. The repeated attacks on Solaris x86 strangled it. But Red Hat voluntarily _abandoned_ the Linux developer workstation market.

Red Hat's enterprise distro is not appealing to developers, it's a deployment environment for large conservative instututions. When Ubuntu came along and sucked the developer workstation market away from Fedora (not even bothering to field a server version for its first few releases), Red Hat gradually became a less appealing deployment environment, because it's not what Linux developers write software for.

Red Hat is under fairly standard disruptive attack. Its brand equity among developers has eroded, it's "pointy hair linux" now. It's the distro management uses to run Cobol programs. Small projects deploy something else, and when they grow to be big projects they stay with that something else. Red Hat is still milking brand equity from a dozen years ago.

March 9, 2012

Downloaded yesterday's Rachel Maddow Show to see if it had stopped being "All Abortion All The Time" (with interludes of Bad Science About Nuclear Power).Alas, no. The first 20 minutes or so of the show: solid abortion coverage, then a guest she could interview about abortion. Sigh.

I'm pro choice, but somewhere between the 3-hour special on Dr. Tiller and the entire episode covered to the history of the Kansas anti-abortion movement, I got kind of tired of listening to Rachel go on about it, and stopped watching. Keith Olbermann's job was to be mad as hell and not take it anymore, and he was a lightning rod that let Rachel be calm and intellectual. Now she's trying to rile up her base, and it's exhausting to watch.

Luckily, vlc has a "playback->faster" option that makes the show chipmunk its way through topics, and her last interview of the show was on an interesting topic, which they danced around but didn't quite directly address:

Rich people get rich by cornering the market: not just selling stuff but preventing _other_ people from selling the same thing, thus driving down the price and taking away their volume. Even the relatively good guys like Warren Buffet talk about how great it is to have a "moat around a business".

The current republican "party of the rich" is all about cornering the market. All the New Jim Crow laws are about cornering the market on voting. Taking away rights from poor/gay/female/young people is about cornering the market on rights, so elderly white men are disproportionately represented. The republicans are trying to corner the market on power.

If you don't have the metaphor of "cornering the market", you can't understand the current Republican strategy. They're think they're playing a zero-sum game, where "I can't succeed unless I fence you out". I.E. "for me to win, you must lose."

Their attacks on contraception or freakout about gay marriage make no sense unless you understand this fundamental tenet of their worldview: anybody else doing well in any way is an existential threat to the mindset of this group of elderly white men. They must be surrounded by suffering to feel good about themselves. Only then are they better than everyone else.

This leads to horribly stupid policies. If a meteor was hurtling towards the planet republicans wouldn't try to _stop_ it, they'd try to build domes so the chosen few could survive. The Tories are going out of their way to dismantle an existing successful health care system because tearing other people down is part of their definition of progress. If the world is a zero sum game, all you have to do is attack your enemies and rewards for you must follow; simple math. The last man on earth must be a trillionaire.

March 8, 2012

Somebody asked me if replicant is relevant to my interests. It is not. I see it as about on par with Utoto or maybe gNewSense.

Let's start with pragmatism: ReactOS is doing an open clone of windows (essentially porting Wine to the bare metal). Ever heard of it? Know anybody who's used it?

I'm an ex-OS/2 developer: preinstalls matter. Google licensees are shipping something like 100 million android devices per quarter, and providing those guys with a native development environment is a huge opportunity. Replicant _might_ manage 10,000 installs over the project's entire lifetime, all aftermarket.

Cyanogenmod is upgrading Android to be more useful, and even they've got a tiny fraction of stock Android's market share. Replicant is offering _less_ than stock Android, removing working code they disagree with the license terms of. Nobody who just wants to USE their phone has any reason to do this.

Again, BusyBox predates Android. Toybox is not competing with BusyBox on Android: if Google has had over five years to start shipping android and hasn't done so. The shortest path to getting android users more command line functionality than Android's toolbox provides is to write new code compatible with Android's existing license policies, as issued by Google and adhered to by Android device manufacturers. This is not based on "how an ideal world should work". This is starting with existing reality and plotting a course through it.

(Yes, the toolbox/toybox name thing is confusing, but I'd like to point out I named my project before the first Android phone shipped. :)

March 7, 2012

The Linux Foundation is a really strange bureaucracy. They hold some random invitation-only conference, and then they send out spam emails inviting everyone to "request an invitation". No, really!

One of us is unclear on the concept of "invitation-only". Actually, I think one of us is unclear on the concept of "open source". (And I see having bronze silver and gold sponsors weren't enough, they needed platinum too, because that's the _important_ part.)


March 6, 2012

I got interviewed by h-online about Toybox.

March 5, 2012

My old /usr/bin vs /bin rant has been cleaned up slightly and published by Hacker Montly.

They integrated my corrections about the actual drive sizes: / was a fast but tiny 0.5 meg disk, and RK05 disk packs (on /usr and eventually /home) were very slow 2.5 meg external beasties (not 1.5 megs). So the "3 whole megabytes" line was right, but I had the mix wrong in my 2010 post. (Dennis Ritche's website has primary soruce material on this if you know where to look, I pointed 'em at a couple citations.)

I need to get back to working on my computer history book. I need to tackle my aboriginal linux todo list. I need to get toybox finished. I need to start qcc. I need to get hexagon linux booted on my old nexus one. I need to fix chroot in the kernel. I need to do the "hello world" kernel refactoring. I need to start another penguicon/linucon-style convention. I need to start a podcast to get rid of unused ideas. I need more hours in the day...

March 3, 2012

toybox 0.2.1 is out.

I screwed up the tarball slightly (didn't pregenerate generated/help.h) so it requires python on the host to build. I'll probably do a 0.2.2 next weekend to catch up with the patch backlog and fix that.

March 1, 2012

Still haven't _quite_ got a toybox release out, too much other stuff eating my time. Thinking this weekend...

Got the roadmap updated a bit though. That should replace most of the random todo list stuff, _and_ be the place listing currently supported commands each release. That's also the place to stick this sort of thing...

I feel bad about the contributors whos patches have gone umerged this week, though. Sorry, I'm catching up as fast as I can! (The roadmap has a list of "probably done" commands, and of course I'm re-auditing them all to make sure they _are_. Well a lot of 'em I haven't looked at in 3-4 years...)

February 27, 2012

Heh. Forbes asks "Is the cloud catching up with mightly Oratroll?" and never once mentions in-memory databases or the NoSQL movement.

My reaction to the article is "Couldn't happen to a nicer patent troll", but a quick check of my blog didn't find an explanation of what in-memory databases _are_, and wikipedia is useless here. Since that's what's really killing Oracle, here's a quick primer:

An in-memory database is what you get when all your tables fit in RAM at once. IBM's Search and Query Language (SQL) came from its mainframe R-series databases of the 1970's, which were designed around the assumption you only have enough memory to load one record from each table at a time. So it would load one record from disk out of each table, compare them together, and immediately write the results back to disk. All that "stream" and "join" stuff is based on the idea that you can't possibly have more than one record from each table in memory at once, and seeks are bad for performance!

This stopped being relevant about 20 years ago. Moore's Law has doubled memory sizes, and by about 2000 it was starting to become feasible for small databases to keep the entire thing in memory at once and index everything with hash tables, providing literally a 1000 times speedup over disk access.

In-memory databases are not just three orders of magnitude _faster_, they're also correspondingly _simpler_. Suddenly, your "tables" were just python dictionaries, and the entire database program became a thousand lines of python (800 of which implemented the SQL parser; I saw such a program on sourceforge in 2001).

The obvious implementation goes like this: start with a snapshot of your database (a gzipped Python "pickle" file of your hash tables, for example). Every "write" transaction (a query which updates any records anywhere), gets appended to a log file. (You can fsync() the log, it's linear streaming writes so should be pretty fast.) If the power fails, reload the snapshot and replay the log.

When the log gets uncomfortably long: fork the database. The parent process closes the old log file, opens a new one, and continues about its' business. The child starts writing a new snapshot, freeing its memory as it does so.

If the power fails before the child finishes writing the new snapshot, just read the old snapshot and replay both logs in sequence. When the child finishes writing, it can archive/delete the old snapshot and old log because the next shutdown replay doesn't need them.

The interesting bit of the above is that due to the way "fork" works, this doesn't take twice as much physical memory. The parent and child share copy-on-write mappings of all the underlying physical memory, and as long as the child is freeing its copies as fast as the parent is dirtying pages, the system shouldn't run out of memory. Making this take advantage of SMP is bog standard threading/locking.

This is why the department of defense spent millions of dollars buying a 2.4 terabyte ramdisk back in 2004. It was also a big driver for early adoption of 64-bit systems a year later: people wanted more than 4 gigs of memory to fit their databases in; they could afford the chips but needed more than 32 bits of address space to use it all. Now there's plenty.

The entire "nosql" movement boils down to "Hey, if 80% of our remaining database code is implementing SQL, why _bother_ with that? Why not just use the darn hash tables directly, via shared library or something?" SQL itself is an artifact of the R-Series stream-and-join design assumption, it doesn't fit well with randomly seekable hash tables that return results almost instantly.

If you're familiar with disruptive technologies, this is a textbook one. It started with databases too small to interest the existing players; if your database fit entirely in memory in 2000 it simply wasn't worth Oracle's time. Then medium-sized businesses could do it. Now all Oracle's got left are the largest fish like credit card processing companies and stock exchanges (plus the out-of-touch old fogies who Never Got Fired For Buying Oracle), and when those switch over (or in the latter case die off), what's left?

Oracle's problem is that the decades of "stream and join" design optimization is built on top of an obsolete design assumption, that memory is a transitional state your data passes through because memory is tiny and precious; records must be evicted back to disk as soon as possible. This is no longer the case, and the obsolete read/process/write loop at the hard of stream-and-join means Oracle has the best buggy whips in the world, decades of optimization into hiting the horses _just_right_... and nobody cares anymore.

(The stream-and-join metaphor also doesn't translate well to clusters, so very high end data warehousing really isn't their thing either. Like DEC, they're sandwiched between a relentlessly rising foe and a hard ceiling they can't chip through very fast.)

I've developed a theory that dying business models explode into a cloud of IP litigation, like drowning victims climbing on top of anyone who can swim, pushing them under to buy a few more seconds. SCO did it, now Oracle's doing it, the RIAA/MPAA are infamous for it... This is not a sign of health.

February 26, 2012

Finally got killall.c and kill.c cleaned up and merged.

Fun little corner case: kill lets you do "-s signal" or just "-signal" and lots of signal names begin with s! Thus "kill -stop 12345" gets interpreted as "kill -s top 12345" and it complains "top" is an unknown signal.

To fix that, I added a new option to lib/args.c where you can stick a space after a command letter and it requires its argument to be a separate command line argument, not the remains of the current one.

So now I've stuck a toybox snapshot into aboriginal and am building "TOYBOX=toybox CROSS_COMPILER_ARCH=i686 ./ i686". I had to disable ls because it wants "-di" and "-tc" and who knows what else. And it's doing the xargs segfault thing again. Sigh.

Probably not going to get a release out tonight...

February 24, 2012

People keep asking me about Qualcomm's Hexagon in email, and I keep writing up long explanations and then never them again. So here's the most recent email I wrote on the topic, for posterity. It's been a year and a half, the basics of what we worked on shipped already, the story can come out now.

Keep in mind my contract to work on this stuff expired in October 2010, so these are my vague recollections from over a year ago:

On 02/24/2012 02:28 AM, [REDACTED] wrote:

> Thanks for info. I have already seen this page, but i have tried all
> branches on the 
> and have
> not found any simulator.


They had an in-house simulator that was some horrible proprietary thing they contracted out to a third party to produce, and if I recall right qualcomm's lawyers went out of their way to make sure they _didn't_ get source code because there was a nest of "propreitary is obviously better, duh". I expect that bit them in the ass, because the sucker was useless to the Linux port.

Mostly we just used real hardware. We had these things called "comet" boards that had a snapdragon SOC and ~256M of memory which you could boot and run code on. (No local storage to speak of, I wound up implementing nbd-client in busybox to get some.)

There was an aborted attempt to add hexagon support to qemu (Scott somebody did it, google for "quic qemu" and I think his post to the qemu mailing list is the first hit). Unfortunately, the guy who tried it couldn't wrap his head arount TCG (I think he was mostly a manager), and it never went anywhere. :(

Hexagon is a six stage pipeline running at 600mhz, and the clever thing they did was create six different register profiles and round-robin them down the pipeline, so each pipeline stage is totally independent of the others and they don't need any pipeline interlocks. So it _looks_ like a 6-way SMP chip, and they send NOPs down the pipeline when there's nothing to do (which power the circuitry down completely for that clock cycle). It doesn't do any branch prediction or speculative execution or anything becuase it's designed _not_to_waste_power_, instead each register profile is a separate thread (a bit like hyper-threading, only much simpler). It has ridiculous amounts of parallelism, in addition to having up to six thread profiles in flight at once, instructions are bundled into VLIW "packets", of up to 4 instructions dispatched to 4 execution units. Each execution unit has slightly different capabilities (the last two have the big vector and floating point ops doing the SIMD thing, the first one handles branching, I forget what the second does. They've all got a largeish general-purpose set of instructions they all do too. I had a giant booklet on this explaining the architecture and such.)

As far as Linux is concerned, it's a 6-way SMP chip running at 100mhz, but each cycle it can dispatch up to 4 instructions so you get back closer to 300mhz performance, and then there's the SIMD stuff which makes multimedia things fly. Also, most of the prefetch delays and such happen during the 6 "idle" stages between each batch of 4 instructions getting executed, so it's _really_ good at fairly hard realtime. I'm told that the next generation is going to have a 4-stage pipeline (and thus they're clocking it down to 500 mhz, so it looks like a 4-way SMP running 125 mhz). The lower the clock speed the better the power consumption to performance ratio is, and what they _already_ had was beating the then-current ARM stuff in battery life at a given performance level.

> When i use gdb from windows binaries i see this:
>     /(hexagon-gdb) file DSP2.mbn/
>     /Reading symbols from
>     D:\SHARED\qdsp6\Hexagon_Tools__4.0_windows\gnu\bin/DSP2.mbn...(no
>     debugging symbols found)...done./
>     /(hexagon-gdb) run/
>     /Starting program:
>     D:\SHARED\qdsp6\Hexagon_Tools__4.0_windows\gnu\bin/DSP2.mbn/
>     /hexagonsim_exec_simulation: *Unable to execute hexagon-sim*!/
>     /(hexagon-gdb)/

Sounds like they didn't hook it up all the way.

The snapdragon system-on-chip (which is what you find in the Nexus One and such) actually contains four processors:

1) An ARMv7 "Scorpion" processor qualcomm licensed from ARM and then optimized (at the Raleigh campus, they're protective of their turf, internal politics at qualcomm).

2) A QDSP6 "Scorpion" processor qualcomm developed internally (in Austin). In android this is used as a "multimedia coprocessor", but is actually a powerful 4-issue VLIW general purpose processor with a lot of vector instructions.

3) A QDSP4 (old DSP, does signal processing for the radio side of things). This was an ancestor of QDSP6 the way the 8080 was an ancestor of the Core 2 Duo: it ain't gonna run Linux.

4) An ancient ARMv5 that's the "boot processor". It runs code from flash at power-on, does the DRAM controller init, and then hands off control to either the Scorpion or the Hexagon. having a different chip take control is a question of running different boot code on this ARMv5. (Afterwards in Android it goes off and does signal processing for the radio side of things just like the QDSP4.)

Yes, "snapdragon" and "scorpion" are confusingly similar. (To me, anyway.) Snapdragon = SOC with 4 processors one of which is a Hexagon, Scorpion = yet another ARM implementation.

For the Linux port we powered the other 3 processors down after the Hexagon came up. I really really really wanted a bootloader I could run as an Android app on my Nexus One to boot Linux on the hexagon in that (copy the uboot and kernel+initramfs blobs into memory, kick the ARMv5 boot processor to run that uboot, halt the ARM), but I could never get Richard Kuo or anybody to write one. (And it wouldn't have been useful unless I got at least a usb-to-serial adapter working to get me a serial console, which turned out to be nontrivial. All the Snapdragon peripheral drivers were in the Android tree, but under arch/arm. One of the big things the Linutronix guys were looking at was fishing them out and moving them up to the generic architecture stuff so Hexagon could use 'em. No idea if they ever actually did this.)

Last I checked nobody had written any code to let the Scorpion and Hexagon share main memory: who is accessing what without stomping each other was a thing, so for the Linux port they just powered the Scorpion down completely. (The Android hexagon binary blob can share memory to act as a "multimedia coprocessor", but it does its best to _look_ like a hardware coprocessor, and essentially has dedicated memory buffers handed off the way you hand off texture memory to a 3D chip or sound data to a sound card. All static mappings, I expect. As I said: I didn't really look at that. You could try to objdump -d it if you like. :)

There was occasional talk among the engineers of doing a Snapdragon revision without a Scoprion in it, which would save power and licensing money and die size and so on. But we were careful not to say this where anybody outside the team could hear it because we didn't want the Raleigh guys to start quoting the "From Hell's Heart I Stab At Thee" speech from Star Trek II. (Note: my management mostly shielded me from the politics, so I got this stuff second hand. My exposure to the politics was basically a long list of things I wasn't supposed to talk about outside the department, which meant close your office door because you never know who might be listening...)

We had enough trouble with the lawyers, which were TERRIFIED that this open source stuff was going to undermine their precious patent revenue (the most lucrative of which expire this year anyway, I think). From an engineering perspective "not doing android" simply wasn't an option, and we convinced senior management of this, but the Lawyers insisted that the Qualcomm guys who worked on open source had to be moved to a separate corporate shell ("Qualcomm Innovation Center", I.E. new email addresses but everything else was the same, they didn't even annotate the names on our office doors with the distinction), and then a SECOND corporate shell was set up (the "code aurora foundation") which was a partnership between Qualcomm and Qualcomm with some random lip service from Intel or somebody, so we had a second layer of indirection to wash everything through protecting Qualcomm's patents from Linux. It was _ridiculous_. Oh, and the lawyers Doomed to Exile in Quic were the junior guys who drew the short straw and couldn't _avoid_ getting the GPL all over them. (They tried not to panic in front of us. Again, this was over a year ago and I have no idea if this recollection is accurate or just me projecting onto them based on incomplete understanding of things that were happily Not My Problem.)

> I also have tried to compile these sources with Cygwin. This was very
> hard, because makefile for gdb has errors with looking of some
> libraries.

I did everything under Linux. We were trying to port Gentoo to it, but hit some unnecessary complexity adding a new architecture to their profiles (undocumented stage 1 black magic, and the need to annotate every single package in the tree with every single architecture it supports, which is _crazy_), and wound up just doing Linux From Scratch plus a bunch of Beyond Linux From Scratch packages.

I'm unaware of anybody ever trying this stuff under Cygwin. I was using Ubuntu, the server was running Debian, and I think a couple developers were using Fedora.

> I have manually downloaded and compiled these libraries, set
> direct path for libraries in make file, successfully make all
> *gnutools*, but have another problem again:
>     Administrator@PC6 ~/Hexagon/bin/gdb
>     $ ls DSP2*
>     DSP2.MBN
>     (hexagon-gdb) target hexagon-sim
>     (hexagon-gdb) run
>     Starting program: /home/Administrator/Hexagon/bin/gdb/DSP2.MBN
>     bailing from child.
>     execvp: No such file or directory
>     Switching to remote protocol
>     :15097: Connection timed out.

I know nothing about cygwin. I don't do windows. :)

When the basic Hexagon suport got into Linux 3.1, did they set up a linux-hexagon mailing list? (I told them they really _should_, but I honestly haven't been keeping track since I left. Busy with other things...)

> Also i have seen page, but
> there not exist even a gdb files.


Qualcomm outsourced the Linux upstreaming part to Thomas Gleixner's linutronix. (I may have strongly recommended them: pay one of Linus's lieutenants to do the first code review pass in private so you can fix up the code to kernel standards without embarassment or bikeshedding). Thomas is tied up in various NDAs, but now that most of this has gone upstream he might be able to help you more than I can. I'm a year out of date on all this, and left my qualcomm proprietary stuff at work when I left. (I had Aboriginal Linux building a hexagon native development environment, and then building Linux From Scratch and a chunk of Beyond Linux From Scratch natively on a comet board under that. It booted X11 (remotely, the X server was on another machine because comet had a network card but no graphics hardware) and ran a half-dozen X apps: a terminal, xchess, xeyes, and so on. Left it all behind when my contract expired, I no longer have that code. Oh well.)

Hexagon a _very_ interesting chip and I'd _hugely_ love to see it succeed, but as long as Qualcomm's lawyers are steering the company the engineering team has a giant ball and chain dragging it down. They do great stuff and you never hear about it, because "secret is good". Nobody talks about Qualcomm's secret shame, that they actually have a bunch of really smart guys working for them who do cool stuff. (I'm often frustrated at TI being closed and tangled, but it's a lot less trouble finding out their stuff _exists_...)

> I have found /*80-NB419-1_A_hexagon_v2_programmers_ref.pdf*/ document
> with hexagon command representation and looks like my ELF file
> (DSP2.mbn) has Hexagon-based architecture.

I do remember that "qdsp6-objdump -d" works in the toolchain they shipped. (Even the really old ones.)

Note: the qdsp6 name is because Qualcomm's 6th generation digital signal processor grew legs and became a general purpose CPU, _while_ still being good at all the DSP multimedia stuff. The hexagon rebranding is fairly recent (the name's due to the 6-stage pipeline and thus 6-way SMP, even though the newer ones are gonna be 4-way. :)

The big upgrades they did to the toolchain to support Linux were:

1) Add dynamic linking support,

2) Port uClibc and glibc to it,

3) Forward port from gcc 3.4 to something reasonably current.

If all you want is a binary blob to run on the chip, the old toolchain should be fine.

A note on the MMU: the chip hasn't actually got one. Instead it has a set of Translation Lookaside Buffer slots loaded by software. They made a binary blob that acts as an MMU, and their snapdragon port of u-boot (running in the armv5 boot processor, I think) loads this blob and hooks it up to the page fault interrupt so it acts as a software mmu (which their Linux port then depends on).

The lawyers have ADAMANTLY refused to release the code for that because they've got multiple patents in it that will _never_ stand up to scrutiny because software MMUs like this were commonplace back in the 1980's, but as long as you can't examine it you can't collapse the quantum state of the patents and thus they MIGHT be enforceable. (The 3D accelerator guys do something similar: everybody has patents that cover everybody else's chips, but as long as their drivers are closed enough you can't examine them and _show_ that they haven't got some clever way to program the chips that _doesn't_ violate your patents. The open source guys reverse engineering your chips and writing open drivers for them that violate a dozen patents is no problem because that doesn't show that the _closed_ drivers didn't have a workaround for that patent, and the patent-violating open soruce drivers are obviously programming the chip wrong. Plausible deniability! Also an insane waste of time so trolls can leech off the work of others to perpetuate concepts left over from Gutenberg's original hand-cranked printing press, as greedy rich people come up with yet another way to corner a market.)

Also note that the GPL doesn't bite unless you distribute binaries, meaning this code being GPLed translates to an awful lot of stuff that would be trivially easy to reverse engineer if they just shipped a binary never gets let out of the company because it requires an Enormous Legal Review Process to do so, and the political capital to push it through just ain't there. So "we got this to work, but you'll never see it, because it's GPL". (Yes my employment contract can say I can't release this binary outside the company. GPL is about copyright law, not contract law. The _license_ can't require it, but I can personally agree to an individual restriction before they give me the code. You want to argue that with Qualcomm's lawyers, be my guest. I'll be over here.)

The main problem with the Hexagon variant in the existing Snapdragon chips (um, QDSP6v2 I think) is that they don't have enough TLB slots. If you run the full 6-way SMP doing gcc compiles and such, it thrashes the hell out of the cache and slows itself way down. The performance "sweet spot" for that turned out to be around -j3 or -j4 in my testing. I don't think we found this out soon enough to fix QDSP6v3 (although since that's the 4-stage 500mhz variant, it puts less presure on the TLB anyway so is more or less in the sweet spot of what they _do_ have). But QDSP6v4 (in development when I left) adds lots more TLB slots which should greatly improve performance under Linux. Assuming anybody ever uses it under Linux.

The existing multimedia codec binary blob that Android runs uses a lot of static buffers with hugepage mappings, so it doesn't hit the TLB slot imitation. It's all hand-crafted assembly, really. (I assume so, I never looked at it.)

> I have succesfully decompiled
> a little peace of code first with a paper and pencil :), then have
> decompiled elf sections with objdump, but it's very hard to reverse
> entire algo without simulator.

"objdump -d" - disassemble.

> Please, help me, if you can. Maybe, you have any another tools,
> or useful advice about these tools.

I'd love to see Hexagon succeed. I spent half a year working to get Linux on it. The engineers in Austin also greatly want to see it succeed, but they're at the mercy of A) Qualcomm's legal department, B) politics with the Raleigh guys behind the Scorpion yet-another-ARM-chip, (those guys are very proud of their yet-another-ARM-chip. I forget why...), C) quarterly budget renewals. (My contract ended because they couldn't get the funding to renew it without a 2-month gap, so I went off and did other things.)

The reason the lawyers are in charge at Qualcomm is due to an accounting trick: all the patent licensing revenue is credited to the legal department, but all the R&D costs of coming up with that technology in the first place are billed to engineering. So even though engineering makes way more gross revenue than licensing, it looks on paper like licensing brings in 3x the net revenue that engineering does. And thus senior management listens to the lawyers three times as much as it listens to the engineers. Sad really.


February 23, 2012

Over the years I've wasted a lot of time talking to people whose minds are already made up, and are merely trying to figure out how to let you have their way. People who don't seem to believe there _is_ a legitimate alternate viewpoint, and are just trying to find a way to explain to us how we're wrong that we'll accept. People who are not there to _be_ convinced of anything.

I'm always trying to figure out "how am I wrong this time". Sometimes it's because I haven't explained myself properly, but often it's because I'm taking the wrong approach, have the wrong goals... I am wrong a lot.

This is why I expect I was wasting my time once more commenting on the recent lwn article where a lawyer reports on his meeting with a conference organizer, and thus the busybox gpl stuff is all resolved now. (I guess if we could get a pizza delivery guy to talk to a random janitor somewhere in rural Oklahoma they could bring about arab-israeli peace? People who ship BusyBox still get sued because of it, Android's still excluding GPL code from their userspace... What's changed, exactly?)

I admit commenting on "the story is over" with "why was this ever news" is sort of counterproductive anyway, but this whole thread was irritating because the original blog post that set this whole mess off was based on more than one false premise:

  • Tim isn't behind toybox, I am. Implying Tim (or Sony) is some sort of puppetmaster is both untrue and insulting.

  • Busybox is way older than Android, so if Android's 4th release still isn't shipping BusyBox, it's unlikely to start. If some combination of GPLv3, the busybox lawsuits, and the FSF's insanity has rendered GPL in userspace unpalatable to android (which is Google's _official_policy_ here), and waiting around for 5 years didn't make them change their minds, and they're shipping a BILLION devices: there just _might_ be a viable niche which a new project can legitimately address.

  • I'm writing new code to obsolete my old code, and people who didn't contribute any code to _either_ project are freaking out about it. So what? Why is this news? It's pure FUD trying to bury the new project and hold out a little longer just in case 5 years isn't enough to confirm the Android guys are serious about not shipping busybox. (Even though there are already 3-4 other projects trying to fill this market vacuum.)

  • How would it be bad if Sony (or whomever) _was_ involved? "If you don't like it, write your own code" is exactly what toybox is doing, and it's exactly what's pissing these people off. That's not "standing for freedom", that's insisting on control over the actions of others. These guys are hypocrites.

  • Patent trolls and copyright trolls are roughly equivalent. Everybody just assumes the lawsuits were a good thing, nobody's questioning whether or not they might have been a net negative.

  • How is "let's switch away from the obvious lawsuit factory" any different whether it's BusyBox or SCO or Oracle you're switching away from? Deciding to _stop_ doing business with somebody is not synonymous with violating their IP, even if "we have patents on other stuff, we'll still sue!" Great. If you want to sue people over the kernel, then do so. If the kernel guys feel that _not_ making "Linux" synonymous with "Lawsuit" it could be because they're _very_smart_.

That last bit I find really annoying. I was the guy who initiated the BusyBox lawsuits. Me, personally. I set the process in motion, I recruited the other people into the action, without me they would not have happened. (Erik was busybox maintainer for years, notice how the lawsuits started _after_ I took over?)

I no longer believe those lawsuits were a good idea, for a number of reasons. (One of which was that the SFLC assured me they had distanced themselves from the FSF... then got back in bed with them to sue Cisco in 2008.) There are excellent conversations to be had around that, but it's apparently not a conversation anybody (other than me) wanted to have.

Instead you get a bunch of hysterical armchair admirals with no copyrights worth enforcing (because they haven't written any interesting code anybody feels the need to use), screaming about how will the poor defenseless kernel survive without busybox to protect it. The kernel literally has a THOUSAND times more contributors than busybox (which hasn't even got one full-time employee working on it: nope, not even Denys). Not only are they far more _capable_ of suing people if they chose to, but Linus still hasn't put his foot down on binary only modules.

The big success in busybox releasing code was the the 2003 negotiations with Linksys that spawned OpenWRT, but that wasn't a lawsuit. GPLv2 gave them leverage, but it was all "walk softly and carry a big stick". Hitting lots of people with the stick broke the stick, and made people distance themselves from your stick. It was a _bad_thing_. (GPLv3 is a board with nails in: I'm not carrying that anywhere, nor going near it.)

February 22, 2012

Sigh. Ok, I know it's a public wiki, but dude.

So the entire first page of what was the toybox roadmap is now an advertisement for some BSD project, and you have to scroll down to see Toybox is even mentioned. They did no new requirements analysis other than strongly implying they're perfect for the job, despite BSD's ffs and berkeley packet filter being completely irrelevant to Android. They're "proven capable to replace busybox, in general", with 4 fewer commands than Toybox's last release. And now they're the only thing you see when you load that page unless you scroll down a lot, apparently attempting to leverage the work I did while simultaneously obscuring it.

I wonder if they know that gluing together a bunch of existing command implementations into a single binary is not where you _stop_? That the whole _point_ of the exercise is the extensive cleanup and refactoring to simplify and maximize code sharing?)

Sigh. It's an open wiki, obviously they have the right to promote their own project there, and it's not my place to remove stuff other people wrote. But it's also obviously unsuitable to use that page as the toybox roadmap. Let's see...

Needs more cleanup.

February 21, 2012

Working on release notes for _another_ toybox release, because the amount of progress since the last release is kind of impressive. (Very little of it done by me...)

February 20, 2012

I am behind on everything...

February 12, 2012

The "inbox" on gmail accumulates a lot of bounce notifications for spam sent to Japan in my name for some reason, and every couple weeks I remember to go in and clear them out. (Email that's actually to my address gets moved to another folder, this is the bcc: stuff that's not to a recognized list, I.E. almost entirely spam.)Thunderbird's stupidity continues to amaze me. This is an imap folder (not a local folder filters copy stuff to), so when I go in and delete messages it A) freezes for seconds at a time, B) manages to do about two deletes at once even when I hit five (about as many as I can see on the screen at once with their amazingly horrible space-wasting layout and the font size I use).

But those aren't the weird bit. The weird bit is: C) download 2767 message headers, over and over and over. The number isn't even going down, it just sits there, downloading the same message headers, over and over. It takes about 10 seconds each time, and then it does it again. I assume taht since I queued up about 300 message deletions, it's going to do this pointless activity 300 times. At 10 seconds each, this is most of an hour of UTTERLY useless network thrashing.

That's a pretty good summary of thunderbird, really. Some weekend, I need to install a real mail program. (It's moving my email filters over I don't look forward to...)

Oh, it also loses track of folder locking. Even after it finishes with the network (such as me switching the network card _off_, perhaps by having suspended the laptop before its' hour of pointless thrashing was through), it insists something is still using that folder and therefore messages I read don't switch from "unread" to "read" and I can't delete any of them.

The fix for that is to kill thunderibrd and restart it. Really, the "and restart it" is negotiable, I need to install a real mail program. This one has SO many things wrong with it, you'd think nobody'd ever actually _used_ it before.

(Perhaps it's meant to only be used with fetchmail or something, and the imap functionality is vestigial? It certainly isn't _tested_...)

(Alas, the kill thunderbird and restart it also kills any half-finished replies you may have been composing, which I tend to ahve several of up at any given time. I keep forgetting this. Kmail would save them and pop them back open when it restarts. Thunderbird doesn't consider email worth reliable delivery or archiving; it treats it the way Twitter treats tweets. "A mere sideline to what we actually do, whatever that is.")

February 11, 2012

It's always hard to resist correcting people being wrong on the internet, especially when it's personal. (I point out that pressure put on Cisco in 2003 _without_ a lawsuit was far more effective than the lawsuit in 2008, and the reply is "You're wrong, they did sue, in 2008". No really. Apparently the 2008 lawsuit reached back in time several years to retroactively provide the basis for OpenWRT and such. Good to know.)

But I'm not feeding the trolls over there anymore. I've decided to shut up and show them the code. Have a toybox release.

February 10, 2012

If you wonder why I think the FSF zealots do more harm than good, the threads on toybox are exhibit A. The loonier members there are literally taking the same position as SCO, that a program which replaces another program is automatically "infringing".

Whether it's just FUD or they actually believe it, the public statements are that you _cannot_ legally compete with any existing piece of software using a fresh implementation. "We like the old thing enough that the new thing must somehow be illegal." Under this theory, FireFox on Windows infringes Internet Explorer, and OpenOffice infringes Word, Cyanogenmod must be a "circumvention device" for phone company lockdowns, and so on.

These guys are "on the side" of Linux developers the way BMI and ASCAP are "on the side" of musicians. In the more mature content distribution industries like music and books and video, we have a wealth of evidence (link link link link link link (and link link) link link) that attempting to "protect" content on the internet tends to be hugely counterproductive.

Luckily: SCO lost. Not to the FSF: to _IBM_.

The point of copyleft was to turn copyright against itself. Getting comfortable with copyright enforcement suits to the point where you miss them when they're gone means YOU HAVE TURNED INTO WHAT YOU FIGHT. You're defining yourself by what you hate. That's _sad_.

Proprietary software didn't _exist_ prior to 1983, when the Apple vs Franklin decision extended copyright to apply to the binary ROM images Franklin's Apple II clone had copied verbatim. (Before that, copyright didn't apply to binaries, here's a 1980 audio interview with Bill Gates (transcript) about his efforts to lobby congress to change the law.) Once the law _did_ change, for-profit companies everywhere jumped on the new status quo, from IBM announcing from now on it would distribute Object Code Only (which was not well recieved), through AT&T closing up Unix, to the Xerox printer driver that drove Stallman to start the FSF.

The FSF was a conservative reactionary attempt to defend the status quo against changes in the industry, and wasn't even the only _unix_ based attempt to do so (BSD, Minix, Linux...) The GPL was an attempt to take the bull by the horns and steer. But that was 30 years ago: the disadvantages of proprietary software (abandonware, unfixable bugs, version skew, winner-take-all markets) became apparent to most players in the industry even _before_ the rise of the internet. In a lot of ways, proprietary software was an experiment that ran its course. Javascript isn't open because of the FSF, it's freely viewable because THAT'S WHAT WORKS. Do we really still need to be riding a bull?

When the DMCA passed, everybody thought it was a bad thing, now FSF advocates are trying to not just wield it but advocate aggressive interpretations of it. I'm sure if SOPA passed, a decade later these guys would do the same. That's going in the wrong direction.

The internet has fundamentally eroded the concept of copyright, as I wrote about a dozen years ago. Copyright arose with the printing press, which rendered irrelevant the church scribes' hand-written illuminated manuscripts. This broke their monopoly on literacy, and during the transition many people were put to _death_ for translating the bible into local dialects so people could read it themselves.

Now we've got the internet, which is the printing press all over again. If the printing press was as big a deal as electricity, the internet is room temperature superconductors powered by cold fusion. It changes everything _again_, and copyright _no_longer_makes_sense_ in the new context. The entrenched corporate interests are staging a new inquisition to stop the world from spinning, but all they can do is delay the inevitable.

Most people under the age of 50 take for granted that the RIAA and MPAA are ludicrously misguided dinosaurs struggling to slow their descent into the tarpits. But the people who want stronger GPL and more lawsuits, and think they NEED this in order to propser? They are on THE WRONG SIDE of that exact argument.

In moving from GPL to BSD, I'm asking: do we really _need_ to live in gated communites? Is the outside world really so terrifying we must fence ourselves off from it? GPLv3 is a concrete bunker full of canned goods because the previous gated community wasn't _secure_ enough, and I don't want to live there. Same way I put a creative commons tag at the top of this page (copy what you like, it's polite to attribute it).

Because dude: link link link link link link, link link... Why are we still arguing about this?

February 9, 2012

Debugging is frustrating. I'm trying to track down the xargs segfault that happens when I build aboriginal with toybox in host-tools (all the defconfig commands overriding the busybox commands), ala:

cd ~/aboriginal/aboriginal
hg clone
rm -rf build
mkdir -p build/packages
ln -sf ../../toybox build/packages/toybox
TOYBOX=toybox ./
  ./ i586 2>&1 | tee out.txt

When I do this, I get:

[8426193.532368] xargs[4542]: segfault at 0 ip 00007f4e3f21487c sp 00007fff4cb7a238 error 4 in[7f4e3f107000+15d000]

So in another terminal I ran:

while true
  sleep .2
  [ ! -z "$(dmesg | tail | egrep '(segfault|protection)')" ] &&
    killall make configure

And then looked at build/logs. This helped me track down the sort bug, but the xargs one _seems_ to be the xargs pipeline at the start of sources/sections/, and I when I extract that and ran it I _think_ I got a segfault once... but not on subsequent runs.

Heisenbugs! Always a pain...

February 8, 2012

Finally found where archived messages live in the Thunderbird UI. It's not in any of the menus (pull-down or pop-up) that's a red herring.

There are left pointing triangle and right pointing triangle buttons between "All Folders" and "Quick Filter" in the UI. (Under the "Tabs? Why would it have tabs?" level, which are under the "green arrow pointing into a drawer", "inverted carat", "paper with a pencil", "some kind of book maybe", and "dogtag with another carat" icons, which are under the pulldown text menus. (Cluttered UI much?)

If you hit the _left_ triangle you get a different set of folders, one of which is "archive". Which contains, in a flat view, every message ever downloaded by the system, including the spam. Plus _extra_ copies of messages that have been accidentally thrown in there. You can distinguish the extra copies because they actually have bodies when you select them, as opposed to an "oh no, there wouldn't be hot water _today_" message on all the others.

Thunderbird: they put an amazing amount of EFFORT into sucking this badly.

I miss kmail. Too bad it was glued to a desktop that became unusable, and got sucked up into a katamari with calendaring and rss feed reading software I didn't want to use....

February 7, 2012

I really should thank the guy who blogged about how upset toybox made him. I have contributors now! I'm actually having a hard time keeping up with all the code review I need to do...

If I had a book, I'd encourage him to publicly burn a copy. Closest I can think of is that I'm responsible for maybe half the material in "The Art of Unix Programming", if that's of interest...

February 6, 2012

I really need to find a better mail program than thunderbird, because its developers are crazy.

The random 4 minute hangs while it decides to go contact the network and refuse to even repaint are annoying enough (even though I told it to NEVER periodically fetch mail, it does so at random intervals anyway. Usually giving it focus reminds it to do so, and thus it

And of course if it's interrupted (such as switching the network _off_ so whatever it's doing HAS TO FAIL so I eventually get control back), it drops empty messages in all my folders with no title and no body.

Contributing to this delay is the fact that A) google's gmail imap server is slow, B) it's got some insane O(N^2) algorithm on folder size, probably doing a quicksort on already sorted content or some other equally "we don't understand how this actually works and are using it in a naieve manner" nonsense. Yes, I have 144,500 unread messages in linux-kernel. I hardly ever GO into that folder because I haven't got the time for this mail program to grind away trying to open it. But YOU SHOULD NOT RANDOMLY RESORT IT WHEN I'M NOT IN IT AND YOU'RE NOT FETCHING MAIL! (I can see the CPU spike and stay at 100% for minutes at a time. I've got a little bar graph that shows it to me. Stop it.)

There's a workaround for this: "file->offline->work offline". Then it only OCCASIONALLY randomly decides to talk to the network. Without that, thunderbird would be completely unusable.

But the thing that's _dangerous_ is the "archive message" option in the right click menu. If I right click on something, it will pop up a menu, usually after about a 3 second delay for thunderbird to be slow and bloated. UNLESS it's too far to the right on the screen, so that the cursor is over the menu when it finally gets around to processing the "release" part of the right click. Then it'll randomly select whatever menu item the cursor was over and execute it.

When it starts randomly popping up a reply window or something when I wanted "make this as unread", that's easy enough to undo. But "archive" means "hide this message from me so I can never find it again". Where do archived messages _go_? I've googled, but nobody seems to know the answer. The only way I've ever found them again was to remember some snippet of text from the message and do a find | xargs | grep on the thunderbird data directory, where I can find the raw text of the message and copy it out by hand.

I rant about this because it just ate another message, and I've given up trying to find it again..

February 5, 2012

Alas, did not get a toybox release out this weekend. Prepping for a release found too many fiddly little bugs I want to fix first.

February 3, 2012

Corporations are not people, corporations are machines.

A corporation is a machine the same way an aircraft carrier is a machine: many people must show up to work to operate the machine every day, and the machine can't do anything without them. An airplane and an airline are just different kinds of machines.

Treating a corporation as a person is no different than treating a car or a building as a person. "Don't blame me, it was the car that ran over that pedestrian. Don't blame me, it's the building's doors that were locked when it burned down with all those workers inside."

People drive the machine, and part of the machine is a uniform that people can put on to act on behalf of the machine. Some people who wear the uniform are "just following orders", and will do anything as part of their job.

But the people driving are the ones who rose to the top, often because they want to win at all costs and are willing say or do anything to get what they want. "Of course I'm not HIV positive, baby, come to bed." Some people will say anything to close the deal, to get someone's money, foreclose on their homes, strip-mine national parks, all while eating whale sushi with a side of bald eagle... because they can.

We wind up with crap like "Citizens United" not just because the people steering these machines find them more useful if they're treated as people, shielding the driver from responsibility for who the car hits. It's that the people arguing for this honestly see no difference between people and machines.

You don't have to be a sociopath to become rich, but it does eliminate the temptation to give your money away to others less comfortable than you are, so there's a bit of selection pressure involved at the billionaire level. And being a sociopath means you can hire lawyers, who are just following orders, to argue that machines and people are exactly the same.

February 2, 2012

Crazy busy with work and the second half of 40th birthday celebrations tonight at the Drafthouse and friends having emergencies du jour... Probably won't get a toybox release out before the weekend.

(And of course uClibc 0.9.33 ships the day after I release Aboriginal 1.1.1. Yay, but now I have to go test it...)

February 1, 2012

Happy birthday to me, just turned 40. I am old.

Somebody blogged about how my Toybox project is obviously a plot by Sony, even though I've been doing it on and off since 2006, been publicly disgusted with GPLv3 just as long, and repeatedly blogged about how either Android or iPhone is going to replace the PC and I'd much rather it be Android. (For the record Sony hasn't paid me a dime for Toybox, although it would be nice if they did.)

I attempted to explain to him that he was simply _wrong_ in lots of comments on his blog, and on the story on it (well, the first one, anyway), but by the time the h-online story came out about it, I decided to just shut up and show them the code. Working on cleanup for a release now.

I am sorry I lost my temper (again) when forcibly reminded that The Failure of Open Source still exists. I get annoyed when people who don't write open source software try to tell those of us who do how to go about it, and when it's all somebody does for a decade and change? Gets old.

Like me, apparently. Birthdays...

January 29, 2012

SUSv4 continues to be full of subtle assumptions and missing pieces. For example, with xargs the -L option works on "non-empty" lines but doesn't specify whether a line containing whitespace is "empty", or only zero length lines are. (In this case it mentions trailing whitespace indicates continuation, so I guess a line with only whitespace has to be empty due to the continuation rule. I presume trailing whitespace on the last line is not an error.)

January 28, 2012

Finally got xargs checked in to toybox. It's only got the basic -ns0 options (yes, -0 is a basic options, had to be designed into the tokenizing), but at this point filling out the test suite is the hard part.

What bit me in the test suite? I wanted to use "ls -w" as part of one of the tests, which turns out to be a gnu/dammit extension, and thus doesn't hold weight. Like so much the Free Software Foundation does, ls -w turns out to be utterly useless because they didn't think it through.

When ls's output is a tty it detects the screen width and wraps lines, as required by SUSv4. The -w option is described in the man page as "assume screen width instead of current value", so I should be able to test ls -w against xargs -s and check that the wrapping decisions match, right?

Here's the stupid part: -w only works if you have a tty. If you don't have a tty, the -1 option (list one file per line) is implied. So if you go "ls -w 80 > filename" it does one entry per line just as it would if you hadn't specified -w in the first place. I.E. the most obvious use for -w is exactly where it DOESN'T WORK.

The FSF still sucks at engineering, because it's not what they do. The FSF is somewhere between a religious organization without an invisible friend to venerate, and a lobbying group that never buys appointments with politicians. The common element of those is fundraising, and the Linux Foundation's got them beat there. (Both organizations sponsor a bit of engineering development, and presumably budget it as a marketing expense.)

January 27, 2012

I've been collapsing /bin and /usr/bin together since forever, and now it's a thing, so I linked to one of my old off-topic busybox posts about it in the LWN discussion comments, from where it got scooped up by Lennart Pottering, and from there became the top "story" on y-combinator, where I do not have an account so can't comment on the the fascinating discussion of my offhanded historical blathering.

I usually have to go to great lengths to track down and mirror computer history stuff, always nice when it comes to me. Although if somebody's going to say I "got many other details wrong" it'd be nice to list them. Good to know that Sun specifically was to blame for /opt, I need to research Sun's "project Lulu" which is going to be hard given Robert Young's Lulu occluding the search term. I remember the bit in "Under the Radar" about it, and the Larry... (Wall? Page? the bitkeeper guy) treatise it links to. I should re-read them.

The backstory is that I started symlinking /bin /sbin and /lib to their /usr counterparts in the yellowbox days back at WebOffice (2001 or thereabouts) because it made read-only vs read-write tracking easier. (All the read-only root filesystem stuff was under /usr, all the writeable stuff was under /var, and everything else at the top level was a symlink or mount point for a non-block backed filesystem.) Keep in mind I'm the guy who wrote up the first initramfs documentation: the first clear /bin vs /usr/bin explanation I got was in the tutorial workbook from Atlanta Linux Showcase 1999 (which I've still got), but clearly initrd obsoleted that split even before initramfs existed. (Linux 0.0.1 had kernel/blk_drv/ramdisk.c already.)

But I also did it because computer history is a hobby of mine and I learned backstory of _why_ /usr/bin and sbin and lib happened: Their first hard drive was only half a megabyte, they added a bigger but slower 2.5 megabyte RK05 disk pack on /usr for the home directories, but the root disk was so small it leaked into /usr, and when they got a third disk (another RK05 disk pack I think, I need to track down the references) they mounted it on /home and gave /usr over to the system. Keeping commonly used binaries in /bin and /sbin was because the first disk was _faster_.

All of this was an implementation detail of their original PDP-11 system circa 1972. It made perfect sense for Ken and Dennis to do, it _never_ applied to Linux running on PC hardware in any way. When I got taught it at ALS in 1999 I went "huh", because I am weird. (This is also why I'm so slow learning new tools: I question assumptions down to the bedrock on a regular basis, and am not COMFORTABLE unless I understand WHY we're doing stuff. Sometimes it's good, sometimes it's incredibly inconvenient. Oh well.)

Mirell's now informed me that I need to actually _write_ the computer history book I've been meaning to do forever, rather than just blogging about it. I have boxes of unscanned magazines, books to read, so many links to collate... I need to dig up the old proposed topic index I wrote up over 10 years ago. (Great thing about computer history: your old todo lists don't go stale. It's still history, now even more so.)

January 25, 2012

Why Linux on the Desktop will never happen, part eight thousand, four hundred, and seventy two:.

Finally rebooted my netbook today (instead of just suspend/resume) because too many things had stopped working. The network fails in low memory situations with a panic in dmesg saying it can't allocate memory. (Why the network card is trying to _allocate_ memory for each packet instead of having some static buffers is one of those unanswerable questions.) This fix is "sudo insmod -r iwlagn && sleep 1 && sudo insmod iwlagn" repeated several times because for the first couple dmesg says a watchdog timer dies after 4 seconds of trying to bring the card up.

The sound died too, but that's something like 15 separate modules with a dependency hierarchy I've never managed to work out, so I can't just rmmod and insmod that. So I did without sound for a while... until the network rmmod/insmod trick stopped working last night. Now it was giving me the timeout message _twice_ on each cycle, something was horked, so... reboot.

On reboot, it prompts me to log in, and won't let me. I note that I selected "shutdown" from the menu rather than just holding the button down; I was _nice_ to the system, which was probably my mistake.

So I ctrl-alt-F1 over to a text console, log in there and dig up .xsession-errors, and it says:

/etc/gdm/Xsession: Beginning session setup...
Setting IM through im-switch for locale=en_US.
Start IM through /etc/X11/xinit/xinput.d/all_ALL linked to /etc/X11/xinit/xinput.d/default.
/usr/bin/startxfce4: X server already running on display :0
<stdin>:1:3: error: invalid preprocessing directive #Those
<stdin>:2:3: error: invalid preprocessing directive #or
<stdin>:3:3: error: invalid preprocessing directive #Xft
<stdin>:4:3: error: invalid preprocessing directive #Xft
xrdb:  "Xft.hinting" on line 13 overrides entry on line 6
xrdb:  "Xft.hintstyle" on line 14 overrides entry on line 7
xfce4-session: Unable to access file /home/landley/.ICEauthority: Permission denied
XIO:  fatal IO error 104 (Connection reset by peer) on X server ":0.0"
      after 38 requests (37 known processed) with 0 events remaining.

Obviously the solution is "sudo mv .ICEauthority .ICEauthority.bak", we can totally expect any random end-user to know how to do that. Just as they'd know about ctrl-alt-F1 (which the kernel developers keep threatening to remove).

I note that even I have no idea what .ICEauthority is _for_. It's bound to be more of that unnecessary selinux/dbus/hal style crap Linux has been growing. Lennart Pottering recently published part 12 of his "systemd for administrators" series on LWN. Anything that requires a 12 part series to explain should not be on my system.

If you wonder why I'm not particularly worried about Android obsoleting vanilla Linux: from a usability perspective they honestly can't make it much worse.

January 24, 2012

The Obama administration wasn't content to take away Habeas Corups but recently removed the fifth amendment as well, which combined with the "oh yeah torture was fine" stuff means this is no longer funny.

This is why leaving war crimes unprosecuted was a bad idea: he didn't cauterize the wound, and so the infection continues to spread...

January 20, 2012

Banging on the long-delayed Aboriginal Linux release again, which is in part blocked by the fact that mips' network connection went away, so the emulated mips system thinks it can use distcc but fails when it tries (and then overloads because it hasn't got enough memory for a fully local -j 3 build without the OOM killer going off).The problem actually isn't the kernel upgrade, the problem turns out to be the QEMU upgrade. It worked in qemu 0.15.1, and didn't work in 1.0, and git bisect tracked that down to commit 5632ae46: "mips_malta: move i8259 initialization after piix4 initialization".

Not quite sure how to deal with that: "use qemu 1.0 for x86 because the emulator command name changed, but use 0.15.1 for mips because there's a blocking bug". Hmmm...

January 18, 2012

Darn it, musl is lgpl, and thus pointless. I was looking at it as a potential bionic replacement to complement toybox as a toolbox replacement, but it won't. Android allows no GPL in userspace.

I'm excited by the opportunity for toybox to replace toolbox on a billion machines and become the de-facto standard when the smartphone repeats the mini->micro transition and becomes the new default computing platform.

But I mothballed Toybox was pointless back when it was just fighting for a fraction of BusyBox's market share, which was fighting for a fraction of glibc's market share, which was fighting for a fraction of Windows' market share.

Musl is fighting for a fraction of uClibc's market share, which is fighting for a fraction of glibc's market share, which is fighting for a fraction of Windows' market share. Good luck with that, I have other things to do with my time.

Toolbox and Bionic are both weak-ass stubs Google did just enough of to run Java, and no more. They are crying OUT for replacement, and since I was one of the big reasons for BusyBox's success (not the only one, but I turned an embedded-only toy into a general purpose program), I'm in an excellent position to do a Toolbox replacement.

But that replacement won't be GPL. The Free Software Foundation has poisoned the GPL (I commented about that on recently), which is why GPL use is declining faster than ever. Android's "no GPL in userspace" policy explicitly includes LGPL.

And that isn't just a Google thing, it's everybody building systems around Android too. (This includes my day job: previous versions of their product included stuff like BusyBox, but that stopped last year and none of the new stuff we're working on is allowed to. They're pretty typical.)

Back when Eric Raymond and I wrote the 64 bit transition paper, we focused on the wrong transition, a mistake I acknowledged last year while explaining some of the things I'd gotten wrong in that paper. Yes, the switch from 32 to 64 bit PCs was our chance to break into the PC desktop, and we blew it. (If incremental change between transitions was possible either OS/2 or the first 30 years of Linux would have at _least_ clawed their way above 2% market share.) But the PC desktop is going away in favor of the smartphone desktop, which is scaling up via tablets and USB docking stations to displace the PC the way the PC displaced minicomputer terminals in the 70's and 80's. That transition is a race between iPhone and Android, with vanilla Linux and the Gnu/dammit stuff so far back you can't even see it.

Comparing the PC to phone transition to the earlier minicomputer to PC transition, the smartphone has just emerged from the Commodore 64 vs Amiga vs Atari 800 scramble. The macintosh and PC of this generation have now emerged. Google is the IBM of this era, somewhat locked down and surrounded by a cloud of followers that innovate a bit but not so far they lose compatability. If I can convince this generations equivalents of Compaq and HP and Gateway to all do the same thing, then Google might take it up the same way Linus Torvalds finally merged squashfs when it became ubiquitous. (Basically, because his reluctance to do so had ceased to matter to anybody but him: it was universally used anyway.)

Sigh. It wasn't the Musl developers' fault I got excited without checking the details. It's a bit painful to see an enormous missed opportunity right next to a corresponding waste of time effort and talent, like watching someone dying of thirst crawl right past an oasis. But as they say, "you can lead a horse to water"...

January 17, 2012

Didn't get an Aboriginal release out this weekend, instead I wrote up documentation on toybox's argument parsing logic and finally did a first pass at collating the zillion toybox todo snippets into the main todo file.

Tried to play with musl (which seems to be trying to replace uClibc the way toybox is trying to replace busybox), but the git repository has been down all evening. (The web page is up, but the git repo isn't?) Oh well, I tried.

Honestly, Github exists, mirroring git is _trivial_ and he didn't _bother_. The website isn't down, just the git repo; I take that to mean the project's author doesn't think the repository is worth anybody else's time to pay attention to (or he would have made it possible for us to do so). And thus the project goes back down my todo list to the "cat flossing" levels. Oh well.

January 13, 2012

Resisted updating my notes.html symlink to point to the notes-2012.html file for a week and change because I'm writing an updated version of the python rss generator that'll also split the big file into individual files with prev/next links and an index. (To make linking to individual entries a lot easier.)

Alas, appreaching 2 weeks into the new year and not having finished it... time to set the symlink anyway.

I also need to get an aboriginal linux release out. And update the toybox todo. Maybe this weekend.

Day jobs. They are time consuming.

January 11, 2012

The FSF deleted the last GPLv2 release of binutils (2.17) off their website, an replaced it with a binutils 2.17a. I downloaded that and diffed it against the real 2.17, and the first difference was:

--- binutils-2.17/cgen/cpu/fr30.cpu     1969-12-31 18:00:00.000000000 -0600
+++ 2011-08-24 06:40:39.000000000 -0500
@@ -0,0 +1,1863 @@
+; -*- Scheme -*-
+; Copyright 2011 Free Software Foundation, Inc.
+; Contributed by Red Hat Inc;
+; This file is part of the GNU Binutils.
+; This program is free software; you can redistribute it and/or modify
+; it under the terms of the GNU General Public License as published by
+; the Free Software Foundation; either version 3 of the License, or
+; (at your option) any later version.

The new file is GPLv3. Those bastards at the FSF deleted the last GPLv2 release of binutils off their website, replaced it with a GPLv3 version, and REDIRECTED THE OLD FILENAME TO POINT TO THE NEW FILE WHICH IS UNDER A DIFFERENT LICENSE.

Wow that's evil. (They keep using the word "freedom", but they seem to think it means we should do only what they want us to, be happy with what they give us without question, and act in obedience to their whims. I do not think it means what they think it means. We are apparently NOT free to consider GPLv3 a bad idea and want no part of it; they'll take away our old GPLv2 stuff and inflict the new license on us by stealth if necessary. GPLv3 hadn't been released when binutils 2.17 shipped, but apparently that's the license it's under now. At least if you get it from their website.)

Luckily Aboriginal Linux's sha1sum check caught it, and automatically fell back to my mirror location, which still has the real 2.17. I noticed all this while trying to figure out why that had happened.

January 10, 2012

Sigh. Me and my big mouth.

The tl;dr version: somebody was an asshole on IRC and I responded in kind for about 15 seconds before /ignoring them, randomly rolled a critical hit on the parting shot, and thus got kicked out of a position in Funtoo development I'd never asked for in the first place. Oh well.

So hanging out on the #funtoo channel, one of the devs suddenly started going off on a random unprovoked rant about how "If welfare worked, I'd lose weight when thin people exercise", and so on.

This pissed me off, since my sister and her four kids have been on food stamps and living in heavily subsidized housing ever since her husband left her for the wife of some military guy deployed to Iraq. I've sent her tens of thousands of dollars over the years, in the past year alone I bought each niecephew a netbook, and flew all five of them to florida to see their great grandparents with us over christmas. My father and grandfather have also each sent her five figure amounts, but I haven't got the money to actually _support_ her, and if she moves she'd lose custody of her kids.

Another friend was recently homeless (she moved to Austin for a job right as the recession hit and the job wasn't here anymore when she got here because the _company_ went away, then she was tied to an apartment lease in a strange city but couldn't find a new job during her three months of saved living expenses, had her car reposessed shortly before she was evicted from her apartment, had her purse stolen her first week on the street with all her photo ID in it, and spent about nine months camped in a park before I found her). She's currently unemployed again (I helped her put her life back together enough to get a minimum wage clerical job but it only lasted a couple months), and now she's trying to survive a bad infection of both kidneys without health insurance (this is apparently why she hasn't been sleeping well since before christmas; as with all poor people she only went to the doctor when the problem wouldn't go away on its' own for a long time, so it's pretty advanced by the time it's diagnosed). She's on 1500 miligrams of antibiotics a day, which I gave her most of the money to pay for (she borrowed the rest from a friend, another dancer she met dancing at a strip club; which was the only job she could get without photo ID and turns out to pay horribly when four different managers want to be "tipped out" to the tune of $50 each every night; she wound up _losing_ money half the time, especially since her skin's a bit sun-damaged from holding a sign on street corners, which might make maybe half minimum wage on a good day and literally nothing on a bad day).

The clinic she went to (the emergency room just tried to give her Vicodin and send her away) says she really _needs_ to be on intravenous antibiotics, but she can't afford them, and the next appointment to see if she can qualify for medicaid isn't until friday. Last I heard one of her kidneys had already shut down. I'm seriously worried she's going to die, of something entirely treatable, because she hasn't got health insurance.

This is not "some anecdote". Her name is Heather, her 26th birthday is a week from Friday. I was hoping to get her dental work for her birthday (living on the streets is really hard on the teeth, first place she went to said a dozen teeth can't be saved and need to come out), but after the other medical bills I can't afford it.

Another friend is dead broke and moved back in with her father because she can't find a job (her degree is in mortuary science, kinda specific), and can't really search for one from his suburban house without a car. (She previously lived with her mother in NYC, and thus never got a driver's license. She's terrified of having to move back in with her mother "because of the roaches".) Her father recently had his _second_ sudden massive abdominal surgery (the details are horrific, but he survived and is doing well enough to have the colostomy bag removed soon), so now she's taking care of him. He's been underemployed since the dot-com crash caused his business to fold, and back on the market in his 50's he hit the age discrimination our industry's full of, so he had to start social security early (hence is getting less of it), then he tried to take up the post-mortgage collapse "refinancing" plans that basically meant the bank told him to stop paying his mortgage until the process completed, and then tried to reposess his house when he complied. (Ongoing legal battle there is in something like its third year, involving every form of malfeasance on the part of the bank you can imagine, including up through the "robo-signers".)

Another friend is currently broke and unemployed after high school, due to a motorcycle accident that hospitalized her (her knee is still screwed up due to severed nerves) that prevented her from starting college on time, and thus cost her a scholarship. She moved from Arkansas to Louisiana recently (more or less couch surfing), and tried to check into a mental hospital (feeling suicidal) which wouldn't take her.

Another friend's been hospitalized a couple times for stress (related to his father winding up in jail basically for life) and a back injury (he's in his 20's but fell down the stairs in his apartment), but luckily he had health insurance. Except now he's unemployed and paying something like $1000/month for Cobra, which has eaten through his savings and is currently going on credit cards (along with his rent and other living expenses) while he job hunts. He's a fairly recent college graduate with only ~3 years experience on his resume, so even though he's probably smarter than me it's a lot harder for him to find a job without moving to a strange city where he has no friends.

This is not an exhaustive list. My ex-roommate Reese was homeless for a while, she got back on her feet after I let her "rent" a room from me for a year and only pay for one month of that. Back when I dabbled in real estate in the late 90's I rented a place to a guy named Tim who had been homeless for a while before I met him (dug himself out of it waiting tables, eventually saved up and bought a trailer in a trailer park). My brother tried his own dabbling in real estate and the mortage crisis happened while he was trying to fix up and sell four properties (one of which he was living in), they all wound up getting reposessed, and he lost his job, so he moved back in with his father for a while.

So when this guy on the Funtoo list started randomly mouthing off about how safety net programs didn't help _him_ and were thus worthless, I told him to "die in a fire" before slash-ignoring him. This was my mistake.

A couple hours Daniel Robbins informed me that the guy's father _did_ die in a fire (about five years ago), and the police suspect him of arson and raided his house last week, so it was the most hurtful possible thing I could have said to him (which I honestly didn't know), and I "couldn't represent the Funtoo project" anymore, which actually comes as something of a relief.

I never actually asked to be on the Funtoo core team, and warned him I probably wouldn't have time to do much with it, I was just working on a technology he found useful. I've wanted to get bootstrap-gentoo working for _years_ and still do, and it's not actually that _hard_ since he and I did about the first third of it in a single evening when he visited Austin last month. But the bits I don't know are in _gentoo_, and learning the guts of gentoo just isn't all that interesting to me, so it's been on my todo list for years and hasn't quite made it to the top yet.

The reason not being on this core team anymore comes as a relief is I no longer feel _guilty_ about not spending enough time on it. Funtoo wanted me to maintain wiki pages and regression test their "metro" image builder tool (which is another highly integrated thing like catalyst that's almost useless for the bootstrapping I want to do: it's simply not compatible with anything other than itself). These are all interesting todo items, but I have no time! (Heck, I never did set up a standalone funtoo system, and deleted my funtoo chroot while I was in Florida because my fiddling around had glitched it and my netbook was out of space anyway. Now I don't have to feel guilty about not finding time to set it back up.)

That said, yeah I shouldn't have said that to whoever that guy was, wouldn't have if I'd known the context, and would happily apologize to him. Turnabout is fair play and all, but I also had the option to be _better_ than him, and probably should have taken it. To do otherwise makes Wil Wheaton sad.

(I think the real lesson I'm taking away from all this is I shouldn't let it get to be afternoon without having eaten anything all day if there's any chance I'll have to interact with people.)

January 8, 2012

Other bugs in aboriginal linux:

1) Powerpc threading segfaults immediately on application launch. (Non-threaded stuff seems to work fine.)

2) Sparc threading hangs, and the g++ "hello world" build fails with an uknown link type.

3) If the network card config fails the CPUS count is still set to 3, but without distcc the builds don't ge distributed, and boards with only 256 megs of ram can't do -j 3 builds locally without dying.

All of these were there in the 1.1.0 release (and in fact sparc dynamic linking didn't work at all) so they're not _regressions_ and thus don't hold up the release, but I should fix both next time around. (And the proper fix to #3 is really to fix the network cards.

Still debugging the 3.2 kernel. Apparently everybody's network drivers went bye-bye and need a vendor symbol now (not just the intel stuff), and one of my long term todo items has been to switch more board emulations over to gigabit ethernet (which has less overhead and thus works faster than 100baseT or 10baseT emulations).

Unfortunately, enabling the intel E1000 driver on armv5l panics the kernel at some point after it loads. (Sometimes. Other times, it boots up but the interface still doesn't work, I don't think the device probe actually finds one even when I tell qemu to provide one using the same arguments that work on i686. Possibly the qemu code for that is target-specific, even though it's a PCI device.)

January 7, 2012

I got an ssh account on Daniel Robbins' fire-breathing 16-way server with 48 gigs of ram (actually an openvz container), and managed to wget the 3.2 kernel over there yesterday. Today I can't download it to my laptop from, so I copied the one from that server.

Yes, the 3.2 kernel is out and even today is melting under the strain of serving it. I'm told the local mirrors ( and such) aren't back up yet. I never got a release out with 3.1 (upgraded the repository to that right after cutting 1.1.0 in October), so I waited for 3.2 to come out and now I'm trying to get 3.2 out promptly to make staying in sync with kernel releases a bit easier.

One of the perl removal patches needed to be rediffed, but that was just to remove fuzz; otherwise still works just fine. The sparc relocation fix for 3.1 is already upstream, so yank that.

In 3.2, the network card config symbols have been deeply screwed up for some reason: now you need to specify which vendor the cards belong to. If you want the E1000 driver, you have to add CONFIG_ETHERNET and CONFIG_NET_VENDOR_INTEL, for no apparent reason. I don't know why they did this. Ask commit dee1ad47f2ee7.

January 4, 2012

I'm mucking about with the early boot code of a Texas Instruments board at work, which means I'm reading through the guts of U-boot.

Bootable kernels tend to make an ELF image, and then run it through "objcopy -O binary elffile binfile" to create a fully linked runnable binary blob of raw machine language, with a fixed load address and start address, that can actually run unmodified on the hardware.

In Linux, this ELF file is called "vmlinux", and in U-boot it's "u-boot", and both are still around after the build if you want to play with them. (QEMU can cheat slightly: it contains an ELF loader that does what objcopy does, and in the process it can determine what load address and start address to use from the ELF metadata. So "qemu -kernel vmlinux" (or u-boot) works on some targets. One of my todo items is to make it work on all of them, the fiddliness is how to feed in the kernel command line.)

The start address generally corresponds to the symbol "_start" in the ELF file (it's actually a field in the ELF header, but the location of _start is the default that field gets set to by the ELF tools), which is some assembly function that does the really low level setup necessary before it can jump to C code. You have to set the processor into the right mode, make sure problematic hardware we haven't configured yet is switched off (disable interrupts and Memory Management Unit), and set the stack pointer to a chunk of memory so C has local variables and so the first C function can call other functions and return from them.

The first thing the C code generally does set up the DRAM controller to start refreshing memory at the correct rate, so data written to DRAM actually stays there. Until the DRAM controller is set up (which is nontrivial, since the same hardware has to work with different brands/speeds/sizes of memory that get refreshed in different ways), all reads from DRAM return random garbage.

This raises a couple of interesting questions. The first is "How does u-boot run before the DRAM init happens?" By running out of something other than DRAM. Your board has to have some non-volatile memory such as ROM or Flash to provide a boot program when you switch it on, which has the disadvantage of being insanely slow compared to DRAM, but once the DRAM is initialized you can copy the rest of your code into DRAM and run from there.

The other question is: where do you get memory for the C stack? In the olden days, C's need for a stack led BIOS vendors to write the DRAM controller init routines in assembly language using no storage other than the CPU registers, which was horrible and is why BIOS development used to be black magic nobody understood. Then the "coreboot" project came up with the a trick of repurposing the processor's data cache via a static TLB entry, so a chunk of address ranges were stored in cache instead of DRAM, providing enough stack space to run a DRAM controller setup function in C. (And there was much rejoicing.) You only need to do it the old way on processors with no L1 or L2 cache (which are obsolete these days because the latency of DRAM fetch from chips a couple inches away does not mix well with modern clock speeds: just sending a signal down that much wire is several clock cycles' round trip, let alone having the DRAM circuitry actually look stuff up).

In general, having more assembly code than necessary raises maintenance problems: it's extremely awkward to write and debug assembly compared to C, and every processor has its own slightly different assembly language which few people know all that well. You generally want to Jump to C code as fast as possible, where the potential number of people who can review/maintain your code increases by an order of magnitude and you have many more tools at your disposal.

Alas, somebody needs to tell Texas Instruments this.

In this case, Texas Instruments went with an overcomplciated 3 layer approach, plus extra hardware. It includes a ROM in its system-on chip and builds U-boot twice. The ROM loads the first instance of U-boot (called "MLO" for some reason) from an SD card into 256k of dedicated on-chip memory (which is not the same as the L2 cache, and is thus mostly wasted after boot time). The ROM has to have a device driver to talk to the SD card through an MMC bus and parse the FAT filesystem to find and read the MLO file. The MLO instance of U-boot has to have its own MMC+SD+FAT driver to load a bigger U-boot into DRAM after it's initialized the DRAM refresh circuitry. And then that U-boot has to have a third instance of the same controller in order to load Linux from the SD card.

Yes, really.

In the u-boot board I'm working on right now, _start is defined in the file "arch/arm/cpu/arm_cortexa8/start.S", and it does _not_ do the minimal work necessary to jump to C. Instead, Texas Instruments wrote extensive setup code in assembly language.

The _start code begins with a branch to the label "reset:", jumping past some memory used to store variables. The reset routine sets the cpu to SVC32 mode, and then has a large blob to "copy vectors to mask ROM indirect addr" inside an #ifdef CONFIG_OMAP34XX that I really hope doesn't apply to the board I'm using. (I doubt Wolfgang Denk would allow such a horror upstream into his code, most likely it's only present in the vendor fork I'm using.)

Next it then optionally calls cpu_init_crit. This disables the processor cache and MMU (in case something left them on), and then calls lowlevel_init which is board-specific init (in this case in arch/arm/cpu/arm_cortexa8/ti81xx/lowlevel_init.S . What does "optionally" mean here? It's in an #ifndef CONFIG_SKIP_LOWLEVEL_INIT, but that doesn't seem to be set in our board's include/config.h or the files sourced from it.)

Lots of plumbing in lowlevel_init.S involves NOR flash, which our board config doesn't seem to have enabled either. The lowlevel_init: label is at line 383 in the source I'm looking at, and minus the NOR bits it does two things: 1) Set the stack pointer to the end of the first block of physical memory (SRAM0_START+SRAM0_SIZE-4). 2) Try to figure out if we're running from DRAM already or not.

January 3, 2012

Bootable kernels tend to make an ELF image, and then run it through "objcopy -O binary elffile binfile" to create a fully linked runnable binary blob.

The rest of this blog entry just explains that statement.

The ELF file format is a container format (like zip/tar/cpio or one the "ar" tool does for *.a static libraries). Executable files, shared libraries, and the *.o files created by compiling source code are all different kinds of ELF files.

Instead of a bunch of separate files, this format is optimized to contain chunks of program code. The archive's metadata describes what kind of code it contains (armv7 little endian 32 bit using the Thumb2 extensions...), and lists a bunch of named "symbols" describing the chunks of data in the archive. This includes "sections" of memory (with attributes like read-only or zero filled), plus the variables and functions that live in those sections. Each variable or function includes a size in bytes, the section name it lives in, the offset within that section it starts at, and so on. A bunch of tools can show this data, but "objdump -x" and "readelf -a" the big ones.

The Linux kernel contains an ELF loader that can parse ELF files and load them into memory, sometimes delegating to a dynamic linker (another program listed in the ELF headers as responsible for dealing with this program).

But running hardware needs raw machine language, so at boot time you need a full linked binary blob, a "load address" to copy it into memory at, and a "start address" to jump to. (Sometimes the start address is the same as the load address. Sometimes a jump to the real start address is inserted at the very start of the file.)

Bootable kernels tend to make an ELF image, and then run it through "objcopy -O binary elffile binfile" to create a fully linked runnable binary blob with a known load address and start address. In the case of Linux, the ELF file is called "vmlinux". In the case of u-boot, the elf file is called "u-boot". They're still around at the end of the build, and you can play with 'em if you like. (If you ever hook a debugger up to a JTAG interface to debug a running kernel, loading the ELF file into the debugger gives it all the symbol information to provide symbol and function names instead of raw memory addresses for everything.)

A "relocateable" program can be loaded and run from any memory location, but it has to either be REALLY carefully programmed (only using relative addresses that are a given distance forward or backward from the current location), or use some sort of wrapper has to run on it to adjust all its addresses with regards to where it's been loaded this time around. The relocation code has to be one big function using entirely local variables: it can't access any globals or call any other functions because those would have to be relocated, which is a chicken and egg problem.

Another common kind of wrapper decompresses the program, so the binary you actually load can be compressed and thus much smaller. Both kind of wrappers run first, and then jumps to the real starting location once they've doen their work on the runnable image. This is a form of relocation, but the wrapper is just copying the output to a known location so it doesn't actually have to patch all its jump and read/write instructions that use an address: those can still be fixed at build time.

January 2, 2012

Sprint is too officially stupid to live. I think breaking the contract is probably worth it. I actually want them to go out of business now, because they've EARNED it.

This morning Sprint's horrible network got astonishingly bad. From my house, with three bars signal, I managed to load _one_ web page (no graphics) after half an hour of trying. Switching to airplane mode and back (force-reinitializing the radio) didn't help, so I rebooted, and when that didn't help I finally broke down and allowed my phone to do the "upgrade" it's been pestering me about since I got it.

This "upgrade" did exactly one thing: it disabled tethering in my phone. That's it. Tethering worked fine out of the box. Now the instant I switch on USB or Wireless tethering, the 3g icon goes away and the bars turn from green to gray. As soon as I switch it off, it comes back. 100% reliable. (I'm not saying the green 3g icon actually passes PACKETS, I'm just saying whether or not it's got a data connection associated with the tower.)

I called sprint's tech support, and they said:

  • Their network has been having 3g outages since the 25th, still ongoing. This week of bad service is apparently not noteworthy.
  • They want to charge me an extra $30 a month to restore my ability to connect my laptop to my phone, which WORKED FINE OUT OF THE BOX until they "upgraded" it away. It's fine for me to repeatedly download DVD images onto my phone and delete them (it's got something like 12 gigabytes of storage, I can queue up a 4 gig image each day entirely in the phone with about 30 seconds effort on my part), so clearly this isn't a _bandwidth_ issue. This is them being greedy.

They already charge me $100/month for the phone. T-mobile is advertising literally half that ($50/month) and their network is reasonably reliable, which sprint's isn't.

This means T-mobile is $70/month cheaper for better service, and breaking the contract with sprint (one month into it) would only cost $350, so I'd come out ahead after 5 months.

I'm sorry, Sprint, but given the above you DESERVE to go out of business.

January 1, 2012

And a new year. Introspection time.

Still employed. I've been doing the "work half the year, do open source half the year" thing for ages, and now that I dunno when/if the Polycom job will end (and hope it won't, but the future's hard to predict), I am either totally falling behind on my open source stuff with a 9-5 job, or falling behind AT THE JOB if I do the "get up at 5am to have some open source programming time before work" thing. It works out ok if I get to bed by 9pm, but I wind up hanging out with Fade instead and going to work on less than 6 hours of sleep, which doesn't end well. My _ideal_ job would be half time, I probably wouldn't even mind sitting in a cubicle for that, but most employers just aren't set up that way.

Still in the tiny condo across the street from a fraternity. My trip to Florida cleared up my sinuses so tremendously that I'm relucatant to buy a bigger place in Austin: Kelly warned years ago that if I didn't have asthma or sinus problems, Austin would provide. And it has. If it's due to being a block downwind of Pease Park's Pollen, moving would help. If the condo is full of black mold or something (which I haven't seen any sign of but who knwos), moving would help. But if it's because Austin is at the intersection of four different ecosystems (hills, desert, plans, ocean) and gets allergens from all of 'em

Four cats is still too many cats. I love the kiggies but I don't want any more kiggies after these: the smell is incredible, they threw up on the couch like five times while we were out, I suspect half the reason my sinuses are so screwed up is incessant cat dander in a confined space, it's hard to travel even when we have time off because of cat care arrangements, and I can't work at home because they constantly pester me for attention (even when we HAVEN'T just been away). Fade and I started accumulating the current batch in 2003 so they're coming up on 8 years old, and going by how long my previous cats have lived this means we've got another decade or so of overwhelming cattitude.

Fade and I still want to have kids, but it hasn't happened and doesn't seem likely to. We looked into adopting, but it's a morass of regulation and hoops to jump through with a multi-year wait that seems designed to guarantee you get the kid just in time to send them off to college, or some such. Oh well. I suppose four cats pretty much ate the household's caring for small ones bandwidth anyway.

I have various unhappy friends and relatives, most of whom need jobs. I try to help out where I can but even with as much money as I'm making it's not enough to address half their needs. (It's nice to be able to help, but buying Heather an extra bottle of ibuprofen is no substitute for the couple thousand dollars worth of dental work she needs. Yes, real example.) I turn 40 later this year. I should probably be focusing on saving for retirement, since the republicrats will have destroyed social security and medicare by the time I get there. (The republicans are evil, the democrats are dishrags hoping that xeno's paradox will protect everything they hold dear as they endlessly meet the unmoving opposition halfway. Not a good combination.)

I need to redo my blog's rss feed generator to create individual pages for each blog entry, with forward/back links, so actually linking to specific blog entries (or chaining together a few of them on a given topic) is a bit more feasible.

Back to 2011