Mercurial > hg > qcc
changeset 595:6695c1cbdfdd
And so it begins.
author | Rob Landley <rob@landley.net> |
---|---|
date | Wed, 13 Jun 2012 09:33:28 -0500 |
parents | 2365d90138f5 |
children | 3cffd74ad346 |
files | TODO VERSION todo/TODO.old todo/commands.txt todo/todo.txt |
diffstat | 5 files changed, 345 insertions(+), 82 deletions(-) [+] |
line wrap: on
line diff
--- a/TODO Thu Apr 24 16:05:02 2008 -0500 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,81 +0,0 @@ -TODO list: - -- bug with defines: - #define spin_lock(lock) do { } while (0) - #define wq_spin_lock spin_lock - #define TEST() wq_spin_lock(a) -- typedefs can be structure fields -- see bugfixes.diff + improvement.diff from Daniel Glockner -- constructors -- cast bug (Peter Wang) -- define incomplete type if defined several times (Peter Wang). -- long long constant evaluation -- configure --cc=tcc (still one bug in libtcc1.c) -- disable-asm and disable-bcheck options -- test binutils/gcc compile -- add alloca(), __builtin_expect() -- gcc '-E' option. -- optimize VT_LOCAL + const -- tci patch + argument. -- '-b' bug. -- atexit (Nigel Horne) -- see -lxxx bug (Michael Charity). -- see transparent union pb in /urs/include/sys/socket.h -- precise behaviour of typeof with arrays ? (__put_user macro) -- #include_next support for /usr/include/limits ? - but should suffice for most cases) -- handle '? x, y : z' in unsized variable initialization (',' is - considered incorrectly as separator in preparser) -- function pointers/lvalues in ? : (linux kernel net/core/dev.c) -- transform functions to function pointers in function parameters (net/ipv4/ip_output.c) -- fix function pointer type display -- fix bound exit on RedHat 7.3 -- check lcc test suite -> fix bitfield binary operations -- check section alignment in C -- fix invalid cast in comparison 'if (v == (int8_t)v)' -- packed attribute -- finish varargs.h support (gcc 3.2 testsuite issue) -- fix static functions declared inside block -- C99: add variable size arrays (gcc 3.2 testsuite issue) -- C99: add complex types (gcc 3.2 testsuite issue) -- postfix compound literals (see 20010124-1.c) -- fix multiple unions init -- setjmp is not supported properly in bound checking. -- better local variables handling (needed for other targets) -- fix bound check code with '&' on local variables (currently done - only for local arrays). -- sizeof, alignof, typeof can still generate code in some cases. -- bound checking and float/long long/struct copy code. bound - checking and symbol + offset optimization -- Fix the remaining libtcc memory leaks. -- make libtcc fully reentrant (except for the compilation stage itself). -- '-MD' option - -Optimizations: - -- suppress specific anonymous symbol handling -- more parse optimizations (=even faster compilation) -- memory alloc optimizations (=even faster compilation) - -Not critical: - -- C99: fix multiple compound literals inits in blocks (ISOC99 - normative example - only relevant when using gotos! -> must add - boolean variable to tell if compound literal was already - initialized). -- add PowerPC or ARM code generator and improve codegen for RISC (need - to suppress VT_LOCAL and use a base register instead). -- interactive mode / integrated debugger -- fix preprocessor symbol redefinition -- better constant opt (&&, ||, ?:) -- add portable byte code generator and interpreter for other - unsupported architectures. -- C++: variable declaration in for, minimal 'class' support. -- win32: add __stdcall, __intxx. use resolve for bchecked malloc et - al. check GetModuleHandle for dlls. check exception code (exception - filter func). -- handle void (__attribute__() *ptr)() - - - -
--- a/VERSION Thu Apr 24 16:05:02 2008 -0500 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,1 +0,0 @@ -0.9.23 \ No newline at end of file
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/todo/TODO.old Wed Jun 13 09:33:28 2012 -0500 @@ -0,0 +1,81 @@ +TODO list: + +- bug with defines: + #define spin_lock(lock) do { } while (0) + #define wq_spin_lock spin_lock + #define TEST() wq_spin_lock(a) +- typedefs can be structure fields +- see bugfixes.diff + improvement.diff from Daniel Glockner +- constructors +- cast bug (Peter Wang) +- define incomplete type if defined several times (Peter Wang). +- long long constant evaluation +- configure --cc=tcc (still one bug in libtcc1.c) +- disable-asm and disable-bcheck options +- test binutils/gcc compile +- add alloca(), __builtin_expect() +- gcc '-E' option. +- optimize VT_LOCAL + const +- tci patch + argument. +- '-b' bug. +- atexit (Nigel Horne) +- see -lxxx bug (Michael Charity). +- see transparent union pb in /urs/include/sys/socket.h +- precise behaviour of typeof with arrays ? (__put_user macro) +- #include_next support for /usr/include/limits ? + but should suffice for most cases) +- handle '? x, y : z' in unsized variable initialization (',' is + considered incorrectly as separator in preparser) +- function pointers/lvalues in ? : (linux kernel net/core/dev.c) +- transform functions to function pointers in function parameters (net/ipv4/ip_output.c) +- fix function pointer type display +- fix bound exit on RedHat 7.3 +- check lcc test suite -> fix bitfield binary operations +- check section alignment in C +- fix invalid cast in comparison 'if (v == (int8_t)v)' +- packed attribute +- finish varargs.h support (gcc 3.2 testsuite issue) +- fix static functions declared inside block +- C99: add variable size arrays (gcc 3.2 testsuite issue) +- C99: add complex types (gcc 3.2 testsuite issue) +- postfix compound literals (see 20010124-1.c) +- fix multiple unions init +- setjmp is not supported properly in bound checking. +- better local variables handling (needed for other targets) +- fix bound check code with '&' on local variables (currently done + only for local arrays). +- sizeof, alignof, typeof can still generate code in some cases. +- bound checking and float/long long/struct copy code. bound + checking and symbol + offset optimization +- Fix the remaining libtcc memory leaks. +- make libtcc fully reentrant (except for the compilation stage itself). +- '-MD' option + +Optimizations: + +- suppress specific anonymous symbol handling +- more parse optimizations (=even faster compilation) +- memory alloc optimizations (=even faster compilation) + +Not critical: + +- C99: fix multiple compound literals inits in blocks (ISOC99 + normative example - only relevant when using gotos! -> must add + boolean variable to tell if compound literal was already + initialized). +- add PowerPC or ARM code generator and improve codegen for RISC (need + to suppress VT_LOCAL and use a base register instead). +- interactive mode / integrated debugger +- fix preprocessor symbol redefinition +- better constant opt (&&, ||, ?:) +- add portable byte code generator and interpreter for other + unsupported architectures. +- C++: variable declaration in for, minimal 'class' support. +- win32: add __stdcall, __intxx. use resolve for bchecked malloc et + al. check GetModuleHandle for dlls. check exception code (exception + filter func). +- handle void (__attribute__() *ptr)() + + + +
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/todo/commands.txt Wed Jun 13 09:33:28 2012 -0500 @@ -0,0 +1,54 @@ +Seven packages. This is to replace binutils and gcc. + +FWL needs: ar as nm cc gcc make ld + - Why gcc (shouldn't cc cover it? What builds?) + - Need a make. Separate issue, busybox probably. + +Loot tinycc fork to provide: + + cc - front-end option parsing + multiplexer (swiss-army-executable ala busybox) + cross-prefix, so check last few chars: cc,ld,ar,as,nm + + Calls several automatically (assembler, compiler, linker) as necessary. + Pass on linker options via -Wl, + + Merge in FWL wrapper stuff (ccwrap.c) + call out again? distcc support? + + Path logic: + compiler includes: ../qcc/include + system includes: ../include + compiler libraries: ../qcc/lib + system libraries: ../lib + tools: built-in (or shell out with same prefix via $PATH) + command line stuff: current directory + + ld - linker + #include <elf.h> which qemu already has. + Support for .o, .a, .so -> exe, .so + Support for linker scripts + + ar - library archiver + Busybox has partial support (still read-only?) + ranlib? + + cc1 - compiler + preprocessor (-E) support + output (.c->.o) support + + as - assembler + + nm - needed to build something? + +binutils provides: + ar as nm ld - already covered + strip, ranlib, addr2line, size, objdump, objcopy - low hanging fruit + readelf - uClibc has one + strings - busybox provides one + + Probably not worth it: + gprof - profiling support (optional) + c++filt - C++ and Java, not C. + windmc, dlltool - Windows only (why is it installed on Linux?) + nlmconv - Novell Netware only (why is this installd on Linux?)
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/todo/todo.txt Wed Jun 13 09:33:28 2012 -0500 @@ -0,0 +1,210 @@ +QCC - QEMU C Compiler. + + Use QEMU's Tiny Code Generator as a backend for a compiler based on my old + fork of Fabrice Bellard's tinycc project. + +Why? + + QEMU's TCG provides support for many different targets (x86, x86-64, arm, + mips, ppc, sh4, sparc, alpha, m68k, cris). It has an active development + community upgrading and optimizing it. + + QEMU application emulation also provides existing support for various ELF + executable and library formats, so linking logic can presumably be merged. + (See elf.h at the top of qemu.) QEMU is also likely to grow coff and pxe + support in future. + +Building a self-bootstrapping system: + + My Firmware Linux project builds the smallest self-bootstrapping system + I could come up with using the following existing packages: + + gcc, binutils, make, bash, busybox, uClibc, linux + + This new compiler should replace both binutils and gcc above. (As a smoke + test, the new system should still be able to build all seven packages.) + + To build those packages, FWL needs the following commands from the host + toolchain. (It can build everything else from source, but building these + without already having them is a chicken and egg problem.) + + ar as nm cc gcc make ld /bin/bash + + The reason it needs "gcc" is that the linux and uClibc packages assume + their host compiler is named "gcc", and call that name instead of cc even + when it's not there. (You can mostly override this by specifying HOSTCC=$CC + on the make command line, although a few places need actual source patches.) + + Ignoring gcc, make, and bash, this leaves "ar, as, nm, cc, and ld" as + commands qcc needs to provide for a minimal self-bootstrapping system. + + Note that the above set of tools is specifically enough to build a fresh + compiler. When building a linux kernel, creating a bzImage requires objcopy, + building qemu requires strip, etc. + +What commands does the current gcc/binutils combo provide? + + gcc 4.1 provides the commands: + cc/gcc - C compiler + cpp - C preprocessor (equivalent to cc -E) + gcov - coverage tester (optional debugging tool) + + Of these, cc is required, cpp is low hanging fruit, and gcov is probably + unnecessary. + + Binutils provides: + ar - archiver, creates .a files. + ranlib - generate index to .a archive (equivalent to ar -s) + as - assembler + ld - linker + strip - discard symbols from object files (equilvalent to ld -S) + nm - list symbols from ELF files. + size - show ELF section sizes + objdump - show contents of ELF files + objcopy - copy/translate ELF files + readelf - show contents of ELF files + addr2line - convert addresses to filename/line number (optional debug tool) + strings - show printable characters from binary file + gprof - profiling support (optional) + c++filt - C++ and Java, not C. + windmc, dlltool - Windows only (why is it installed on Linux?) + nlmconv - Novell Netware only (why is this installd on Linux?) + + Of these, ar, as, ld, and nm are needed, ranlib, strip, addr2line, and + size are low hanging fruit, size, objdump, obcopy, and readelf are + variants of the same logic as nm, and gprof, c++filt, windmc, dlltool, + and nlmconv are probably unnecessary. + +Standards: + + The following utilities have SUSv4 pages describing their operation, at + http://www.opengroup.org/onlinepubs/9699919799/utilities + + ar, c99, nm, strings + + This means the following don't: + + ld, cpp, as, ranlib, strip, size, readelf, objdump, objcopy, addr2line + + (There isn't a "cc" standard, but you can probably use "c99" for that.) + +Existing code: + + multiplexer: + + The compiler must be provide several different names, yet the same + functionality must be callable from a single compiler executable, + assembling when it encounters embedded assembler, passing on linker + options via "-Wl," to the linking stage, and so on. + + The easy way to do this is for the qcc executable to be a swiss-army-knife + executable, like busybox. It needs a command multiplexer which can figure + out which name it was called under and change behavior appropriately, to + act as a compiler, assembler, linker, and so on. + + This multiplexer should accept arbitrary prefixes, so cross compiler names + such as "i686-cc" work. This means instead of matching entire known names, + the multiplexer should checks that commands _end_ with recognized strings. + (This would not only allow it to be called as both "qcc" and "cc", but + would have the added bonus of making "gcc" work like "cc" as well.) + + Both busybox and tinycc already handle this. Pretty straightforward. + + cc/c99 - front-end option parsing + + Both tinycc's options.c and ccwrap.c (in FWL) handle command line option + parsing, in different ways. Both take as input the same command line + syntax as gcc, which is more or less the c99 command line syntax from + SUSv4: + + http://www.opengroup.org/onlinepubs/9699919799/utilities/c99.html + + What ccwrap.c does is rewrite a gcc command line to turn "cc hello.c" + into a big long command line with -L and -I entries, explicitly specifying + header and library paths, the need to link against standard libraries + such as libc, and to link against crt1.o and such as appropriate. + + Such a front end option parser could perform such command line rewriting + and then call a "cc1" that contains no built-in knowledge about standard + paths or libraries. This would neatly centralize such behavior, and + if the rewritten command line could actually be extracted it could be + tested against other compilers (such as gcc) to help debugging. + + Note that adding distcc or ccache support to such a wrapper is a fairly + straightforward item for future expansion. + + The option parser needs to distinguish "compiling" from "linking". + + When compiling, the option parser needs to specify two include paths; + one for the compiler (varargs.h, defaulting to ../qcc/include) and + one for the system (stdio.h, defaulting to ../include). + + When linking, the option parser needs to specify the compiler library + path (where libqcc.a lives, defaulting to ../qcc/lib), the system + library path (where libc.a lives, defaulting to ../lib), and add + explicit calls to link in the standard libraries and the startup/exit + code. Currently, ccwrap.c does all this. + + Note that these default paths aren't relative to the current directory + (which can't change or files listed on the command line wouldn't be found), + but relative to the directory where the qcc executable lives. This allows + the compiler to be relocatable, and thus extracted into a user's home + directory and called from there. (The user's home directory name cannot + be known at compile time.) The defaults can also be specified as absolute + paths when the compiler is configured. + + The current ccwrap.c also modifies the $PATH (so gcc's front-end can + shell out to tools such as its own "cc1" and "ld"), and supports C++. + Although qcc doesn't need either of these, both are useful for shelling + out to another compiler (such as gcc). + + The wrapper can split "compiling and linking" lines into two commands, + either saving intermediate results in the /tmp directory or forking and + using pipes. (That way cc1 doesn't need to know anything about linking.) + Optionally, the compiler can initialize the same structures used by the + linker, but is the speed/complexity tradeoff here worth it? + + Note that "-run" support is actually a property of the linker. + + cpp - preprocessor + + This performs macro substitution, like "qcc -E". + + cc1 - compiler + + This compiles C source code. Specifically, it converts one or more .c + files into to a single .o file, for a specific target. + + Generating assembly output is best done by running the binary tcg output + through a disassembler. Keep it orthogonal. + + ld - linker + This needs to be able to read .o, .a, and .so files, and produce ELF + executables and .so files. It should also support linker scripts. + + This needs to "#include <elf.h>", which non-linux hosts won't always have + but which qemu has it's own copy of already. + + ar - library archiver + This is a wimpy archiver. It creates .a files from .o files + (and extracts .o files from .a files). It's a flat archive, with no + subdirectories. + + Busybox has partial support for this (still read-only, last I checked). + + The ranlib command indexes these archives. + + SUSv4 has a standards document for this command: + + http://www.opengroup.org/onlinepubs/9699919799/utilities/ar.html + + as - assembler + Tinycc has an x86 assembler. It should be genericized. + + nm - name list + + For some reason, gcc won't build without this. + + SUSv4 has a standards document for this command: + + http://www.opengroup.org/onlinepubs/9699919799/utilities/nm.html