The Binary Language of Moisture Vaporators(dlang.org) |
The Binary Language of Moisture Vaporators(dlang.org) |
Until very recently the dominant paradigm, at least for ahead-of-time compilers on Unix, was to emit textual assembler code and run a separate assembler on it behind the scenes. No disassembler needed, you can ask GCC or LLVM based compilers to just give you the intermediate data with -S.
> But running obj2asm is a separate process, and the output is filled with all the boilerplate needed to create a proper object file. The boilerplate is rarely of interest, and I’m only interested in the generated code for a function.
Sounds like a bug in obj2asm. GNU objdump, for example, has handy -d and -D flags for disassembling only the code for one or all symbols.
> One would think that the way to do this would be to have the compiler generate the assembler source code, which would then be run through an assembler like MASM or gas to create the object file. I figured this would be slow and too much work.
It was fast enough even for the very first C compilers... <checks calendar> 40 years ago. For actual compilations. Not to mention that when you as a human want to read the assembly code, it doesn't matter how "slow" the -S flag's output is. No matter how "Alpha" you are, the bottleneck will be you, the human.
> Instead, the disassembler logic actually intercepts the binary data being written to the object file and disassembles it [...] I am not aware of any other compiler that does this in the same way.
The HotSpot JVM has -XX:PrintAssembly and -XX:CompileCommand=ClassName.methodName flags that do this in the same way.
Just like the author, I am not aware of any other, other compiler that does this in the same way. Because just like the author I haven't bothered to check. But if I had to bet, I would bet that all major just-in-time compilers do this in the same way.
I suggest you try it before dismissing it. It's changed the way I work, and has started changing the way others do, too.
> It was fast enough even for the very first C compilers... <checks calendar> 40 years ago.
I actually did write a C compiler 40 years ago. Emitting assembler would mean another round trip to the floppy disk, and one might even have to swap floppies to do it. It would have been slowed to the point of making the compiler uncompetitive. Even if one forgave MASM for being miserably slow, as it had a linear symbol table and was multipass itself.
Consider also that in those days, compilers did not run the linker. You compiled the code with one command, and linked with another. Mine (Datalight C) would do both in one step.
After seeing that, every other compiler vendor did the same thing.
Thanks for the tip on what Hotspot does. That seems to have been added starting with JDK8, after I had stopped working with Java. When I used it, one had to use the debugger.
Try what? I know (and mentioned) two distinct ways of looking at the assembly code produced by my C compiler. It doesn't matter to me how -S is implemented because even if it's "slow" I am physically unable to count nanoseconds.
I also know (and mentioned) that I know how to dump disassembly from HotSpot's memory. Yes, it's useful.
> Consider also that in those days, compilers did not run the linker. You compiled the code with one command, and linked with another.
Not sure what "those days" you mean. Running a single "cc" command and getting a single executable seems to have been well established by 1978: https://archive.org/details/TheCProgrammingLanguageFirstEdit...
https://en.wikipedia.org/wiki/Datalight says that Datalight was founded in 1983; I have no way of telling whether you had started developing your compiler well before that.
Ah. Checking the calendar is not sufficient, one should also be able of doing correct arithmetic. 1972 was 50 years ago, not 40.
I hadn't realized/noticed that Luke bought C-3PO to do my job.
Not sure how to feel about it.
> Try what? I know (and mentioned) two distinct ways of looking at the assembly code produced by my C compiler
Do you find:
cc -c test.c
objdump -d test.o
faster or slower to type than: dmd -c test.c -vasm
especially when using command completion? For me, it's no contest, especially when doing it repeatedly, and with command completion.Besides, I like the concise output of -vasm:
foo:
0000: 89 F8 mov EAX,EDI
0002: 01 C0 add EAX,EAX
0004: C3 ret
better than objdump's: test.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <foo>:
0: 55 push %rbp
1: 48 89 e5 mov %rsp,%rbp
4: 89 7d fc mov %edi,-0x4(%rbp)
7: 8b 45 fc mov -0x4(%rbp),%eax
a: 01 c0 add %eax,%eax
c: 5d pop %rbp
d: c3 retqTesting the code generator is a different question, which I hope will be helped by vasm since I've actually had to move demonstrations off dmd and onto GCC since the code was so bad (in particular bad register allocation, I'm not that bothered about redundant cmp-s).
cc -S test.c
fastest to type. cat test.s vi test.s
But if you insist on doing things the hard way on stdout, GCC has got you covered as well. Add -o - to the command line. .file "test.c"
.text
.globl fn
.type fn, @function
fn:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size fn, .-fn
.ident "GCC: (Ubuntu 4.8.4-2ubuntu1~14.04.4) 4.8.4"
.section .note.GNU-stack,"",@progbits
No thanks.On the other hand it properly shows relocations (like global variable references) which disassemblers tend to get wrong.