Thursday, March 3, 2011


It's time to update my blog and LLVM is the only thing I can think of writing about.

The 'main'-project of LLVM is a compiler backend. It reads code in a special assembly language (LLVM assembly) and can turn that into bytecode or native code.
There's also a library that helps in writing that assembly-code, so you don't have to know the instructions yourself.

The assembly is rather high-level, so that LLVM can perform optimizations on that. If you let LLVM JIT-compile it can also do run-time optimizations.

LLVM is cross-platform and written in mostly standard C++, making it quite portable. This also means that LLVM bytecode applications can be run on multiple platforms without recompiling, like Java and C#.

Apparently the developers consist of Apple-people, or at least are sponsored by them.

Personally I don't like Apple with the restrictions they place on their hardware and APIs, but LLVM is a darn fine piece of work - well, it started as a college project, not related to Apple.

Anyways, this also means that the early builds are only available for the OS X system, but clang in C++03 mode (gccs C++0x mode is far more advanced) and LLVM Core works fine on a big variety of Unix-like systems.

I've began working a bit with LLVM, and this is what I have so far:
My work:

Int a = 42
Int aA/*comment*/==4/*heh*/2 //another comment
Int Nothin
Keyword: Int
Identifier: a
Operator: =
Literal: 42

Keyword: Int
Identifier: aA
Operator: ==
Literal: 4
Literal: 2

Keyword: Int
Identifier: Nothin

And the part where LLVM jumps in:
Int a = 4
Float f = 5.678
Int b = a
Compilation went fine.
; ModuleID = 'stdin'

define i32 @main() {
%b = alloca i32
%f = alloca float
%a = alloca i32
store i32 4, i32* %a
store float 0x4016B645A0000000, float* %f
store i32* %a, i32* %b

The language seen in the input is one I intend to invent. It's going to draw its main influences from C++ and I'll see where it goes from there.
I'll just improve my current code a bit and then move on to writing a documentation about the yet nameless language.

No comments: