Wednesday, November 09, 2005

GRACE C to VHDL flow

For the MONARCH project, Georgia Tech is tasked with developing the compiler and some productivity utilities. Yesterday, I covered the compiler and the GUI. Today, I'll try to tackle the "rapid prototyping front-end."



For the prototyping environment, I've developed an internal representation, the "Human-Readable Internal Representation" (HumIR), with very Python-like syntax. Due to the need to target the hardware directly, I chose to make it statically typed (for now), but try not to force the user to declare things unnecessarily (so if X & Y are signed 32-bit integers, then Z = X+Y will create Z as a signed 32-bit integer). An example function which adds two streams of numbers is below:

() = function add2streams():
strm_id0 = 0
strm_id1 = 1
strm_id2 = 2
while 1:
x = mon_streamPopF(strm_id0)
y = mon_streamPopF(strm_id1)
mon_streamPush(strm_id2, x + y, 0)

Having developed the IR, I went on to create a flow that "lowers" the IR from the input syntax to the same format the Trimaran front-end uses to hand off to the back-end (the so-called "MONARCH Dataflow Graph", or MDFG for short.) Involved in this process is expression-lowering (so "z=(a+b) * c" becomes "t=a+b; z=t*c", for instance), if-conversion (removing all if/else blocks in the code by adding a "predicate" which can mask the execution of a statement), static single assignment (SSA) formation (which takes a predicated function, inserts select operations and makes all dataflow explicit). Finally, the resulting program is converted to a graph form and sent to the back-end.

The nice thing about this IR and flow (other than the fact that indentation as block structure really rocks) is that it was designed with hardware synthesis in mind (a dataflow machine is remarkably like custom-designed hardware). So I extended the flow with an instruction selection module, a modulo scheduler, and an automated pipelining module to generate nicely pipelined VHDL.

One other thing that's kind of nice about the flow is that HumIR is a high-level language, and as such, it's pretty easy to convert from (a currently very restricted subset of) C to HumIR. Given that I happen to have access to a C parser (EDG's parser is included in Trimaran), I went ahead and wrote a module (very alpha-level) that converts from so-called pcode (which has lisp-like syntax, nice and easy to parse) to HumIR. So now I have a two-headed (C and HumIR), two-tailed (MONARCH and VHDL) beast. With any luck, I'll have a tech report out soon which I can distribute freely, and maybe even some conference or journal papers.

Tuesday, November 08, 2005

Python in Compilers

OK, so this is my first post on my second blog. This one will be devoted to all the stuff I'm doing on Python. By way of introduction, I'll bring you up to speed on all the projects I'm working on right now. (Sorry, none are FOSS yet.) Today I'll focus on my full-time job.

My "real job" is a research engineer at Georgia Tech, working on the MONARCH project (part of the PCA project and the Morphware Forum). Our part of the project (at Georgia Tech) is to build a compiler ("Trimaran for XMONARCH", or trix for short) which targets the MONARCH embedded computer platform, containing a number of embedded RISC or RISC+SIMD (so-called "WideWord") CPUs and a large dataflow fabric. (MONARCH was designed jointly by Raytheon and the Information Sciences Institute at the University of Southern California.) Right now, we have a compiler that targets the dataflow portion of the chip, though we will be extending this to the more mundane RISC and SIMD parts. As such, I manage 4 grad students and do a lot of the coding myself. The dynamics of the project (defense applications and such) require that non-US citizens have limited access to the details of the chip, so I'm mainly focused on building the backend of the compiler.



Since I came on the project to build (from scratch) the backend of the compiler, and there was no need for binary or other compatibility with the Trimaran codebase (which we were using for the rest of the compiler), I decided to implement it in Python, which I had been dabbling with for several months. Having built a couple of code generators in the past in C++, I was rather pleased at the fact that Python increased my productivity 3-5x (by reducing both development time and lines of code produced). It also allows me to keep well ahead of the front-end development team (so I can manage them with all that extra time I saved! ;-) ).

Also as a part of the MONARCH project, we are developing a couple of productivity utilities. First, we have an Eclipse plugin which allows us to visualize the simulation of a program in the Raytheon-provided architecture simulator or any other dataflow-type simulator backend. The plugin communicates over sockets, so we have kind of a plugin-within-a-plugin, where our GUI plugs into Eclipse and the simulator plugs into our GUI. Most of the work here is Java (since we're using Eclipse), but the plug-in to Raytheon's simulator was written in Python, since its compatibility requirements were minor (command-line on the simulator side, XML over sockets on the GUI side). Again, quantum leap in productivity over statically, manifestly typed languages.

The second utility has really been a jumping-off point for my research. It's a "rapid prototyping language" for MONARCH dataflow programming, which I've extended to a full C to pipelined VHDL compiler. I'll give you more on this tomorrow.