Wednesday, April 04, 2007

Optimal Bytecode Interpretation

I always feel like I should blog more about engineering related stuff, since that's what I do all day.

For now, my earlier plans for bytecode to C++ compilation are on hold as we're looking into more immediate gains from aggressively optimizing the interpretation. One of the downsides of the hybrid C++ compilation/bytecode interpretation approach that I have in mind is that it will require a workflow change. The C++ will eventually need to be compiled and then linked to the executable. There's a pipeline/workflow bubble there that's not appealing. Still, I think it would be fantastic to get that built for critical core libraries of script (utilities) that don't change much.

Optimizing Interpretation

Our bytecode is that of a dynamic, and (as a result of the structure of the instruction set) terribly slow language. Unfortunately, we use an off-the-shelf compiler, so we can't infer much that was in the original source but fails to show up in the bytecode output.

Did some memoization, in this case caching of variable and function lookups, with excellent (awesome!) results.

A clever universal hashing scheme for both special strings and user strings opened up a lot of possibilities.

Close attention paid to string handling yielded good results.

Thread synchronizaton primitives were killing us. We usually try to keep our assembly language to a bare minimum for portability sake. This translates to a handful of primitives to improve PS2 performance. But to rid ourselves of our most time consuming locks, we applied asm to a couple of critical spots where we need to read in, modify, then write out whole cache lines as atomic operations on the PowerPC machines (PS3, 360). This avoids the expense of a more heavyweight lock such as a mutex in those places and gives us back a lot of performance for multithreaded operation.

Anyhow, we've significantly improved the bytecode execution performance, which is something the game teams have been asking (begging!) for for ages.

Of course, it's never fast enough.. ;)

3 comments:

Anonymous said...

"The C++ will eventually need to be compiled and then linked to the executable. There's a pipeline/workflow bubble there that's not appealing"

Sure it needs to be compiled, but couldn't you compile it into a DLL (all platforms have some form of DLL). Then you *shouldn't* need to change the workflow right?

Paul Senzee said...

Definitely. I'd like to in this case, but it will take considerable effort to do that in our current build and runtime environment. In the longer term, I'd certainly like to go that way.

Harley said...

Good reading this post