Multicore Machines and Games

I've abused Jeff Atwood's comments section enough, but I want to continue ranting about multicore machines.

One of the themes in his comment section is that games don't speed up on multicore machines. This is largely true. Jeff points out, correctly, that most games are video-processor bound and not CPU-bound and that if you want higher framerates, invest in a newer videoboard, not a dual-core system. Very true.

Additionally, he doesn't mention that the large majority of games are not only not multithreaded, they are aggressively serial. Virtually all games spend the vast majority of their time in a loop that looks very similar to:

  while(running){
 UpdateWorld();
 RenderWorld(); //CPU: CalculateVisible(), GPU: DrawVisible()
 }

That's why framerates vary from machine to machine and framerates matter to gameplay: your framerate is also your input rate, your physics rate, your AI rate, etc.

That gameloop has made absolute sense until the multicore era: threading on a single processor can only provide the illusion of increased performance (aside from IRQ interrupt-related situations) while it introduces problems of synchronization.

Movement towards a concurrent model, where (conceptually) the game runs like:

CPU1: UpdateWorld() -> CPU2: UpdatePhysics(); CPU3: UpdateAI();
CPU4: CalculateVisible()  -> GPU: DrawVisible()

has been slow, because other than AI and things like economic models, pretty much everything else in a game is tightly coupled to the rendering pipeline. So to do a concurrent game, you will dance with all of the famously hard issues of multithreading (both logical problems like races and deadlocks, and hardware problems like resource contention).

Even on the XBox 360, which has multiple cores, I believe that none of the launch games utilized more than one: I've heard that Geometry Wars (an arcade game) was the first multicore game for the XBox 360. I believe that the first wave of concurrent games will be coming this holiday season, with Call of Duty 3 (I know is multicore) and maybe Gears of War (?). The interesting thing is that commercial game design is sadly a very conservative area: it's basically "make a sequel to the previous hit," so the first concurent games will apply all the new horsepower to evolve things like AI and physics engines that today are based on minuscule CPU budgets (say, 10%). So I think it will take several generations before those engines "grow up" to effectively consume all of the available horsepower, even on (just) multicore consoles, much less take advantage of the manycore era as it hits.

There's an irony, though: in a very real sense, gaming is already well into the concurrent era. First, performant games expect a system with two chips: a CPU and a GPU. GPUs are every bit (heh) as complex as CPUs and today have 10\^8 transistors. Second, and even more interesting, GPUs today already are beyond the "multi-" level of concurrency (2-4 concurrent operations) and well into the "many-" level. For instance, my graphics card has 16 "pipelines" for calculating pixels.  For several hundred dollars, I could add a second card, or buy a higher end board with 24 more-capable pipelines. Meanwhile, my laptops have far fewer pipelines.

GPUs today anticipate the chaos coming to desktops tomorrow: the market has everywhere from 1 low-capacity shader pipeline to (say) 24 high-capacity pipelines. This is exactly what the situation will be with desktops in 4-5 Moore's generations. And if you look at how radically different a modern GPU is compared to a VGA graphics card, that's how radically different the whole dang box is going to be.

A final point: the only way the game industry can deal with the complexity is to rely on a software abstraction layer (DirectX or OpenGL); many rely on multiple abstraction layers (licensed game engines). The implication is that there will be a similar reliance in the manycore CPU era. And, fascinatingly, many in the game industry are switching to the use of specialized shader languages. These are typically C-like at the gross level (curly brackets and semicolons) but up close have a bit of a scripting language feel: implicit typing, restricted access semantics, etc. This is yet another piece of evidence that the tokens chosen by a language designer are not trivial to the ultimate acceptance of the language.