February 22nd, 2007


In which Ben explains (again) why America is so unbelievably stupid.

So, GQ has drawn up articles of impeachment for Cheney. Very nicely done. But do you really think it's going to accomplish anything? Watching us "little people" attempt to influence politics in DC is a total laugh riot for me. Do you understand the way this system works yet, my friends? The Congresscritters in DC do not care about us. They do not care about our opinions. They do not give a damn about our values. They are NOT our "representatives" under any but the most academic definitions of the word. They are in this not for truth, or justice. They do not care about "doing the right thing." They are in it purely to do whatever will get them the most votes next election. And their actions (which speak far louder than any carefully crafted, bland, deliberately inoffensive words written by their staff speech writers) are showing that they believe their best chance of getting elected... is to sit on their thumbs and do nothing. Don't believe me? Remember that the great white liberal hope, Nancy Pelosi, she elected from San Francisco of all places, said: "Impeachment is off the table." Let me translate: "I'm going to sit here and quite purposefully and knowingly let Dubya keep sending our troops to be blown to bloody chunks in Iraq, even though he cannot now, and never has been able to, give us any credible reason why they should be dying there." How much more pathetically complicit do your politicians need to be, people?? Even Molly Ivins (RIP) said: "If Democrats in Washington haven't got enough sense to OWN the issue of political reform, I give up on them entirely."

Collapse )

Getting massively parallel wrong: the NVidia "nvcc" GPU compiler.

This part is pretty cool:

* Vectorised intrinsics. If application code is, for example, computing sin(x[i]) for all i in a vector, the compiler can replace this with a single call to a highly optimised sin specialised for vectors.

* Cache blocking. Replace a single loop over a large vector with smaller loops that operate on cache-sized chunks of the vector.

* Loop nest optimisation. For a set of nested loops, this can change the order in which the inner and outer loops are executed, to improve the pattern of access to memory.

But this stuff is just awful:

Collapse )

We have plenty of previous examples of hardware that failed to live up to their early marketing promise, from the i860 to the PS3. CUDA looks set to follow in their footsteps: I expect that it will take vast amounts of work for programmers to get halfway decent performance out of a CUDA application, and that few will achieve more than 10% of theoretical peak performance.

People with the expertise, persistence, and bloody-mindedness to keep slogging away will undoubtedly see phenomenal speedups for some application kernels. I’m sure that the DOE and NSA, in particular, are drooling over this stuff, as are the quants on Wall Street. But those groups have a tolerance for pain that is fairly unique. This technology is a long way from anything like true accessibility, even to those already versed with parallel programming using environments like MPI or OpenMP.


I understand that cache bounce can cause a 1000% slowdown in multi-core systems, which is exactly why cache optimization needs to be automated - so it can be done correctly everywhere. Leaving it up to the programmer is a bad decision 99%+ of the time for the exact same reasons that having a programmer type raw opcodes into memory is a bad decision 99%+ of the time.

See previously.