G4s don't suck
"As for SSE2 vs. Altivec, SSE2 is a toy by comparison. Its architecture does not offer the range of generalized high precision capability that the altivec instruction set does. It is filled with bandwidth limitations, particularly its tiny number of harder to use registers that make it nearly impossible to keep the pipeline full, and it is capable of basically no parallelism whatsoever with the regular FP unit on the processor (which means it must start and stop each unit to switch back and forth, and the lack of generalization makes this an excruciating performance penalty). The small number of registers in particular makes the P3 a better scientific computing processor than the P4 for real world applications because the P4's pipe is too deep to keep it filled. This can be graphically demonstrated with fully optimized applications that force significant branching on real world data. "