...to predict the best path to take when they reach branches in the sequences of instructions taken, and they fetch data the main thread likely will need from main memory so it's stored in relatively fast-response cache memory.
"The scout is the guy who does all the dirty work — all the snow-plowing in front of the main thread," Tremblay said.
Sun was happy enough with the scout thread performance that it chose to pair one scout thread with each regular thread in Rock, Tremblay said. The two threads tend to run at opposite times, with the regular thread launching a scout thread only when it stalls waiting for data from memory, so Rock avoids some of the heating problems caused by multiple threads running simultaneously, Tremblay said.
One consequence of the fast-thread priority is that the chip's clock speed matters more than in Niagara, which runs at a comparatively slow 1.2GHz, Tremblay added. The x86 chips from Intel and AMD have stayed in the 3GHz neighborhood as the companies moved to multicore designs.
Out of order
To speed execution, most modern chips don't methodically execute instruction sequences in a plodding, linear fashion. Instead, they employ various techniques such as out-of-order execution and speculative execution to get a jump on instructions a few steps ahead of the regular sequence.
Niagara employs none of these techniques, each of which requires more circuitry and therefore increases the chip size and power consumption. But Rock takes the opposite approach — and then some.
Rock goes a step beyond with something called out-of-order retirement, Tremblay said. When an instruction is retired, it means the chip has completed that step of processing and has committed its results to internal memory slots called registers.
With speculative execution, the chip makes its best guess about whether or not to take particular branches — conditional decision points that depend on the results of existing calculations. Current chips are able to speculate about the best choices to take, storing results in temporary locations called intermediate registers, Brookwood said. But they don't commit those results to the real registers until the chip is sure the choices were correct.
With out-of-order retirement, the chip commits its speculative results to memory and moves on without having to wait for validation. "What Rock will let you do is actually finish the instruction and maybe finish more instructions beyond it," Brookwood said.
If the choices proved to be the wrong ones, the chip can quickly back up to the earlier state, and software moves backward along with it so that incorrect results aren't produced, Brookwood said. "It's an undo button... for the stuff that's been committed," he said.
Software doesn't need to be rewritten to support out-of-order retirement, Tremblay said. Preserving compatibility is one of Sun's high chip priorities.
Sun had an awkward phase with its processor plans, ripping up its road map, cancelling the UltraSparc V chip and relying instead on a partnership with fellow Sparc chip designer Fujitsu. But the company now has a simpler, more attainable strategy, Quick said, and Sun is eager to boast about its progress.
"We are very excited right now about how Sparc is going," Fowler said.





