Intel Itanium

The next platform from Intel. Features a VLIW [VeryLargeInstructionWord?] instruction set design. Can process up to 8 instructions at once (if I remember correctly). Programmer controls processor parallelism rather than chip.

For someone who gets excited about Assembly optimization, this chip will be super awesome. For everyone else, the question is cost, how does it perform now, and finally, how well will first generation Itanium programs execute on later generations of Itaniums (apparently that is the major historical flaw in VLIW design).

See: for Itanium Software Developer information.

And for an OpenSource IA64 Assembler.

SourceForge has some Itanium boxes on the compile farm for OpenSource programmers to use to port Linux software to Itanium.

On a related note, you may know that while SGI is not going to abandon their IRIX customers anytime soon, they have announced that Linux is going to be their future operating system and Irix will be slowly phased out over the coming years. They have also announced that MIPS would slowly be phased out also, in favor of Intel over the coming years.

So, in taking steps in that direction, SGI has booted Linux/MIPS (a long time port of Linux to SGIs, which till recently was limited to Indys, Indigo2s, and a few Chalenges) on one a 32 processor ccNuma Origin 2000. They've stated that while Linux/MIPS will never be supported, they plan to work on the performance problems now with Linux/MIPS so that they will be able to ship Linux/Itanium in massive ccNuma configurations as soon as the Itaniums are ready. I hear that they already have a few in house versions of the Origin 3k line running off of Itaniums instead of Mips chips.
Some facts about IntelItanium:

Note that Linux itself apparently already runs on the Itanium. (Haven't tested it myself though, however I will if any of you friendly folks sends me an Itanium box to do the tests on.) Porting software should be easy if it can handle the fact that pointers are now 64 bit. That's the theory, at least.

By the way, larger pages might cause trouble for GarbageCollection techniques that implement WriteBarrier?s using the MMU. I am also wondering whether ByteCode VirtualMachines can take full advantage of the VLIW design; there's a NEXT jump after every bytecode's implementation, and simple bytecodes might not fill the full instruction word.

I don't fully understand the remark "Top 96 registers in rotating stack". Aparently we now have 96 registers (hurray! take that, SPARC), ...but whaddabout the "rotating stack"? Is this some kind of hardware stack a la what dedicated ForthLanguage machines have? (But why rotating? Stacks don't rotate, do they?)

-- StephanHouben

This probably means that the top-of-stack pointer rotates through the 96 registers. When you push an item, the 96-deep stack item (if any) goes into the BitBucket. This is how ChuckMoore's Forth chips work.

Close but not quite. It automatically dumps the overflow into memory. So it amounts to yet another stack caching scheme, somewhat similar to ChuckMoore's, but also somewhat similar to the Sparc's and to others (and also somewhat different in some details)...there have been many stack caching schemes through history.

See for instance: -- DougMerritt

View edit of December 14, 2014 or FindPage with title or text search