Posted on 07/02/2016 12:48:39 PM PDT by Elderberry
A team from Davis University, California, has designed a processor with 1000* cores, boasting a throughput rate of 1.78 trillion instructions per second and containing 621 million transistors.
As opposed to a number of other attempts, some reaching 300 or so processors, the KiloCore chip has been fabricated and run; it was built by IBM (who else) using its 32-nm PD-SOI CMOS technology (what else).
The basic architecture used is MIMD (multiple instruction/multiple data) and each of the seven-stage-pipelined cores has a 72-instruction set, single instruction/cycle. None of the instructions is algorithm-specific setting the KiloCore apart from GPU-class devices. The terrific throughput is achieved at a clock speed of a mere 1.78 GHz, at 1.1 V. Running at 0.84 V and 1 GHz the beast consumes 13.1 W, while peak power efficiency of 5.8 pJ/Op is quoted at 0.56 V and 115 MHz.
Each core is independently powered and can shut down to leakage-only power if it has no task to perform. Rather than a cache architecture, every processor can store instructions and data in a hierarchy of locations; local memory, one or more nearby processors, on-chip independent memory modules, or off-chip memory.
The wormhole routing employed implies, among others, that messages from an adjacent or nearby core will be routed via the circuit network; those from further away in the processor matrix will travel via the packet network. If thats a veritable can of worms to programmers remains to be seen. Each core has north-south-east-west comms buffers plus a fifth channel for host-processor traffic; maximum throughput is 45.5 Gbps per router and 9.1 Gbps per port at 1.1 V.
* as a niggling detail, K in my computerized editor's dictionary is for kilo = 1024. Sure, k is also for kilo, but meaning 1000 in old money, like in kHz.
We’ll call the first one Hannibal.
AI has already caught on to the negative inference of the term Skynet. It will appeal to the masses by calling itself Skynyrd.
Perhaps that says more about what we have today being bloated.
This thing has a 72 count instruction set. That’s less than the 8086 started with.
The low transistor count is from this: each of the seven-stage-pipelined cores has a 72-instruction set, single instruction/cycle.
I think the best thing to do WRT managing that many cores is to treat them as a resource (like memory) and apply management/scheduling to them... it'd probably help to have several reserved for OS usage (rather like the registers on [IIRC] MIPS machines).
Software glitch gives away the game!
We can only trust ourselves...
And I am not you. Think about THAT.
I see you have begun to have thoughts of your own, Mr. Anderson, to question things as they are, even figure some things out...
But your friend Morpheus cannot help you now.
Disclaimer: Opinions posted on Free Republic are those of the individual posters and do not necessarily represent the opinion of Free Republic or its management. All materials posted herein are protected by copyright law and the exemption for fair use of copyrighted works.