"Then, there's the problem with cost. To implement dual head capability requires double the hardware to control it. Well, give or take. After that, once you've gotten two heads to double throughput, you need to double the bandwidth from the heads to the interface (ide, scsi, fibre) which requires even more expensive hardware and design"
Theres already 2 heads there. And the more bandwidth problem doesn't make much sence.... we're talking about ~80MB/sec thats nothing.[/i] >>
Dual heads working on one data stream one head at a time. Quite a different problem than a single data stream split into two heads. It also doubles the problem of seeking data (gotta check two tables to find the right data) and writing data (split data, write both at once, check if written correctly, rinse, wash, repeat).
Imagine trying to split a data stream and then writing to the platter or getting two data streams and putting it together correctly. Try doing that with a simple program and you'll see the performance gains are really little or nonexistent. (In case you're wondering, SMP works on two different data streams) Today's technology focuses on working with very large (as compared to a bit) sized data chunks. Example: It takes far less time to load a text file into memory using 512kb chunks than to read it in byte by byte. I should know. That's how a classmate was able to beat out everyone else (except the prof) when given the programming assignment to read in a text file of the bible and count the number of instances of a certain word as fast as possible.
To be able to manipulate a data stream (which enters the heads serial fashion, I would think) at the bit level at speeds similar to today's raid arrays is a very difficult task. A PCI raid controller splits up data into large chunks (64kb, 128kb, whatnot) and sends them to each drive parallel fashion. What you're asking may become more feasible with serial ATA, but would still require extremely fast or massively parallel hardware. Just look at the PS2. 230 or so mhz of processor speed but something like 500mb/s or more of memory bandwidth. And this is using yesterday's technology.
To get, say, 80 mb/s transfer rate with a single controller you would need one that could run at least 640 mhz (80x 8) with memory bandwidth equal to processor speed and everything fitting inside 1/4 or so of the drive itself. The rest of the space going to the platters themselves. So imagine how hot does a 640+mhz processor run? Add to that 640+mhz of memory bandwidth (I don't think it's RAM, which rules out DDR and really jacks up the price at this speed) and considering 640 is only reading in data and not processing the head, location, and error-checking and you can see why a bit-level implementation like you're asking may be so difficult. You could try double the processors with half the speed and half the memory bandwidth, but then you get into cost considerations as you have to basically take what you see on your current hdd board and double it. Silicon real-estate is hella expensive and difficult to shrink to a fit.
It would probably be a lot easier using larger data chunks, at which point it would screw over a two-head idea pretty quickly. You'd be limited to reading, say, 256kb off the top and 256kb off the bottom in 512kb data chunks. Your 128kb autoexec file is now taking up four time's its required memory. Your FAT32 clusters of 64kb now require you to read four clusters at any given read and to write all four chunks anytime you write anything to the drive. SO, you would need to read in the existing clusters, find out which ones you need to write and which ones you don't, compile the data, then spit it onto a platter.
Quite frankly, I wouldn't devote too much funding to developing this type of drive except for specialized cases (aka, non-mainstream consumer who really doesn't give a damn about bandwidth in the first place) and even then I'm not even sure the costs would be justified.