- Mar 21, 2007
- 1
- 0
- 0
Hi all,
This may seem like a somewhat subtle (or trivial as the case may be) point to make but it is something I have not yet seen addressed directly. It is worth noting that my status as a programmer came about from physics research, specifically writing simulations that would potentially be run on cluster level parallel environments. This is because if not for my study of TTL circuits, PC hardware, assembly language, basic OS architecture and whatnot before studying C++ for the first time I would not be asking this question (and it is perfectly possible that I do not see that this question is irrelevant because I still don't know enough, as is usually the case with anything involving computers... ). I do not consider what I have just stated to be some sort authoritative set of qualifications. Quite the opposite, aside from building home project circuits and putting together PC's from main components only C++ really represents something I have studied to where I have completed large multi-module projects that still made use of professional code wherever possible (I have only recently started to study intermediate). So
Just to make sure everyone understands and is clear (I include myself in that statement) what the most important implications so far regarding SSD's are I will state them (feel free to scan/skip the rest of this paragraph). While it is true many have already been able to leave any traditional magnetic HDD's in the dust when it comes to sequential speed via multiple channels, what is really making them revolutionary is their random access time. Not to beat the point into the ground (though it really is worth noting again and again) is that they have in read-time (and write-time with controllers that do good hardware abstraction) taken the mass storage random access time of a typical mass storage spinning magnetic drive, which has in 20 years not improved by even a single order of magnitude past ~2 milliseconds all the way to ~0.1 milliseconds (depends on drive) even while the technology is still making its way mainstream. Low power requirements, reliability (SLC particularly so) and other benefits are big to be sure but as far as the user experience goes, directly from a desktop SSD or indirectly from an internet that will soon be run on servers that use SSD's, the random access time is the transforming factor.
Now, this is what led me to the question in the title, as it came initially from the perspective of having previously been broken in on how (at a low level) the read/write heads of a spinning HDD are optimized in terms of requests. The problem is a very "non-linear" one in the extreme from a mathematical point of view considering that you have tons of requests that vary greatly in dependence on one another, priority and even their physical proximity relative to where the read/write heads are currently all the while with a limited ability to take all this into account and plan accordingly in real-time. This resulted in an optimized but still absurd variance inherent to spinning drives (the fact that they spin alone would cause this to some extent) that makes the "average seek time" truly the AVERAGE seek time, representing at best the center of a bell curve. In perspective we are now more aware of one of those things that we tend to let slip by, which is understandable considering that we were stuck with it for the past 20 years. So every OS, application and piece of hardware has had to design around what is not simply a high random access time, but a highly VARIABLE random access time. Just in case anyone is wondering how this could be relevant consider how important scheduling and prediction is to processing efficiency (that is, look at the on-die real estate of a single nehalem core and see how small the execution unit is compared to everything handling prediction, prefetch, microcode, interrupts, etc.)
Now I know and have learned to take more seriously the importance of the things closest to the processor and by this I mean the importance of cache over memory, the uselessness of parallel execution capability without an extensive on-die means of scheduling and preparation (i.e. why the P4 pipeline was full of bubbles for a lack of), but I am thinking of the implications of not simply fast but consistent hard drive read/write times in the big picture as I will explain below.
What this all makes me wonder about is whether or not a ridiculous requirement for durability has been lifted from very high level scheduling tasks with the advent of SSD's that from what I have read have read here on anandtech and elsewhere now give a relatively predetermined random access time for information off the system drive. The issue of variance in time of a definite write is legitimate considering that with present flash memory the slow write time is abstracted away. But I believe in one of the articles here that it was inferred from the "quantization" of write time that (with the intel controllers at least) the drive used some kind of queue that could be stuffed to a point where it would take an extra "cycle" or so of a firmware operation to abstractly represent the write request.
I freely admit I have never dived into assembly code in the middle of C++ proper and I have obviously never written a task scheduler but I was wondering if there was anyone here familiar with the issues involved enough to give an answer to this: Would not simply fast random access time but uniformly fast random access time to mass storage could mean a paradigm shift not in actual code necessarily, but in the implementation details of virtual machines, compilers and schedulers perhaps most especially when it comes to parallelization? Just thinking aloud it would seem ("seem" being the operative word) that every scheduler optimization algorithm that previously had to just tell a process to "standby" and come back when its unknown wait time for a read/write operation was over could now, without missing a beat, choose another process that could actually be completed just in time to clear out of the global descriptor table (that is just an obviously basic in the extreme example of what I mean).
This is my first post so please, if I am missing something big here, just politely say so and if you can think of one off-hand recommend an article or book that will get me started.
This may seem like a somewhat subtle (or trivial as the case may be) point to make but it is something I have not yet seen addressed directly. It is worth noting that my status as a programmer came about from physics research, specifically writing simulations that would potentially be run on cluster level parallel environments. This is because if not for my study of TTL circuits, PC hardware, assembly language, basic OS architecture and whatnot before studying C++ for the first time I would not be asking this question (and it is perfectly possible that I do not see that this question is irrelevant because I still don't know enough, as is usually the case with anything involving computers... ). I do not consider what I have just stated to be some sort authoritative set of qualifications. Quite the opposite, aside from building home project circuits and putting together PC's from main components only C++ really represents something I have studied to where I have completed large multi-module projects that still made use of professional code wherever possible (I have only recently started to study intermediate). So
Just to make sure everyone understands and is clear (I include myself in that statement) what the most important implications so far regarding SSD's are I will state them (feel free to scan/skip the rest of this paragraph). While it is true many have already been able to leave any traditional magnetic HDD's in the dust when it comes to sequential speed via multiple channels, what is really making them revolutionary is their random access time. Not to beat the point into the ground (though it really is worth noting again and again) is that they have in read-time (and write-time with controllers that do good hardware abstraction) taken the mass storage random access time of a typical mass storage spinning magnetic drive, which has in 20 years not improved by even a single order of magnitude past ~2 milliseconds all the way to ~0.1 milliseconds (depends on drive) even while the technology is still making its way mainstream. Low power requirements, reliability (SLC particularly so) and other benefits are big to be sure but as far as the user experience goes, directly from a desktop SSD or indirectly from an internet that will soon be run on servers that use SSD's, the random access time is the transforming factor.
Now, this is what led me to the question in the title, as it came initially from the perspective of having previously been broken in on how (at a low level) the read/write heads of a spinning HDD are optimized in terms of requests. The problem is a very "non-linear" one in the extreme from a mathematical point of view considering that you have tons of requests that vary greatly in dependence on one another, priority and even their physical proximity relative to where the read/write heads are currently all the while with a limited ability to take all this into account and plan accordingly in real-time. This resulted in an optimized but still absurd variance inherent to spinning drives (the fact that they spin alone would cause this to some extent) that makes the "average seek time" truly the AVERAGE seek time, representing at best the center of a bell curve. In perspective we are now more aware of one of those things that we tend to let slip by, which is understandable considering that we were stuck with it for the past 20 years. So every OS, application and piece of hardware has had to design around what is not simply a high random access time, but a highly VARIABLE random access time. Just in case anyone is wondering how this could be relevant consider how important scheduling and prediction is to processing efficiency (that is, look at the on-die real estate of a single nehalem core and see how small the execution unit is compared to everything handling prediction, prefetch, microcode, interrupts, etc.)
Now I know and have learned to take more seriously the importance of the things closest to the processor and by this I mean the importance of cache over memory, the uselessness of parallel execution capability without an extensive on-die means of scheduling and preparation (i.e. why the P4 pipeline was full of bubbles for a lack of), but I am thinking of the implications of not simply fast but consistent hard drive read/write times in the big picture as I will explain below.
What this all makes me wonder about is whether or not a ridiculous requirement for durability has been lifted from very high level scheduling tasks with the advent of SSD's that from what I have read have read here on anandtech and elsewhere now give a relatively predetermined random access time for information off the system drive. The issue of variance in time of a definite write is legitimate considering that with present flash memory the slow write time is abstracted away. But I believe in one of the articles here that it was inferred from the "quantization" of write time that (with the intel controllers at least) the drive used some kind of queue that could be stuffed to a point where it would take an extra "cycle" or so of a firmware operation to abstractly represent the write request.
I freely admit I have never dived into assembly code in the middle of C++ proper and I have obviously never written a task scheduler but I was wondering if there was anyone here familiar with the issues involved enough to give an answer to this: Would not simply fast random access time but uniformly fast random access time to mass storage could mean a paradigm shift not in actual code necessarily, but in the implementation details of virtual machines, compilers and schedulers perhaps most especially when it comes to parallelization? Just thinking aloud it would seem ("seem" being the operative word) that every scheduler optimization algorithm that previously had to just tell a process to "standby" and come back when its unknown wait time for a read/write operation was over could now, without missing a beat, choose another process that could actually be completed just in time to clear out of the global descriptor table (that is just an obviously basic in the extreme example of what I mean).
This is my first post so please, if I am missing something big here, just politely say so and if you can think of one off-hand recommend an article or book that will get me started.