• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

So what happened to 2.6.x pre-emptiness hype ?

pitupepito2000

Golden Member
Hi,

As I was skimming the post-halloween document located at http://www.codemonkey.org.uk/d...post-halloween-2.6.txt
it mentioned the Kernel preemption, and I remember reading and feeling all the hype that it caused before kernel 2.6.0 was officially released, but looking back, I don't find that such featured made any difference in responsiveness or any faster than the 2.4.x kernels.

So what do you think about it? Did you find that kernel 2.6.x satisfied your expectations? Did kernel preemption made your box any faster?

Here's a quote from the post-halloween document
Kernel preemption.
~~~~~~~~~~~~~~~~~~
- The much talked about preemption patches made it into 2.6.
With this included you should notice much lower latencies especially
in demanding multimedia applications.
- Note, there are still cases where preemption must be temporarily disabled
where we do not. These areas occur in places where per-CPU data is used.
- If you get "xxx exited with preempt count=n" messages in syslog,
don't panic, these are non fatal, but are somewhat unclean.
(Something is taking a lock, and exiting without unlocking)
- If you DO notice high latency with kernel preemption enabled in
a specific code path, please report that to Andrew Morton <akpm@osdl.org>
and Robert Love <rml@tech9.net>.
The report should be something like "the latency in my xyz application
hits xxx ms when I do foo but is normally yyy" where foo is an action
like "unlink a huge directory tree".
 
oh, that's right I remember a while back you, drag, and a few other people were arguing about that in kernel 2.6.x, and if it is a good practice. Personally I don't think you are missing a whole lot of, out of it, except for alsa being integrated into the kernel.
 
It made a big difference, and they've improved the scedualling more.

Scedualling is one of the more important features that go into how well a OS works. Different types of scedualling determine the behavior of the computer.

For instance with pre-emptive features activated it makes the user interface more responsive. It doesn't actually improve performance, but the time from button press to something happening and transitions in between will SEEM faster and more responsive.

Of course there is a downside to this, if your running, say, a busy forums server and you open up X that is going to place a load on your server, it will stop the background proccesses (ie your SERVICES) and go attend to the user at the X windows terminal.

This is a BAD thing for a server. Especially if your dealing with databases and whatnot, this can lead to bad things happening like your streaming a image to a harddrive and there is a hickup in the stream. So pre-emptiveness behavior should be stopped.

This isn't something that couldn't of been added a lot earlier, but with the 2.6.x the Linux guys are aiming for the desktop. The 2.4.x series pretty much rapped up Linux as a successfull enterprise-capable scalable server, so it was time to focus elsewere. The server aspect has taken a life of it's own and now large third parties (such as IBM) is dumping code, time, and money into making it work even better then it did in the past.

Now with Linux they've improved the scedualling stuff even more. With advances like Bossa a individual or group of developers with otherwise little knowledge of the kernel can develop custom scedualling scemes to more accurately fit it to specific roles.

For instance you could develope a very low latency scedualling sceme for a software-based Digital recorder box so that you don't have to worry about tracks going out of sync with live music being recorded. Or a person building a 128+ cpu traditional Unix server using Linux can accurately scale the scedualling to match the box.

Previously only a person with very good knowledge of the kernel's internals and code could change stuff like around. Now they've cleaned it up so that the Bossa is compiled into C code and it can be compiled into the kernel during the normal make building proccess. This is because the guys who know most about getting the best performance out of proccess management for specific tasks are not the same guys that program kernels. At least that's how I understand it.

The reason you probably didn't notice a big change is:

1. You box was already very fast.
2. The distro your using backported pre-emptiveness behavior into the 2.4 series of kernels (All major distros have lots of customizations to their official kernels, especially Redhat)
3. You just don't notice that sort of thing.

It's not going to make things "faster" per say. Your probably not going to get a FPS increase in games or anything, but resposiveness to moving windows around would be better.
 
If your going to build your own kernel be sure that you turn the pre-emptive scedualling on, or at least make sure that it's on. I know that on early 2.6 stuff it was disabled by default...


The option in menuconfig is inside "proccessor types and features", then it's labled "Preemptive kernel"


here is a interesting link to a Bossa overview.
There is even a "bosa-fied" knoppix cdrom to play around with. Probably useless for us end-users, but would be interesting to use in a class about operating systems or something...
 
Originally posted by: drag
It made a big difference, and they've improved the scedualling more.

Scedualling is one of the more important features that go into how well a OS works. Different types of scedualling determine the behavior of the computer.

For instance with pre-emptive features activated it makes the user interface more responsive. It doesn't actually improve performance, but the time from button press to something happening and transitions in between will SEEM faster and more responsive.

Of course there is a downside to this, if your running, say, a busy forums server and you open up X that is going to place a load on your server, it will stop the background proccesses (ie your SERVICES) and go attend to the user at the X windows terminal.

This is a BAD thing for a server. Especially if your dealing with databases and whatnot, this can lead to bad things happening like your streaming a image to a harddrive and there is a hickup in the stream. So pre-emptiveness behavior should be stopped.

This isn't something that couldn't of been added a lot earlier, but with the 2.6.x the Linux guys are aiming for the desktop. The 2.4.x series pretty much rapped up Linux as a successfull enterprise-capable scalable server, so it was time to focus elsewere. The server aspect has taken a life of it's own and now large third parties (such as IBM) is dumping code, time, and money into making it work even better then it did in the past.

Now with Linux they've improved the scedualling stuff even more. With advances like Bossa a individual or group of developers with otherwise little knowledge of the kernel can develop custom scedualling scemes to more accurately fit it to specific roles.

For instance you could develope a very low latency scedualling sceme for a software-based Digital recorder box so that you don't have to worry about tracks going out of sync with live music being recorded. Or a person building a 128+ cpu traditional Unix server using Linux can accurately scale the scedualling to match the box.

Previously only a person with very good knowledge of the kernel's internals and code could change stuff like around. Now they've cleaned it up so that the Bossa is compiled into C code and it can be compiled into the kernel during the normal make building proccess. This is because the guys who know most about getting the best performance out of proccess management for specific tasks are not the same guys that program kernels. At least that's how I understand it.

The reason you probably didn't notice a big change is:

1. You box was already very fast.
2. The distro your using backported pre-emptiveness behavior into the 2.4 series of kernels (All major distros have lots of customizations to their official kernels, especially Redhat)
3. You just don't notice that sort of thing.

It's not going to make things "faster" per say. Your probably not going to get a FPS increase in games or anything, but resposiveness to moving windows around would be better.

I don't think my computer is that fast. I always get the kernels from kernel.org rather than from debian.

But, oh well, I guess it could be that I just don't notice that sort of thing.

For example, I don't feel that kernel 2.6.x boots any faster than 2.4.x. Is there any way that I can run any tests to measure the difference that the 2 kernels make in performance in my box?

thanks,
pitupepito
 
Originally posted by: pitupepito2000
oh, that's right I remember a while back you, drag, and a few other people were arguing about that in kernel 2.6.x, and if it is a good practice. Personally I don't think you are missing a whole lot of, out of it, except for alsa being integrated into the kernel.

I don't use Linux for a desktop anyhow, only server based stuff. 😉
 
This isn't something that couldn't of been added a lot earlier, but with the 2.6.x the Linux guys are aiming for the desktop. The 2.4.x series pretty much rapped up Linux as a successfull enterprise-capable scalable server, so it was time to focus elsewere. The server aspect has taken a life of it's own and now large third parties (such as IBM) is dumping code, time, and money into making it work even better then it did in the past.

Actually, one of the biggest changes to the 2.6 series kernel (IIRC) is the O(1) scheduler. This is directly aimed at servers. Linux 2.4.x and earlier had an O(n) scheduler, which scaled poorly with the number of processes/threads on a system. Basically, the 2.4 approach scanned the entire ready list to see if anything should be boosted in priority or re-scheduled at the end of each time slice. 2.6 instead sticks to boosting only those processes that meet certain criteria, eliminating the necessity to scan through the entire list at each scheduler run.

Large enterprise-class servers with hundreds of execution units running were noticing very high system scheduling overhead, and this was the culprit.

It's also a direct aim to catch up with Solaris and NT, which have had O(1) schedulers since 1992.
 
I don't think my computer is that fast. I always get the kernels from kernel.org rather than from debian.

But, oh well, I guess it could be that I just don't notice that sort of thing.

For example, I don't feel that kernel 2.6.x boots any faster than 2.4.x. Is there any way that I can run any tests to measure the difference that the 2 kernels make in performance in my box?

thanks,
pitupepito

It's not going to make a difference in boot performance. It's just responsiveness.

It's just FEELING.

Here is what you do to realy tell the difference:

boot up with a vanilla 2.4 kernel.
Put the system under 100% cpu load....
Then open up a window in X, and move it around. See how it responds.

Then boot up with a 2.6 kernel with preemptive enabled and do the same.

You'll see that the 2.4 kernel-driven linux will bog down and the window will lag and smear and generally show the ugly side of X windows. With the 2.6 kernel you won't be able to "feel" much difference between a 100% cpu load and a 3% load.

Here is a python script you can use if you can't think of anything to put the cpu under 100% load:
#! /usr/bin/env python

c = 40

while 1:
(hit tab here)c = c ** c - c / c % c * 13

(Python needs that whitespace, so don't forget to add that tab)


At least it made a difference on my machine. The visual artifacts and the movement of windows seemed much smoother/better with preemptive enabled.
 
Originally posted by: kylef
This isn't something that couldn't of been added a lot earlier, but with the 2.6.x the Linux guys are aiming for the desktop. The 2.4.x series pretty much rapped up Linux as a successfull enterprise-capable scalable server, so it was time to focus elsewere. The server aspect has taken a life of it's own and now large third parties (such as IBM) is dumping code, time, and money into making it work even better then it did in the past.

Actually, one of the biggest changes to the 2.6 series kernel (IIRC) is the O(1) scheduler. This is directly aimed at servers. Linux 2.4.x and earlier had an O(n) scheduler, which scaled poorly with the number of processes/threads on a system. Basically, the 2.4 approach scanned the entire ready list to see if anything should be boosted in priority or re-scheduled at the end of each time slice. 2.6 instead sticks to boosting only those processes that meet certain criteria, eliminating the necessity to scan through the entire list at each scheduler run.

Large enterprise-class servers with hundreds of execution units running were noticing very high system scheduling overhead, and this was the culprit.

It's also a direct aim to catch up with Solaris and NT, which have had O(1) schedulers since 1992.

I know that Windows has implimented Pre-emptive scheduler for ever and a day, but I don't know anything about it being O(1) or not.

Pre-emptiveness when it comes to dealing with end-users only deals what priority (or goodness) is given to user's actions on the system, not anything to do with the 0(1)-ness or not of the scedualer.

(edit: At least when it comes to user initiated actions. Linux has always been multiasking and procceses have a range of 0-21 of niceness (priority) that you can assign. The kernel will alocate time (quantum slices?) round robin style, but still give priority to those proccesses that are less nice. I don't know if the 2.4 kernel did ANY preempting, I assume it did. Probably wrong. I don't know. But with 2.6 the "preemptive" options is specificly refers to stuff initiated by user input. I beleive.)


Now when it comes to enterprise stuff there was a different issue that linux faced.

What scaled poorly in the Linux 2.4 setup wasn't so much the amount of proccesses as much as the amount of proccessors.

The 2.4 kernel implimented one runqueue, one tasklist irregardless of the machine it was being used on.

This worked fine up to a certain point. It scaled just fine with one cpu. 2 cpus? It was still in the game. 4? ok. 8? that's when you start to see issues.

If you look at benchmark graphs (back when they were still doing big benchmarks on w2k vs Linux) they'd both scale fine up to 4 proccessors, Linux would start to drop off at 8 and then after that it wouldn't scale at all.

Linux's problem was that it only had one tasklist, and that was the same as it's global runque.

You see with the single runque/tasklist the scheduler just ran up and down the list looking for proccesses with the lowest goodness and then assigning them to a cpu. That was it. It would read the runqueue, find the next proccess and give that to a cpu to proccess. Then it would do it again for the next one and then the next one, and so forth.

So with 4 cpus it wasn't a problem, and mostly worked with 8 cpus. After that it caused issues, there was no way to make sure that each CPU got the same thread, so that large caches in cpus were kinda wasted. Plus since the scheduler itself was single threaded, it had a single spinlock, and a single runque. It couldn't read all the potential threads and feed them to the cpus quick enough to efficently utilize all the cpus.

So basicly the overhead piled up and anything above 8 cpus was getting wasted. So that's what caused it to be 0(n) on computers that had more then 4. It wasn't so much a 0(n) as much as a brick wall above 8 cpus.

It's not as bad at it seems at first for the 2.4 kernel. When designing the 2.4 setup robustness, simplicity, and speed on 1-2 cpus was priority because that's were Linux was being used mostly at. It realy wasn't being used for the large databases and stuff like it is today.

What the 2.6 0(1) scedualer brought to linux was a independant runque for each CPU. Each cpu has a 140 possible runlevel priorities and now the scedualer takes proccesses from the global runque and alocates it to each cpu more efficiently. It doesn't have to rush between the cpus. it can allocate proccesses/threads at it's leasure. To make sure that no CPU will remain idle it will check every 200ms to see if any are idle, if any are cpus are idle and it doesn't have any threads to feed it it searchs for suitable threads to move from one cpu to another every 1ms.

So that's were the 0(1)-ness comes from. No matter how many cpu's or threads there are it will take the scedular the same amount of time/overhead to allocate threads.

To a end user on a single to quad cpu machine this sort of thing is fairly useless compared to old single tasklist way of 2.4, but the nice part is that the 2.6's scheduler is still very quick so there isn't any real penalty to use the slightly more complicated model on a computer that won't realy benifit from it much.

Of course some of these things were backported to 2.4 by companies like Redhat as far back as 2002 to make sure that their OS was still competative in enterprise stuff. Now that it's all official, though, it makes it much nicer.

There is still alot to be done, though. Like NUMA-based machines or the other specialized platforms used by high end Unix servers and Supercomputers. Stuff like that is were Sun's Solaris is still better then Linux. The scheduler still doesn't take into account the position of the memory, or the size of the memory footprint of a thread/proccess when it comes to deciding weither or not to migrate threads from one cpu to another.

Of course this is were people like IBM come in, they have the hardware to test and the know-how to deal with these sort of challenges.


Now how all that compares to Window's setup, I have not the foggiest clue.
 
Pre-emptiveness when it comes to dealing with end-users only deals what priority (or goodness) is given to user's actions on the system, not anything to do with the 0(1)-ness or not of the scedualer.

Actually, the 2.6 "Preemptible kernel" has nothing to do with pre-empting user-mode processes and threads, which has ALWAYS been possible in the linux kernel since day 1. What they're talking about is pre-empting threads running in privileged mode (in the kernel). This is brand new, and is MUCH, MUCH harder to implement. The kernel is full of data structures that track all sorts of stuff on the system: processes, threads, scheduling, open file handles, etc. Prior to 2.6, all of these data structures were locked with a huge monolithic mutex. This effectively prohibited any thread running in kernel mode from being preempted. The new kernel has much more fine-grained locking all over the place, protecting each critical section from contention issues. This means that the minute a kernel thread has passed through such a critical section, it can be preempted by any other thread on the system.

This is a big step forward for linux scalability, and it was sorely lacking. Solaris has had this since the early 90s (or maybe even before), and NT has had this since 1992. You probably wouldn't ever notice this being a problem until you start scaling to lots of processes executing system calls simultaneously. On systems with lots of processors and lots of running processes, this is a huge perf win.

You see with the single runque/tasklist the scheduler just ran up and down the list looking for proccesses with the lowest goodness and then assigning them to a cpu. That was it. It would read the runqueue, find the next proccess and give that to a cpu to proccess. Then it would do it again for the next one and then the next one, and so forth.

Right... the key phrase being "read the runqueue". If you have to read all N processes in the runqueue to make a scheduling decision, you've just created an O(N) scheduler. This is true whether or not you have k runqueues for k processors, or whether you have 1 runque for k processors.

After that it caused issues, there was no way to make sure that each CPU got the same thread, so that large caches in cpus were kinda wasted.

In nearly all x86 implementations, both the L1 cache and TLB are invalidated on context switch because the page tables are swapped. So really the only "wasted cache" when assigning a thread to a new processor are those L2 cache entries which happen to coincide with that thread's working set. L2 cache hit rates are fairly low anyway, and pale in comparison with the TLB misses. So assigning a thread to another processor incurs a fairly low performance hit. Contrast this with a NUMA system (such as a 4-way Opteron), where thread processor affinity is very important to keep memory accesses local. Here the scheduler must be very aware of processor affinity.

Plus since the scheduler itself was single threaded, it had a single spinlock, and a single runque. It couldn't read all the potential threads and feed them to the cpus quick enough to efficently utilize all the cpus.

Ah, this is indeed another problem, which is orthogonal to the O(n) problem. If your scheduler data structure is global for all processors, you are absolutely correct in that it must be protected by a single spinlock. Then multiple scheduling decisions cannot be made in parallel. This would be considered a "scheduling contention" issue. So this *is* a problem, but it is a separate problem from the O(n) problem. Just for sake of example, NT has had an O(1) scheduler since 1992, but not until Server 2003 did it implement a separate runqueue data structure for each processor. That just shows that the two problems are discrete and require separate fixes.

So that's were the 0(1)-ness comes from. No matter how many cpu's or threads there are it will take the scedular the same amount of time/overhead to allocate threads.

Well, the only reason I disagree is that the only true 'n-factor' in the old linux scheduler was going through the entire runqueue every single time a scheduling operation occurred. Having multiple scheduling events trying to access the same runqueue spinlock does not really compare to going through the entire runqueue each time for two reasons. First of all, running through the entire list of runnable processes happens at EVERY scheduler event, so this magnifies the O(n) operation of going through a scaling list (you can't avoid it, and it happens every time). By contrast, scheduler contention happens less frequently, and only in the most pathological case possible (where EVERY cpu is running a thread that finishes a quantum simultaneously, triggering a mad rush on the runqueue spinlock) does it represent an O(k) algorithm. And remember, k << n, further differentiating the two.

I think part of the reason for the confusion is that the Linux kernel people have dubbed the new 2.6 scheduler, "The O(1) Scheduler." It turns out that there are several separate enhancements in the scheduler, not all of which directly address the O(n) problem (which, btw, is a classic kernel scheduling problem and is well-studied).

Now how all that compares to Window's setup, I have not the foggiest clue.

The NUMA-aware stuff was only added as of Windows Server 2003, and I doubt it will be downported to XP.

The 2.6 kernel has other improvements which are quite impressive. Device-driver level async I/O, for instance, is a huge improvement and something that Linux has trailed in.
 
All very interesting and it makes sense what your saying. It wasn't until recently that I've begun to understand the importance of all this scedualing stuff. Even after 2.6 I was "cool it's more fancy for the hpc types", but until I understood that this is stuff that on larger archectures is something that administrators took a active handling of in traditional Unix setups for years and years to get best performance and remove bottlenecks.

It's weird because in my schooling you'd never hear anybody even mention stuff like this, but as PC's are growing in capabilities and complexity it is getting critical to know this sort of stuff.

So how many administrators know that on the NT-based platform (and probably Linux and others) that there is a tendancy for the sceduler to give priority to multi-threaded apps vs single threaded ones, because all the threads are treated the same so when one job has more threads then the other it's going to automaticly get priority simply because having 3 threads means that it gets potentially 3 time slices vs the other app that may only get 1 time slice?

So that if you have machine that is running multiple services and one service is bogging down that it may be because it's single threaded vs others that are multi-threaded and you should see if giving the single threaded app a larger time slice would solve a performance problem and you can maybe avoid a costly hardware upgrade.

I wouldn't of normally had a clue on this sort of thing. I'd figured before that all performance tuning involved was getting good drivers and reducing OS overhead as much as possible.


I would be interesting to see benchmarks on busy systems to see what effect scedualing priorities/time slices would have on overall performance.


For anybody else who is interested, I found some links that I think are particularly interesting.:

From the readme from at ftp.suse.com in the /pub/people/ak/numa/ directory (damn http:// autoparser can't I turn that off?!!)

Linux kernels support a NUMA API since 2.6.7rc3 (don't worry about the developement kernels, they've been having patches and were working it/testing on it since the 2.5 days)

It was aimed, I beleive, for x86-64 mostly, but it is applicable to other archatectures, probably for IA64, too. I'm pretty unsure on the details. There is a NUMA Linux website, but it's a bit dated. Newest info it had on it's front page was for 2.5 series. I think that Suse working with AMD has done a lot of good for getting great performance potiential out of the Linux + Opteron setup.

good couple articles about the windows NT scheduler setup.

Same guy talks about improvements of W2k over NT 4.0 in terms of scalability.

Part 2 of his article goes more into depth specificly on w2k's sceduler improvements.

(the MS dev. guys call the scheduler the "dispatcher")

Of course the "Pro" in WinNTPro magazine doesn't seems so much pro-fessional as pro-microsoft. But it's a magazine, so it's normal (I don't go reading Linux Journal for objective commentary on Windows). 😛

How the improved 2.6 sceduler linux basicly works. (fairly good magazine for advanced linux-specific tips/tricks) There are a couple other articles about it around the same time period.

If your curious about Linux developements check out this IBM's developerworks articles.

I know it seems that I have a boner for IBM, but I don't. I don't trust IBM any farther then my lawyer can sue them (which since I don't have a lawyer isn't very far), but damn when IBM does something they don't do it half-assed.

(check out the "charming python" series for good python introduction, if your interested. Excelent stuff)

Now I am all excited again about getting my hands on dual core, dual cpu opteron setup... 4 cpus.

each dual-core pair of CPUs share their own memory bank and memory controller. (on nicer motherboards.. cheaper current 939 boards have only one bank of memory that is controlled by one cpu and memory controller is shared between them on the hyperlink stuff) With a setup like that I can see how good NUMA support can be critical for good performance. Shared memory/fast connections between 2 cores, while still making it very expensive performance wise to move threads to the other CPU in comparision. Tricky to balance all that out, I'd bet.

I'd suppose that you'd see great performance gains of tweaked Win2003/Linux 2.6.8+ setups vs Win2000/Linux 2.4.x on a platform like that.
 
Originally posted by: n0cmonkey
Originally posted by: pitupepito2000
oh, that's right I remember a while back you, drag, and a few other people were arguing about that in kernel 2.6.x, and if it is a good practice. Personally I don't think you are missing a whole lot of, out of it, except for alsa being integrated into the kernel.

I don't use Linux for a desktop anyhow, only server based stuff. 😉

what do you use as a desktop then?
 
So how many administrators know that on the NT-based platform (and probably Linux and others) that there is a tendancy for the sceduler to give priority to multi-threaded apps vs single threaded ones...

Absolutely true, and you're right: not many people know the details of the scheduler, which is amazing considering how performance-minded so many enthusiasts are.

I would be interesting to see benchmarks on busy systems to see what effect scedualing priorities/time slices would have on overall performance.

There are actually some cool little experiments you can do with an app called "cpustres.exe" from Sysinternals.com and the *NT4* version of perfmon.exe (available in the Win2k resource kit). The reason you need the NT4 version of Perfmon is that it can sample faster than once per second, allowing you to actually see dynamic thread priorities changing as you play with the threads in the CPU Stress app. You can devise little experiments to actually witness the scheduler in action. 🙂

good couple articles about the windows NT scheduler setup.

Same guy talks about improvements of W2k over NT 4.0 in terms of scalability.

You've cited some good resources.

That author (Mark Russinovich) co-wrote a book called "Inside Windows 2000" with Dave Solomon. An updated version is about to come out called "Inside Windows Server 2003" (or something to that effect) that documents many of the improvements in XP and Server 2003. He has also written the vast majority of the freeware tools available on Sysinternals.com: they're amazing. I highly recommend that you check out Process explorer, Filemon, and Regmon. The fact that they're free and not part of some kind of expensive utilities package is frankly inexplicable.

I actually took a week-long seminar course from those two when they came here to MS, and I learned things I never knew about Windows. It's amazing how much Mark was able to learn about Windows by experimenting, disassembling, debugging, and monitoring: all without taking a peek at the source code!

 
Originally posted by: pitupepito2000
Originally posted by: n0cmonkey
Originally posted by: pitupepito2000
oh, that's right I remember a while back you, drag, and a few other people were arguing about that in kernel 2.6.x, and if it is a good practice. Personally I don't think you are missing a whole lot of, out of it, except for alsa being integrated into the kernel.

I don't use Linux for a desktop anyhow, only server based stuff. 😉

what do you use as a desktop then?

It depends on where I am. At work, where I have no choice, I use Win2k. On my desk at home, I use OpenBSD. In my bedroom I use Mac OS X. Out of the three, OpenBSD is my favorite really.
 
Originally posted by: kylef
That author (Mark Russinovich) co-wrote a book called "Inside Windows 2000" with Dave Solomon. An updated version is about to come out called "Inside Windows Server 2003" (or something to that effect) that documents many of the improvements in XP and Server 2003. He has also written the vast majority of the freeware tools available on Sysinternals.com: they're amazing. I highly recommend that you check out Process explorer, Filemon, and Regmon. The fact that they're free and not part of some kind of expensive utilities package is frankly inexplicable.

I actually took a week-long seminar course from those two when they came here to MS, and I learned things I never knew about Windows. It's amazing how much Mark was able to learn about Windows by experimenting, disassembling, debugging, and monitoring: all without taking a peek at the source code!

http://www.amazon.com/exec/obi...21789-1697418?v=glance
And because the videos were developed with full access to the Windows source code and Windows Development Team, you know you're getting the real story.
(emphasis mine)

Yet, in one of the user-review comments:
No offense to David, but Mark's influence is obvious. If you are a fan of his Internals column, you will like this book even better. The fact that he does it without source code is even more amazing.
(emphasis mine)

So I'm curious - which is it - did they have access to the source code or not?

I have to concur though, regardless, their freebie utils are a God-send for NT admins and developers alike.
 
Originally posted by: drag
Originally posted by: kylef
This isn't something that couldn't of been added a lot earlier, but with the 2.6.x the Linux guys are aiming for the desktop. The 2.4.x series pretty much rapped up Linux as a successfull enterprise-capable scalable server, so it was time to focus elsewere. The server aspect has taken a life of it's own and now large third parties (such as IBM) is dumping code, time, and money into making it work even better then it did in the past.

Actually, one of the biggest changes to the 2.6 series kernel (IIRC) is the O(1) scheduler. This is directly aimed at servers. Linux 2.4.x and earlier had an O(n) scheduler, which scaled poorly with the number of processes/threads on a system. Basically, the 2.4 approach scanned the entire ready list to see if anything should be boosted in priority or re-scheduled at the end of each time slice. 2.6 instead sticks to boosting only those processes that meet certain criteria, eliminating the necessity to scan through the entire list at each scheduler run.

Large enterprise-class servers with hundreds of execution units running were noticing very high system scheduling overhead, and this was the culprit.

It's also a direct aim to catch up with Solaris and NT, which have had O(1) schedulers since 1992.

I know that Windows has implimented Pre-emptive scheduler for ever and a day, but I don't know anything about it being O(1) or not.

Pre-emptiveness when it comes to dealing with end-users only deals what priority (or goodness) is given to user's actions on the system, not anything to do with the 0(1)-ness or not of the scedualer.

(edit: At least when it comes to user initiated actions. Linux has always been multiasking and procceses have a range of 0-21 of niceness (priority) that you can assign. The kernel will alocate time (quantum slices?) round robin style, but still give priority to those proccesses that are less nice. I don't know if the 2.4 kernel did ANY preempting, I assume it did. Probably wrong. I don't know. But with 2.6 the "preemptive" options is specificly refers to stuff initiated by user input. I beleive.)


Now when it comes to enterprise stuff there was a different issue that linux faced.

What scaled poorly in the Linux 2.4 setup wasn't so much the amount of proccesses as much as the amount of proccessors.

The 2.4 kernel implimented one runqueue, one tasklist irregardless of the machine it was being used on.

This worked fine up to a certain point. It scaled just fine with one cpu. 2 cpus? It was still in the game. 4? ok. 8? that's when you start to see issues.

If you look at benchmark graphs (back when they were still doing big benchmarks on w2k vs Linux) they'd both scale fine up to 4 proccessors, Linux would start to drop off at 8 and then after that it wouldn't scale at all.

Linux's problem was that it only had one tasklist, and that was the same as it's global runque.

You see with the single runque/tasklist the scheduler just ran up and down the list looking for proccesses with the lowest goodness and then assigning them to a cpu. That was it. It would read the runqueue, find the next proccess and give that to a cpu to proccess. Then it would do it again for the next one and then the next one, and so forth.

So with 4 cpus it wasn't a problem, and mostly worked with 8 cpus. After that it caused issues, there was no way to make sure that each CPU got the same thread, so that large caches in cpus were kinda wasted. Plus since the scheduler itself was single threaded, it had a single spinlock, and a single runque. It couldn't read all the potential threads and feed them to the cpus quick enough to efficently utilize all the cpus.

So basicly the overhead piled up and anything above 8 cpus was getting wasted. So that's what caused it to be 0(n) on computers that had more then 4. It wasn't so much a 0(n) as much as a brick wall above 8 cpus.

It's not as bad at it seems at first for the 2.4 kernel. When designing the 2.4 setup robustness, simplicity, and speed on 1-2 cpus was priority because that's were Linux was being used mostly at. It realy wasn't being used for the large databases and stuff like it is today.

What the 2.6 0(1) scedualer brought to linux was a independant runque for each CPU. Each cpu has a 140 possible runlevel priorities and now the scedualer takes proccesses from the global runque and alocates it to each cpu more efficiently. It doesn't have to rush between the cpus. it can allocate proccesses/threads at it's leasure. To make sure that no CPU will remain idle it will check every 200ms to see if any are idle, if any are cpus are idle and it doesn't have any threads to feed it it searchs for suitable threads to move from one cpu to another every 1ms.

So that's were the 0(1)-ness comes from. No matter how many cpu's or threads there are it will take the scedular the same amount of time/overhead to allocate threads.

To a end user on a single to quad cpu machine this sort of thing is fairly useless compared to old single tasklist way of 2.4, but the nice part is that the 2.6's scheduler is still very quick so there isn't any real penalty to use the slightly more complicated model on a computer that won't realy benifit from it much.

Of course some of these things were backported to 2.4 by companies like Redhat as far back as 2002 to make sure that their OS was still competative in enterprise stuff. Now that it's all official, though, it makes it much nicer.

There is still alot to be done, though. Like NUMA-based machines or the other specialized platforms used by high end Unix servers and Supercomputers. Stuff like that is were Sun's Solaris is still better then Linux. The scheduler still doesn't take into account the position of the memory, or the size of the memory footprint of a thread/proccess when it comes to deciding weither or not to migrate threads from one cpu to another.

Of course this is were people like IBM come in, they have the hardware to test and the know-how to deal with these sort of challenges.


Now how all that compares to Window's setup, I have not the foggiest clue.

CAN YOU BELIEVE THAT A GUY THAT MAKES POSTS LIKE THIS ISN'T elite?
 
Originally posted by: pitupepito2000
Originally posted by: drag
Originally posted by: kylef
This isn't something that couldn't of been added a lot earlier, but with the 2.6.x the Linux guys are aiming for the desktop. The 2.4.x series pretty much rapped up Linux as a successfull enterprise-capable scalable server, so it was time to focus elsewere. The server aspect has taken a life of it's own and now large third parties (such as IBM) is dumping code, time, and money into making it work even better then it did in the past.

Actually, one of the biggest changes to the 2.6 series kernel (IIRC) is the O(1) scheduler. This is directly aimed at servers. Linux 2.4.x and earlier had an O(n) scheduler, which scaled poorly with the number of processes/threads on a system. Basically, the 2.4 approach scanned the entire ready list to see if anything should be boosted in priority or re-scheduled at the end of each time slice. 2.6 instead sticks to boosting only those processes that meet certain criteria, eliminating the necessity to scan through the entire list at each scheduler run.

Large enterprise-class servers with hundreds of execution units running were noticing very high system scheduling overhead, and this was the culprit.

It's also a direct aim to catch up with Solaris and NT, which have had O(1) schedulers since 1992.

I know that Windows has implimented Pre-emptive scheduler for ever and a day, but I don't know anything about it being O(1) or not.

Pre-emptiveness when it comes to dealing with end-users only deals what priority (or goodness) is given to user's actions on the system, not anything to do with the 0(1)-ness or not of the scedualer.

(edit: At least when it comes to user initiated actions. Linux has always been multiasking and procceses have a range of 0-21 of niceness (priority) that you can assign. The kernel will alocate time (quantum slices?) round robin style, but still give priority to those proccesses that are less nice. I don't know if the 2.4 kernel did ANY preempting, I assume it did. Probably wrong. I don't know. But with 2.6 the "preemptive" options is specificly refers to stuff initiated by user input. I beleive.)


Now when it comes to enterprise stuff there was a different issue that linux faced.

What scaled poorly in the Linux 2.4 setup wasn't so much the amount of proccesses as much as the amount of proccessors.

The 2.4 kernel implimented one runqueue, one tasklist irregardless of the machine it was being used on.

This worked fine up to a certain point. It scaled just fine with one cpu. 2 cpus? It was still in the game. 4? ok. 8? that's when you start to see issues.

If you look at benchmark graphs (back when they were still doing big benchmarks on w2k vs Linux) they'd both scale fine up to 4 proccessors, Linux would start to drop off at 8 and then after that it wouldn't scale at all.

Linux's problem was that it only had one tasklist, and that was the same as it's global runque.

You see with the single runque/tasklist the scheduler just ran up and down the list looking for proccesses with the lowest goodness and then assigning them to a cpu. That was it. It would read the runqueue, find the next proccess and give that to a cpu to proccess. Then it would do it again for the next one and then the next one, and so forth.

So with 4 cpus it wasn't a problem, and mostly worked with 8 cpus. After that it caused issues, there was no way to make sure that each CPU got the same thread, so that large caches in cpus were kinda wasted. Plus since the scheduler itself was single threaded, it had a single spinlock, and a single runque. It couldn't read all the potential threads and feed them to the cpus quick enough to efficently utilize all the cpus.

So basicly the overhead piled up and anything above 8 cpus was getting wasted. So that's what caused it to be 0(n) on computers that had more then 4. It wasn't so much a 0(n) as much as a brick wall above 8 cpus.

It's not as bad at it seems at first for the 2.4 kernel. When designing the 2.4 setup robustness, simplicity, and speed on 1-2 cpus was priority because that's were Linux was being used mostly at. It realy wasn't being used for the large databases and stuff like it is today.

What the 2.6 0(1) scedualer brought to linux was a independant runque for each CPU. Each cpu has a 140 possible runlevel priorities and now the scedualer takes proccesses from the global runque and alocates it to each cpu more efficiently. It doesn't have to rush between the cpus. it can allocate proccesses/threads at it's leasure. To make sure that no CPU will remain idle it will check every 200ms to see if any are idle, if any are cpus are idle and it doesn't have any threads to feed it it searchs for suitable threads to move from one cpu to another every 1ms.

So that's were the 0(1)-ness comes from. No matter how many cpu's or threads there are it will take the scedular the same amount of time/overhead to allocate threads.

To a end user on a single to quad cpu machine this sort of thing is fairly useless compared to old single tasklist way of 2.4, but the nice part is that the 2.6's scheduler is still very quick so there isn't any real penalty to use the slightly more complicated model on a computer that won't realy benifit from it much.

Of course some of these things were backported to 2.4 by companies like Redhat as far back as 2002 to make sure that their OS was still competative in enterprise stuff. Now that it's all official, though, it makes it much nicer.

There is still alot to be done, though. Like NUMA-based machines or the other specialized platforms used by high end Unix servers and Supercomputers. Stuff like that is were Sun's Solaris is still better then Linux. The scheduler still doesn't take into account the position of the memory, or the size of the memory footprint of a thread/proccess when it comes to deciding weither or not to migrate threads from one cpu to another.

Of course this is were people like IBM come in, they have the hardware to test and the know-how to deal with these sort of challenges.


Now how all that compares to Window's setup, I have not the foggiest clue.

CAN YOU BELIEVE THAT A GUY THAT MAKES POSTS LIKE THIS ISN'T elite?

We've been trying to get that remidied for a while now. The mods ignore us. Doesn't make any sense at all, unless they were waiting for the .Net upgrade or for someone to send morphine and whores.
 
Originally posted by: n0cmonkey
Originally posted by: pitupepito2000
Originally posted by: drag
Originally posted by: kylef
This isn't something that couldn't of been added a lot earlier, but with the 2.6.x the Linux guys are aiming for the desktop. The 2.4.x series pretty much rapped up Linux as a successfull enterprise-capable scalable server, so it was time to focus elsewere. The server aspect has taken a life of it's own and now large third parties (such as IBM) is dumping code, time, and money into making it work even better then it did in the past.

Actually, one of the biggest changes to the 2.6 series kernel (IIRC) is the O(1) scheduler. This is directly aimed at servers. Linux 2.4.x and earlier had an O(n) scheduler, which scaled poorly with the number of processes/threads on a system. Basically, the 2.4 approach scanned the entire ready list to see if anything should be boosted in priority or re-scheduled at the end of each time slice. 2.6 instead sticks to boosting only those processes that meet certain criteria, eliminating the necessity to scan through the entire list at each scheduler run.

Large enterprise-class servers with hundreds of execution units running were noticing very high system scheduling overhead, and this was the culprit.

It's also a direct aim to catch up with Solaris and NT, which have had O(1) schedulers since 1992.

I know that Windows has implimented Pre-emptive scheduler for ever and a day, but I don't know anything about it being O(1) or not.

Pre-emptiveness when it comes to dealing with end-users only deals what priority (or goodness) is given to user's actions on the system, not anything to do with the 0(1)-ness or not of the scedualer.

(edit: At least when it comes to user initiated actions. Linux has always been multiasking and procceses have a range of 0-21 of niceness (priority) that you can assign. The kernel will alocate time (quantum slices?) round robin style, but still give priority to those proccesses that are less nice. I don't know if the 2.4 kernel did ANY preempting, I assume it did. Probably wrong. I don't know. But with 2.6 the "preemptive" options is specificly refers to stuff initiated by user input. I beleive.)


Now when it comes to enterprise stuff there was a different issue that linux faced.

What scaled poorly in the Linux 2.4 setup wasn't so much the amount of proccesses as much as the amount of proccessors.

The 2.4 kernel implimented one runqueue, one tasklist irregardless of the machine it was being used on.

This worked fine up to a certain point. It scaled just fine with one cpu. 2 cpus? It was still in the game. 4? ok. 8? that's when you start to see issues.

If you look at benchmark graphs (back when they were still doing big benchmarks on w2k vs Linux) they'd both scale fine up to 4 proccessors, Linux would start to drop off at 8 and then after that it wouldn't scale at all.

Linux's problem was that it only had one tasklist, and that was the same as it's global runque.

You see with the single runque/tasklist the scheduler just ran up and down the list looking for proccesses with the lowest goodness and then assigning them to a cpu. That was it. It would read the runqueue, find the next proccess and give that to a cpu to proccess. Then it would do it again for the next one and then the next one, and so forth.

So with 4 cpus it wasn't a problem, and mostly worked with 8 cpus. After that it caused issues, there was no way to make sure that each CPU got the same thread, so that large caches in cpus were kinda wasted. Plus since the scheduler itself was single threaded, it had a single spinlock, and a single runque. It couldn't read all the potential threads and feed them to the cpus quick enough to efficently utilize all the cpus.

So basicly the overhead piled up and anything above 8 cpus was getting wasted. So that's what caused it to be 0(n) on computers that had more then 4. It wasn't so much a 0(n) as much as a brick wall above 8 cpus.

It's not as bad at it seems at first for the 2.4 kernel. When designing the 2.4 setup robustness, simplicity, and speed on 1-2 cpus was priority because that's were Linux was being used mostly at. It realy wasn't being used for the large databases and stuff like it is today.

What the 2.6 0(1) scedualer brought to linux was a independant runque for each CPU. Each cpu has a 140 possible runlevel priorities and now the scedualer takes proccesses from the global runque and alocates it to each cpu more efficiently. It doesn't have to rush between the cpus. it can allocate proccesses/threads at it's leasure. To make sure that no CPU will remain idle it will check every 200ms to see if any are idle, if any are cpus are idle and it doesn't have any threads to feed it it searchs for suitable threads to move from one cpu to another every 1ms.

So that's were the 0(1)-ness comes from. No matter how many cpu's or threads there are it will take the scedular the same amount of time/overhead to allocate threads.

To a end user on a single to quad cpu machine this sort of thing is fairly useless compared to old single tasklist way of 2.4, but the nice part is that the 2.6's scheduler is still very quick so there isn't any real penalty to use the slightly more complicated model on a computer that won't realy benifit from it much.

Of course some of these things were backported to 2.4 by companies like Redhat as far back as 2002 to make sure that their OS was still competative in enterprise stuff. Now that it's all official, though, it makes it much nicer.

There is still alot to be done, though. Like NUMA-based machines or the other specialized platforms used by high end Unix servers and Supercomputers. Stuff like that is were Sun's Solaris is still better then Linux. The scheduler still doesn't take into account the position of the memory, or the size of the memory footprint of a thread/proccess when it comes to deciding weither or not to migrate threads from one cpu to another.

Of course this is were people like IBM come in, they have the hardware to test and the know-how to deal with these sort of challenges.


Now how all that compares to Window's setup, I have not the foggiest clue.

CAN YOU BELIEVE THAT A GUY THAT MAKES POSTS LIKE THIS ISN'T elite?

We've been trying to get that remidied for a while now. The mods ignore us. Doesn't make any sense at all, unless they were waiting for the .Net upgrade or for someone to send morphine and whores.

&amp;lt;----points to his sig
 
Originally posted by: MCrusty
Originally posted by: n0cmonkey
Originally posted by: pitupepito2000
Originally posted by: drag
Originally posted by: kylef
This isn't something that couldn't of been added a lot earlier, but with the 2.6.x the Linux guys are aiming for the desktop. The 2.4.x series pretty much rapped up Linux as a successfull enterprise-capable scalable server, so it was time to focus elsewere. The server aspect has taken a life of it's own and now large third parties (such as IBM) is dumping code, time, and money into making it work even better then it did in the past.

Actually, one of the biggest changes to the 2.6 series kernel (IIRC) is the O(1) scheduler. This is directly aimed at servers. Linux 2.4.x and earlier had an O(n) scheduler, which scaled poorly with the number of processes/threads on a system. Basically, the 2.4 approach scanned the entire ready list to see if anything should be boosted in priority or re-scheduled at the end of each time slice. 2.6 instead sticks to boosting only those processes that meet certain criteria, eliminating the necessity to scan through the entire list at each scheduler run.

Large enterprise-class servers with hundreds of execution units running were noticing very high system scheduling overhead, and this was the culprit.

It's also a direct aim to catch up with Solaris and NT, which have had O(1) schedulers since 1992.

I know that Windows has implimented Pre-emptive scheduler for ever and a day, but I don't know anything about it being O(1) or not.

Pre-emptiveness when it comes to dealing with end-users only deals what priority (or goodness) is given to user's actions on the system, not anything to do with the 0(1)-ness or not of the scedualer.

(edit: At least when it comes to user initiated actions. Linux has always been multiasking and procceses have a range of 0-21 of niceness (priority) that you can assign. The kernel will alocate time (quantum slices?) round robin style, but still give priority to those proccesses that are less nice. I don't know if the 2.4 kernel did ANY preempting, I assume it did. Probably wrong. I don't know. But with 2.6 the "preemptive" options is specificly refers to stuff initiated by user input. I beleive.)


Now when it comes to enterprise stuff there was a different issue that linux faced.

What scaled poorly in the Linux 2.4 setup wasn't so much the amount of proccesses as much as the amount of proccessors.

The 2.4 kernel implimented one runqueue, one tasklist irregardless of the machine it was being used on.

This worked fine up to a certain point. It scaled just fine with one cpu. 2 cpus? It was still in the game. 4? ok. 8? that's when you start to see issues.

If you look at benchmark graphs (back when they were still doing big benchmarks on w2k vs Linux) they'd both scale fine up to 4 proccessors, Linux would start to drop off at 8 and then after that it wouldn't scale at all.

Linux's problem was that it only had one tasklist, and that was the same as it's global runque.

You see with the single runque/tasklist the scheduler just ran up and down the list looking for proccesses with the lowest goodness and then assigning them to a cpu. That was it. It would read the runqueue, find the next proccess and give that to a cpu to proccess. Then it would do it again for the next one and then the next one, and so forth.

So with 4 cpus it wasn't a problem, and mostly worked with 8 cpus. After that it caused issues, there was no way to make sure that each CPU got the same thread, so that large caches in cpus were kinda wasted. Plus since the scheduler itself was single threaded, it had a single spinlock, and a single runque. It couldn't read all the potential threads and feed them to the cpus quick enough to efficently utilize all the cpus.

So basicly the overhead piled up and anything above 8 cpus was getting wasted. So that's what caused it to be 0(n) on computers that had more then 4. It wasn't so much a 0(n) as much as a brick wall above 8 cpus.

It's not as bad at it seems at first for the 2.4 kernel. When designing the 2.4 setup robustness, simplicity, and speed on 1-2 cpus was priority because that's were Linux was being used mostly at. It realy wasn't being used for the large databases and stuff like it is today.

What the 2.6 0(1) scedualer brought to linux was a independant runque for each CPU. Each cpu has a 140 possible runlevel priorities and now the scedualer takes proccesses from the global runque and alocates it to each cpu more efficiently. It doesn't have to rush between the cpus. it can allocate proccesses/threads at it's leasure. To make sure that no CPU will remain idle it will check every 200ms to see if any are idle, if any are cpus are idle and it doesn't have any threads to feed it it searchs for suitable threads to move from one cpu to another every 1ms.

So that's were the 0(1)-ness comes from. No matter how many cpu's or threads there are it will take the scedular the same amount of time/overhead to allocate threads.

To a end user on a single to quad cpu machine this sort of thing is fairly useless compared to old single tasklist way of 2.4, but the nice part is that the 2.6's scheduler is still very quick so there isn't any real penalty to use the slightly more complicated model on a computer that won't realy benifit from it much.

Of course some of these things were backported to 2.4 by companies like Redhat as far back as 2002 to make sure that their OS was still competative in enterprise stuff. Now that it's all official, though, it makes it much nicer.

There is still alot to be done, though. Like NUMA-based machines or the other specialized platforms used by high end Unix servers and Supercomputers. Stuff like that is were Sun's Solaris is still better then Linux. The scheduler still doesn't take into account the position of the memory, or the size of the memory footprint of a thread/proccess when it comes to deciding weither or not to migrate threads from one cpu to another.

Of course this is were people like IBM come in, they have the hardware to test and the know-how to deal with these sort of challenges.


Now how all that compares to Window's setup, I have not the foggiest clue.

CAN YOU BELIEVE THAT A GUY THAT MAKES POSTS LIKE THIS ISN'T elite?

We've been trying to get that remidied for a while now. The mods ignore us. Doesn't make any sense at all, unless they were waiting for the .Net upgrade or for someone to send morphine and whores.

&amp;lt;----points to his sig

Total agreement, as of now. 🙂
 
I don't know all that about being elite or anything. I am just a Linux fanboy who can consume and regurgitate things I find on the internet. It's more google then me most of the time.


Anyways, here is a executative-style summary from IBM.

They regularly conduct tests on Linux OSes to weed out any bugs or problems then if they find them they report them to the kernel buglist. They do these tests anytime there are major changes or during random times for certification purposes. The page describes the tests and then compares the same machine in the same test enviroment, the only difference was that one time it was using 2.4 series kernel and on the other time it was using 2.6.0.rc2. It also gives a basic summary of the improvements of 2.6 over 2.4 kernels.

A typical test will begin with a 24 stress test, then if that's successfull then a 96 hour one and eventually a 14-day long test. All using a veriaty of tools and testing methodology.

The particulars used on this server:

# Machine: IBM xSeries Netfinity 8500R 8681-7RY
# CPU: (8) Pentium III-700MHz
# Memory: 9 GB
# Swap space: 2 GB
# Linux distribution: Red Hat 7.3
# Web server: Apache Http Server 2.0.47
# Web test tool: WPT 1.9.4

It seems like it was static content with the WPT test, but I am not familar with it. The test simulated 30 clients with 2 threads each.

The results?

By switching to the 2.6 series kernel they gained a 600% increase in webserving performance. Which I think is pretty damn good improvement. (I realy doubt it would of shown this large amount of increase on a 4-way or less machine though...)
 
Originally posted by: drag
I don't know all that about being elite or anything. I am just a Linux fanboy who can consume and regurgitate things I find on the internet. It's more google then me most of the time.

It's not what you know, it's what you can find out. Shut up, we're trying to get you elite and you'll be glad, like it or not. 😛

Sorry for helping to drag (maybe a bit of a pun) this off topic, but I am thoroughly enjoying the rest of the thread. 🙂
 
No offense to David, but Mark's influence is obvious. If you are a fan of his Internals column, you will like this book even better. The fact that he does it without source code is even more amazing.
(emphasis mine)

So I'm curious - which is it - did they have access to the source code or not?

I have to concur though, regardless, their freebie utils are a God-send for NT admins and developers alike.

The answer is that Mark Russinovich has no access to the source code, and is very adamant about KEEPING it that way for two reasons: he doesn't want to be seen as a Microsoft "insider", and he's somewhat suspicious about whether viewing the Windows source code would suddenly call into question the ownership of his company's future products (Winternals sells emergency recovery software).

Mark uses two tools to do the majority of his Windows internals investigation: the IDA Pro disassembler and Numega's brilliant SoftICE kernel debugger. (Mark actually worked at Numega for a time after his PhD at Carnegie Mellon and research work at Open Systems Research.) He does not have access to the internal Windows sybmol server either, but as of XP he *does* have access to the full debugging symbols of the kernel and system Dlls, which are available to all MSDN subscribers.

In any case, watching him blow through an NT Native API call and describing its operation by looking only at the disassembly is quite impressive.

Dave Solomon, on the other hand, has a vendor/contractor badge at Microsoft and has full access to the source code. Dave also has direct access to NT's architects, such as Dave Cutler (chief NT architect) and Landy Wang (Memory Manager owner). Dave is an ex-Digital VMS guy.

So they're both correct. One guy does have source access, and the other doesn't.
 
Back
Top