• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

How much electricity would be saved worldwide if Windows was writen in Assembly?

Page 6 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
Status
Not open for further replies.
Do you realize that Windows Vista was three times as many lines of code as the operating system we developed for our last major program, which was orders of magnitudes more complicated and critical than Windows Vista.

Orders of magnitude more complicated than an OS which supports thousands of devices through dozens of chipsets and capable of running a million applications on a hundred different processors?

An OS for one, single program?
 
Last edited:
Again I should emphasize I know exactly what I am talking about. For example the Windows string concatenation API takes an order of magnitude more clock cycles than the one I have written in asm. It is just one example. Windows has grown so big that it needs to be optimized, otherwise it has no future.

Which windows string concatenation api are we talking about? Do you mean lstrcat ?
 
you "know" for "a fact" that aero is bloatware? really?
Did your knowledge of facts inform you that disabling aero LOWERS performance, because aero offloads to the GPU a lot of things that were previously rendered in the CPU?
Disabling Aero will save you some ram, but increase your CPU utilization in windows, and is simply not worth it (unless you are extremely hurting for ram)

I jist don't need it. An OS needs to be fast offer functions which I need be bugfree use little power and work seemlessly. But in no way be showy. leave that to third party vendors for those who want it.
 
Orders of magnitude more complicated than an OS which supports thousands of devices through dozens of chipsets and capable of running a million applications on a hundred different processors?

An OS for one, single program?
My thoughts, especially considering that he obvious does not work for MS, i.e. does not know how large the codebase is, since they don't publish numbers as far as I know and yet is able to estimate the complexity? Oh and doesn't even mention the name of this ominous program 😉


@Any_Name_Does: The OS usually needs MORE power if you disable aero, so your point is? Aero looks nice and is something you really get used to, so even if it would use a bit more power, I'd still use it.. if you want architectural purity, why not use Arch? You don't even get those horrible inefficient guis there, great.

And no, the OS functions don't have to be especially fast, first of all they have to be correct, don't break backwards compatibility, be good documented and then maybe we can think about optimizing them for speed, if they are important enough to warrant the work.

And I'd still love to see your code for the string concetenation, I can't imagine how you should get that much faster than the trivial C function.
 
I think it doesn't matter how Windows is written, what matters is for how long and how often is being used; otherwise no energy will be saved.
 
Orders of magnitude more complicated than an OS which supports thousands of devices through dozens of chipsets and capable of running a million applications on a hundred different processors?

An OS for one, single program?

You don't understand what I mean by program. I do not mean software program.

It is not prudent for me to give any additional information, I just meant to give an example that showed how much room for improvement Vista had in terms of efficiency. It was somewhat of a brainfart on my part to give the example, as it is not something I wish to talk about in a public forum.
 
Last edited:
I jist don't need it. An OS needs to be fast offer functions which I need be bugfree use little power and work seemlessly. But in no way be showy. leave that to third party vendors for those who want it.

Aero makes the OS faster, the machine consume less power, it works seamlessly, and I have never seen any bugs in it...
just because it is looks good doesn't mean it is BAD...
I used to disable the eye candy theme in windows XP for extra performance (giving it a win2k look)... that was because winXP eye candy came at the cost of CPU cycles and ram (at the time when both were very limited)... I eventually started using it when it became trivial. With Aero though, it always had better performance than previous methods, thanks to offloading to GPU.

Now, it was possible for MS to design a hypothetical system that offloads to the GPU with less eye candy for infinitesimal improvement in power consumption (due to reduction in GPU utilization) and no effect on speed (because its all GPU anyways). But they didn't, they made aero, which, while not the ultimate in power savings, is flat out superior in terms of speed and power to the older and less pretty methods of rendering.
 
Last edited:
If MS took that approach, I guarantee you that that would result in a slower version of windows that would consumer more power. Have you ever tried to optimize assembly code better than the compiler? It's really hard. The compiler is usually better at it that any human. The compiler produces this crazy assembly code that doesn't look so good but when you actually run it, it's better than hand written assembly.

The only way that an os written in assembly would be better than an os written in c++/c would be due to a reduction in architechtural complexity of the program and a reduction in features. It would not be due to better fine tuning and squeezing more cycles out.

some time ago someone I know needed a program which calculates the pythagoras formula a² + b² = c², with a and b being numbers from1 to 10000. I wrote that program and it is fast. It could become even faster if I was a good asm programmer. You seem to be a programmer yourself. write a program which does this function. I bet you that my code will be at least twice as fast as yours. your code writes the results of the calculation to a text file and displays it at the end. But do not cheat. what I mean by cheat is have an asm programmer optimize it for you.
 
How large is Windows compared to the Nike-X/Spartan anti-ballistic missile system control program, which was perhaps the largest computer program written in the 1960s or early 1970s and was written in assembly language?
Much, much, much bigger. There is no comparison.
 
some time ago someone I know needed a program which calculates the pythagoras formula a² + b² = c², with a and b being numbers from1 to 10000. I wrote that program and it is fast. It could become even faster if I was a good asm programmer. You seem to be a programmer yourself. write a program which does this function. I bet you that my code will be at least twice as fast as yours. your code writes the results of the calculation to a text file and displays it at the end. But do not cheat. what I mean by cheat is have an asm programmer optimize it for you.
Okay, we want the integer value of sqrt(a^2 + b^2) where a and b are integers? Basically that depends solely on the implementation of the integer sqrt function.. ah I know one or two tricks for that.

Fine, let's see how much faster your version is 😉
 
Last edited:
Okay, we want the integer value of sqrt(a^2 + b^2) where a and b are integers? Basically that depends solely on the implementation of the integer sqrt function.. ah I know one or two tricks for that.

Fine, let's see how much faster your version is 😉

Not just a^2 + b^2
a^2 + b^2 = c^2

so something like this:

3² + 4² = 5² = 25
 
Last edited:
Not just a^2 + b^2
a^2 + b^2 = c^2

so something like this:

3² + 4² = 5² = 25
Ahm so you basically want a*a + b*b?!

Or are you interested in the 5? I.e. you give the function two numbers and it returns the a^2 + b^2 AND sqrt(a^2 + b^2) ?

Still the only interesting part is the integer sqrt function.. just give me the assembly code for it (I hope you followed the standard c function call conventions) and I'll just benchmark that part. Just compute the sqrt for all numbers from 0 till UINT_MAX and look how long it takes.
 
Last edited:
No, I'm getting more and more confused - that seems completely useless and not much to optimize (power of two of a number and adding two numbers?)

I think it's easier if you write the function in pseudo code for me..
 
some time ago someone I know needed a program which calculates the pythagoras formula a² + b² = c², with a and b being numbers from1 to 10000. I wrote that program and it is fast. It could become even faster if I was a good asm programmer. You seem to be a programmer yourself. write a program which does this function. I bet you that my code will be at least twice as fast as yours. your code writes the results of the calculation to a text file and displays it at the end. But do not cheat. what I mean by cheat is have an asm programmer optimize it for you.

you think you will be able to beat a compiler by 2x and you say you are not good at writing assembly.🙄

i can tell by that example you dont know what you are doing. that would take microseconds to finish and no one would waste their time optimizing it. in fact most of the time would be spent diplaying the file.

inline assembly is not cheating. it's an advantage of the language which is what this is all about.
 
No, I'm getting more and more confused - that seems completely useless and not much to optimize (power of two of a number and adding two numbers?)

I think it's easier if you write the function in pseudo code for me..

it is not useless. it is famous mathematical formula.
power of 2 of a number + power of 2 of another number resulting in the power of 2 another number. you must find from 1 to 10000 every posible number which when multiplied by itself and added ........... come on wiki has it. Take a look
 
you think you will be able to beat a compiler by 2x and you say you are not good at writing assembly.🙄

i can tell by that example you dont know what you are doing. that would take microseconds to finish and no one would waste their time optimizing it. in fact most of the time would be spent diplaying the file.

inline assembly is not cheating. it's an advantage of the language which is what this is all about.

do it then. I don't think you have understood the task.
 
it is not useless. it is famous mathematical formula.
power of 2 of a number + power of 2 of another number resulting in the power of 2 another number. you must find from 1 to 10000 every posible number which when multiplied by itself and added ........... come on wiki has it. Take a look
Ahm thanks I know what a Pythagorean triple is (my math minor wasn't that useless), would've been easier if you had just told me that you wanted all triples in the range of 1-10000.
But I'm not aware of any application area where you'd want to use it, sounds more like a Project Euler kind of fun function. Just because it's famous doesn't mean it's useful.. the prove of Fermat's Last Theorem is unbelieveable famous, horrendously complicated (not that many people who can follow it in every detail), long (100 pages) but still, practically useless 😉


But the real problem here is, that even a trivial bruteforce implementation takes far less than 1ms to complete and there are many easy algorithms to generate those tripples, so that's a really bad, bad example to test anything.
 
Last edited:
Ahm thanks I know what a Pythagorean triple is (my math minor wasn't that useless), would've been easier if you had just told me that you wanted all triples in the range of 1-10000.
But I'm not aware of any application area where you'd want to use it, sounds more like a Project Euler kind of fun function. Just because it's famous doesn't mean it's useful.. the prove of Fermat's Last Theorem is unbelieveable famous, horrendously complicated (not that many people who can follow it in every detail), long (100 pages) but still, practically useless 😉


But the real problem here is, that even a trivial bruteforce implementation takes far less than 1ms to complete and there are many easy algorithms to generate those tripples, so that's a really bad, bad example to test anything.

The person who asked for this program was a math teacher and I don't know for what. for me it was a challenge and I did it, and I still have it.
I think it is good example of how vastly superior asm code could be to other languages, as it is run within a couple of small loops with a size less than 100 bytes.
 
I am afraid I have to disagree. No compiler can come anywhere close to assembly code. I think the reason high level languages took over was the lack of energy awareness and the relatively small number of computers worldwide. Now that this awareness is there and computers consume huge amounts of energy, a return to assmbly COULD be worthwhile.
Note that I am not saying it IS worthwhile. I just think it is worth a calculation.

How long would it take MS to recode Win7 to assembly?

Diseconomies of scale

^ The above link reckons that a coder will only manage between 300 - 5000 lines per year on a 10 million line of code project. Vista is reputed to be 50 million lines.

Assuming that Win7 is 60 million, it would take your 100 coders 60,000,000/(300*100) = 2000 years to produce 60 million lines of code. Given that you have to provide office space, heating, lighting, computers etc. for the people, you are going to use a lot of power (this is what sandorski meant when he said you "front load the power")... then there is the engineers salaries which over the period will amount to $12 billion.

Of course you could increase staffing levels - but this too will have a dis-economy of scale so increasing the staff to 100,000 will not get the project done in 2 years (and it will therefore cost >$12 billion).

^Granted this is a bit of a laugh, since you should be able to do things faster if you already have the code 🙂

The coding horror site is probably worth reading, as are the books it mentions.
 
How long would it take MS to recode Win7 to assembly?

Diseconomies of scale

^ The above link reckons that a coder will only manage between 300 - 5000 lines per year on a 10 million line of code project. Vista is reputed to be 50 million lines.

Assuming that Win7 is 60 million, it would take your 100 coders 60,000,000/(300*100) = 2000 years to produce 60 million lines of code. Given that you have to provide office space, heating, lighting, computers etc. for the people, you are going to use a lot of power (this is what sandorski meant when he said you "front load the power")... then there is the engineers salaries which over the period will amount to $12 billion.

Of course you could increase staffing levels - but this too will have a dis-economy of scale so increasing the staff to 100,000 will not get the project done in 2 years (and it will therefore cost >$12 billion).

^Granted this is a bit of a laugh, since you should be able to do things faster if you already have the code 🙂

The coding horror site is probably worth reading, as are the books it mentions.

I have to repeat myself.
you don't recode the whole OS. You start off with the kernel dll and go on to other major files. slowly and steadily. I would say 100 programmers could do most of the important work in a couple of years. if you want to spend a little as possible you could even employ programmer in india who would happily work for a 1000 dollars a month. What happens after that is that you have a better OS forever. they don't write the code. they just optimize it.
 
Windows xp search for a file used to take more than a minute on my computer. I wrote assembly code which did it in 14 seconds. I didn't even optimize it.

Your mistake is thinking that everybody would just run the program they need to and then immediately turn the computer off... that would lead to a significant saving. Most of the time the computer is idling anyway...

I'm browsing, watching a movie and other stuff and the CPU usage is 1%, having Win7 an order of magnitude faster via assembly code won't save me any energy...
 
Last edited:
functionality is more important than performance. the whole point of an algorithm is to do something. speed is secondary especially when your example only takes milliseconds.

i'll play your little game. i'm not a programmer by profession, i would consider myself a neophyte but i think that will actually help prove my point shortly.

can you post the ASM?
 
I have to repeat myself.
you don't recode the whole OS. You start off with the kernel dll and go on to other major files. slowly and steadily. I would say 100 programmers could do most of the important work in a couple of years. if you want to spend a little as possible you could even employ programmer in india who would happily work for a 1000 dollars a month. What happens after that is that you have a better OS forever. they don't write the code. they just optimize it.

The kernel is still going to be millions of lines isn't it?

If you change or optimise something there is always the possibility that it will break something else, which means it will take ages to implement.

That bit of redundant code you thought was redundant didn't turn out to be so redundant after all...

Edit: Yeah even linux kernels have surpassed 10 million lines of code... good luck with Win7!
 
Last edited:
functionality is more important than performance. the whole point of an algorithm is to do something. speed is secondary especially when your example only takes milliseconds.

i'll play your little game. i'm not a programmer by profession, i would consider myself a neophyte but i think that will actually help prove my point shortly.

can you post the ASM?

write the code and once you get it right we'll look for someone as referee.
 
Status
Not open for further replies.
Back
Top