How much electricity would be saved worldwide if Windows was writen in Assembly?

Cogman · Jul 20, 2010

iCyborg said:
I was actually aware of that, realized it that when I increased the number of runs, the number of solutions also increased, and I normalized everything to the original 10 runs. Btw, all times reported are also per 10 runs, not per 1 run

As for the MT version, I actually planned to do that, perhaps you can try with Scali's suggestion, I won't be able to until later today.
I was going to divide work in a bit better way though. You have it by n, e.g. for 2 cores, it'd be 1-25 and 26-50. The thing is that those inner m and k loops will be executed a lot more for smaller n's, so it would be better to have something like 1-15/16-50 (and say 7, 10, 16, 17 for 4 cores) as the first thread will have lots more to do than the last. It would need some hardcoding though, and perhaps experimenting, but the overhead does make it quite unlikely that much can be gained, unless Interlocked add helps.

Meh, the critical section is only used once per thread, and critical sections are really only slow if there is a collision (really not all that likely). I did an interlocked add just to make sure and got similar results (Ok, it was with wonky gcc asm as mingw doesn't support the InterlockedAdd function).

Where the speeds are ~2x that of the single threaded version, I'm going to say that thread overhead is the big killer here. Threading will probably only benefit this if we are finding more solutions, or using a crappier algorithm. Perhaps if I'm bored, I'll thread my original solution for the heck of it.

Scali · Jul 20, 2010

Cogman said:
Meh, the critical section is only used once per thread, and critical sections are really only slow if there is a collision (really not all that likely). I did an interlocked add just to make sure and got similar results (Ok, it was with wonky gcc asm as mingw doesn't support the InterlockedAdd function).

gcc has an equivalent, __sync_add_and_fetch();

Markbnj · Jul 20, 2010

Any_Name_Does said:
Thanks Markbnj,

Well, I actually didn't mind his remarks. I saw it more as provocation than insult and It takes much more than that to provoke me, but you are right. Rules are rules.

If you really want to thank me, then don't ever actually create a thread this dumb in the Programming forum. I have a family. Think of the children. As long as it stays on some sort of optimized assembler discussion, which is where it seems to have landed, I'll leave it. But my finger has been hovering over the lock button since Virge copied it in.

Any_Name_Does · Jul 20, 2010

Markbnj said:
If you really want to thank me, then don't ever actually create a thread this dumb in the Programming forum. I have a family. Think of the children. As long as it stays on some sort of optimized assembler discussion, which is where it seems to have landed, I'll leave it. But my finger has been hovering over the lock button since Virge copied it in.

Do you mean troll invasion?

Scali · Jul 20, 2010

Any_Name_Does said:
Do you mean troll invasion?

Come on, stop pushing it.
Apparently your opinion was an impopular one, and met with a lot of adversity.
You could have known beforehand.

And even if you didn't... by now you should have realized that these 'trolls' were telling the truth, as crude as they may have been in delivering the message.
After all, plenty of code was produced to demonstrate that pretty much everything that was claimed before, is actually true (eg, that choosing the right algorithm is more important than the language, or that a modern compiler will easily outperform the average programmer trying to write some assembly).

Markbnj · Jul 20, 2010

Any_Name_Does said:
Do you mean troll invasion?

There has no doubt been trolling on both sides. In this forum we talk about programming, and topics closely related to programming, and in general try to do so as adults would if they were in each other's physical presence. This thread violates pretty much all of those standards in one aspect or another.

Any_Name_Does · Jul 20, 2010

Scali said:
Come on, stop pushing it.
Apparently your opinion was an impopular one, and met with a lot of adversity.
You could have known beforehand.

And even if you didn't... by now you should have realized that these 'trolls' were telling the truth, as crude as they may have been in delivering the message.
After all, plenty of code was produced to demonstrate that pretty much everything that was claimed before, is actually true (eg, that choosing the right algorithm is more important than the language, or that a modern compiler will easily outperform the average programmer trying to write some assembly).

You don't need to lock truth tellers. Schmidi told a truth and it was fine. some other guys optimized on his truth and it got better.
I already explained why I opened this thread. Here is the summary.
Ladies and gentlemen and dear trolls,
I just got fed up with assembly and wanted some proof that high level languages hold their own. so I would motivated to start learning one. And the proof was delivered. Go thank the moderator for locking you out, otherwise I would be doing things with you, you wouldn't want.

Why not lock the thread?

Scali · Jul 20, 2010

Why don't we leave the thread open, as various people are apparently still playing with the different algorithms.

Any_Name_Does · Jul 20, 2010

Scali said:
Why don't we leave the thread open, as various people are apparently still playing with the different algorithms.

Mark isn't happy with it.

zebano · Jul 20, 2010

nevermind

BoberFett · Jul 20, 2010

Any_Name_Does said:
You don't need to lock truth tellers. Schmidi told a truth and it was fine. some other guys optimized on his truth and it got better.
I already explained why I opened this thread. Here is the summary.
Ladies and gentlemen and dear trolls,
I just got fed up with assembly and wanted some proof that high level languages hold their own. so I would motivated to start learning one. And the proof was delivered. Go thank the moderator for locking you out, otherwise I would be doing things with you, you wouldn't want.

Why not lock the thread?

Things we wouldn't want? Like what, more crappy code from a terrible programmer?

I can understand your moderation Mark, but I'd say my insults for this troll were fairly well deserved.

Any_Name_Does · Jul 20, 2010

BoberFett said:
I can understand your moderation Mark, but I'd say my insults for this troll were fairly well deserved.

You really want it?

Cogman · Jul 20, 2010

Speaking of which....

Code:

struct cogmanTData
{
    int start;
    int end;
    int* mulTable;
    int* sol;
    CRITICAL_SECTION* CS;
};

DWORD WINAPI CogmanSSThread(void* data)
{
    cogmanTData* tData = (cogmanTData*)data;
    int* mulTable = tData->mulTable;
    int solutions = 0;
    for(int i=tData->start; i<tData->end; ++i)
    {
        for(int j=i+1; j<5001; ++j)
        {
            int c;
            int d;
            c = mulTable[i] + mulTable[j];
            d=sqrt(c);
            if (mulTable[d] == c)
            {
                ++solutions; // without this it would deadcode the solution.
#ifdef _OUTPUTENABLE
                cout<<i;
                cout<<' ';
                cout<<j;
                cout<<' ';
                cout<<iD;
                cout<<endl;
#endif
            }
        }
    }
    EnterCriticalSection(tData->CS);
    *tData->sol += solutions;
    LeaveCriticalSection(tData->CS);
    return 0;
}

bool CogmanSimpleSolutionThreaded()
{
    const int NUM_RUNS = 100;
    LARGE_INTEGER start, end, freq;
    int solutions = 0;
    QueryPerformanceCounter(&start);

    int mulTable[7074];
    mulTable[2] = 7;

    for(int iRuncount=0; iRuncount<NUM_RUNS; iRuncount++)
    {
        // Generate the multiplication table
        int start = 1;
        if (mulTable[2] != 4)
        {
            for (int i = 1; i < 5001; ++i)
            {
                mulTable[i] = i * i;
                int c;
                int d;
                c = 1 + mulTable[i];
                d = sqrt(c);
                if (d * d == c)
                {
                    ++solutions;
                }
            }
            start = 2;
            for (int i = 5001; i < 7074; ++i)
            {
                mulTable[i] = i * i;
            }
        }
        #define CNUMTHREADS 32
        CRITICAL_SECTION CS;
        InitializeCriticalSection(&CS);
        HANDLE threads[CNUMTHREADS];
        cogmanTData tData[CNUMTHREADS];
        int stopPos = 1;
        for (int j = 0; j < CNUMTHREADS; ++j)
        {
            tData[j].start = stopPos;
            stopPos += 5001 / CNUMTHREADS;
            if (stopPos >= 4999) // Make sure we catch the end
                stopPos = 5001;
            tData[j].end = stopPos;
            tData[j].sol = &solutions;
            tData[j].mulTable = mulTable;
            tData[j].CS = &CS;
            threads[j] = CreateThread(NULL, 0, CogmanSSThread, (void*)&tData[j], 0, NULL);
        }
        WaitForMultipleObjects(CNUMTHREADS, threads, true, INFINITE);
    }
    QueryPerformanceCounter(&end);
    QueryPerformanceFrequency(&freq);
    double dEnd=(double) end.QuadPart / (double)freq.QuadPart;
    double dStart=(double) start.QuadPart / (double)freq.QuadPart;
    double dTotalTick=((dEnd-dStart)/(double) NUM_RUNS);
    cout<<"CogmansSimpleSolT ";
    cout<<dTotalTick;
    cout<<" seconds, solutions found: "<< solutions/NUM_RUNS << endl;
    return solutions;
}

This shows a CLEAR speed up from the result of using more then one thread, about 4x vs the regular code (I took the 10x monkier out. So, multiply by ten to see the improvement...)

Search

How much electricity would be saved worldwide if Windows was writen in Assembly?

Cogman

Lifer

Scali

Banned

Markbnj

Elite Member <br>Moderator Emeritus

Any_Name_Does

Member

Scali

Banned

Markbnj

Elite Member <br>Moderator Emeritus

Any_Name_Does

Member

Scali

Banned

Any_Name_Does

Member

zebano

Diamond Member

BoberFett

Lifer

Any_Name_Does

Member

Cogman

Lifer

TRENDING THREADS