Haswell to support transactional memory in hardware

janas19

Platinum Member
Nov 10, 2011
2,313
1
0
Transactional synchronization? Hmmm.... Kind of reminds me of *gasp* Bulldozer! (A little bit). Lol.
 

Abwx

Lifer
Apr 2, 2011
11,854
4,829
136
Doesn't read that way to me.

Perhaps you should read some of the research Oak Ridge National Laboratory, DARPA, Cray, IBM and others have done before you start with "Not invented by AMD".

You are in the IT sector for long enough to know that documented
research often did end as failure once implemented...

The Pentium 4 double pumped ALUs had considerable docs ,
yet they did strugle to be on par with older uarch s ALUs
on Athlon and Pentium III.....
 

Phynaz

Lifer
Mar 13, 2006
10,140
819
126
Oh, kinda like how CMT worked out for AMD. Now I get what you are saying :whiste:

Anyway, if you want to bring up the P4 please start your own thread. The purpose of the thread is to discuss Haswell and TSX.
Thanks.
 

BenchPress

Senior member
Nov 8, 2011
392
0
0
You are in the IT sector for long enough to know that documented
research often did end as failure once implemented...
And you're clearly not long enough in the "IT sector" to understand the massive importance of transactional memory.

Try writing a lock free task scheduler using only CAS. Then try it again using transactional memory.
 

Abwx

Lifer
Apr 2, 2011
11,854
4,829
136
And you're clearly not long enough in the "IT sector" to understand the massive importance of transactional memory.

Try writing a lock free task scheduler using only CAS. Then try it again using transactional memory.

Long enough to realize that what was managed with softare optimisations
will be partly replaced by an hardware managements of memory sharing and tasks locks....

You should read the article i linked....
 

BenchPress

Senior member
Nov 8, 2011
392
0
0
Long enough to realize that what was managed with softare optimisations
will be partly replaced by an hardware managements of memory sharing and tasks locks....
Your point being?
You should read the article i linked....
I know everything there is to know about it. I read the spec. Now what's your point?
 

Abwx

Lifer
Apr 2, 2011
11,854
4,829
136
Your point being?

I know everything there is to know about it. I read the spec. Now what's your point?

My point is that a software implementation will be on
the long term more versatile.

What intel is doing is just to push the OS in a direction such
that it will be optimised firstly for its CPUs , but this is old history....
 

Phynaz

Lifer
Mar 13, 2006
10,140
819
126
My point is that a software implementation will be on
the long term more versatile.

What intel is doing is just to push the OS in a direction such
that it will be optimised firstly for its CPUs , but this is old history....

Really. I wonder what IBM is doing at Argonne then? What OS are they pushing to be optimized for their CPU's?

As humorous as your uninformed posts are, you are continuing to embarrass yourself. You really should consider spending a couple of hours educating yourself. Or don't, I really don't care either way.
 

Phynaz

Lifer
Mar 13, 2006
10,140
819
126
I read the spec.

Thanks for the link. I read through chapter 8.

You know what I'm thinking....This is a first step towards speculative execution. This is the first of the control mechanisms that will be required. I wonder now how much of this initially started with the Mitosis research project?
 

Abwx

Lifer
Apr 2, 2011
11,854
4,829
136
Speculative execution already exist , it s just that it cant be pushed
further as this would imply useless and energy voracious speculative exe.....

Has already heard of pipes stallings???....
 

CPUarchitect

Senior member
Jun 7, 2011
223
0
0
Speculative execution already exist , it s just that it cant be pushed
further as this would imply useless and energy voracious speculative exe.....
He's talking about speculative threading assisted by hardware, not the existing branch prediction you're thinking of.

And speculation is not necessarily a bad thing for power consumption. For instance a processor without branch prediction and prefetching would run much slower than a modern one which does use these techniques. To make it match in performance, you'd have to do some extreme overclocking, greatly increasing the power consumption beyond what you saved by not speculating.

So from a performance/Watt perspective speculation can be a very interesting deal. It just needs a high prediction rate and a low penalty for mispredictions.
 

nyker96

Diamond Member
Apr 19, 2005
5,630
2
81
all look good on paper, I wonder what's the actual performance improvements from this type of thing. When it comes to threading and multitasking at least from a software point of view, it's hard to tell its benefits without some actual testing.
 

blckgrffn

Diamond Member
May 1, 2003
9,686
4,339
136
www.teamjuchems.com
all look good on paper, I wonder what's the actual performance improvements from this type of thing. When it comes to threading and multitasking at least from a software point of view, it's hard to tell its benefits without some actual testing.

Having seen a "variation" of this sort of thing introduced in file systems, which have relied on traditional locks like this in the past, the upsides and practical throughput gains can be pretty enormous. I would think the more cores you have the more important this technology is to maintain throughput.

I say that because this is important when you have multiple servers accessing a clustered files system, which is how I view cores competing for memory access. It probably isn't perfect or even 90% accurate, but it makes sense of this concept for me.
 

Cerb

Elite Member
Aug 26, 2000
17,484
33
86
all look good on paper, I wonder what's the actual performance improvements from this type of thing. When it comes to threading and multitasking at least from a software point of view, it's hard to tell its benefits without some actual testing.
The greatest benefit is that it makes a lot more sense, when sharing data across threads. Let each thread read and write as it will, with checks to verify correctness, and then allow a globally-visible commit, or fall back. It would be wrong to say it is simple, but it would be right to say that it fits most humans' thought patterns far better than locking. Using locks in a traditional way is very much a choice of lesser evil over greater evil (lockless operation with shared memory--run away in fear!).

Software transactional memory has a fair chance of imposing more overhead than it offers in added performance, due to the added overhead of managing transactions. Each transaction must be isolated until commit, must have a way to verify that it should commit, that it did or did not commit, and then something to do on commit failure. When the transactions themselves are for very small amounts of work and/or memory, keeping up with that can take longer than waiting to grab a lock. Meanwhile, optimistic locking gets you halfway into the problems of trying to go lock-free.

Looking at the spec, this won't remove STM from the picture, but reduce overhead for small contentious sections of code (tell the hardware how to handle a race condition, and only do something about it if a race actually happens). I love chiding Intel for craziness in their extensions, but this one looks just about right. All I don't see are ways to handle falsely-conflicting writes.

Thanks for the link. I read through chapter 8.

You know what I'm thinking....This is a first step towards speculative [threading]. This is the first of the control mechanisms that will be required. I wonder now how much of this initially started with the Mitosis research project?
First step? Isn't it the overwhelming majority of the steps? There is no need for the hardware to know that it is executing the same task in two threads; there just needs to be (1) a way for one thread to fail without screwing up program/memory state, and (2) a way to guarantee that only correct paths keep executing. If my understanding is correct, a little special handling of overlapping write sets (IE, false sharing -> convoy -> live-lock) should be all the CPU would need, in addition to a minimal HTM implementation (in addition, a mechanism to handle sub-line sharing/contention would also improve basic HTM performance--win-win), to be able to properly implement SpMT.
 
Last edited:

Abwx

Lifer
Apr 2, 2011
11,854
4,829
136
Because it's not made by Amd?

No , because i m not gullible and that i take the time to read articles,
as the one from Hardware.fr

Ultimately these early information about the implementation of transactional memory in Haswell leave us mixed feelings. HLE mode, due to its backward compatibility is particularly interesting and should simplify the execution of code that was not written optimally by providing performance gains. It will be understood that programmers recompile the code by adding the intrinsic need to enjoy it but the payoff can be interesting. RTM mode for its part also shows the limits of standard x86, by dint of being extended in all directions is the problem of interoperability.

Unlike mathematical instructions that a compiler can decide on its own dispatcher SSE instructions or AVX for some calculations based on the processor model on which the program will run, transactional memory requires an intervention by the programmer (via intrinsic this may also be the case for math instruction) or a new memory model for the programming language (the model proposed above transactional memory for C + 11). If we add to that the unknown with concern the eventual implementation of AMD, how RTM may have an even harder to win than other extensions to the x86 instruction set.
Edit :

As a side note , i will add that the topic title is misleading
according to the quote above , seems that Phynaz
is taking Intel s claims at face value , as usual....
 
Last edited: