Parallel Processing

Apathetic · Apr 4, 2008

Does anyone do any work with or just mess around with parallel processing? It's always been an interest of mine and it looks like the field has a lot of potential for growth in the near future. I took a grad school course in it during the dark ages (about 18 years ago) and things have obviously changed a LOT since then.

What languages, books, web sites are out there now? Any Linux (Ubuntu) packages that you would recommend?

Dave

degibson · Apr 4, 2008

Its still a dark art, for the most part. There are several flavors to think about -- shared memory v. message passing, control parallel v. data parallel, etc. You're quite right in your observation that the field has a lot of growth potential (and will NEED to expand a lot to leverage multicore systems).

High-performance code is still in C/C++, and basically the only reason to parallelize most jobs is for performance... there are some parallel commercial apps in Java for various reasons, but they're often slower than parallel native codes.

As far as packages, you'd be surprised how little support is out there. On Linux, your choices are:
1) PThreads, all in the same process. Try 'man pthread_create'. Included in most linux distros.
2) fork(), with any flavor of IPC you choose. Included in all linux distros.
3) MPI. I don't know if there's an Ubuntu package for this or not. You may have to compile this one yourself.

Have a look at this cite, for a class that I recently was involved in teaching: http://pages.cs.wisc.edu/~davi...58/Fall2007/index.html
In particular, check out the homework assignments... its meant to be intro-level parallel programming for folks that are already adept C/C++ programmers.

And, if you have any specific questions, post 'em!

Apathetic · Apr 4, 2008

Thanks for the info! I've played with PThreads before. I'm sure I can easilly find a package for it. What are the advantages/disadvantages of PThreads over OpenMP?

Dave

tfinch2 · Apr 4, 2008

I did some MPI and pthread stuff in C about a year ago.

kamper · Apr 4, 2008

I've read about some interesting parallel processing packages for python recently. They mostly revolve around distributing work to multiple processes (possibly on multiple machines) because CPython can't actually execute multiple python threads concurrently. A lot of people see that as a good thing because it really minimizes icky concurrency issues, although it does force you to break up your workload in a fairly blunt way. Guido van Rossum and Bruce Eckels had an interesting blogalogue on artima.com about it. I've never used the stuff myself though.

degibson · Apr 5, 2008

Here's the skinny on pthreads vs OpenMP:

Pthreads use explicit thread creation and destruction, and you explicitly manage your stack frames in Pthreads (that is, every thread starts in its own function). Its pretty raw.
OpenMP is an interface, ususally a wrapper around Pthreads, that hides the thread creation, destruction, and management. Mostly, OpenMP allows you to decorate loops with #pragmas to automagically make them parallel. Not correct necessarily, but parallel. Both interfaces have the 'thread' abstraction, and both use shared memory for communication.

I gave a seminar on the OpenMP interface in 2006. You can find my slides here: http://pages.cs.wisc.edu/~gibson/talks/openmp.ppt

One thing I forgot to mention is Intel's TBB (threadingbuildingblocks.org/), which is everything including the kitchen sink as far as x86 parallel programming is concerned. They also have some handy tutorials. So do I, though my tutorial somewhat assumes that the reader is at UW-Madison: http://pages.cs.wisc.edu/~gibson/tbbTutorial.html

So long as you can build, you can probably already use pthreads without any new packages. OpenMPI requires compiler support, and I think gcc has supported it for awhile but I'm not sure. I have always used Sun Studio when writing OpenMP codes.

Regarding Python options, there are also experimental python interpreters that do allow multiple threads (PyPy? I don't recall which, but I've seen them in at least two projects). The problem with python is that its interpreted, and hence is slower than native, and hence you're already giving up performance when performance is usually the only reason to parallelize in the first place (here, I assume you're not using P2C).

kamper · Apr 5, 2008

Yeah, python would be slower, but you could have a situation where you're doing all the heavy lifting in a small bit of C and using python to write everything else. That is, use python for the stuff that's hard for the programmer but easy for the machine and C for the stuff that's hard for the machine.

degibson · Apr 6, 2008

I agree with Kamper -- if possible, use python for the hard parts and C for the high-performance parts. I use python a lot for productivity, when performance doesn't matter. I find it works great for launching jobs and analyzing results and such. Presumably, however, the parallel part is the high-performance part -- though you raise a good observation in that if you can segregate your task into multiple parallel /processes/, python is a great way to manage them.

Apathetic · Apr 7, 2008

Thanks for the info (especially the link to TBB). I thought I saw Intel had released something interesting a while a go but, of course, I could't remember what they called it and if you do a google search for Intel and Threading you get back about eleventy billion results...

I've never used Pyton before. Maybe I should take this oppurtunity to learn about it.

Edit: I just had a chance to review the OpenMP slides. They were definately what I was looking for. Thanks again!

Dave

lousydood · Apr 7, 2008

Writing parallel programs in C or C++ is not fun. You can use things like OpenMP but they have serious limitations. You're basically restricted to un-nested, fork-join parallelism. You don't want to use pthreads to do parallel programming -- that is pure pain and little benefit. The state of the art is much better than that. Imperative programming languages force a sequential model and breaking out of that is a serious difficulty for those languages.

The cool new work is being done in functional programming languages. Nested data parallelism (as in NESL) is coming back into vogue. These are operators designed to work on vectors of data, distributing the work across multiple cores or using vector instructions, and transform the code so that parallel operations can be composed cleanly. My favorite language is Haskell (haskell.org) which has easy lightweight threads which can be scheduled transparently across multiple cores. It has parallel programming combinators which you can use to add parallelism with minimal fuss. And it has nice concurrency abstractions for explicit communication and even state management: like Software Transactional Memory, which lets you avoid the pain of multi-threaded programming with locks.

The Erlang folks have really nailed down a neat model for distributed message-passing computation, but that's another matter. It was designed and is used in industry, though, which might be a plus for you.

degibson · Apr 8, 2008

Functional languages give the the heebiejeebies. I admit that parallel imperative languages are really hard. The problem with functional languages is that non-imperative is unnatural to imperative programmers. Yes, its true that imperative programmers think sequentially, and yes explicit threading like pthreads and OpenMP have serious limitations. My take is that all languages are currently insufficient to express parallelism, and if any language ever becomes a clear winner, it will be something that feels imperative and acts functional.

As an aside, Software Transactional Memory is in no way tied to functional languages. Intel, for instance, has a C++ compiler available that does STM.

lousydood · Apr 8, 2008

That is true, however, the original work on composable STM was published in this paper: http://research.microsoft.com/...npj/papers/stm/stm.pdf and implemented in the Glasgow Haskell compiler. I think if you glance at the code snippets in there you'll find they have an "imperative feel" but act functional. Don't be intimidated by the fact that it was a conference paper -- Peyton-Jones writes in a very accessible style.

Our starting point is this: a purely-declarative language is a perfect
setting for transactional memory, for two reasons. First, the type
system explicitly separates computations which may have side-
effects from effect-free ones. As we shall see, it is easy to re?ne it
so that transactions can perform memory effects but not irrevocable
input/output effects. Second, reads from and writes to mutable cells
are explicit, and relatively rare: most computation takes place in the
purely functional world. These functional computations perform
many, many memory operations ? allocation, update of thunks,
stack operations, and so on ?but none of these need to be tracked
by the STM, because they are pure, and never need to be rolled
back. Only the relatively-rare explicit operations need be logged,
so a software implementation is entirely appropriate.

degibson · Apr 9, 2008

Microsoft has made some good STM contributions (and I seem to recall earlier STM work, so we could debate the meaning of original and composable), but there are problems with STMs beyond just language type -- the biggest one is that they are weakly atomic. Sadly, a paper I read recently that addresses this issue doesn't appear to be available to the public yet -- look for it at ISCA 2008 -- but the problem can be solved with modest user-level hardware support. Anyway, thankfully STMs have been made for imperative languages as well.

You should also check out Sun's Rock and its compilers, which sport Hybrid HW/SW transactional memory (very modest HW support). I believe there is an ISSCC paper on it.

Apathetic · Apr 11, 2008

Are there any other functional languages other than Erlang and Haskel I should take a look at? Not that I'm looking foreward to experimenting with them (functional languages tend to make my little head hurt).

I write multi-threaded services at work and used to work on soft real-time systems (but that was a long time ago) so PThreads isn't all that different than what I'm used to.

Dave

degibson · Apr 12, 2008

I don't know, Apathetic, I generally shy away from such things. I'm replying to give this thread a bump to try to get lousydood's attention again. =)

kamper · Apr 12, 2008

Originally posted by: Apathetic
Are there any other functional languages other than Erlang and Haskel I should take a look at? Not that I'm looking foreward to experimenting with them (functional languages tend to make my little head hurt).

I write multi-threaded services at work and used to work on soft real-time systems (but that was a long time ago) so PThreads isn't all that different than what I'm used to.

Dave

Scheme would be fun. I read about half of The Little Schemer and never really wrote much code in scheme but it had a pretty noticeable effect on the way I think about more procedural languages. I know nothing about real life concurrent programming with scheme but I found it fun to think about how you could implement it (an interpreter/compiler or whatever) to automatically parallelize stuff so I think it's good for generally thinking about what it means for a function to have no side-effects.

I don't know much about Erlang or Haskel though, so for all I know they are much more appropriate tools for learning.

irishScott · Apr 12, 2008

I used Java threads for work a couple of years ago.

degibson · Apr 13, 2008

Theres Lisp of course... I have never used lisp, and don't know anything about it except that its functional.

IHateMyJob2004 · Apr 15, 2008

Your problem needs to havea multithreaded solution. You can multithread sorting for example, but not everything can be done in a multithreaded world. Just learn how to handle critical sections and how threads work and you are good to go.

EDIT: In short. Learn the concepts. Try them on any language (even Java). There, you learned multi-threading. just try doing a sort algorithm with alot of sets. Sort a 10 million numbers across 10 threads. Be sure to handle critical sections properly and start/stop the threads properly. It's not hard.

lousydood · Apr 16, 2008

Originally posted by: degibson
Microsoft has made some good STM contributions (and I seem to recall earlier STM work, so we could debate the meaning of original and composable), but there are problems with STMs beyond just language type -- the biggest one is that they are weakly atomic. Sadly, a paper I read recently that addresses this issue doesn't appear to be available to the public yet -- look for it at ISCA 2008 -- but the problem can be solved with modest user-level hardware support. Anyway, thankfully STMs have been made for imperative languages as well.

You should also check out Sun's Rock and its compilers, which sport Hybrid HW/SW transactional memory (very modest HW support). I believe there is an ISSCC paper on it.

Hi. I haven't been able to find any proceedings for ISCA 2008, I suppose it hasn't been published yet. Do you recall the author and title of the paper?

Here is a relevant paper I found from POPL'08: http://research.microsoft.com/...s/papers/2008-popl.pdf

STM definitely has taken off in non-functional languages, but there are some problems. Imperative languages rely heavily on side-effects, which can get hairy in the presence of replayable transactions. The previous paper I linked (Harris, et al 05) discusses this and why that makes Haskell a nice vehicle for STM. (BTW, I wasn't implying that paper was original in discussing STM, but rather, in discussing composable transactions).

Here's some choice quotes from the conclusion of Harris, et al 08:

We have demonstrated that imposing a strong language restric-
tion, static separation of mutable state, lets us give the programmer
the attractive behavior of the strong semantics even with a very
permissive implementation.

Although separation is appealing in a functional setting, it is probably less palatable in
an imperative language where most data is considered mutable, and
would therefore require marshaling across the separation boundary.

degibson · Apr 17, 2008

Originally posted by: lousydood
Hi. I haven't been able to find any proceedings for ISCA 2008, I suppose it hasn't been published yet. Do you recall the author and title of the paper?

I see that one of the authors (Craig Zilles) has now posted it on his page. You can find it http://www-sal.cs.uiuc.edu/~zilles/:

Title: Using Hardware Memory Protection to Build a High-Performance, Strongly Atomic Hybrid Transactional Memory

Here is a brief summary of the STM-side:
STMs find it quite hard to implement strong atomicity. If they are augmented with hardware fine-grained access control (i.e. ro/no-access words rather than pages), however, strong atomicity becomes much easier to implement at low cost.

Here is a relevant paper I found from POPL'08: http://research.microsoft.com/...s/papers/2008-popl.pdf

Its fairly obvious that lousydood is more on the 'language' side of STM, and I'm looking at it from a hardware perspective, admittedly (my research has nothing to do with TM, but I work closely with others in the field).

Basically, I'm hoping that the contributions of functional languages can eventually be migrated to some flavor of "well-behaved" imperative languages (this comes from a guy who nearly failed the only language class he ever took, mind you). I write enough parallel code that I would gladly 'restrict' my C++ wizardry if only it would give me an STM that doesn't choke.

Parallel Processing

Platinum Member

Golden Member

Platinum Member

Lifer

Diamond Member

Golden Member

Diamond Member

Golden Member

Platinum Member

Member

Golden Member

Member

Golden Member

Platinum Member

Golden Member

Diamond Member

Lifer

Golden Member

Lifer

Member

Golden Member