Threading Problem

Cogman

Lifer
Sep 19, 2000
10,286
147
106
So, I'm trying to make a custom built thread pool object. My problem is, it crashes, unpredictably. I think I've narrowed down where the code is faulting and can't figure out why it it incorrect.

Here's how it works, the thread pool spawns 4 threads and passes into those threads a pointer to the threadpool class. each of those spawned threads will request jobs from the thread pool, and report back to the thread pool when their jobs are finished. The problem is, every so often the data that comes back from the thread pool will flip around (really weird).

This is the code I believe is where the problem is located.

typedef void(*Job)(void*);

struct Task
{
Job job;
void* input;
int jobNumb;
};

DWORD WINAPI thread(ThreadPool* input )
{
printf("Created Thread\n");
void* thisThread = GetCurrentThread();
while(!input->isDead())
{
Task task;
input->getJob(task);
if (task.job != NULL)
{
task.job(task.input);
input->finishJob(task.jobNumb);
}
else
{
SuspendThread(thisThread);
}
}
return 0;
}

void ThreadPool::getJob(Task& input)
{
EnterCriticalSection(&cs);
{
if(jobs.size() == 0)
{
input.job = NULL;
input.input = NULL;
}
else
{
input.job = jobs.front();
input.input = data.front();
input.jobNumb = tasks.front();
jobs.pop();
data.pop();
tasks.pop();
}
}
LeaveCriticalSection(&cs);
}

So what gives? Often it is jobNumb that will swap places with job, which of course causes a segfault when job is called as a function.

The threadpool class can be found
header
Source
 

degibson

Golden Member
Mar 21, 2008
1,389
0
0
Don't make threads in your constructor for the threadpool. 'this' isn't valid until post-constructor... so passing 'this' is causing your first 'use' of 'this' to fail (aka, your call to getJob()). There's nothing wrong with getJob() itself, but the pointer you're implicitly passing into it (this) isn't valid.

Make a post-constructor ::Init() function, and make your threads there instead.
 

Venix

Golden Member
Aug 22, 2002
1,084
3
81
Using 'this' in the constructor is fine; after the initializer list completes, 'this' is guaranteed to point to a valid object. The only thing that won't work is virtual calls or casts to derived classes if the class is a base.

C++ FAQ Lite has additional information.

A few comments/questions:

- SuspendThread/ResumeThread are intended for debuggers, not for thread synchronization. Have you looked into events or (preferably) completion ports?
- Why does it sleep 10 ms after getting a job?
- Is there a reason for not initializing taskNum? I imagine debugging might be a little clearer if job numbers started at some sensible default, like 1.
- How did you determine that the data is getting flipped around? If it's by the printf statements, are you sure that you're using a thread-safe C runtime?
- You're using standard library containers, but the code isn't exception safe. Is that intended?

Your thread pool doesn't seem to offer much over QueueUserWorkItem. Have you looked into using that or a thin wrapper around it instead?
 

degibson

Golden Member
Mar 21, 2008
1,389
0
0
Originally posted by: Venix
Using 'this' in the constructor is fine; after the initializer list completes, 'this' is guaranteed to point to a valid object. The only thing that won't work is virtual calls or casts to derived classes if the class is a base.
Venix is right. I had a complication with a virtualized threadpool once... I incorrectly assumed this was the same issue. Of course 'this' has to be OK before you can do things to it. My bad.

 

Cogman

Lifer
Sep 19, 2000
10,286
147
106
Originally posted by: Venix
Using 'this' in the constructor is fine; after the initializer list completes, 'this' is guaranteed to point to a valid object. The only thing that won't work is virtual calls or casts to derived classes if the class is a base.

C++ FAQ Lite has additional information.

A few comments/questions:

- SuspendThread/ResumeThread are intended for debuggers, not for thread synchronization. Have you looked into events or (preferably) completion ports?
- Why does it sleep 10 ms after getting a job?
- Is there a reason for not initializing taskNum? I imagine debugging might be a little clearer if job numbers started at some sensible default, like 1.
- How did you determine that the data is getting flipped around? If it's by the printf statements, are you sure that you're using a thread-safe C runtime?
- You're using standard library containers, but the code isn't exception safe. Is that intended?

Your thread pool doesn't seem to offer much over QueueUserWorkItem. Have you looked into using that or a thin wrapper around it instead?

- SuspendThread/ResumeThread were used to keep the threads from keeping a CPU occupied. I was hoping for a better solution, Event based threading would be ideal, do you know where there is any reading on this?
- Umm, Yeah, the 10ms sleep was just to try and see if there where possibly race conditions going on, I forgot to remove it, that's all :)
- No reason for not initializing, no reason to initialize it. It is an arbitrary number to allow for callers to check on the progress of their specific task.
- It was through printf, and I'm fairly certain because I did with only one thread created for the thread pool, that thread was the only one printing out. I know it isn't thread safe, however, if the single thread is the only one printing out then there can't really be any conflict.
- I know I should worry about exceptions being thrown, but in my simple examples, there shouldn't be any exception that could be thrown.

Thanks for the input, I'll look up event based coding to see if that will do the trick. As for the QueueUserWorkItem, I've not heard about it, It might be more of what I want. Really, this is more of an exercise in threading for me. I figure if I can get a working thread pool, I should be able to handle most concurrency type programming :).
 

JasonCoder

Golden Member
Feb 23, 2005
1,893
1
81
Not directly on point but in the neighborhood...

Jeffrey Richter is the man on this stuff (especially on windows). You might get some good ideas from his Power Threading library. It's .Net but it's all Windows so the concepts are the same.
 

Venix

Golden Member
Aug 22, 2002
1,084
3
81
Looks like Larry Osterman has pretty much exactly what you need. He even explains the problems with the "signal all threads when work is ready" implementation.

I'm still not seeing anything obvious in your code that could cause a crash, and I built it and queued about 50 things on a quad-core machine and didn't have a problem. Do you have any test code that consistently causes the problem to occur?
 

Cogman

Lifer
Sep 19, 2000
10,286
147
106
Originally posted by: Venix
Looks like Larry Osterman has pretty much exactly what you need. He even explains the problems with the "signal all threads when work is ready" implementation.

I'm still not seeing anything obvious in your code that could cause a crash, and I built it and queued about 50 things on a quad-core machine and didn't have a problem. Do you have any test code that consistently causes the problem to occur?

Yep, its here

http://cboard.cprogramming.com...tiple-threads-main-cpp

Strangest thing, The problem doesn't seem readily apparent to me. When I get some time I'll try an even based system and see if that helps any (Even if it doesn't, I like the idea better. Thanks for the links :))
 

Venix

Golden Member
Aug 22, 2002
1,084
3
81
Job is defined as a cdecl function pointer returning void, but crossProduct_thread is stdcall (WINAPI) returning DWORD. Casting stdcall to cdecl and calling it will trash the stack, and I imagine that casting away the return type may also cause problems.
 

Cogman

Lifer
Sep 19, 2000
10,286
147
106
Originally posted by: Venix
Job is defined as a cdecl function pointer returning void, but crossProduct_thread is stdcall (WINAPI) returning DWORD. Casting stdcall to cdecl and calling it will trash the stack, and I imagine that casting away the return type may also cause problems.

Ahhhh, duh, I should have seen that. :) Thanks for the help, it was driving me nuts as I couldn't get it to fail other than in this case. Yep, changing calling standards will make a big difference :).

Thanks again, At least I know now that my thread pool isn't broken, my threaded function is.