• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

Thread scaling

beginner99

Diamond Member
Not sure if this is the right forum (OS forum?) but I have a question about my homework for a CS-class.
It's about multi-threading.

A server takes 15 ms to process a request for a file if it is in cache, in 1/3 of the cases it takes an additional 75 ms to get the file from the disk. In that time the thread sleeps.

a) Requests per second with 1 server thread?
-> 15 ms * 1/3 * 75 ms = 40 ms per request -> 25 Requests per second

b) with multiple threads?

Is there a formula for this? I only came with following idea:
A thread sleeps 1/3* 75 ms on average per request. So a second thread on average has 25 ms execution time while first thread sleeps. So we can execute 25 ms / 40ms +1 = 1.625 requests in the same time (40.635/s). Correct?

How do i generalize this for n-threads?
 
I'm not sure; this seems more like statistics than CS.

What I can determine easily is requests per second with six or more threads. Why?
 
Hmm. Here is the issue that I see. If two threads request data that is not in cache (IE, it is on the disk) then they can't each (reasonably, this may not be true if the data is in the same local) expect the data to come back to them in 75ms. Rather, the total time for all threads to get their data would be 150ms (because the disk has to seek from thread 1's data to thread 2's data) in the worst case.

On the other hand, if data is in the cache for one thread and on the disk for the other thread, it IS reasonable to expect the total time to be 75 ms because the cache thread will be able to get its data while the disk thread is waiting.

The next problem arises with the fact that most computers are multi-cored. If you have 2 threads and 2 cores, it is reasonable to expect each thread to be able to get cached data in 15 ms. However, if it is not multicored, than the time it takes for each thread to request data from the cache is going to be highly dependent on how often the OS switches between threads. If we assume that the OS switches from one thread to the other instantly once data is retrieved, than it can be assumed that the time it takes for the each thread to get its data from cache is ~15ms. However, for an OS that doesn't switch instantly (IE, every multithreaded OS ever made) the time becomes 15ms + OS switching time. Before each thread will have seen its data.

Assuming instantaneous switching, your formula becomes something like
(2/3) * 15 + 1/3 * 75 * n

For n number of cores, this equation would morph into
(2/3)^(number of cores) * 15 + 1/3 * 75 * n

*Warning* I'm not great at statistics, so I could be WAY off.. However, it seems to scale like it is supposed to. In other words, more threads should equal longer request times. However, this doesn't agree with what you came up for with n = 1.

Are you sure it is 15 + 1/3 * 75 and not 15 * 2/3 + 1/3 * 75? Generally, your probabilities should add up to 1. Think of this situation where disk speed equals cache speed. Your equation would say that there was a 1/3 * 15 + 15 = average request time. Does that really make sense? If disk and cache speeds are the same, shouldn't average request time = 15ms for 1 thread?
 
Last edited:
Hmm. Here is the issue that I see. If two threads request data that is not in cache (IE, it is on the disk) then they can't each (reasonably, this may not be true if the data is in the same local) expect the data to come back to them in 75ms. Rather, the total time for all threads to get their data would be 150ms (because the disk has to seek from thread 1's data to thread 2's data) in the worst case.

On the other hand, if data is in the cache for one thread and on the disk for the other thread, it IS reasonable to expect the total time to be 75 ms because the cache thread will be able to get its data while the disk thread is waiting.

The next problem arises with the fact that most computers are multi-cored. If you have 2 threads and 2 cores, it is reasonable to expect each thread to be able to get cached data in 15 ms. However, if it is not multicored, than the time it takes for each thread to request data from the cache is going to be highly dependent on how often the OS switches between threads. If we assume that the OS switches from one thread to the other instantly once data is retrieved, than it can be assumed that the time it takes for the each thread to get its data from cache is ~15ms. However, for an OS that doesn't switch instantly (IE, every multithreaded OS ever made) the time becomes 15ms + OS switching time. Before each thread will have seen its data.

Assuming instantaneous switching, your formula becomes something like
(2/3) * 15 + 1/3 * 75 * n

For n number of cores, this equation would morph into
(2/3)^(number of cores) * 15 + 1/3 * 75 * n

*Warning* I'm not great at statistics, so I could be WAY off.. However, it seems to scale like it is supposed to. In other words, more threads should equal longer request times. However, this doesn't agree with what you came up for with n = 1.

Are you sure it is 15 + 1/3 * 75 and not 15 * 2/3 + 1/3 * 75? Generally, your probabilities should add up to 1. Think of this situation where disk speed equals cache speed. Your equation would say that there was a 1/3 * 15 + 15 = average request time. Does that really make sense? If disk and cache speeds are the same, shouldn't average request time = 15ms for 1 thread?

Assume instant switching and single core.
a) I should be right because we solved it in class and teacher confirmed.
Request always takes at least 15 ms and in 1/3 of the cases 75 ms are added. Makes sense

b)
should be a curve that quickly declines with number of threads A second thread is a huge benefit, a third a little one and latter it should get pretty marginal. That's what my common sense tells me but haven't found a reasonable formula. (logarithmic?)
I also assume, that if 2 threads access the disk it has no influence on the time.

Things like this can drive me crazy and in the end it's probably a stupid trick question or a very simple answer. 🙂
 
Are you sure it is 15 + 1/3 * 75 and not 15 * 2/3 + 1/3 * 75? Generally, your probabilities should add up to 1. Think of this situation where disk speed equals cache speed. Your equation would say that there was a 1/3 * 15 + 15 = average request time. Does that really make sense? If disk and cache speeds are the same, shouldn't average request time = 15ms for 1 thread?
No, I thought so at first too, but he's right , because it is 75ms additional time. If you want as probabilities that sum up to 1, then it is:
15 * 2/3 + 1/3 * 90 = 15 * 2/3 + 1/3 * (15 + 75) = you can put 1/3*15 with 2/3*15 to get what he said.


I'll be using all the assumptions above (single core, ignoring disk and context switching times), it's quite ok for homework anyway.

While your reasoning seems ok for the 2-thread case, and can be easily generalized(*), I tried writing a simulation, and I get about 44,2 requests per second compared to 40,625. The variation is in the 4th decimal place (i.e. 44.22-44.23), so it's definitely statistically meaningful. My program could be wrong though, so perhaps someone else could try writing one, it's not hard.

(*) The thread sleeps for 25/40 = 5/8 of the time. The probability that with 2 threads neither is running is p=(5/8)^2. So for 1-p of the time, the server is actively running at least one request. For 1s=1000ms, if it runs (1-p)*1000ms it will complete (1-p)*1000/15 requests which is 40.625, the same you got. Generalization is easy: the probability that out of n threads neither is running is (5/8)^n.
It seems correct to me, but I can also not see where my simulation is wrong either, and the results definitely don't match...
 
No, I thought so at first too, but he's right , because it is 75ms additional time. If you want as probabilities that sum up to 1, then it is:
15 * 2/3 + 1/3 * 90 = 15 * 2/3 + 1/3 * (15 + 75) = you can put 1/3*15 with 2/3*15 to get what he said.


I'll be using all the assumptions above (single core, ignoring disk and context switching times), it's quite ok for homework anyway.

While your reasoning seems ok for the 2-thread case, and can be easily generalized(*), I tried writing a simulation, and I get about 44,2 requests per second compared to 40,625. The variation is in the 4th decimal place (i.e. 44.22-44.23), so it's definitely statistically meaningful. My program could be wrong though, so perhaps someone else could try writing one, it's not hard.

(*) The thread sleeps for 25/40 = 5/8 of the time. The probability that with 2 threads neither is running is p=(5/8)^2. So for 1-p of the time, the server is actively running at least one request. For 1s=1000ms, if it runs (1-p)*1000ms it will complete (1-p)*1000/15 requests which is 40.625, the same you got. Generalization is easy: the probability that out of n threads neither is running is (5/8)^n.
It seems correct to me, but I can also not see where my simulation is wrong either, and the results definitely don't match...

hey thanks. Seems very reasonable and as expected I was so lost i didn't see the not so complex solution.

About the simulation: How can you assure that the threads you create all run on a single core? Or I strongly assume your pc has 2 or more cores and hence the result of the simulation is just by accident close to the theoretical result.
 
My simulation is statistical: threads aren't real threads but just variables tracking their states. I'd rather not give too many details as that may sway people to my same (and suspectedly incorrect) solution, and it's also slightly convoluted too without additional explanation.

Basically think of it as having a timer, advancing it by 15 every time and checking whether there's a thread that's not sleeping. Keep thread states in some variables (like whether it can run, and if it's waiting, how much longer etc) and keep count of completed requests.

E.g. one possible sequence is:
time=0 : both threads ready, take 1st and with prob. 2/3 service the request. say it doesn't have to wait for disk
time=15 : 1st thread finished, 1 request done, say that now it has to wait for disk, select 2nd thread and say it can run
time=30 : 2nd thread finished, 2 requests done, 1st thread unavailable for 60 more ms, 2nd ready to run, say it also has to wait for disk this time
time=45 : 1st thread still has 45ms to wait, 2nd thread 60ms
time=60 : 1st thread still has 30ms to wait, 2nd thread 45ms
...
time=90 : 1st thread has finished waiting, server procedes to complete the request, 2nd thread 15ms waiting
time=105: 1st thread again ready to run, 3 requests done so far, 2nd thread waiting for the 1st to be stalled to complete its request as the data from disk came back

I run this until timer > some giant number and count the number of completed requests.
 
Back
Top