Computer science people - question for you.

mjia · Dec 22, 2004

Is it? I haven't touch this "select" methord, so I won't know...perhaps it would be interesting to learn about...anyone have any links?

Kev · Dec 22, 2004

nevermind i stink

oboeguy · Dec 22, 2004

Originally posted by: mjia
Is it? I haven't touch this "select" methord, so I won't know...perhaps it would be interesting to learn about...anyone have any links?

Here ya go

A selection algorithm chooses the kth smallest of a list of numbers; this is an easier problem in general than sorting. One simple but effective selection algorithm works nearly in the same manner as quicksort, except that instead of making recursive calls on both sublists, it only makes a single tail-recursive call on the sublist which contains the desired element.

Thank you, thank you. I'm here 'til Thursday!

How's that for remembering CS class from 1995?

TheLonelyPhoenix · Dec 22, 2004

Bubble sort would probably be fastest for this.

The algorithm: Compare last two elements in the array. If the first is lower than the second, leave as-is, otherwise, flip them. Do the same with second-to-last and third-to-last elements, then the third-to-last and fourth-to-last elements, etc through the entire array. In this way, the lowest value will "bubble" to the top of the array with every pass. Once you complete an entire pass through the array without switching any values, you know the whole thing is sorted.

Now, to make this work for your application, you'll want a reverse bubble sort, or a "sinking sort" if you need a cute name for it. Start at the top of the array instead and move downward for a pass, always switching values to keep the lower number on top. This will send the greatest number in the array to the bottom. After 10 passes, you know the 10 highest numbers in the array will be at the bottom of it. This is an O(n) algorithim as opposed to O(n^2) (since the time taken is only dependent on the size of the array and not the number of passes needed, which is constant).

Edit: And yes, this beats out n*log n as well. Everyone who said "insert favorite O(n log n) algorithm here" just got pwned by a sophomore electrical engineer with a CS minor. Boo-ya.

Train · Dec 22, 2004

this thread is proof that there are a ton of smart programmers here at AT, we need a programming forum damnit!

sao123 · Dec 22, 2004

Bubble sort would probably be fastest for this.

The algorithm: Compare last two elements in the array. If the first is lower than the second, leave as-is, otherwise, flip them. Do the same with second-to-last and third-to-last elements, then the third-to-last and fourth-to-last elements, etc through the entire array. In this way, the lowest value will "bubble" to the top of the array with every pass. Once you complete an entire pass through the array without switching any values, you know the whole thing is sorted.

Now, to make this work for your application, you'll want a reverse bubble sort, or a "sinking sort" if you need a cute name for it. Start at the top of the array instead and move downward for a pass, always switching values to keep the lower number on top. This will send the greatest number in the array to the bottom. After 10 passes, you know the 10 highest numbers in the array will be at the bottom of it. This is an O(n) algorithim as opposed to O(n^2) (since the time taken is only dependent on the size of the array and not the number of passes needed, which is constant).

Edit: And yes, this beats out n*log n as well. Everyone who said "insert favorite O(n log n) algorithm here" just got pwned by a sophomore electrical engineer with a CS minor. Boo-ya.

I think i said this 7 posts before you. But thanks for the confirmation... and I'm a CS major.

TheLonelyPhoenix · Dec 22, 2004

Originally posted by: sao123

Bubble sort would probably be fastest for this.

The algorithm: Compare last two elements in the array. If the first is lower than the second, leave as-is, otherwise, flip them. Do the same with second-to-last and third-to-last elements, then the third-to-last and fourth-to-last elements, etc through the entire array. In this way, the lowest value will "bubble" to the top of the array with every pass. Once you complete an entire pass through the array without switching any values, you know the whole thing is sorted.

Now, to make this work for your application, you'll want a reverse bubble sort, or a "sinking sort" if you need a cute name for it. Start at the top of the array instead and move downward for a pass, always switching values to keep the lower number on top. This will send the greatest number in the array to the bottom. After 10 passes, you know the 10 highest numbers in the array will be at the bottom of it. This is an O(n) algorithim as opposed to O(n^2) (since the time taken is only dependent on the size of the array and not the number of passes needed, which is constant).

Edit: And yes, this beats out n*log n as well. Everyone who said "insert favorite O(n log n) algorithm here" just got pwned by a sophomore electrical engineer with a CS minor. Boo-ya.

Click to expand...

I think i said this 7 posts before you. But thanks for the confirmation... and I'm a CS major.

Yeah, I went back and read the thread more thoroughly after I posted and saw that you had said essentially the same thing. There's been a lot of armchair philosophy going on in this thread about sorting algorithms, so its hard to seperate out all the drivel.

Edit: chuckywang appears to have had it as well.

gwlam12 · Dec 22, 2004

Wow. Some of you aren't very modest.

RaynorWolfcastle · Dec 22, 2004

I kind of skimmed through the begining of the thread with everyone arguing over the order of the algorithm, but I had another thought.
Out of curiosity, is anything known about the random numbers? Is there a known upper and lower bound maybe? Is the distribution, for example (uniform, or gaussian about the middle maybe?). If the array really is large and something is known about the random numbers, you may be able to go through a "rough pass" using a probabilistic method, and then run a sort on a small dataset (say 200 elements). If nothing is known about the distribution you might still be able to pull something like this off, but it would require a little more thinking; maybe you could take a 100 sample points then calculate a mean and standard deviation before walking through the array.

I think this is the best way to go about it, if your data lends itself to this sort of algorithm.

TheLonelyPhoenix · Dec 22, 2004

Originally posted by: gwlam12
Wow. Some of you aren't very modest.

Modesty is for liberal arts majors.

mjia · Dec 22, 2004

Is there a practical use for searching for the single most efficient algorithm? Are these concerns mostly academic?

Don't get me wrong, I'm all for finding the best solution (I'm something of a perfectionist, and can fully appreciate the beauty of a well designed program). I'm just curious. I think you mentioned that the array is only in the order of hundreds in terms of size. If the values of these elements can be access easily (eg. they are primitives like ints), then there is no practical different in time if the algorithm is only used once (as you've said).

oboeguy · Dec 22, 2004

Originally posted by: TheLonelyPhoenix

Originally posted by: gwlam12
Wow. Some of you aren't very modest.

Click to expand...

Modesty is for liberal arts majors.

Word. What's the point of modesty here anyway? 🙂

Ameesh · Dec 22, 2004

just a curious side note i gave this question in an interview today and we worked on optimizing the linear solution: At this point O(n) is fine but big O represents the worst case not the avg case which is theta and as far as the number 10 goes if we generalize it to an input variable the best solution we got was n * log m which is clearly better then n * m

EngenZerO · Dec 22, 2004

i would use mergesort cause of its avg running time of nlgn for a true random array then just axs your top ten elements...

how fast is the requirement?

chuckywang · Dec 22, 2004

Originally posted by: Ameesh
just a curious side note i gave this question in an interview today and we worked on optimizing the linear solution: At this point O(n) is fine but big O represents the worst case not the avg case which is theta and as far as the number 10 goes if we generalize it to an input variable the best solution we got was n * log m which is clearly better then n * m

EDIT: You could be right. Do you have an outline of this n*log m algorithm?

Originally posted by: TheLonelyPhoenix

Edit: And yes, this beats out n*log n as well. Everyone who said "insert favorite O(n log n) algorithm here" just got pwned by a sophomore electrical engineer with a CS minor. Boo-ya.

I'm a senior electrical engineer, thanks.

SinNisTeR · Dec 22, 2004

Originally posted by: chuckywang
Picking the ith element out of a set of n number is an O(n) operation. Just pick the 10 biggest elements. Your complexity is 10*O(n) which is the same as O(n).

Another easy algorithm is just to make an 10 element array that keeps track of the 10 biggest elements, and just iterate through the 100 elements, updating the 10 element array when necessary.

ditto

blahblah99 · Dec 22, 2004

Winner: a 10-iteration bubble sort. THANKS EVERYONE!

her209 · Dec 22, 2004

Originally posted by: blahblah99
Winner: a 10-iteration bubble sort. THANKS EVERYONE!

That give you mO(n).

RaynorWolfcastle · Dec 23, 2004

I still think a probabilistic solution is better for a really large dataset if anything at all is known about the nature of the variables (and unless you're really using completely random numbers there's a good chance you know something about the variables). If you know the distribution you can probably work out an algorithm that will be n+10*O(k) where k would be a fairly small number (say 5% of n). I guess I don't see going through a very large array 10 times as being terribly efficient, but if it does the trick then so be it. 🙂

mjia · Dec 23, 2004

Ya, I agree that looking at the statistics is a good approach. I would use some technique to pick the pivots wisely.

hans007 · Dec 23, 2004

Originally posted by: blahblah99
Say I have an array of 100 elements with random numbers, and only want to keep the 10 highest numbers. What's the best way?

I don't want to sort through the whole array then pick the 10 highest, since the array can contain a million elements. Is there a variation of the sort algorithm?

TIA.

wouldnt you just sort it like normal , and then take the 10 on one side. they'd alreayd be sorted.

TheLonelyPhoenix · Dec 23, 2004

Originally posted by: chuckywang

Originally posted by: TheLonelyPhoenix

Edit: And yes, this beats out n*log n as well. Everyone who said "insert favorite O(n log n) algorithm here" just got pwned by a sophomore electrical engineer with a CS minor. Boo-ya.

Click to expand...

I'm a senior electrical engineer, thanks.

Hey, I pointed out that you got it too. 🙂 Go up a few posts.

IHateMyJob2004 · Dec 23, 2004

No comments on my procedure? Sorting 10 elements 10 times is faster than sorting 100 elements once. Then selection is pretty much linear. The solution to the problem is not using a known sort routine. It has to be something outside of the box or a modiifed version/mix of other sorting algorithm(s).

This is essentially a modified bubble sort followed by some simple loops afterwards.

Originally posted by: IHateMyJob2004
(1)Oh, break set of 100 elements into 10 seperate arrays(memcpy is fastest method) and sort each (bubblesort?).

(2)
Then take arrays 1,2 and find the ten lowest numbers and place them in a third array. Repeat this for 3 through 10.

(3)
now you have 5 arrays

(4)
Use simlar process as in (2) to end up with 3 arrays of size 10.

(5)
Use simlar process as in (4) to end up with 2 arrays of size 10.

(6)
Use simlar process as in (5) to end up with 1 arrays of size 10.

Step (6) will give you the lowest 10 numbers.

IHateMyJob2004 · Dec 23, 2004

Originally posted by: blahblah99
Winner: a 10-iteration bubble sort. THANKS EVERYONE!

Oh, I guess someone did like my idea!

Yayyyy, me!!!!!!!!

PS: I am currently seeking a new software gig in CT.

4.5 years relelvent experience.

PM me with openings 😉

IHateMyJob2004 · Dec 23, 2004

Originally posted by: IHateMyJob2004
(1)Oh, break set of 100 elements into 10 seperate arrays(memcpy is fastest method) and sort each (bubblesort?).

(2)
Then take arrays 1,2 and find the ten lowest numbers and place them in a third array. Repeat this for 3 through 10.

(3)
now you have 5 arrays

(4)
Use simlar process as in (2) to end up with 3 arrays of size 10.

(5)
Use simlar process as in (4) to end up with 2 arrays of size 10.

(6)
Use simlar process as in (5) to end up with 1 arrays of size 10.

Step (6) will give you the lowest 10 numbers.

*smacks head*
(1)
In regards to step one.
Not bubblesort. Use quick sort or merge sort! quick sort is theoretically faster but for 10 elements, I'd just use merge sort. Sorry....been a long time since I had to remember sorting algorithm names.

Computer science people - question for you.

Member

Lifer

Diamond Member

Diamond Member

Lifer

Lifer

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Member

Diamond Member

Lifer

Diamond Member

Lifer

Diamond Member

Platinum Member

No Lifer

Diamond Member

Member

Lifer

Diamond Member

Lifer

Lifer

Lifer