Question for you Comp Science folks! (Round 2)

amoeba · Jan 29, 2005

I think its slightly less than 10 million comparisons.

for one iteration, you are doing 1million -1 comparisons.

because its 500k + 250K +125K + 62.5K +......

next iteration you are doing 1million -2 comparisons?

so your end result should be around 10 million -55 comparisons.

amoeba · Jan 29, 2005

I think original thread had all of our ideas in there.

I'm curious now OP, what is this great algorithm of yours that beats the 10 iteration bubble?

Kilrsat · Jan 29, 2005

If you know the range of the numbers (and it isn't too sparse) a counting sort can do it with a single pass through the 1 million item array (without any comparisons!!!), and a handful of comparisons to pick only the large elements that exist.

Of course if you're talking about a full range of all valid integers, then the linked-list system here is probably best. But if you're talking about 1 million entries with numbers in a 0-5million range, I bet counting sort would be fewer comparisons.

JayHu · Feb 1, 2005

the best way to solve the selection problem has not been solved yet,
but Pratt, Shamir, and a bunch of other smarties came up with one that runs in theta n time.
it's a divide and conquer algorithm, based off a modified quicksort. Their ingenuity lies in the picking of the pivot values, theta n time.
Number of comparisons based on the original algorithm is ~16n comparisons, however they have it down to ~3n I believe.

Anyway, the lower bound on the problem is 2.0000000000000000001n comparisons. and theta n is the best you can do.

yukichigai · Feb 1, 2005

I still say Radix Sort will be satisfactory for this problem. Just... slightly modified. I doubt it would be faster than whatever the guys JayHu mentioned have come up with, but it's what came to mind and I want to tell y'all about it. Allow me to explain.

Radix Sort sorts based on each character/digit of a sequence. So in an array of 5 3-digit numbers it examines the hundreds place first, then the tens, then the ones, lumping them together in their individual sub-groups each time. For a task like this, just using a raw Radix Sort would be fast, but there's a certain logic that allows some optimizations to be made to the Radix Sort algorithm. All you're looking for here is the 10 highest numbers. Let me try to explain this....

Say you have an array of 7 digit numbers of size n.

Okay, first digit. The Radix sort gets them all organized by the 10^6 magnitude. (Or whatever the hell the term is. i.e. numbers starting with 9 grouped together, then 8, so on and so forth)

Second digit. The Radix sort starts sorting, but once it completes sorting in each grouping made by the previous digit -- e.g. finishes sorting everything starting with a 9 -- it checks to see if it's sorted at least 10 entries. If it has, it stops and moves on to the next digit, discarding all other numbers. If it has sorted EXACTLY 10 entries it stops completely and gives you that grouping of numbers. (And maybe sorts it real quick, if you really want)

Third digit. Exact same thing as above.

And so on and so forth.

Does that make any sense?

Like I said, I don't think it'll be better than what the geniuses came up with, but I had to mention it.

JayHu · Feb 1, 2005

Originally posted by: yukichigai
I still say Radix Sort will be satisfactory for this problem. Just... slightly modified. I doubt it would be faster than whatever the guys JayHu mentioned have come up with, but it's what came to mind and I want to tell y'all about it. Allow me to explain.

Radix Sort sorts based on each character/digit of a sequence. So in an array of 5 3-digit numbers it examines the hundreds place first, then the tens, then the ones, lumping them together in their individual sub-groups each time. For a task like this, just using a raw Radix Sort would be fast, but there's a certain logic that allows some optimizations to be made to the Radix Sort algorithm. All you're looking for here is the 10 highest numbers. Let me try to explain this....

Say you have an array of 7 digit numbers of size n.

Okay, first digit. The Radix sort gets them all organized by the 10^6 magnitude. (Or whatever the hell the term is. i.e. numbers starting with 9 grouped together, then 8, so on and so forth)

Second digit. The Radix sort starts sorting, but once it completes sorting in each grouping made by the previous digit -- e.g. finishes sorting everything starting with a 9 -- it checks to see if it's sorted at least 10 entries. If it has, it stops and moves on to the next digit, discarding all other numbers. If it has sorted EXACTLY 10 entries it stops completely and gives you that grouping of numbers. (And maybe sorts it real quick, if you really want)

Third digit. Exact same thing as above.

And so on and so forth.

Does that make any sense?

Like I said, I don't think it'll be better than what the geniuses came up with, but I had to mention it.

Right, but radix sort relies on knowing what the top value is, which is a theta n time right?
Since if I wanted to radix sort, I'd have to know how high to start, and that actually requires theta n.
So after that, for large n, I do radix sort, I reduce my set to 1/10 of the original size (on average, but again worst case scenario sucks!).

her209 · Feb 1, 2005

her209 · Feb 2, 2005

Are we ever going to get an answer or did we get Syringed!?

yukichigai · Feb 2, 2005

Originally posted by: JayHu
Right, but radix sort relies on knowing what the top value is, which is a theta n time right?
Since if I wanted to radix sort, I'd have to know how high to start, and that actually requires theta n.
So after that, for large n, I do radix sort, I reduce my set to 1/10 of the original size (on average, but again worst case scenario sucks!).

I was assuming the maximum possible value of the numbers in the list was known.

Without that you could attempt to do a Radix Sort starting from the last digit, instead of the first. I'd have to think about that a little harder to come up with something, but my brain's tickling like there's something there.

JayHu · Feb 2, 2005

Originally posted by: her209
Are we ever going to get an answer or did we get Syringed!?

I don't know.
I doubt he can beat linear time guaranteed anyway. If he does, he'll be very, very, very famous.

her209 · Feb 7, 2005

We have officially been Syringed.

sao123 · Feb 8, 2005

I am currently working on an "Intelligent Bubblesort Algorithm".

I believe in theory that there are void areas of an array that do not need searching during the next pass after a given pass of the bubblesort. Based on this, I believe I can make 9 of the 10 bubblesort passes as a partial passes, after completing the first full pass. Therby eliminating a certain significant percentage of comparisons during each loop.

I will post more as i complete the theory and a working algorithm.

Edit: I dont expect a <1 second search, but I do expect 20-40% faster results than 10 passes of the bubblesort itself.

sao123 · Feb 11, 2005

Here is the theory behind the intelligent bubblesort.

Theory.... After a given pass, there may be a region of the array that does not need to be searched because during the last pass the largest value in that region has been moved to the smallest index in that region. This algorith tracks the original position of the last moved item through a local variable.

there are 3 cases.

Case 1:
Given: Let X be the index of the largest number in the list.
P is the pass currently being performed, which is used to prevent reprocessing already sorted numbers.
The first pass of the sort must use case 1 with P=0.

Theory: P < x < array_size
Next Pass is Case 2.

Case 2:
Given: X was the index of the largest number in the list from the last pass.
Theory: The next largest (NLN) number in this pass has index P < I <= x+1 (Region R > x+1 is a void area for searching.)

Result A: WC -> NLN index = x+1 (No further void area is present)
Next pass is a case 1.

Result B: NLN index is: P < NLN < x+1
Let Z be in the index of the largest number in the list. Z < X.
Next pass is case 3.

Case 3:
Given: x & z have values where z < x.
Theory: NLN has index: NLN > x or P < NLN <= z+1. Region z < R < x is void for searching.

Result:
Let X be the index of the largest number in the list.
If x = z, next pass is case 1 else next pass is case 2.

Currently testing the coded algorith. anyone interested pm me.

techfuzz · Feb 11, 2005

I'll assume the array pre-sorted in descending order and then using intuition, I pick the first 10 numbers. I win!

techfuzz

sao123 · Feb 11, 2005

Code and Theory posted in this thread.

Question for you Comp Science folks! (Round 2)

amoeba

Diamond Member

amoeba

Diamond Member

Kilrsat

Golden Member

JayHu

Senior member

yukichigai

Diamond Member

JayHu

Senior member

her209

No Lifer

her209

No Lifer

yukichigai

Diamond Member

JayHu

Senior member

her209

No Lifer

sao123

Lifer

sao123

Lifer

techfuzz

Diamond Member

sao123

Lifer

TRENDING THREADS