MergeSort in place on Linked Lists

ScythedBlade

Member
Sep 3, 2006
56
0
0
So, I was looking over the Java API regarding Collections.sort() on linked lists, and it says it dumps linked lists into an array so that it can call merge sort. Because, otherwise, sorting a linked list in place would lead to a complexity of O(n^2 log n) ... can someone explain how this happens (The n^2 log n complexity)?
 

imported_Dhaval00

Senior member
Jul 23, 2004
573
0
0
I thought Merge Sort was n.log(n). I could be wrong. One explanation could be that the actual Merge Sort takes n.log(n) while the linked list data structure adds to the complexity and makes the entire operation n^2log(n).
 

Argo

Lifer
Apr 8, 2000
10,045
0
0
You sure it says linked list and not just list? In place merge sort on an ArrayList should be n * (n * log(n)) due to constant reallocations required during merge phase.

However, on doubly LinkedList it should be only n * log(n) since insertion can be done in constant time. Looks like it could be a typo in the javadoc, perhaps they meant general lists.
 
Sep 29, 2004
18,656
67
91
Have you DLed the Java source yet? That could hold your answer.

Here is the API comment:
The sorting algorithm is a modified mergesort (in which the merge is omitted if the highest element in the low sublist is less than the lowest element in the high sublist). This algorithm offers guaranteed n log(n) performance. This implementation dumps the specified list into an array, sorts the array, and iterates over the list resetting each element from the corresponding position in the array. This avoids the n2 log(n) performance that would result from attempting to sort a linked list in place.
 

lousydood

Member
Aug 1, 2005
158
0
0
Sounds wrong. It should be easy to merge-sort even singly-linked lists in worst-case O(n*log(n)) time and constant space, if you can modify the references, which is the case in Java.
 
Sep 29, 2004
18,656
67
91
I tried looking over the Java source code today. I pulled up the source to Collections and I could not find the sort() method. I didn't have time to spend on the issue though so I stopped. Anyone else look at the source yet?

Collections:
public static void sort(List list) {
Object a[] = list.toArray();
Arrays.sort(a);
ListIterator i = list.listIterator();
for (int j=0; j<a.length; j++) {
i.next();
i.set(a[j]);
}
}

which calls Arrays:
public static void sort(Object[] a) {
Object aux[] = (Object[])a.clone();
mergeSort(aux, a, 0, a.length, 0);
}

which calls:
private static void mergeSort(Object src[], Object dest[],
int low, int high, int off) {
int length = high - low;

// Insertion sort on smallest arrays
if (length < INSERTIONSORT_THRESHOLD) {
for (int i=low; i<high; i++)
for (int j=i; j>low &&
((Comparable)dest[j-1]).compareTo((Comparable)dest[j])>0; j--)
swap(dest, j, j-1);
return;
}

// Recursively sort halves of dest into src
int destLow = low;
int destHigh = high;
low += off;
high += off;
int mid = (low + high) >> 1;
mergeSort(dest, src, low, mid, -off);
mergeSort(dest, src, mid, high, -off);

// If list is already sorted, just copy from src to dest. This is an
// optimization that results in faster sorts for nearly ordered lists.
if (((Comparable)src[mid-1]).compareTo((Comparable)src[mid]) <= 0) {
System.arraycopy(src, low, dest, destLow, length);
return;
}

// Merge sorted halves (now in src) into dest
for(int i = destLow, p = low, q = mid; i < destHigh; i++) {
if (q >= high || p < mid && ((Comparable)src[p]).compareTo(src[QQ])<=0)
dest = src[p++];
else
dest = src[q++];
}
}

---------------
At this point, just use an IDE to look this over. There are more methods that you might want to look at. I think some of the code got reformatted thanks to brackets so you might want to just DL the source and look it over.

The source code:
http://www.sun.com/software/co...2se/java2/download.xml
You need to login to DL it.
 

Crusty

Lifer
Sep 30, 2001
12,684
2
81
The last time I went over sorting in Java was back with 1.4.2. Merge sort would split the list into chunks < 7 elements long, perform an insertion sort on each one, and then merge the results back together. So instead of recursively splitting your list until you have 1 element in each one, it stops once the lists are sufficient enough to run a quick insertion sort on it. I'm not sure if they treat things any differently if it's a LL or not, but if the List gets dumped into an array that's your extra n(according to the OP) then the first thing you have to do in the algorithm is traverse the entire List taking each element and putting it into the array.

 

lousydood

Member
Aug 1, 2005
158
0
0
Now I see why Sun was reluctant to open source their compiler. Anyway...

Performing insertion-sort at 7 elements is ok. That's a constant-time operation, asymptotically, and doesn't change the overall recurrence for merge-sort significantly.

Copying the list into an array causes O(n) space usage. This is clearly unnecessary, since it is well known that linked lists can be merge-sorted (destructively) in constant space. However, on small lists, locality of reference may overcome this. On the other hand, allocation has its overhead problems too.

Crusty, putting the list into an array first won't result in the extra factor of n. It will be O(n + n*log(n)) and of course that can be simplified back to O(n*log(n)). The API comment is simply incorrect.

 

Crusty

Lifer
Sep 30, 2001
12,684
2
81
Originally posted by: lousydood
Now I see why Sun was reluctant to open source their compiler. Anyway...

Performing insertion-sort at 7 elements is ok. That's a constant-time operation, asymptotically, and doesn't change the overall recurrence for merge-sort significantly.

Copying the list into an array causes O(n) space usage. This is clearly unnecessary, since it is well known that linked lists can be merge-sorted (destructively) in constant space. However, on small lists, locality of reference may overcome this. On the other hand, allocation has its overhead problems too.

Crusty, putting the list into an array first won't result in the extra factor of n. It will be O(n + n*log(n)) and of course that can be simplified back to O(n*log(n)). The API comment is simply incorrect.

You're right, I haven't looked at that stuff in a few years :p

 

lousydood

Member
Aug 1, 2005
158
0
0
Some testing reveals that GCC, at least, handles in-place merge-sort on arrays better than in-place merge-sort on lists. There appears to be a constant factor of 1.5x slow-down with the list, even though I arrange it so that the list must be copied first into the newly allocated array before being sorted array-style. Both algorithms still run in asymptotic O(n*log(n)) time and constant space (besides the array copy).

However, the array merge-sort became obscenely slower once I increased the size of the input to a point where the array became very unwieldy in memory (about 10 million elements for my old machine), whilst the list merge-sort was not adversely affected.