Sorting
Transcrição
Sorting
Fachhochschule Braunschweig / Wolfenbüttel - University of Applied Sciences - Data Structures and Algorithm Design - CSCI 340 Friedhelm Seutter Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik © Friedhelm Seutter, 2008 Contents 1. 2. 3. 4. 6. 7. 8. 10. 13. Analyzing Algorithms and Problems Data Abstraction Recursion and Induction Sorting Dynamic Sets and Searching Graphs and Graph Traversals Optimization and Greedy Algorithms Dynamic Programming NP-Complete Problems Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 2 © Friedhelm Seutter, 2008 1 4. Sorting • • • • Insertion Sort Quicksort Mergesort Heapsort Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 3 © Friedhelm Seutter, 2008 Sorting Problem Let A = (a1, a2, . . . , an) be an array of (nonnegative) integers, called keys. The problem is to find a permutation π such that the integers are sorted in nondecreasing order. Solution: A = (a π(1), a π(2), . . . , a π(n)) with a π(1) ≤ a π(2) ≤ . . . ≤ a π(n) Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 4 © Friedhelm Seutter, 2008 2 Analysis of Complexity General strategy: Sorting by comparison of keys • Time: Number of key comparisons (basic operations) • Space: Amount of extra space (in addition to the input) Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 5 © Friedhelm Seutter, 2008 Insertion Sort Let some elements at the left side of the array be sorted. Take the first element from the unexamined elements and insert it at the right position of the sorted elements. To get a vacant space the greater elements must be shifted one position to the right. Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 6 © Friedhelm Seutter, 2008 3 Insertion Sort - Example < sorted > 2 4 12 14 15 2 4 12 14 15 < not examined > 7 19 11 19 11 19 11 7 2 4 < 7 12 14 sorted Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 15 > <not exam.> 7 © Friedhelm Seutter, 2008 8 © Friedhelm Seutter, 2008 Insertion Sort Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 4 Worst-case Complexity Basic operation: Key comparison in line 4 W(n) = ∑(2 ≤ j ≤ n) (j – 1) = ∑(1 ≤ j ≤ n-1) j = ½ n (n – 1) ∈ Θ(n2) Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 9 © Friedhelm Seutter, 2008 Average-case Complexity Basic operation: Key comparison in line 4 Assumptions: All permutations are equally likely as input and the keys are distinct. A(n) ≈ ¼ n2 ∈ Θ(n2) Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 10 © Friedhelm Seutter, 2008 5 Best-case Complexity Basic operation: Key comparison in line 4 B(n) = n – 1 ∈ Θ(n) Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 11 © Friedhelm Seutter, 2008 Space Complexity Insertion Sort sorts in-place. The additional amount of space is independent of the number of elements to sort. Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 12 © Friedhelm Seutter, 2008 6 Divide and Conquer Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 13 © Friedhelm Seutter, 2008 Quicksort • Divide: Choose one element to be the pivot. Divide the array in two subarrays corresponding to the pivot. Less or equal elements to the left, greater elements to the right, the pivot in between. • Conquer: An array of length 1 is sorted. • Combine: Append two sorted subarrays with the pivot in between. Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 14 © Friedhelm Seutter, 2008 7 Quicksort Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 15 © Friedhelm Seutter, 2008 Quicksort-Partition Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 16 © Friedhelm Seutter, 2008 8 Quicksort-Partition f l unexamined f pivot i ≤ pivot j l unexamined > pivot f q ≤ pivot Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik pivot 17 pivot l > pivot © Friedhelm Seutter, 2008 Quicksort-Partition Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 18 © Friedhelm Seutter, 2008 9 Divide Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 19 © Friedhelm Seutter, 2008 20 © Friedhelm Seutter, 2008 Combine Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 10 Complexity Basic operation: Key comparison in line 4 of Partition W(n) ∈ Θ(n2) Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik A(n) ∈ Θ(n log n) B(n) ∈ Θ(n log n) 21 © Friedhelm Seutter, 2008 Space Complexity Amount of space needed: Θ(n) The exchange of keys is in-place, but there are in the worst case n recursive procedure calls and they need that space for storing their local variables. A tricky implementation may reduce the space complexity to Θ(log n). Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 22 © Friedhelm Seutter, 2008 11 Mergesort • Divide: Divide the array in two halves, recursively. • Conquer: An array of length 1 is sorted. • Combine: Two sorted subarrays are merged to a sorted array. Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 23 © Friedhelm Seutter, 2008 24 © Friedhelm Seutter, 2008 Mergesort Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 12 Mergesort-Merge Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 25 © Friedhelm Seutter, 2008 26 © Friedhelm Seutter, 2008 Divide Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 13 Combine Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 27 © Friedhelm Seutter, 2008 Complexity Basic operation: Key comparison in line 6 of Merge W(n) ∈ Θ(n log n) Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik A(n) ∈ Θ(n log n) B(n) ∈ Θ(n log n) 28 © Friedhelm Seutter, 2008 14 Space Complexity Amount of space needed: Θ(n) There is no exchange of keys, but all n keys are copied and merged to an extra array. A tricky implementation may reduce the space needed to n/2, but this still is in Θ(n). Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 29 © Friedhelm Seutter, 2008 Lower Bounds for Sorting by Comparison of Keys What is the minimum number of key comparisons for sorting algorithms based on comparisons of keys? Given an array of n distinct keys. The solution of sorting the keys is a permutation of the keys. Thus there are n! possible solutions. Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 30 © Friedhelm Seutter, 2008 15 Decision Tree for Sorting All possible sorting solutions may be represented in a decision tree. The inner nodes represent a comparison of two keys. The possible outcomes are true or false. If false, the keys have to be exchanged. Inner nodes have two successors. The leaves are the possible sorting solutions. Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 31 © Friedhelm Seutter, 2008 Decision Tree for Sorting Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 32 © Friedhelm Seutter, 2008 16 Decision Tree for Sorting Sorting by comparison corresponds to a path in the decision tree from the root to a leaf. The length of the longest path corresponds to the number of comparisons in the worst case. Therefore the lower bound of the height of a decision tree is a worst case lower bound for the number of key comparisons. Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 33 © Friedhelm Seutter, 2008 Lower Bounds for Sorting Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 34 © Friedhelm Seutter, 2008 17 Lower Bounds for Sorting Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 35 © Friedhelm Seutter, 2008 Heapsort The algorithm uses a data structure called heap, which is a binary tree and some special properties. Heap-structure: Complete binary tree with some of the rightmost leaves removed. Partial tree order property: The key at any node is greater (less) than or equal to the keys at each of its children. Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 36 © Friedhelm Seutter, 2008 18 Heap Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 37 © Friedhelm Seutter, 2008 38 © Friedhelm Seutter, 2008 Heap Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 19 Heap Implementation • As a linked structure with each node containing pointers (references) to the roots of its subtrees. • As an array: – The root is in A[1] – Let i be the index of a node, except the root, then the index of the parent is ⎣i/2⎦. – Let i be the index of a node, except a leaf, then 2i is the index of the left child and 2i+1 is the index of the right child. Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 39 © Friedhelm Seutter, 2008 Heap: Array-Implementation 1 2 3 4 5 6 7 8 9 10 16 14 13 8 7 11 12 2 4 1 heapsize[A] Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 40 11 12 13 14 length[A] © Friedhelm Seutter, 2008 20 Heapsort Strategy • The root contains the largest key in the heap. • Build a sorted sequence in reverse order by repeatedly removing the root element from the heap. • After each removing step the heap properties have to be reestablished by bringing the next largest key to the root. Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 41 © Friedhelm Seutter, 2008 Fixing a Heap • A node violates the partial order tree property, i. e. its key is less than at least one of the keys of its children. • This node must be exchanged with the child, which has the largest key, recursively. Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 42 © Friedhelm Seutter, 2008 21 Fixing a Heap Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 43 © Friedhelm Seutter, 2008 44 © Friedhelm Seutter, 2008 Fixing a Heap Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 22 Fixing a Heap Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 45 © Friedhelm Seutter, 2008 46 © Friedhelm Seutter, 2008 Fixing a Heap Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 23 Complexity of FixHeap Basic operation: Key comparisons in lines 3 and 6 W(n) = 2h = 2 ⎣lg n⎦ ∈ Θ( log n) (h height of the heap, n number of nodes) Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 47 © Friedhelm Seutter, 2008 Constructing a Heap • Given an unordered array of keys. The corresponding binary tree has heap structure, but the partial order tree property is violated. • The leaves A[ ⎣n/2+1⎦] , . . . , A[n] are heaps. • The subtrees with roots from A[ ⎣n/2⎦ ] down to A[1] must establish their partial order tree property. Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 48 © Friedhelm Seutter, 2008 24 Constructing a Heap A = (4, 1, 12, 2, 16, 11, 13, 14, 8, 7) Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 49 © Friedhelm Seutter, 2008 Constructing a Heap A = (16, 14, 13, 8, 7, 11, 12, 2, 4, 1) Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 50 © Friedhelm Seutter, 2008 25 Constructing a Heap Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 51 © Friedhelm Seutter, 2008 Complexity of ConstructHeap Basic operation: Call of FixHeap in line 3 W(n) ≤ n ⎣lg n⎦ ∈ Θ(n log n) But this upper bound is poor! Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 52 © Friedhelm Seutter, 2008 26 Heights of subtrees for FixHeap Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 53 © Friedhelm Seutter, 2008 Complexity of ConstructHeap Basic operation: Call of FixHeap in line 3 W(n) ≤ ∑(0 ≤ k ≤ h) 2k(h – k) ≈ 2n - lg n + 2 ∈ Θ(n) Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 54 © Friedhelm Seutter, 2008 27 Heapsort Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 55 © Friedhelm Seutter, 2008 56 © Friedhelm Seutter, 2008 Heapsort Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 28 Heapsort Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 57 © Friedhelm Seutter, 2008 58 © Friedhelm Seutter, 2008 Heapsort Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 29 Heapsort Given: A = (4, 1, 12, 2, 16, 11, 13, 14, 8, 7) ConstructHeap: A = (16, 14, 13, 8, 7, 11, 12, 2, 4, 1) HeapSort: A = (1, 2, 4, 7, 8, 11, 12, 13, 14, 16) Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 59 © Friedhelm Seutter, 2008 Complexity of Heapsort Add up the complexities of ConstructHeap and FixHeap in the loop: W(n) Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik = Θ(n) + (n-1) Θ(lg n) ∈ Θ(n lg n) 60 © Friedhelm Seutter, 2008 30 Space Complexity Heapsort sorts in-place. The space needed for recursion is limited to a depth of about lg n. But these procedures can be recoded in iterative procedures. Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik 61 © Friedhelm Seutter, 2008 Comparison of Sorting Algorithms Algorithm Worst case Insertion Sort Quicksort Mergesort Heapsort Fachhochschule Braunschweig / Wolfenbüttel Institut für Angewandte Informatik n2/2 n2/2 n lg n 2n lg n 62 Average Extra space Θ(n2) Θ(n log n) Θ(n log n) Θ(n log n) Θ(1) Θ(log n) Θ(n) Θ(1) © Friedhelm Seutter, 2008 31