티스토리 뷰

Divide and conquer, 분할 정복 알고리즘

 단어 그대로 "분할", "정복", "합치기" 세개로 나누어 진행하는 알고리즘이다.

divide and conquer알고리즘의 대표적인 예로는 merge sort가 있으며 quick sort, ip 주소 할당, memorization알고리즘 등이 있다.

merge sort

1. 분할(divide)

 - 가장 작은 단위가 될때 까지 쪼갠다.

 - 주어진 문제의 최소단위로 나누어질때 까지 같은 작업을 반복하기 때문에 recursion으로 구현할 수 있다.

2. 정복(conquer)

 - 쪼개진 단위들을 정렬하며 합쳐나간다.

merge sort의 time complexity는 O(logn)이 된다.

 

프로그래밍언어마다 내장되어 있는 sort method가 있다.

//javascript
const arr = [2, 5, 3, 1, 8, 4];
arr.sort(); 

//C++(algorithm 라이브러리 include)
vector<int> arr = {2, 5, 3, 1, 8, 4};
sort(arr.begin(), arr.end());

//rust
let mut arr = vec![2, 5, 3, 1, 8, 4];
arr.sort();

각 언어별 sort가 사용하는 알고리즘이 다르지만, rust의 sort는 merge sort로 이루어져있다.

아래의 코드는 rust slice sort method의 함수 내용이다

#[stable(feature = "rust1", since = "1.0.0")]
#[inline]
pub fn sort(&mut self)
where
T: Ord,
{
   merge_sort(self, |a, b| a.lt(b));
}


fn merge_sort<T, F>(v: &mut [T], mut is_less: F)
where
    F: FnMut(&T, &T) -> bool,
{
    // Slices of up to this length get sorted using insertion sort.
    const MAX_INSERTION: usize = 20;
    // Very short runs are extended using insertion sort to span at least this many elements.
    const MIN_RUN: usize = 10;

    // Sorting has no meaningful behavior on zero-sized types.
    if size_of::<T>() == 0 {
        return;
    }

    let len = v.len();

    // Short arrays get sorted in-place via insertion sort to avoid allocations.
    if len <= MAX_INSERTION {
        if len >= 2 {
            for i in (0..len - 1).rev() {
                insert_head(&mut v[i..], &mut is_less);
            }
        }
        return;
    }

    // Allocate a buffer to use as scratch memory. We keep the length 0 so we can keep in it
    // shallow copies of the contents of `v` without risking the dtors running on copies if
    // `is_less` panics. When merging two sorted runs, this buffer holds a copy of the shorter run,
    // which will always have length at most `len / 2`.
    let mut buf = Vec::with_capacity(len / 2);

    // In order to identify natural runs in `v`, we traverse it backwards. That might seem like a
    // strange decision, but consider the fact that merges more often go in the opposite direction
    // (forwards). According to benchmarks, merging forwards is slightly faster than merging
    // backwards. To conclude, identifying runs by traversing backwards improves performance.
    let mut runs = vec![];
    let mut end = len;
    while end > 0 {
        // Find the next natural run, and reverse it if it's strictly descending.
        let mut start = end - 1;
        if start > 0 {
            start -= 1;
            unsafe {
                if is_less(v.get_unchecked(start + 1), v.get_unchecked(start)) {
                    while start > 0 && is_less(v.get_unchecked(start), v.get_unchecked(start - 1)) {
                        start -= 1;
                    }
                    v[start..end].reverse();
                } else {
                    while start > 0 && !is_less(v.get_unchecked(start), v.get_unchecked(start - 1))
                    {
                        start -= 1;
                    }
                }
            }
        }

        // Insert some more elements into the run if it's too short. Insertion sort is faster than
        // merge sort on short sequences, so this significantly improves performance.
        while start > 0 && end - start < MIN_RUN {
            start -= 1;
            insert_head(&mut v[start..end], &mut is_less);
        }

        // Push this run onto the stack.
        runs.push(Run { start, len: end - start });
        end = start;

        // Merge some pairs of adjacent runs to satisfy the invariants.
        while let Some(r) = collapse(&runs) {
            let left = runs[r + 1];
            let right = runs[r];
            unsafe {
                merge(
                    &mut v[left.start..right.start + right.len],
                    left.len,
                    buf.as_mut_ptr(),
                    &mut is_less,
                );
            }
            runs[r] = Run { start: left.start, len: left.len + right.len };
            runs.remove(r + 1);
        }
    }

    // Finally, exactly one run must remain in the stack.
    debug_assert!(runs.len() == 1 && runs[0].start == 0 && runs[0].len == len);

    // Examines the stack of runs and identifies the next pair of runs to merge. More specifically,
    // if `Some(r)` is returned, that means `runs[r]` and `runs[r + 1]` must be merged next. If the
    // algorithm should continue building a new run instead, `None` is returned.
    //
    // TimSort is infamous for its buggy implementations, as described here:
    // http://envisage-project.eu/timsort-specification-and-verification/
    //
    // The gist of the story is: we must enforce the invariants on the top four runs on the stack.
    // Enforcing them on just top three is not sufficient to ensure that the invariants will still
    // hold for *all* runs in the stack.
    //
    // This function correctly checks invariants for the top four runs. Additionally, if the top
    // run starts at index 0, it will always demand a merge operation until the stack is fully
    // collapsed, in order to complete the sort.
    #[inline]
    fn collapse(runs: &[Run]) -> Option<usize> {
        let n = runs.len();
        if n >= 2
            && (runs[n - 1].start == 0
                || runs[n - 2].len <= runs[n - 1].len
                || (n >= 3 && runs[n - 3].len <= runs[n - 2].len + runs[n - 1].len)
                || (n >= 4 && runs[n - 4].len <= runs[n - 3].len + runs[n - 2].len))
        {
            if n >= 3 && runs[n - 3].len < runs[n - 1].len { Some(n - 3) } else { Some(n - 2) }
        } else {
            None
        }
    }

    #[derive(Clone, Copy)]
    struct Run {
        start: usize,
        len: usize,
    }
}

 

C++에서의 sort method를 뜯어보면 상황별로 다른 알고리즘을 적용함을 알 수 있다.

last - first보다 32 보다 작다면 insertion_sort를 이용하고,

divisions가 너무 많이 발생한다면 heap sort

그게아니라면 quick sort를 이용하여 divide and conquer알고리즘을 사용하여 정렬한다.

 

C++의 주석에 작성된 내용을 보면 quicksort를 이용한 divide and conquer 알고리즘을 사용한다고 명시되어 있다.

template <class _RanIt, class _Pr>
_CONSTEXPR20 void _Sort_unchecked(_RanIt _First, _RanIt _Last, _Iter_diff_t<_RanIt> _Ideal, _Pr _Pred) {
    // order [_First, _Last)
    for (;;) {
        if (_Last - _First <= _ISORT_MAX) { // small
            _Insertion_sort_unchecked(_First, _Last, _Pred);
            return;
        }

        if (_Ideal <= 0) { // heap sort if too many divisions
            _Make_heap_unchecked(_First, _Last, _Pred);
            _Sort_heap_unchecked(_First, _Last, _Pred);
            return;
        }

        // divide and conquer by quicksort
        auto _Mid = _Partition_by_median_guess_unchecked(_First, _Last, _Pred);

        _Ideal = (_Ideal >> 1) + (_Ideal >> 2); // allow 1.5 log2(N) divisions

        if (_Mid.first - _First < _Last - _Mid.second) { // loop on second half
            _Sort_unchecked(_First, _Mid.first, _Ideal, _Pred);
            _First = _Mid.second;
        } else { // loop on first half
            _Sort_unchecked(_Mid.second, _Last, _Ideal, _Pred);
            _Last = _Mid.first;
        }
    }
}

 

 

javascript에서는 merge sort에서 응용한 정렬방법의 일종인 Tim sort를 이용한다고 나와있다.

v8 엔진 공식문서 : https://v8.dev/blog/array-sort

 

Getting things sorted in V8 · V8

Array.prototype.sort was among the last builtins implemented in self-hosted JavaScript in V8. Porting it offered us the opportunity to experiment with different algorithms and implementation strategies and finally make it stable in V8 v7.0 / Chrome 70. Bac

v8.dev

 

 Tim sort 알고리즘을 고안한 Tim Peters가 설명한 github 문서

https://github.com/python/cpython/blob/main/Objects/listsort.txt

 

python/cpython

The Python programming language. Contribute to python/cpython development by creating an account on GitHub.

github.com

 

공지사항
최근에 올라온 글
최근에 달린 댓글
Total
Today
Yesterday
링크
«   2024/07   »
1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31
글 보관함