K-way Merge

Like #202203102127, it splits the file contents into parts with size of \(M\) which will be sorted using one of the 202203102004.

Unlike 202203102127, it could use more than two, \(k\), drives or tapes as input or output. \(k\) parts will then be grouped into a run which will be stored into \(k\) drives or tapes. It is ill-advised to use comparison to obtain the smallest element among the parts since it could result in \(O(N^2)\) which is bad. We could take advantage of some #data-structure peculiarly 202202062259 or 202112102110. We will then obtain a sorted run by repeating the process recursively. The same steps will be performed for the next run.

Analysis

  • It will require \(\lceil \log_k(\frac{N}{M} \rceil\) passes.
  • It is more efficient than 202203102127 since the increase of tapes and drives reduce the required passes to process all the runs.
  • This could be further optimised by 202203111148#
Links to this page
  • Polyphase Merge

    This distribution strategy could be used to extend the 202203102233 where the \(k\)th order of Fibonacci number is defined as \(F^{(k)}(N) = F^{(k)}(N - 1) + F^{(k)}(N - 2) + \cdots + F^{(k)}(N - k)\) with initial conditions of \(F^{(k)}(N) = 0, 0 \le N, \le k - 2, F^{(k)}(k - 1) = 1\).

  • External Sorting
#sorting #algorithm #data-structure