Two-way Merge

Two-way Merge splits the file contents into parts where their size, \(M\), could be held by the main memory. Those parts will be sorted using one of the 202203102004 and we will pair those parts into runs which will be stored into two separate drives or tapes. We will merge two parts by comparing them sequentially until we have hit the end of the run. Follow the recursive rule, we will repeat the steps stated before until we have obtained a sorted result.

Analysis

  • It is optimal to have 4 drives or tapes in using Two-way Merge. Two of them used as input, the other two used as output. Their role is interchangeable depends on the run.
  • It will require \(\lceil \log(\frac{N}{M}) \rceil\) passes.
  • This could be further optimised by 202203111148#
Links to this page
  • K-way Merge
    It is more efficient than 202203102127 since the increase of tapes and drives reduce the required passes to process all the runs.

    Like #202203102127, it splits the file contents into parts with size of \(M\) which will be sorted using one of the 202203102004.

    Unlike 202203102127, it could use more than two, \(k\), drives or tapes as input or output. \(k\) parts will then be grouped into a run which will be stored into \(k\) drives or tapes. It is ill-advised to use comparison to obtain the smallest element among the parts since it could result in \(O(N^2)\) which is bad. We could take advantage of some #data-structure peculiarly 202202062259 or 202112102110. We will then obtain a sorted run by repeating the process recursively. The same steps will be performed for the next run.

  • External Sorting
#sorting #algorithm