Comparing hierarchical data in external memory.
Sudarshan S. Chawathe.
In Proceedings of the Twenty-fifth International Conference on
Very Large Data Bases, pages 90-101, Edinburgh, Scotland, U.K.,
September 1999.
[ paper |
citation ]
We present an external-memory algorithm for computing a minimum-cost edit script between two rooted, ordered, labeled trees. The I/O, RAM, and CPU costs of our algorithm are, respectively, 4mn+7m+5n, 6S, and O(MN + (M+N)S^1.5 ), where M and N are the input tree sizes, S is the block size, m = M/S, and n = N/S. This algorithm can make effective use of surplus RAM capacity to quadratically reduce I/O cost. We extend to trees the commonly used mapping from sequence comparison problems to shortest-path problems in edit graphs.