I know some of the range searching data structure, for example kd-tree, Range tree and quad-tree. But all the implementation is in memory, how can I implementation them on secondary memory with high performance I/O efficiency?
Here is the condition:
1): a static set of points on the two dimension.
2): just for query, no inset or delete.
3): adapt for secondary memory.
Thanks.
If you can fit the tree into memory during construction:
Build the kd-tree.
Bottom, up, collect as many points as possible that fit into a block of your hardware size.
Write the data to this block.
Repeat 2.-3. recursively, until you've written all data to disk.
When querying, load a page from disk, process this part of the tree until you reach a reference to another page. Then load this page and continue there.
Alternatively, you can do the same top-down, but then you will likely need more disk space. In above approach, only the root page can be near-empty.