Abstract |
Traditionally, searching for text is accomplished by using a data structure fitted to the task, and database access methods have tended to standardise on B-tree multiway trees, especially where proximity searching may be desired.
B-trees were discovered in 1970. At that time, virtual memory was not usually available, disk space cost 370 US dollars per megabyte, and Intel released the 4004 in 1971, capable of accessing 1kilobyte of program memory, and 4 kilobytes of data. Times have changed - virtual memory is now ubiquitous, and main memory size of 4 gigabytes and more is becoming standard. Recent advances in processor speeds and memory speeds have resulted in a very high speed for memory to memory copying, and for a much larger set of information to be held in memory at any one time.
This paper describes a new method of data organisation for searching, the A-tree, and explains the background and reasons for its design and implementation. Its design is contrasted with that of the B-tree, and, in lesser detail, with hash tables. Some performance figures have been gained from instrumenting performance, and these are compared to standard data structures such as B-trees.
|