|
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
|
| Volume 69 - Issue 21 |
| Published: May 2013 |
| Authors: Ruben Buaba, Abdollah Homaifar, Eric Kihn |
10.5120/12096-8258
|
Ruben Buaba, Abdollah Homaifar, Eric Kihn . Optimal Load Factor for Approximate Nearest Neighbor Search under Exact Euclidean Locality Sensitive Hashing. International Journal of Computer Applications. 69, 21 (May 2013), 22-31. DOI=10.5120/12096-8258
@article{ 10.5120/12096-8258,
author = { Ruben Buaba,Abdollah Homaifar,Eric Kihn },
title = { Optimal Load Factor for Approximate Nearest Neighbor Search under Exact Euclidean Locality Sensitive Hashing },
journal = { International Journal of Computer Applications },
year = { 2013 },
volume = { 69 },
number = { 21 },
pages = { 22-31 },
doi = { 10.5120/12096-8258 },
publisher = { Foundation of Computer Science (FCS), NY, USA }
}
%0 Journal Article
%D 2013
%A Ruben Buaba
%A Abdollah Homaifar
%A Eric Kihn
%T Optimal Load Factor for Approximate Nearest Neighbor Search under Exact Euclidean Locality Sensitive Hashing%T
%J International Journal of Computer Applications
%V 69
%N 21
%P 22-31
%R 10.5120/12096-8258
%I Foundation of Computer Science (FCS), NY, USA
Locality Sensitive Hashing (LSH) is an index-based data structure that allows spatial item retrieval over a large dataset. The performance measure, ?, has significant effect on the computational complexity and memory space requirement to create and store items in this data structure respectively. The minimization of ? at a specific approximation factor c, is dependent on the load factor, ?. Over the years,?=4has been used by researchers. In this paper, we demonstratethat the choice of?=4does not guarantee low computational complexity and low memory space of the data structure under the LSH scheme. To guarantee low computational complexity and low memory space, we propose?=5. Experiments on the Defense Meteorological Satellite Program imagery datasethave shown that?=5saves more than 75%on memory space; cuts the computational complexity by more than 70%andanswers query two times faster on the average compared to that of?=4.