java - Find all records close to particular record using Map Reduce -
i trying implement map reduce program find records in 2gb dataset close each other (something each record , neighbors should output). close to, mean euclidean distance. in dataset each record has x , y coordinate. suggest me intuition doing this. know map should emit each record , reduce can run double loop each entry in list inputted find neighbors, there better solution implementation horribly slow. in advance.
map(rid,r): emit(key,r) reduce(key,lst=[r1,r2....]): elm1 in lst: elm2 in lst: if elm2 in range of elm1: process(elm1,elm2)
the process function puts elm2 neighbor or elm1 mongodb database. each record in mongodb database structured follows
record 'r' | list of neighbors of record 'r'
you may able speed implementation indexing records in buckets. let's of records in grid [0,100] x [0, 100]. create 99 x-buckets [0, 1), [1, 2), ... [99, 100] , 99 y-buckets. given record [x1, y1] , distance d, take intersection of x-buckets [x1 - d - 1] [x1 + d + 1] , y-buckets [y1 - d - 1] [y1 + d + 1], , test euclidean distance of [x1, y1] against points in resulting set.
Comments
Post a Comment