An efficient similarity join approach on large‐scale high‐dimensional data using random projection. (7th May 2019)
- Record Type:
- Journal Article
- Title:
- An efficient similarity join approach on large‐scale high‐dimensional data using random projection. (7th May 2019)
- Main Title:
- An efficient similarity join approach on large‐scale high‐dimensional data using random projection
- Authors:
- Ma, Youzhong
Zhang, Ruiling
Jia, Shijie
Zhang, Yongxin
Meng, Xiaofeng - Abstract:
- Summary: Similarity join on large‐scale high‐dimensional data faces major challenges because of the data scale and the cure of dimensionality. Random projection with p‐stable distribution can reduce the high‐dimensional data form d ‐dimension to k ‐dimension ( k ≪ d ), the distance of the data in k ‐dimensional space can be used to filter out as many data pairs as possible at relative low cost. Based on the above idea, we proposed two novel approaches to deal with large‐scale high‐dimensional data similarity join: projection‐based similarity join (PromSimJ) algorithm and projection space partitioning–based similarity join (ProSPSimJ) algorithm. The comprehensive experiments were performed to test the performance of the above methods. We also compared the performance of the above methods with that of the naive method block nested loop join. The final experimental results prove that our approaches have much better performance and good scalability.
- Is Part Of:
- Concurrency and computation. Volume 31:Number 20(2019)
- Journal:
- Concurrency and computation
- Issue:
- Volume 31:Number 20(2019)
- Issue Display:
- Volume 31, Issue 20 (2019)
- Year:
- 2019
- Volume:
- 31
- Issue:
- 20
- Issue Sort Value:
- 2019-0031-0020-0000
- Page Start:
- n/a
- Page End:
- n/a
- Publication Date:
- 2019-05-07
- Subjects:
- high‐dimensional data -- p‐stable distribution -- random projection -- similarity join
Parallel processing (Electronic computers) -- Periodicals
Parallel computers -- Periodicals
004.35 - Journal URLs:
- http://onlinelibrary.wiley.com/ ↗
- DOI:
- 10.1002/cpe.5303 ↗
- Languages:
- English
- ISSNs:
- 1532-0626
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3405.622000
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 11974.xml