Document clustering for efficient and secure information retrieval from cloud. (10th January 2019)
- Record Type:
- Journal Article
- Title:
- Document clustering for efficient and secure information retrieval from cloud. (10th January 2019)
- Main Title:
- Document clustering for efficient and secure information retrieval from cloud
- Authors:
- Handa, Rohit
Krishna, C. Rama
Aggarwal, Naveen - Abstract:
- Summary: Organizations prefer using cloud for storing their data due to availability of cost‐effective storage. The outsourced data include sensitive information, so data is encrypted as maintaining confidentiality and privacy of the documents is of paramount importance. Retrieving the desired information from the cloud requires efficient searching, which involves submission of search query to the cloud server by the end‐user. As the search terms may include sensitive information of an organization, it is desired that the search query should not reveal any confidential information. The existing works are not suitable for big‐data scenario due to high search time required for large document collections, thereby leading to increased cloud usage cost. Thus, an efficient approach to perform search on encrypted data using clustering is proposed in this paper. As the proposed technique clusters the documents based on the relationship between the keywords, the search method involves searching documents within the relevant cluster in contrast to searching the entire dataset. An efficient ranking method is incorporated to rank the documents according to the relevance to search query using Term Frequency‐Inverse Document Frequency ( tf‐idf ) value of the keywords in the documents, which leads to reduced communication overheads due to reduction in unnecessary documents being downloaded. Moreover, an efficient query randomization approach is proposed so that two or more queriesSummary: Organizations prefer using cloud for storing their data due to availability of cost‐effective storage. The outsourced data include sensitive information, so data is encrypted as maintaining confidentiality and privacy of the documents is of paramount importance. Retrieving the desired information from the cloud requires efficient searching, which involves submission of search query to the cloud server by the end‐user. As the search terms may include sensitive information of an organization, it is desired that the search query should not reveal any confidential information. The existing works are not suitable for big‐data scenario due to high search time required for large document collections, thereby leading to increased cloud usage cost. Thus, an efficient approach to perform search on encrypted data using clustering is proposed in this paper. As the proposed technique clusters the documents based on the relationship between the keywords, the search method involves searching documents within the relevant cluster in contrast to searching the entire dataset. An efficient ranking method is incorporated to rank the documents according to the relevance to search query using Term Frequency‐Inverse Document Frequency ( tf‐idf ) value of the keywords in the documents, which leads to reduced communication overheads due to reduction in unnecessary documents being downloaded. Moreover, an efficient query randomization approach is proposed so that two or more queries involving the same search terms appear distinct. Experimental results using real datasets demonstrate that our proposed multi‐keyword ranked search scheme on encrypted cloud data significantly reduce the number of comparisons and search time in comparison to the existing techniques while maintaining recall of 100% and precision of 82%. … (more)
- Is Part Of:
- Concurrency and computation. Volume 31:Number 15(2019)
- Journal:
- Concurrency and computation
- Issue:
- Volume 31:Number 15(2019)
- Issue Display:
- Volume 31, Issue 15 (2019)
- Year:
- 2019
- Volume:
- 31
- Issue:
- 15
- Issue Sort Value:
- 2019-0031-0015-0000
- Page Start:
- n/a
- Page End:
- n/a
- Publication Date:
- 2019-01-10
- Subjects:
- cipher‐text search -- cloud computing -- cloud security -- cloud storage -- document clustering -- multi‐keyword search -- privacy preserving -- query randomization -- ranked search -- secure information retrieval
Parallel processing (Electronic computers) -- Periodicals
Parallel computers -- Periodicals
004.35 - Journal URLs:
- http://onlinelibrary.wiley.com/ ↗
- DOI:
- 10.1002/cpe.5127 ↗
- Languages:
- English
- ISSNs:
- 1532-0626
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3405.622000
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 11018.xml