A machine learning based approach to identify protected health information in Chinese clinical text. (August 2018)

Record Type:: Journal Article
Title:: A machine learning based approach to identify protected health information in Chinese clinical text. (August 2018)
Main Title:: A machine learning based approach to identify protected health information in Chinese clinical text
Authors:: Du, Liting
Xia, Chenxi
Deng, Zhaohua
Lu, Gary
Xia, Shuxu
Ma, Jingdong
Abstract:: Highlights: Chinese clinical text contains a high volume of diverse PHI, which is unevenly distributed across various medical institutions. We trained and tested a machine learning based approach to identify PHI in Chinese clinical corpus with double annotated PHI tags. The CRF based approach has achieved performance as high as 98.78% measured by F-score when applied to Chinese clinical text. Abstract: Background: With the increasing application of electronic health records (EHRs) in the world, protecting private information in clinical text has drawn extensive attention from healthcare providers to researchers. De-identification, the process of identifying and removing protected health information (PHI) from clinical text, has been central to the discourse on medical privacy since 2006. While de-identification is becoming the global norm for handling medical records, there is a paucity of studies on its application on Chinese clinical text. Without efficient and effective privacy protection algorithms in place, the use of indispensable clinical information would be confined. Objectives: We aimed to (i) describe the current process for PHI in China, (ii) propose a machine learning based approach to identify PHI in Chinese clinical text, and (iii) validate the effectiveness of the machine learning algorithm for de-identification in Chinese clinical text. Methods: Based on 14, 719 discharge summaries from regional health centers in Ya'an City, Sichuan province, China, we built … (more)
Is Part Of:: International journal of medical informatics. Volume 116(2018)
Journal:: International journal of medical informatics
Issue:: Volume 116(2018)
Issue Display:: Volume 116, Issue 2018 (2018)
Year:: 2018
Volume:: 116
Issue:: 2018
Issue Sort Value:: 2018-0116-2018-0000
Page Start:: 24
Page End:: 32
Publication Date:: 2018-08
Subjects:: Protected health information -- De-identification -- Electronic health records -- Conditional random fields
Medical informatics -- Periodicals
Information science -- Periodicals
Computers -- Periodicals
Medical technology -- Periodicals
Medical Informatics -- Periodicals
Technology, Medical -- Periodicals
Computers
Information science
Medical informatics
Medical technology
Electronic journals
Periodicals
Electronic journals
610.285
Journal URLs:: http://www.sciencedirect.com/science/journal/13865056 ↗
http://www.clinicalkey.com/dura/browse/journalIssue/13865056 ↗
http://www.clinicalkey.com.au/dura/browse/journalIssue/13865056 ↗
http://www.elsevier.com/journals ↗
DOI:: 10.1016/j.ijmedinf.2018.05.010 ↗
Languages:: English
ISSNs:: 1386-5056
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - 4542.345250
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store
Ingest File:: 9438.xml