Compressing CNN-DBLSTM models for OCR with teacher-student learning and Tucker decomposition. (December 2019)

Record Type:: Journal Article
Title:: Compressing CNN-DBLSTM models for OCR with teacher-student learning and Tucker decomposition. (December 2019)
Main Title:: Compressing CNN-DBLSTM models for OCR with teacher-student learning and Tucker decomposition
Authors:: Ding, Haisong
Chen, Kai
Huo, Qiang
Abstract:: Highlights: We investigate teacher-student learning and Tucker decomposition to compress and accelerate convolutional layers within CTC-trained CNN-DBLSTM models for OCR. To the best of our knowledge, we are the first to address this problem. Based on the architecture of CNN-DBLSTM model, we propose an objective function for teacher-student learning that directly matches the feature sequences extracted by CNNs of teacher and student models under the guidance of the succeeding LSTM layers. Experimental results on large scale handwritten and printed OCR tasks show that student model trained with the proposed criterion outperforms that trained with a standard KL divergence criterion. We explore the effectiveness of combining teacher-student learning and Tucker decomposition. We use teacher-student learning to transfer the knowledge of a large-size teacher model to a small-size compact student model, followed by Tucker decomposition for further compression and acceleration. Our results show that we can build a very compact CNN-DBLSTM model by using this method, which can reduce significantly both the footprint and computation cost without or with a small recognition accuracy degradation. Abstract: Integrated convolutional neural network (CNN) and deep bidirectional long short-term memory (DBLSTM) based character models have achieved excellent recognition accuracies on optical character recognition (OCR) tasks, along with large amount of model parameters and massive computation … (more)
Is Part Of:: Pattern recognition. Volume 96(2019:Dec.)
Journal:: Pattern recognition
Issue:: Volume 96(2019:Dec.)
Issue Display:: Volume 96 (2019)
Year:: 2019
Volume:: 96
Issue Sort Value:: 2019-0096-0000-0000
Page Start:
Page End:
Publication Date:: 2019-12
Subjects:: Optical character recognition -- CNN-DBLSTM Character model -- Model compression -- Teacher-student learning -- Tucker decomposition
Pattern perception -- Periodicals
Perception des structures -- Périodiques
Patroonherkenning
006.4
Journal URLs:: http://www.sciencedirect.com/science/journal/00313203 ↗
http://www.sciencedirect.com/ ↗
DOI:: 10.1016/j.patcog.2019.07.002 ↗
Languages:: English
ISSNs:: 0031-3203
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store
Ingest File:: 11627.xml