Paper
22 April 2022 Text local error detection and correction based on N-gram for Chinese homophone replacement
Author Affiliations +
Proceedings Volume 12174, International Conference on Internet of Things and Machine Learning (IoTML 2021); 121741A (2022) https://doi.org/10.1117/12.2629117
Event: International Conference on Internet of Things and Machine Learning (IoTML 2021), 2021, Shanghai, China
Abstract
With the continuous development of the Internet, more and more information enter people's lives. However, the information is mixed. It is difficult to guarantee the correctness of the text. For errors caused by homophone replacement, an automatic Chinese text local error detection and correction solution based on n-gram model is proposed. A method of local error detection based on the combined model of 2-gram and 3-gram is proposed, and a method of local error correction based on 3-gram model is proposed. Experiments show that the error detection recall rate is 83.1%, the error detection accuracy rate is 41.5%, the F-score is 55.4%; the error correction rate is 78.1%. The method is compared with the 2-gram model and the 3-gram model. The accuracy of error detection is increased by 7.2% and 8.2% respectively. The F-score is increased by 6.3% and 8.2% respectively.
© (2022) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Bo Zhang, Xiaoxiao Wang, Shaopeng Yuan, Jingyang Wang, and Pengfei Liu "Text local error detection and correction based on N-gram for Chinese homophone replacement", Proc. SPIE 12174, International Conference on Internet of Things and Machine Learning (IoTML 2021), 121741A (22 April 2022); https://doi.org/10.1117/12.2629117
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Error control coding

Error analysis

Analytical research

Information science

Software development

Back to Top