Paper
10 November 2021 Data cleaning system and method based on IC card data characteristics
Lingfei Zeng, Wei Huang, Jingpeng Wen, Haolong Zhan, Yichao Lin
Author Affiliations +
Proceedings Volume 12050, International Conference on Smart Transportation and City Engineering 2021; 120500R (2021) https://doi.org/10.1117/12.2614328
Event: 2021 International Conference on Smart Transportation and City Engineering, 2021, Chongqing, China
Abstract
In the bus IC card data system, there will often be quality issues, such as time point irregularities and losses, due to slight variations in the patterns and usage across the country or due to equipment work or transport failures in the IC card data, with an average error rate of 1.5%.Especially, at present, the amount of data is increasing, and the time that one data cleaning process needs to take is more and more astonishing, so this paper tries for a kind of data cleaning system that can normative the cleaning of data, and can guarantee the complete data cleaning in a reasonable time frame. The first data cleaning result is obtained by standardizing and classifying the initial data format; the second data cleaning result is obtained by correcting the format in accordance with the first data cleaning result, and the third data cleaning result is obtained by correcting the logic in terms of the second data cleaning result. In comparison to the typical comparable duplicate data cleaning technique, this method ensures high efficiency and accuracy while cleaning and maintaining bus IC card data on a regular basis, allowing for precise location of the source and destination of each dirty data, which has significant practical implications for big data processing.
© (2021) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Lingfei Zeng, Wei Huang, Jingpeng Wen, Haolong Zhan, and Yichao Lin "Data cleaning system and method based on IC card data characteristics", Proc. SPIE 12050, International Conference on Smart Transportation and City Engineering 2021, 120500R (10 November 2021); https://doi.org/10.1117/12.2614328
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Data analysis

Data conversion

Global Positioning System

Data processing

Data modeling

Logic

Computer security

Back to Top