PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 1344201 (2025) https://doi.org/10.1117/12.3059059
This PDF file contains the front matter associated with SPIE Proceedings Volume 13442, including the Title Page, Copyright information, Table of Contents, and Conference Committee information.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Intelligent Signal Processing and Information Classification
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 1344202 (2025) https://doi.org/10.1117/12.3053130
In recent years, societal changes have led to a growing prominence of pets in people's lives. However, uncontrolled pet reproduction in urban areas has given rise to a significant issue of stray animals, posing serious threats to human health and the environment. Conventional manual methods for counting stray animals face challenges in terms of efficiency and the risk of disease transmission. With technological advancements, image recognition, and sound identification, among other techniques, have emerged as crucial tools to address this issue. Image recognition, leveraging intuitive statistics based on external features, combined with the low-power attributes of sound identification and the health assessment capabilities of thermal imaging, collectively provide comprehensive technological support for stray animal population statistics. In the realm of image algorithms, both traditional target detection algorithms and deep learning methods such as RCNN and Faster RCNN employ convolutional neural networks to accurately identify and locate stray animals. Regarding sound algorithms, traditional Gaussian mixture models and hidden Markov models, as well as deep learning techniques involving convolutional neural networks, have effectively enhanced the accuracy of stray animal sound recognition. The integration of image and audio in a hybrid method significantly enhances stray animal monitoring. Employing advanced techniques in video tracking and sound recognition, this approach offers an efficient and practical solution, crucial for wildlife ecosystem surveillance and conservation. Research indicates that the application of deep learning methods in the domains of image and sound has significantly advanced compared to traditional approaches. In terms of image processing, I utilized the YOLO algorithm to perform grid division, feature extraction, and loss computation steps to achieve stray animal detection, demonstrating outstanding performance. Through the application of the GMM algorithm, we identified the vocal characteristics of stray animals and inferred their recognition effectiveness by employing likelihood functions. Our objective is to employ a combination of image and audio recognition with deep learning techniques to identify the population of stray animals within specific regions.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 1344203 (2025) https://doi.org/10.1117/12.3053026
Automatic garbage classification has provided a great opportunity for the development of cities due to the increasing environmental concerns and the need for effective waste management. Efficient and intelligent garbage sorting technology is appealing for facilely classifying and detecting different types of garbage from images or videos to facilitate recycling and proper disposal. This paper aims to propose a novel multi-layer network called FPAO based on feature preference and attention optimization to achieve automatic garbage classification from both images and surveillance videos. It is mainly designed with a convolutional layer to carry the feature extraction operation, a pooling layer to conduct the down sampling, a bidirectional long short-term memory (BiLSTM) layer to capture the temporal correspondences and preference, and an attention optimization layer to adaptively assign important feature weight to enhance the accuracy of the classification detection. Experimental evaluation indicates that the proposed FPAO exhibits superior performance in detection stability and robustness. The qualitative and quantitative comparison demonstrates that FPAO yields competitive detection results and outperforms existing similar methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 1344204 (2025) https://doi.org/10.1117/12.3052986
By analyzing the acoustic response generated by applying a constant excitation to a closed container, this paper proposes a contactless sensing technique for accurately determining the number of objects inside a non-transparent closed container. The acoustic data with different spectra are collected to construct a dataset by changing the excitation mechanism of the closed cavity. The data is preprocessed through band-pass filtering and Fast Fourier Transform (FFT). A high-performance classification model is constructed using Support Vector Machine (SVM) with a linear kernel, resulting in an accuracy rate of 97%percnt; or higher. This method is highly adaptable and widely applicable. It demonstrates a stable and precise recognition effect when combining containers and fillers of different materials and sizes, utilizing diversified excitation modes, and dealing with complex and variable object stacking densities. Additionally, it does not require sophisticated data acquisition hardware equipment, providing a powerful technical solution for practical industrial inspection, logistics monitoring, and related fields. It offers a robust technical solution for addressing practical challenges in industrial inspection, logistics monitoring, and related fields.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 1344205 (2025) https://doi.org/10.1117/12.3054421
Using a combination of the TOPSIS model, time series analysis, Lasso regression, and multivariate linear regression, this study aims to quantify and predict momentum disparities among players and to analyze the momentum fluctuation characteristics of player in rocket games. Focusing on the 2023 Wimbledon Men's Singles and other events, this study initially establishes a momentum assessment model based on match progression, then evaluates the influence of various indicators on momentum-difference and subsequently uncovers players' momentum fluctuation characteristics during matches. The study has also been validated in other competitions, exhibiting good generalizability, and can be applied to matches under different environmental conditions. In summary, this work not only enhances understanding of momentum but also offers a method for quantifying and predicting performance. The study offers crucial insights for players and coaches to analyze matches, understand individual strengths and weaknesses, and serves as an important reference.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 1344206 (2025) https://doi.org/10.1117/12.3054107
Various daily applications, including image categorization, natural language comprehension, and voice identification, heavily rely on fully connected and convolutional neural networks. When tackling classification problems, conventional deep learning architectures typically employ softmax activation functions to normalize outputs and minimize the model's cross-entropy loss. In this study, we present a novel malware classification model that integrates hybrid support vector machines with neural networks. Notably, this model substitutes the conventional softmax layer in neural networks with a support vector machine, thereby shifting the learning process towards minimizing margin losses instead of cross-entropy losses. This modification enhances the precision of malware classification for both standard machine learning algorithms and several prevalent deep learning models. To validate our proposed model, we employed the Malimg dataset, comprising malware images derived from binary malware samples. Subsequently, we trained an DL-SVM model to assign scores to each distinct malware family. Experimental outcomes revealed that the malware classification model achieved an accuracy of 96.31%, a precision rate of 97.00%, a recall rate of 97.35%, and an F1 score of 97.16%, respectively.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 1344207 (2025) https://doi.org/10.1117/12.3053163
To address the issue of insufficient military equipment sample data, which cannot meet the training requirements of deep neural networks and tends to cause overfitting, this paper introduces transfer learning technology to solve the small-sample classification problem for military equipment. By constructing a multi-type sample training set and fine-tuning the convolutional layers of pre-trained models, specific target classifiers are trained. Practice has proven that the application of transfer learning in small-sample classification tasks saves model training time, resolves issues related to model overfitting and strong dependence on data labels, and effectively improves the accuracy of image classification based on deep learning for small samples of military equipment.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 1344208 (2025) https://doi.org/10.1117/12.3052988
Biomedical segmentation networks easily suffer from the unexpected misclassification between foreground and background objects when learning on limited and imperfect medical datasets. Inspired by the strong power of Out-of-Distribution (OoD) data on other visual tasks, we propose a data-centric framework, Med-OoD to address this issue by introducing OoD data supervision into fully-supervised biomedical segmentation with none of the following needs: (i) external data sources, (ii) feature regularization objectives, (iii) additional annotations. Our method can be seamlessly integrated into segmentation networks without any modification on the architectures. Extensive experiments show that Med-OoD largely prevents various segmentation networks from the pixel misclassification on medical images and achieves considerable performance improvements on Lizard dataset. We also present an emerging learning paradigm of training a medical segmentation network completely using OoD data devoid of foreground class labels, surprisingly turning out 76.1% mIoU as test result. We hope this learning paradigm will attract people to rethink the roles of OoD data. Code is made available at https://github.com/StudioYG/Med-OoD.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 1344209 (2025) https://doi.org/10.1117/12.3054570
Bird nesting activities on high-voltage transmission facilities may cause transmission accidents, and power grid companies conduct regular inspections of transmission lines to remove bird nests. Due to the complex environment of transmission facilities, it is difficult to determine whether a bird’s nest is located on transmission towers and poles by merely detecting the bird’s nest with color information. A method for extracting suspected areas of bird nests on transmission lines based on the combination of depth information and color information is proposed. First, the disparity map is computed through SGBM stereo matching algorithm, and then the foreground mask is obtained by disparity threshold segmentation, and the foreground mask is processed morphologically. Next, the thresholds for each component are set based on the histogram of the average distribution of each component of the HSV color space of the bird's nest dataset. The bird’s nest suspected region is extracted in the foreground based on the color features using the minimum outer rectangle and filtered to obtain the final bird’s nest suspected region. The experiment proves that the extraction of the bird’s nest suspected region in the foreground by combining the depth information and the color information is effective in the complex background.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134420A (2025) https://doi.org/10.1117/12.3054133
As people's awareness of ecological protection increases, bird sound monitoring has received more and more attention. Among them, using bird sound monitoring as part of audio recognition has become a hot research topic. Since bird sounds are usually collected in natural environments, they contain a lot of noise, which will affect the monitoring results. To solve this problem, this paper designs a Convolutional Recurrent Network (CRN) that enhances feature representation along the frequency axis. This method is based on the Short-time Fourier transform (STFT) features of sound signals, focuses on the complex operation features in the time-frequency domain, and designs an Decode-Encode architecture combined with a time-frequency domain enhancement network to reduce the impact of interference information, We called this network DFCRN. Experimental results on the public datasets Birdsdata and xeno-canto-ca-nv show that compared with other denoising models, the noisy signal after DFCRN enhancement achieves the best results in SegSNR and SI-SNR, and the classification accuracy on xeno-canto-ca-nv is improved by 5%, verifying the effectiveness and robustness of this method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134420B (2025) https://doi.org/10.1117/12.3053037
This study explores an advanced method for emotion classification using electroencephalogram (EEG) data, leveraging the DEAP dataset. The proposed approach combines wavelet transform for feature extraction with long short-term memory (LSTM) neural networks for classification. Initially, the EEG signals were decomposed using fast discrete wavelet transform (DWT) to extract wavelet coefficients from both low-frequency and high-frequency sub-bands. Key statistical features, including the maximum, minimum, mean, standard deviation, energy value, and relative energy value of these coefficients, were computed to form comprehensive feature vectors. These feature vectors were then input into a sophisticated 7-layer LSTM neural network for training and testing. The LSTM network's ability to handle long-term dependencies in sequential data proved highly effective in this context. This study conducted experimental comparisons between single-channel and multi-channel classification performance and explored the impact of different feature component combinations on classification outcomes. The experimental results showed that multi-channel combinations significantly improved classification accuracy, with the best accuracy 95.15% observed in the 8-channel combination scheme. Further analysis revealed that when the feature data contains wavelet components with frequencies between 0 and 8Hz, the classification performance of the network can be significantly improved.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134420C (2025) https://doi.org/10.1117/12.3053059
Facial expression recognition (FER) plays a crucial role within the realm of computer vision. In the last several years, machine learning-powered FER methods have been fully developed. In contrast to conventional feature-based FER algorithms, FER algorithms based on machine learning have gained advantages. In this paper, we review the progress of FER algorithms driven by deep learning, and compare the horizontal effects of different network architectures based on CK+ data sets and VGG16 and other pre-trained deep learning models. At the same time, different data enhancement methods and learning rate adaptive adjustment methods are used to introduce attention mechanisms. A significant quantity of experimental data and model evaluation parameters show that the optimized facial expression recognition algorithm can improve the accuracy of facial expression recognition. Our method and optimization are effective, and good results are obtained on CK+ data set.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134420D (2025) https://doi.org/10.1117/12.3054285
The issue of irregular flight recovery has been a challenging problem in the field of operations research. Here we propose aircraft recovery method based on ALNS (adaptive large neighborhood search) algorithm. By utilizing the ALNS, it is possible to search in multiple neighborhoods of the current solution, greatly increasing the search range of the algorithm in the solution space. Multiple operators are used and the operators to be used in the next iteration are selected based on their historical performance and frequency of utilization. The idea of Tabu search is also incorporated to avoid repeated searches for the same neighborhood. The algorithm proposed in this paper has achieved good results in practical application.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134420E (2025) https://doi.org/10.1117/12.3053331
In this paper, I propose a vehicle classification method based on deep learning, which uses attention-enhanced convolutional neural network (CNN) to realize high-precision vehicle recognition. The model automatically focuses on key vehicle features, significantly improves classification accuracy, and reduces computing costs through structural and training strategy optimization, making it suitable for mobile devices and edge computing. The experimental results show that the model performs well in the identification of various types of vehicles, especially in a specific category of 100% accuracy, showing the application potential in intelligent transportation and vehicle management services.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134420F (2025) https://doi.org/10.1117/12.3052931
Based on the consideration of the ephemeral characteristics of data resources, this paper explore the impacts of data sharing on product quality and service strategies of manufacturers and retailers under different channel power structures by constructing a differential game model and numerical simulation analysis, and adopt a cost-sharing pact to coordinate the performance of e-commerce supply chain in order to realize high-quality data sharing and promote product quality optimization and service innovation. The results show that the overall data sharing level of the supply chain is higher when the retailer has the right to open online channels, which contributes to the improvement of product quality and service level, while the decay of data value has a negative impact, and the supply chain performance is improved under the cost-sharing contract, which helps to alleviate the double marginal effect of decentralized decision-making.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134420G (2025) https://doi.org/10.1117/12.3053085
Since its introduction, Denoising Diffusion Probabilistic Models (DDPM) have received widespread attention for their exceptional performance in image generation. They generate new samples by simulating the denoising process of data, a method that is not only simple and efficient but also capable of producing highly realistic samples. This paper explores the application of Conditional Denoising Diffusion Probabilistic Models (Conditional DDPM) on the MNIST dataset. MNIST is a classic dataset containing handwritten digit images, widely used in computer vision and machine learning fields. The paper first introduces the basic principles and model structure of Conditional DDPM, then elaborately explains how to train and apply the Conditional DDPM on the MNIST dataset, and analyzes the experimental results. The experimental results show that the Conditional DDPM can generate high-quality handwritten digit images that meet specific conditions on the MNIST dataset.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134420H (2025) https://doi.org/10.1117/12.3052944
The trend toward larger ships in the maritime industry has resulted in significant challenges for robots' self-localization capabilities due to occlusion and metal interference in the extensive bottom areas. These limitations hinder the development of higher-level functions such as work path planning and painting. Therefore, the robustness of SLAM against signal interference makes it well-suited for application in ship robots. Additionally, the unclear textures and uneven lighting conditions at the bottoms of large ships cause ORB-SLAM2 to encounter mismatches and localization failures in these low-texture environments. To address these issues. this paper proposes an improved sparse semi-direct method (SemiFeature-Direct SLAM, SFD-SLAM). The feature point extraction method in the front-end perception thread is improved to a direct method, where the pose is calculated by optimizing the photometric error of image feature points instead of feature point matching. Additionally, feature matching computations are performed only in the mapping thread. In response, the feature point extraction method in the front-end perception thread is improved to a direct method. Pose is calculated by optimizing the photometric error of image feature points instead of feature point matching. Additionally, feature matching computations are performed exclusively in the mapping thread. Three TUM datasets, which share similar characteristics with the ship bottom environment, were selected for algorithm validation. The results demonstrate that the proposed algorithm offers higher accuracy compared to ORB-SLAM2 and achieves a 25%-30% improvement in processing speed.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134420I (2025) https://doi.org/10.1117/12.3052928
In order to enhance the interpretability of deep learning models and improve the credibility of model predictions, we propose an interpretability analysis method for flight delay prediction based on KernelSHAP. Delay prediction uses flight and weather data, and the ATMAP algorithm is used to generate weather condition scores that are strongly correlated with flight delay conditions to enrich features. The deep learning model NR-DenseNet is selected for delay prediction. KernelSHAP is combined to analyze the input features and the decision-making process of the model from two perspectives: feature analysis of overall samples and feature analysis of single sample. The results show that the addition of KernelSHAP enhances the interpretability of the model, effectively breaks the black box characteristics of the model, and can provide professionals with more reliable decision-making guidance.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134420J (2025) https://doi.org/10.1117/12.3053070
The MVDR (Minimum Variance Distortionless Response) algorithm is a classic Wiener filtering method used for beamforming in array signal processing. In a one-dimensional linear array high-frequency ground wave radar system, it can be employed to suppress various types of ionospheric clutter. A key step in suppression is the use of maximum likelihood estimation (MLE) to estimate the ionospheric clutter covariance matrix. However, MLE typically assumes that samples are independently and identically distributed (i.i.d.). Traditional MVDR algorithms estimate the clutter covariance matrix using all samples, which may not satisfy the i.i.d. condition. Therefore, this paper proposes two new sample selection strategies for choosing i.i.d. samples. One strategy utilizes the KL(Kullback-Leibler) divergence method from information geometry, while the other employs the weighted correlation coefficient method. Simulation results demonstrate that both new algorithms effectively suppress ionospheric clutter.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134420K (2025) https://doi.org/10.1117/12.3052982
Due to the mutual restriction between stitching effect and stitching time, which makes the video stitching algorithm unable to achieve low-time and high-quality stitching at the same time. To solve this problem, this paper proposes a new fast video stitching algorithm based on adaptive key frames extraction, which makes full use of the information redundancy of video sequence. Firstly, two methods based on fixed frame interval and based on global information are combined to extract video key frames to realize the similarity division between frames. The average optical flow of all past frames is used as the similarity threshold to achieve adaptive inter-frame similarity division. Furthermore, sparse optical flow estimation based on brightness compensation is used to track the feature point pairs of the inter-frame sequence, and a look-up table of feature point pairs is established. Finally, the matching point pairs of the past frame are directly propagated to the current frame by using the inheritance method of feature optical flow, and then stitched together. Experiments show that the algorithm proposed in this paper can reduce the time consumption by 31.8%, 24.4% and 24.1% compared with OpenCV in three different scenarios, and the PSNR can be improved by up to 13.27 compared with PTGUI in terms of stitching performance. Therefore, the algorithm achieves fast and high-quality video stitching, and is robust and stable to environmental and illumination changes.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134420L (2025) https://doi.org/10.1117/12.3052938
Presently although the algorithms based on deep learning have achieved good results in the restoration of ancient murals, they do not consider the interaction between image texture information and structural information, resulting in the ineffective restoration of global information and prone to problems such as discontinuous structures and blurry textures. To address these issues, this paper proposes an ancient mural restoration algorithm based on structure and texture-guided dual-stream generative adversarial networks. Between encoding and decoding, an improved aggregated contextual-transformation (IAOT) module is proposed to enhance the capture of distant features and rich structural details. Experimental results show that the proposed method outperforms comparative algorithms in terms of restoring image texture details and structures, and performs better in evaluation metrics.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134420M (2025) https://doi.org/10.1117/12.3053112
This paper presents a novel approach for dense scene text detection called DSSNet (Dense Script Spotter Network). The network leverages ResNet and FPN for feature extraction, employing multi-scale feature fusion and Transformer-based feature processing to enhance text recognition across varying sizes. The method generates text instance shapes using Bézier central curves and performs text recognition by integrating positional query information. Experimental results on the DSTD1500 and ICDAR2015 datasets demonstrate that DSSNet outperforms existing methods in terms of text localization accuracy, recognition accuracy, and annotation flexibility.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Yi Wei, Zhigang Lv, Liangliang Li, Peng Wang, Yuntao Xu
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134420N (2025) https://doi.org/10.1117/12.3054422
With the wide application of pressure vessels, their safety has attracted more and more attention. Nowadays, X -ray is often used to detect the weld area of pressure vessels. However, the image content scanned by the existing technology is redundant and the weld area cannot be directly obtained. Therefore, this paper proposes a clipping mapping weld extraction method based on the image characteristics of data sets and Gaussian function algorithm, which has good adaptability and stability, and lays a solid foundation for the subsequent detection and processing of defects in welds.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134420O (2025) https://doi.org/10.1117/12.3053319
The micro-Doppler effect captures the fine motion characteristics of maritime targets, serving as a crucial feature for distinguishing between sea clutter and targets, thereby enhancing radar target detection and recognition capabilities. In this study, a long-term radar echo model for micro-motion targets in sea clutter is established using maritime surveillance radar. By analyzing the morphological differences between constant-velocity targets and micro-motion clutter in Short-Time Fourier Transform (STFT) spectrograms, a sliding window cancellation method based on STFT spectrograms is proposed to remove micro-motion clutter. This method effectively separates target echoes from micro-motion clutter, facilitating the extraction of micro-motion target features. Both simulation and experimental results validate the effectiveness of the proposed method, laying a theoretical foundation for advancing maritime vessel detection and recognition.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134420P (2025) https://doi.org/10.1117/12.3052950
In the field of computer science, most of the research on multi-dimensional time series focuses on anomaly detection and alarm prevention, and the related analysis and research are insufficient for further fault diagnosis. Therefore, this paper proposes an anomaly detection model LSTM-CNN-AE combining LSTM and CNN and a fault feature matching method based on feature similarity measurement. Firstly, the temporal and spatial features of MTS were learned by LSTM and 1D-CNN respectively, and the reconstructed samples were obtained by using the decoder and the reconstruction error was calculated, and the error was used as the anomaly score for anomaly detection. The similarity between the spatial-temporal feature of abnormal data and the feature data in the fault knowledge base was measured to realize the fault diagnosis of abnormal data. The experimental results on the monitoring data set of an operating system show that the algorithm can effectively diagnose the fault of multi-dimensional time series.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134420Q (2025) https://doi.org/10.1117/12.3053052
The technique of embedding the owner’s valid copyright information in an audio signal is known as digital audio watermarking. Research in this field has primarily focused on the trade-off between imperceptibility, payload, and robustness. Traditionally, audio watermarking algorithms have been implemented using signal processing techniques, such as Least Significant Bit (LSB) Coding. In this study, we propose a deep learning-based approach for embedding image watermarking information into audio clips of finite length. We utilize Short-Time Fourier Transform (STFT) and Inverse Short-Time Fourier Transform (ISTFT) as differentiable layers within the network. Additionally, we design a multi-band masking model that integrates a psychoacoustic model-based masking threshold to achieve covert watermarking. To enhance the algorithm’s resilience against various attacks, we incorporate multiple differentiable distortions during the training process to simulate realistic attack scenarios. We evaluate the model’s performance through experimental analysis under both attack-free conditions and various channel distortions. The experimental results demonstrate that our method achieves a Signal-to-Distortion Ratio (SDR) of up to 44 dB and an extremely low Bit Error Rate (BER). Comprehensive comparative experiments and ablation studies further validate the effectiveness of our proposed method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134420R (2025) https://doi.org/10.1117/12.3054457
An efficient foggy weather vehicle recognition framework is proposed to address the challenges of vehicle identification under adverse weather conditions in this study. Enhanced high-resolution data is used as input, and an image restoration model is employed to effectively capture multi-scale feature information. Additionally, an image dehazing model is incorporated to restore clear vehicle features from foggy and blurred images, thereby improving the accuracy of vehicle detection in foggy conditions. In the experimental analysis, visualized results are first presented, and the model's performance is analyzed by comparing it with the detection results of other algorithms. The data is further analyzed, with several dehazing algorithms integrated into the vehicle detection model for ablation experiments to validate the model's effectiveness.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134420S (2025) https://doi.org/10.1117/12.3053023
At present, the perception method based on bird's-eye view has become the mainstream of autonomous driving perception. It realizes comprehensive perception of the vehicle's surrounding environment by fusing multiple sensors at the feature level. However, the existing multi-modal fusion perception methods based on bird's-eye view usually require extremely high computing resources, especially in the multi-camera view image conversion processing. In addition, the key to multimodal bird's-eye view perception lies in how to efficiently fuse point cloud features and image features. To address these defects, this paper proposes a novel multi-modal bird's-eye view perception algorithm. First, this paper proposes an index lookup calculation method for the conversion of multi-view image features to bird's-eye view perspective. This method greatly reduces the consumption of computing resources without basically reducing information. Secondly, this paper proposes a feature fusion method, which uses a cross-modal attention mechanism to enhance the interaction between different modal features, realize dynamic spatiotemporal alignment and fusion. Experimental results show that the method proposed in this paper can effectively perceive the environment and can be deployed on a real vehicle platform for real-time detection.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134420T (2025) https://doi.org/10.1117/12.3054441
With the cross development of computational neuroscience and artificial intelligence, research on human brain neural network simulation and signal processing technology has become a hot topic of common concern in both academia and industry. This study used a multi-level and multi-scale computational model and high-precision algorithm to simulate the dynamic behavior of human brain neural networks, exploring the complex interactions between neurons and their impact on information processing capabilities. Real time dynamic simulation of billions of neurons was achieved on a simulation platform using ultra large-scale integrated circuit chips, simulating signal processing patterns similar to those of the human brain. By using an improved backpropagation algorithm to decode and reconstruct neural signals, the efficiency of the algorithm and the accuracy of signal processing have been improved. On this basis, combined with experimental data obtained from functional magnetic resonance imaging, the similarities and differences between neural network simulation results and actual human brain activity were compared and analyzed, revealing the potential connection between cognitive function and brain network activity patterns. This study not only achieved new breakthroughs in simulation technology and signal processing algorithms, but also provided a new quantitative tool and theoretical support for related neuroscience research, which is of great significance for the development of brain computer interfaces and intelligent information processing systems. In addition, the study also delved into the balance between ensuring model complexity and processing efficiency, as well as the challenges and opportunities brought by interdisciplinary collaboration in the field of neuroscience. Through cross validation and error analysis of simulation experiments, the effectiveness of the model and the accuracy of prediction results were ensured. Based on this, feasible suggestions for optimizing neural network structures and signal decoding strategies have been extracted, providing possible new avenues for future research directions.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Junzhuang Zhang, Zhidan Yan, Tingting Song, Zuodan Wang
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134420U (2025) https://doi.org/10.1117/12.3054410
The continuous wave mud channel is an extremely complex channel, and the signal is transmitted with phenomena such as inter-code crosstalk, which leads to a decrease in the correct rate of information transmission. Channel equalizers can effectively solve problems such as inter-code crosstalk. Compared with the traditional channel equalizer, the neural network equalizer has a better equalization effect, but the NARX neural network has a slow convergence speed and tends to fall into local optimal solutions. To solve the above problems, a method based on Particle Swarm Optimization (PSO) and Genetic Algorithm (GA) to optimize NARX neural network is proposed. First, the initial weights of the NARX neural network are optimized using PSO to improve the convergence speed of the NARX neural network, followed by the use of an improved GA to determine the topology of the NARX network at the arrival of the signal in each frame. Then the equalization effects are compared with those of BP, PSO-BP and PSO-NARX equalizers. The final results show that the GAPSO-NARX equalizer has a lower BER, which effectively improves the correctness of information transmission in continuous wave mud channels.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Mobile Communication Technology and Power Grid Modeling
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134420V (2025) https://doi.org/10.1117/12.3053119
Under the background of the rapid development of big data and artificial intelligence, the rapid development of mobile communication has accelerated the development process of 5G technology. 5G has comprehensively affected people's production and life, and brought opportunities and challenges to intelligent network systems. From the perspective of 5G mobile communication network architecture and key technologies, this paper presents the core points and advantages of 5G through the newly improved and upgraded 5G network architecture and various updated and iterative key technologies, and looks forward to the future development trend of 5G.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134420W (2025) https://doi.org/10.1117/12.3054153
An anti-jamming algorithm is proposed in this paper based on frequency-polarization-time domain to address the challenge of detecting target signal amidst multi-source high-power suppression jamming signals for monopulse radar systems. Initially, a received signal model for monopulse radar under multi-source noise frequency modulation jamming is established, incorporating the target signal, jamming and noise components. Subsequently, by evaluating the peak polarization degree of received signals in the frequency domain within the clutter gate, the operational bands of jamming signals are identified. The polarization state of each jamming source then can be determined. Furthermore, jamming suppression of the corresponding polarization state within the target band is achieved through virtual polarization techniques. Finally, interference beyond the target band is mitigated using a combination of a wideband filter, a limiter and a narrowband filter. Simulation results demonstrate that the proposed algorithm can achieve an approximate Signal-to-Interference-plus-Noise Ratio (SINR) of 6dB through coherent accumulation over 16 coherent processing intervals, even in the presence of 50dB Jamming-to-Signal Ratio (JSR) for each of the five jammers.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134420X (2025) https://doi.org/10.1117/12.3052947
Frequency diverse array (FDA) adds a tiny frequency offset which is much smaller than the carrier frequency between the signals transmitted by each antenna element. This introduces an additional degree of freedom in range dimension, resulting in a range-angle dependent beam pattern. Furthermore, the array's beam can be directed to different ranges at the same pointing angle. Multiple input multiple output (MIMO) radar transmits multiple orthogonal signals from the transmitter, and at the receiver, the received signals are separated by matched filters, thereby forming a virtual aperture, allowing for high-precision detection with fewer sparse array elements. Frequency diverse multiple-input multiple-output radar applies FDA to MIMO radar, possessing the advantage of both types of radar. Based on the characteristics of FDA and MIMO radar, this paper designs orthogonal signals which have good orthogonality and pulse compression performance for FDAMIMO radar. Numerical simulations are conducted to demonstrate the target detection and anti-interference capabilities of FDA-MIMO radar, detailed experimental comparison is to be done in further research.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134420Y (2025) https://doi.org/10.1117/12.3053091
Aiming at the segmentation and extraction of the main part of substation equipment, we use Fast Point Feature Histograms (FPFH) and Locally Convex Connected Patches (LCCP) to obtain voxels’ integrated geometric features, then aggregate these features and their K nearest neighbors’ on voxels to build multi-level voxels’ features by bottom-up hierarchy, and achieve pre-segmentation of shapes with the flow-constrained super-voxel clustering algorithm; After the pre-segmentation, we conduct shape analysis to extract semantically meaningful instances of equipment components, achieving part-level point cloud data instance extraction of artificial equipment geometric features. Without training data or manual annotations, the work presented is simple and easy to implement. It can merge patches across surface-singularities. It needs a few parameters, can achieve automatic 3D instance extraction from point clouds for different scenes with the same or similar parameters.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134420Z (2025) https://doi.org/10.1117/12.3053044
This study addresses the technical challenge of maintaining coordinate consistency within the belt benchmark station networks of high-speed railways, focusing on a specific high-speed rail line. The research involves an assessment of the current network status and the collection of data from selected stations in Shandong and Jiangsu provinces. To enhance the stability and reliability of the coordinate system, three optimization schemes have been proposed: calculations based on IGS stations, linear starting points, and peripheral national and provincial stations. Strict data processing techniques have been implemented, including synchronous loop Nrms statistics and baseline accuracy assessments, to ensure high standards and quality of baseline calculations. Simulation results indicate that all schemes meet technical specifications, ensuring the precision and reliability of coordinate calculations. The incorporation of peripheral stations significantly enhances the overall control of the network structure, effectively reducing systemic biases. The optimization schemes proposed in this study not only improve the consistency of the railway coordinate benchmarks and achieve higher positioning accuracy, but also serve as a reference for the design and data management of similar high-speed railway projects.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 1344210 (2025) https://doi.org/10.1117/12.3052921
Due to the limitations of traditional intelligent algorithms, it is very easy to be premature in optimization. In order to avoid this problem, and to enable particles to search the entire population while conducting individual optimal search, tabu search is added to the traditional genetic optimization algorithm. The improved hybrid algorithm is applied to the distribution network of IEEE33-node ships connected with distributed power supply. Due to the radial structure of the ship distribution network, the forward push-back method is used to calculate the power flow. Two faults are set to verify that the algorithm can be correctly reconstructed and the distribution network containing distributed power supply can be divided into isolated island systems that can be powered separately. The improved algorithm avoids the problems that may occur in the traditional genetic algorithm, such as easy to fall into local optimal and poor population diversity. The reconstruction scheme given by the algorithm is verified to be feasible, and the power quality is guaranteed while the power supply is restored, which proves the feasibility of the algorithm.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 1344211 (2025) https://doi.org/10.1117/12.3054420
The electrical fire monitoring mode is mostly set to one-way mode, which can achieve the expected monitoring task, but lacks stability and reliability, and it is difficult to achieve coordination and positioning monitoring in different communication background environments, with low controllability. Therefore, the design and verification of electrical fire monitoring method based on Lo Ra wireless communication technology are proposed. The fire monitoring model of LoRa wireless communication is constructed. After analysis, it is found that the application of LoRa wireless communication technology has long transmission distance, good wall-penetrating effect and low power consumption, so it can be widely used in the fire supervision system of building electrification.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Aidong Xu, Jinran Du, Tao Dai, Peiming Xu, Zhuowei Wang, Chong Chen, Dong Mao
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 1344212 (2025) https://doi.org/10.1117/12.3053138
In this study, a method based on Differential Privacy Federated Learning (DPFL) is adopted, which is suitable for safe and efficient processing of power grid data. Combined with a new algorithm called DP-FedSAM, which combines the sharpness-aware minimization (SAM) optimizer and differential privacy technology, it aims to solve the problems of privacy protection and model performance degradation in power grid data analysis. DP-FedSAM avoids the need for centralized data storage and processing by performing model training locally on each power grid operating node, thereby reducing the risk of data leakage. Each node uses a SAM optimizer to locally generate a smoother, more generalizable model and adds noise via a differential privacy mechanism to ensure secure sharing of updates. This method pays special attention to processing the non-independent and identically distributed (Non-IID) characteristics of power grid data, by testing the robustness and effectiveness of the algorithm in different regions and conditions. Preliminary experimental results show that DP-FedSAM performs well on a variety of power grid data sets, effectively improving the model's generalization ability and prediction accuracy, while strictly complying with privacy protection standards.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 1344213 (2025) https://doi.org/10.1117/12.3052973
Focused on the issues of blurring effect and spectral distortion in current pansharpening approaches, we propose a multiscale pansharpening method based on frequency feature guidance. Firstly, we extract frequency features using learnable Discrete Wavelet Transform Layers (DWTL), select key frequency features for fusion, and then use Inverse Discrete Wavelet Transform Layers (IDWTL) to transform the fused features back to the spatial domain to guide image reconstruction. Secondly, we employ a multi-scale progressive strategy to reconstruct the fused image, effectively leveraging the spectral and spatial features of the source images at different scales. Additionally, we implement a multiscale reconstruction loss constraint during network training to further enhance fusion accuracy. The superiority of our method is shown by testing results on two datasets, both at reduced and full resolution.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 1344214 (2025) https://doi.org/10.1117/12.3053607
In order to solve the problem of active sonar pulse interference existing in ship radiated noise signal, this paper uses Variational Mode Decomposition to decompose the ship radiated noise signal containing pulse interference into multiple Intrinsic Mode Functions. The interference modes are screened out by the relative maximum method, and the effect of the algorithm is evaluated by mean square error and correlation coefficient. The simulation results show that the proposed method can effectively suppress the active sonar pulse interference in ship radiated noise and has a certain robustness. The results of this paper can be used for data preprocessing before underwater acoustic target recognition, and have a certain role in improving the effectiveness of underwater acoustic target recognition.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 1344215 (2025) https://doi.org/10.1117/12.3052934
The objective of this research is to enhance sound field rendering efficiency using a heterogeneous computing framework. By integrating GPU and CPU resources, we address performance challenges in complex environments and large-scale ray tasks. Evaluations of the NVIDIA RTX A6000 GPU and AMD 9754 CPU led to an ray task allocation mechanism, optimizing resource use and maximizing computational efficiency. Experimental results show that this strategy significantly accelerates sound field rendering, achieving nearly 400 times faster performance than traditional single-core CPU computations.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 1344216 (2025) https://doi.org/10.1117/12.3054391
An underdetermined direction-of-arrival (DOA) estimation method is proposed for wideband non-stationary signals based on atomic norm minimization. In scenarios where signals exhibit completely overlapped time-frequency distributions (TFDs), the spatial time-frequency distribution (STFD) matrix becomes rank deficient, making it impossible to separate signals based on their time-frequency characteristics. This proposed method uses multiple snapshots atom norm minimization to achieve an ideal average STFD matrix and enables underdetermined DOA estimations of wideband nonstationary signals with completely overlapped TFDs, without aperture loss or the need for arrays to possess spatial translation invariant structures. Moreover, the proposed method involves interpolating the sparse array into a virtual uniform linear array to enhance array processing degree freedom. Simulation results illustrate the efficacy of the proposed method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 1344217 (2025) https://doi.org/10.1117/12.3054387
An underdetermined direction-of-arrival (DOA) estimation method is proposed for wideband non-stationary signals with completely overlapped time-frequency distributions (TFDs). In this case, signals cannot be separated based solely on their time-frequency characteristics because of a rank-deficient spatial time-frequency distribution (STFD) matrix. The proposed method uses a moving array to recover the rank of the STFD matrix. Meantime, it vectorizes the matrix to construct a differential virtual co-array, and employs matrix rank minimization to fill the holes of virtual array, thereby enhancing the array processing degree of freedom. This method enables underdetermined DOA estimation of wideband non-stationary signals with completely overlapped TFDs, without suffering from aperture loss or requiring arrays with spatial translation invariant structures. Simulation results demonstrate the effectiveness of the proposed method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 1344218 (2025) https://doi.org/10.1117/12.3054298
Sonar signal processing simulator is used for simulating ship-radiated noise received by hydrophones in different marine environment. Application of the simulator will greatly reduce the frequency of experiments on lakes or seas, and shorten the development cycle for sonar device. Considering the falling tendency of continuous spectrum of the ship-radiated and ocean ambient noise, a towed linear array sonar signal processing simulator is proposed in this article, which could obtain accurate computation of the continuous spectrum level at different frequencies, and simulate variation of amplitude with frequency. Accuracy of the simulated hydrophone data is proved through numerical simulation. The simulator designed provides more flexibility since the configuration parameters are available, which could give analog output of underwater sensor data for towed linear sonar with different designed frequency or number of hydrophones. The simulator can be applied to examining performance of sonar system, which is valuable to engineering application.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 1344219 (2025) https://doi.org/10.1117/12.3054287
The taming system is widely used in the time and frequency industry. As a high-efficiency frequency synchronization method, it can achieve precise control of the frequency source and effectively suppress phase noise without deteriorating the stability of the frequency source itself. This paper focuses on the adaptive taming technology of the frequency source, especially the adaptive parameter optimization based on the phase negative feedback mechanism of the phase-locked loop (PLL). Through the natural configuration of the phase-locked parameters, the taming system can avoid the generation of additional phase noise and the deterioration of short-term frequency stability while maintaining the inherent stability of the frequency source. The research results show that the taming system designed using digital phase-locked technology shows a significant improvement in long-term frequency stability, and the jitter of the PPS (Pulse Per Second) output signal is also greatly suppressed, thereby significantly improving the overall performance and reliability of the system. These research results provide theoretical and practical support for high-precision applications in the field of time and frequency.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134421A (2025) https://doi.org/10.1117/12.3054308
In this paper, we propose a multi-dimensional feature aggregation channel estimation network (FACENet) based on self-attention to improve pilot-based channel estimation in orthogonal frequency division multiplexing (OFDM) systems. This network aggregates spatial and channel features of the input data by alternately employing spatial self-attention and channel self-attention, then processes these features through a deep residual network. Given the strong time and frequency correlation of the channel, the spatial self-attention extracts and integrates spatial features from both time and frequency direction. Additionally, to support flexible pilot patterns and maximize the utilization of pilot signals, we propose an interpolation scheme based on data-pilot aided (DPA) estimation, with the interpolation results serving as input to FACENet. Simulation results show that FACENet outperforms other comparative methods across different modulation schemes and pilot numbers. Furthermore, FACENet has been shown to exhibit good robustness, making it applicable to receivers with varying speeds and other channel scenarios.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134421B (2025) https://doi.org/10.1117/12.3052930
Coprime arrays have been a topic of interest in the field of MIMO radar due to enhanced array aperture and mitigated mutual coupling effect, but existing holes in its coarray significantly reduces the number of degrees of freedom (DOFs) and degrades the angle estimation performance. In this paper, a coarray tensor completion method is presented for bistatic coprime MIMO radar to improve the angle estimation performance. Specifically, we first employ coarray interpolation on the holes in the sum-difference coarray (SDCA) generated by bistatic coprime MIMO radar. Then, we introduce transmission and reception spatial smoothing and construct an interpolated coarray tensor, and achieve effective tensor completion and an enhanced number of uDOFs followed by the nuclear norm minimization. Subsequently, the joint direction of arrival (DOA) and direction of departure (DOA) estimates are obtained by three-way tensor decomposition. Finally, numerical simulations are provided to verify the superiority of the proposed method in estimation accuracy.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134421C (2025) https://doi.org/10.1117/12.3052925
Accurate and robust localization is crucial for indoor mobile robots, where traditional vision-based systems often struggle with precision and reliability. This paper presents a SLAM/UWB fusion localization algorithm designed to overcome these challenges. By building on the ORB-SLAM3 stereo-inertial framework and integrating Ultra-Wideband (UWB) positioning data using an Extended Kalman Filter (EKF), the method introduces time and distance thresholds to filter outliers and improve data accuracy. Experimental results show that this approach outperforms ORB-SLAM3 stereo visual-inertial localization, achieving superior positioning accuracy and enhanced robustness in both the Euroc dataset and real-world indoor environments.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134421D (2025) https://doi.org/10.1117/12.3052990
With the rapid development of computer technology and internet technology, the issue of information security has become increasingly prominent. Physical Unclonable Technology (PUT), based on circuit implementation, is a highsecurity technology that differs from traditional cryptography in that it employs logic gates with variable quantities and distinct gate circuits. PUT can be applied in various fields such as authentication, secure encryption, cryptographic computation, and more. Compared to traditional cryptography, PUT offers advantages including high security, strong resistance to attacks, and flexible operation, which has led to its widespread application in the field of information security. Information security refers to the protection and maintenance of information resources from unauthorized or inappropriate access, use, modification, and other potential hazards. It is a product of the information age, encompassing any actions or events that may cause or potentially cause damage to information resources. Traditional cryptography has some drawbacks in ensuring data security, such as the ease of device loss, inconvenient key storage, and difficulties in decryption. Therefore, the development of new cryptographic technologies is necessary to address these issues.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134421E (2025) https://doi.org/10.1117/12.3053089
With the rapid development of the production economy, electrified EMU trains have become the main means of transportation for people to travel. But the ensuing power quality problem has also become one of the main concerns in the railway field today. When the electric locomotive is running, the harmonic currents in the traction current may cause interference to the railway signaling equipment (track circuits) along the line. Therefore, in this paper, we first simulate and analyze the traction current of the locomotive, and then take the useful signal 2600Hz~2630Hz, the harmonic signal 2550Hz, and 2650Hz as an example, and design a FIR band-pass filter based on genetic annealing algorithm. Compared with the traditional digital filter, it can be intuitively observed that the optimized filter filters out harmonic interference efficiently.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Wuzheng Ji, Ze Zhang, Huiyuan Tan, Wenhui Yang, Hui Wang, Xin Liu, Qiuliang Wang
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134421F (2025) https://doi.org/10.1117/12.3052951
Non-Cartesian reconstruction is a crucial technique for accelerating MRI. However, traditional non-Cartesian reconstruction algorithms often result in suboptimal image quality. Recently, deep neural networks have emerged as powerful tools for MRI reconstruction, yet their application to non-Cartesian acquisitions remains underexplored. Transformer-based approaches have shown impressive performance in image super-resolution, prompting us to explore their potential in this domain. To tackle these challenges, this paper introduces a novel framework that combines non-Cartesian image reconstruction techniques with a Transformer-based network. The proposed framework comprises non-uniform Fourier transform, image feature extraction, and image reconstruction modules. To assess the effectiveness of our approach, we performed experiments using the single-coil knee dataset from fastMRI. Compared to other methods, our proposed approach demonstrated a 2.024 dB improvement in PSNR and a 0.117 increase in SSIM under a 4x accelerated radial undersampling condition.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134421G (2025) https://doi.org/10.1117/12.3054329
This paper analyzes the precise point positioning (PPP) performance based on the BeiDou PPP-B2b service by selecting static and dynamic scenarios to evaluate the positioning performance of the BeiDou PPP-B2b service in different environments. The results show that: 1) The static positioning accuracy of PPP-B2b service in open environments can reach centimeter-level, with the best results over a 7-day experimental period being 2.18 cm east, 1.26 cm north, and 2.74 cm vertical; 2) The dynamic positioning accuracy of PPP-B2b service in open environments can reach decimeter-level, with the dynamic positioning accuracy in the experiment being 0.14 m east, 0.32 m north, and 1.28 m vertical; 3) The loss of packets during the transmission process of PPP-B2b service can lead to the expiration of corrections, making them unusable. In static positioning scenarios under open environments, the longest time to restore precise point positioning after an interruption is 16 seconds. In complex environments, the interruption of PPP-B2b service is more severe, leading to long periods where the orbit cannot be restored.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134421H (2025) https://doi.org/10.1117/12.3054284
The microwave landing system plays the key role in the process of aircraft landing while providing high-precision ranging and other information to ensure safe landing of the aircraft. This article presents a detailed design to the output frequency of a metrology calibration system for Microwave Landing Simulator (MLS). Due to the complexity of MLS’s functions and the output of weak time-division multiplexing multiplexed modulation signals, there are many difficulties that should be solved in its calibration process. In order to meet technical requirements and provide reliable support, this article establishes a dedicated metrology calibration scheme and designs the the calibration software for the output frequency of MLS. The front panel and rear panel of the calibration software is designed with LabVIEW. Finally, experimental results illustrate the effectiveness of the the proposed calibration software for MLS.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134421I (2025) https://doi.org/10.1117/12.3053054
In underwater direction estimation and target localization systems, far-field high-precision direction estimation is a key technology that is of great significance for target positioning and tracking. This study aims to explore a high-precision far-field direction estimation method based on the beamforming MUSIC algorithm, and to enhance the accuracy and stability of direction estimation by incorporating error calibration techniques. We first described the signal model and the beamforming MUSIC algorithm tailored to this model. Next, we analyzed the primary errors of the signal model in terms of phase offset. Subsequently, we proposed error correction techniques for phase error correction based on the subspace principle and the steepest descent method, effectively rectifying errors in both simulated and measured data. This research provides new insights and methods for the development of high-precision far-field direction estimation techniques, which are crucial for enhancing the performance and accuracy of underwater direction estimation and target localization systems.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134421J (2025) https://doi.org/10.1117/12.3053061
This paper presents a practical approach for processing AIS signals after RF direct acquisition in the ship automatic identification system. The method involves shifting the signals of the two AIS frequencies to baseband using two stages of digital downconversion (DDC). A multi-stage decimation filtering design is then employed to isolate the low-rate AIS signals of both channels. The implementation is carried out on the FPGA-JFM7K325T from Fudan Micro Company, followed by extensive repeated testing. The results demonstrate that the multi-stage filter extraction method effectively reduces hardware resource consumption and meets the design requirements, confirming the viability of this approach for enhancing AIS signal processing efficiency.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134421K (2025) https://doi.org/10.1117/12.3054120
The accurate prediction of wind speed is of central importance in numerous engineering and environmental disciplines. The reliability of such predictions is of paramount importance in ensuring the safety of building structures and the stability of energy systems. However, traditional wind speed prediction methods are constrained in their ability to handle nonlinear and unstable wind speed data. In light of these considerations, this study proposes a dynamic adaptive segmented wind speed interval prediction method based on fuzzy information granulation, with the objective of enhancing the precision of wind speed prediction and the quality of interval estimation. In light of the nonlinear and non-stationary nature of wind speed data, a FIG-MSS model that integrates fuzzy logic and dynamic time window adjustment is proposed. The model is capable of adaptively capturing the fluctuating characteristics of wind speed data, while the constructed intervals are modeled using LSSVM. The FIG-MSS-LSSVM demonstrates superior performance in PICP , PINAW , and CWC evaluation metrics at confidence levels of 0.9 and 0.95, effectively addressing extreme variations in wind speed data, and providing satisfactory PIs at different confidence levels. The experimental results indicate that the proposed method outperforms the traditional method at varying confidence levels, and is capable of effectively reducing the volatility and uncertainty of the PIs with high prediction stability and accuracy.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134421L (2025) https://doi.org/10.1117/12.3053028
Addressing the issue of measuring the expansion and contraction of hydraulic support movable columns in coal mine working faces, this paper designs a low-power wireless movable column shrinkage sensor for mines. The sensor employs a pull rope displacement sensing probe, achieving an accuracy of ±0.2%FS and a repeatability of ±0.05%FS. This wireless movable column shrinkage sensor utilizes LoRa wireless transmission technology and WaveMesh network to facilitate long-distance ad hoc networking, offering ease of installation and use. Experimental results demonstrate that the sensor exhibits advantages such as high-precision measurement, low energy consumption, and wireless ad hoc network communication capabilities, making it suitable for widespread use in monitoring the expansion and contraction of movable columns in mine hydraulic supports.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134421M (2025) https://doi.org/10.1117/12.3053346
In the trend of deepening and broadening large language models, the limitations posed by key-value (KV) caches on LLM inference have become increasingly prominent. This study elucidates how different input samples induce variations in the correlation between attention heads during the computation of transformer attention mechanisms. By quantifying this context-dependent correlation and dynamically assigning weights to each attention head accordingly, we propose SHA. SHA identifies and prunes heads that contribute minimally to overall performance, accelerating the inference process with minimal loss of performance. Experimental results demonstrate that on the Llama-7B, we successfully remove 30% of the attention heads, reducing KV cache memory requirements by 24.2%, and achieve a throughput improvement of up to 2.04 times.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134421N (2025) https://doi.org/10.1117/12.3054293
Due to the diverse nature of interference signals in mobile communication frequency bands, feature extraction is challenging, and the rapid and automated identification of behaviors such as illegal frequency usage and malicious interference are difficult. Moreover, evaluating the precise impact of interference signals on communication networks is challenging. In response to this problem, this paper proposes an air interface interference recognition technology for communication links based on base station measurement reporting data. It constructs an interference recognition model to achieve automatic interference identification and evaluates the accuracy of interference recognition. On the base station side, this paper utilizes channel measurement data such as signal strength, signal quality, and signal-to-noise ratio exported by the base station. It uses a Long Short-Term Memory (LSTM) autoencoder model to learn normal signal patterns for interference discrimination and evaluates the impact of interference on communication networks. This method achieves precise identification of fixed-frequency interference on the base station side and evaluates the impact of interference on communication networks. Experimental results show that the base station-side approach achieves an F1 score of 0.99 in identifying fixed-frequency interference, outperforming One-Class Support Vector Machine (OCSVM), Principal Component Analysis (PCA), and Isolation Forest (IForest) methods under similar conditions.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Artificial Intelligence and Computer System Design
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134421O (2025) https://doi.org/10.1117/12.3053123
Artificial intelligence is used in people’s daily lives, coping with various kinds of work in almost every aspect of our lives nowadays. Among various kinds of AI, generative AI, ranging from ChatGPT to AI painting, has been discussed most frequently. Some models of generative AI have been used in audio areas, such as denoising or generating audios in recent years. Additionally, though different generative AI has its own advantages, the diffusion model is the most outstanding one among all of those models, leaving people with unforgettable impressions and infinite possibilities by taking advantage of its high fidelity compared to other models, such as GAN.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134421P (2025) https://doi.org/10.1117/12.3053018
In this era of rapid development of the Internet, disinformation, as a new public opinion phenomenon, is generated and disseminated through the Internet. The most important feature that distinguishes it from general disinformation is that the means of dissemination have changed from verbal transmission to the dissemination of textual information or multimedia information on the Internet. Its untruthful and deceptive information content has a great impact on society and people's lives. The wide range and fast speed of dissemination of false information makes it difficult to judge its dissemination channels and the scope of dissemination, and therefore it is impossible to detect false information in an effective and timely manner. In this paper, through the simulation of the BSS virtual community propagation model, the SIR model, we can discover its propagation mode in social networks. After getting a rough idea of the pattern of disinformation dissemination. From the perspective of artificial intelligence through the feature extraction model detection method and Web content filtering model method to effectively detect false information, to reduce the negative impact of false information dissemination on society.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134421Q (2025) https://doi.org/10.1117/12.3054288
In order to achieve efficient transmission during SOC on-chip bus bridging, this paper designed an AXI to AHB bridging controller system, proposed a read and write control method during the bridging conversion process, and provided a method for splitting and aggregating read and write address commands when accessing across 1KB address boundaries, as well as splitting and aggregating read and write data and response. This method effectively solved the problem of low protocol conversion data throughput when AXI hosts continuously access AHB slave devices with continuous address increment burst types. The method proposed in this paper can achieve a maximum single AXI burst transmission of 32KB when converting AXI to AHB, greatly improving the transmission efficiency of AXI to AHB and effectively improving chip data transmission performance.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134421R (2025) https://doi.org/10.1117/12.3054431
Research on the application of VR technology in museums is a field full of potential and innovation. With the continuous development of science and technology, virtual reality (VR) technology has brought revolutionary changes to the way museums display as well. First of all, VR technology enables museums to break the limitations of time and space and bring the audience into the ancient historical scenes. Through high-precision modeling and rendering technology, the audience can feel the elegance of ancient civilization immersive, as if through time and space, and personally experience the customs of a certain era. Secondly, VR technology can also provide museums with richer display forms. The traditional museum display is often limited by physical exhibits and display space, while VR technology can present the exhibits in a more vivid, three-dimensional way in front of the audience. Through virtual reality technology, museums can digitize precious cultural relics and works of art, so that the audience can enjoy these treasures up close in the virtual environment, and even through interactive operations, in-depth understanding of the details and historical background of the exhibits. In short, VR technology can not only deepen the museum's display and interactive experience, making the exhibition form more colorful, but also attract more audiences into the science museum. In this way, VR technology can further enhance the social science popularization value of science museums and make them better serve the public.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134421S (2025) https://doi.org/10.1117/12.3052937
In flight testing, precise time synchronization significantly influences the accuracy of test results. For critical applications such as data acquisition in airborne test equipment, accurate timing is fundamental to the reliable operation of flight test systems and the precision of test data. The IRIG (Inter-Range Instrumentation Group) standards for IRIG_B codes, established by the International Radio Consultative Committee, enhance the efficiency and reliability of time synchronization among different systems by defining methods and accuracy requirements for unified time synchronization. Despite existing research making strides in improving IRIG_B code demodulation performance, traditional techniques still encounter timing inaccuracies under extreme conditions. To address these challenges, this paper proposes a design solution for second pulse delay timing, presenting a detailed process for AC IRIG_B code demodulation. Experimental results validate the effectiveness and accuracy of this approach in generating and demodulating time information.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Qi Li, Zhonghua Guo, Jialong Li, Xiaojun Li, Bo Ban
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134421T (2025) https://doi.org/10.1117/12.3054418
By continuously monitoring and evaluating water quality, it offers a scientific foundation for environmental protection and water resource management, while also fostering sustainable regional development and the establishment of ecological civilization. Using Landsat-8 satellite data and 2021-2023 water quality monitoring data of the Yellow River Basin in Ningxia to build a database, this paper proposes a custom residual convolutional neural network (ResCNN) model enhanced by hybrid attention mechanism (PCWA). Water quality parameters of turbidity (TUB), permanganate index (CODMn), ammonia nitrogen (NH3-N) and dissolved oxygen (DO) in the basin were inverted and compared with convolutional neural network (CNN) model. The findings indicate that the PCWA-ResCNN model more effectively captures the intricate nonlinear relationship between the observed values and surface reflectance, demonstrating significant robustness. In the inversion of TUB, CODMn, NH3-N and DO, the accuracy is the highest, with R2 reaching 97.92%, 95.29%, 95.01% and 96.12%, respectively. Compared with the CNN model, the prediction of this model is improved by 4.16%, 8%, 8.58% and 7.09% respectively. The PCWA-ResCNN model framework developed in this research demonstrates outstanding performance in monitoring water quality and serves as an effective approach for assessing the water quality of complex inland rivers and lakes. The inversion maps of four water quality parameters show that Class I and Class II water quality dominate the Yellow River Basin in Ningxia, and the water quality is in good condition.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134421U (2025) https://doi.org/10.1117/12.3052926
The nonlinear time fractional reaction-diffusion (TFRD) equation is an important class of fractional parabolic equations, and the study of its numerical methods has important scientific significance and application value. This paper proposes a new fast alternating segmented Crank-Nicolson (FASC-N) parallel difference scheme on graded meshes for solving the singular problem of nonlinear TFRD equation. To construct the FASC-N scheme, the time term is approximated by using fast L1 approximation of the time fractional derivative on graded meshes. Based on the alternating technique, the space term is discreteted by Crank-Nicolson (C-N) difference scheme and four kinds of Saulyev asymmetric difference scheme. It is proved that FASC-N scheme is unconditionally stable and has good convergence. Numerical experiment shows that FASC-N scheme has a feature that the calculation accuracy is improved with the decrease of grid ratio. The scheme has good parallel property, and the calculation efficiency is obviously better than the classical serial scheme, which is an efficient method to solve nonlinear TFRD equation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134421V (2025) https://doi.org/10.1117/12.3052972
Sea fog is a common weather phenomenon at sea, which reduces visibility and poses a great threat to marine traffic and operations. Traditional satellite remote sensing sea fog monitoring algorithms need to be improved in terms of accuracy, portability and automation. In this paper, a sea fog prediction model based on a Bidirectional Temporal Convolutional and Long Short-Term Memory Networks is proposed (BiTCN-LSTM). The BiTCN-LSTM introduces a network of causal convolutional and bidirectional gated recurrent cells, which improves the network's attention to sea fog features of important channels by learning forward and backward convolutional features of the input sea fog sequence. In addition, residual multi-scale feature fusion is employed to obtain multiscale information of sea fog, allowing the model to extract and fuse features progressively at different levels for better prediction of relative humidity and visibility time series of sea fog. Experiments show that the proposed BiTCN-LSTM has excellent performance in terms of long-term sea fog humidity and visibility prediction.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134421W (2025) https://doi.org/10.1117/12.3052927
The prompt detection and timely response to forest fires are crucial for effectively protecting the ecological environment. Despite the progress of deep learning fire image recognition models, in UAV aerial image processing, traditional models face challenges such as small fire points, smoke interference, and complex backgrounds, resulting in unsatisfactory detection accuracy. To cope with such problems, this paper proposes a forest fire detection method for UAV aerial photography based on a pre-trained DenseNet201 model, which is trained by using the FLAME dataset and combined with a channel and spatial attention mechanism module (CBAM). This model fire detection accuracy reaches 88.63% and the F-score can reach 90.02%. The model's performance was further validated by visual analysis of the confusion matrix, classification report, ROC curve, and precision-recall curve. The method therefore has a broad application prospect in real forest fire monitoring.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134421X (2025) https://doi.org/10.1117/12.3052917
Shadow puppetry is a traditional Chinese folk art with significant cultural heritage value. However, due to the influence of modern new media, this cultural treasure is gradually disappearing from public view. To address this issue, we have developed "ShadowStudio," a comprehensive digital shadow puppetry creation and sharing system that integrates creation, sharing, and educational functions. This system, built on the Unity 3D engine, allows users to freely create shadow puppetry performances and upload them online for others to download and play. Additionally, users can access an embedded database of shadow puppetry artifacts to learn about the cultural heritage of shadow puppetry. We also conducted field activities to collect feedback from thirty participants. The results indicate that our system significantly enhances users' interest in concepts related to shadow puppetry.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Sikang Jiang, Hongyu Chen, Zhipeng Jia, Zhichao Wang, Ruijie Cai
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134421Y (2025) https://doi.org/10.1117/12.3054167
Amplification-based distributed denial of service attacks (ADDoS) are a common and severe threat to the Internet. Recent reports of ADDoS attacks demonstrate that such attacks not only generate massive traffic but also exploit a variety of protocol types. Traditionally, attackers needed to control a substantial amount of resources to launch an attack. With the use of reflectors and amplifiers in DDoS attacks, an increasing number of protocols are being utilized for ADDoS. In this paper, we explore and review the basic concepts and process of ADDoS attacks, provide a detailed analysis of previous ADDoS incidents and related CVE vulnerabilities, and classify amplification vulnerabilities from the perspective of their causes. Additionally, we summarize common reflection patterns and methods for discovering amplification vulnerabilities, and propose methods to mitigate ADDoS attacks.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Tao Yu, Jinsong Luo, Li Chen, Yu Xie, Xiaozhen Zhu, Yanqi Zhao
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134421Z (2025) https://doi.org/10.1117/12.3052940
Inland waterway underwater revetment is easy to be damaged under the action of water current, ship traveling wave and other factors, in order to detect the damage development, the underwater revetment damage detection method based on multivariate coordinated bathymetry technology is proposed. The method firstly uses side scan sonar scanning to investigate the damaged area of underwater berm and construct two-dimensional image, then locates the damaged area through side scan sonar image interpretation, and finally combines the multibeam point cloud to construct the refinement model of the underwater berm area, extracts the elevation of the water bottom of the transverse section, and determines the depth of defective localized erosion. The method was analyzed in the case of underwater shoring of a navigation channel in Hangzhou, and two defective areas were detected in the target area. The results show that the coordinated use of multibeam precision bathymetry and side-scan sonar can clearly observe the morphological characteristics of underwater structures and obtain rich information on underwater topography, which can provide a powerful reference for the investigation and repair of structural defects of underwater shoring.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Li Yao, Yuwen Wan, Srdjan Damjanovic, Zhengfei Xin
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 1344220 (2025) https://doi.org/10.1117/12.3052979
As the manufacturing industry develops towards high precision and intelligence, CNC machine tools play an important role in production. The occurrence of failures not only reduces production efficiency but also increases costs. In order to improve the accuracy and efficiency of fault prediction, this paper establishes an integrated learning model for CNC machine tool fault prediction by stacking ensemble learning algorithms and combining decision trees, support vector machine (SVM), random forests and other algorithms. The fault-related features are optimized through data preprocessing and feature engineering, and the results are finally obtained. The experimental results show that the ensemble learning model is superior to the single model in result indicators, especially recall rate and F1 score, reaching 0.6393 and 0.7027. This verifies that the ensemble learning model proposed in this paper has better performance in improving the performance of CNC machine tool fault prediction and can better solve the problems related to prediction in CNC machine tool faults.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 1344221 (2025) https://doi.org/10.1117/12.3052932
In the face of serious centralization of the pesticide residue detection platform system structure for agricultural products, difficulty in ensuring the authenticity of detection results, and low confidence in the traceability information chain, a pesticide residue detection traceability system based on internet of things technology and blockchain is constructed to meet the changing needs. The pesticide residue detection and traceability system consists of two main parts: agricultural product pesticide residue embedded detection equipment and blockchain based traceability system. Based on the Internet of Things and blockchain technology, a one-time unique detection code is generated as the task authorization basis for single detection. The detection results adopt a dual storage design of storing traceability hash values on the traceability information chain off chain, and a microservice platform is built using PAAS architecture and HTML5 technology. In addition, the Apriori algorithm is used to construct a pesticide residue risk warning model. This system has achieved reliable collection, transmission, and traceability of pesticide residue detection data for agricultural products. Consumers can obtain agricultural product detection information through traceability source code, effectively solving the problem of low confidence and authenticity of pesticide residue detection data. It has high application value in the construction of agricultural product quality and safety.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Xiaoyang Zhou, Baohua Ying, Yuebo Yue, Linjun Xie, Peiyuan Li, Junjun Wei, Ying Liu, Kai Sun, Jingjie Yan
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 1344222 (2025) https://doi.org/10.1117/12.3054179
This paper proposes a face authentication and facial expression recognition system based on blockchain and trusted computing technology. With the continuous development of digital technology, people's demand for information security and convenience is increasing. Traditional password logins have problems such as insufficient security and easy theft. As a result, the adoption of biometrics as a means of login verification has become a trend. This system combines the immutable and decentralized characteristics of the blockchain, and combines the trusted computing technology to ensure the security, trustworthiness and integrity of the computing equipment and the computing process, and creates a Trusted Execution Environment (TEE), using the efficiency and accuracy of face recognition technology to achieve a safe, reliable, simple and convenient login system. At the same time, the combination of blockchain and deep learning for distributed training can achieve privacy protection, provide a secure and trustworthy training environment, and solve the challenges and problems in distributed training. By integrating the trained model from each blockchain into the final facial expression recognition model, the efficiency of training and the accuracy of expression recognition are greatly improved.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 1344223 (2025) https://doi.org/10.1117/12.3052965
In this paper, a new network called CSGNet is introduced for automatic modulation recognition, designed to improve recognition accuracy in complex electromagnetic environments. This innovative network leverages the strengths of both Convolutional Neural Networks (CNN) and Gate Recurrent Units (GRU) while mitigating their inherent weaknesses. Additionally, it incorporates a Self-Attention mechanism to further enhance its perceptual abilities. The primary advantage of this method is its superior recognition accuracy under low SNR conditions compared to traditional methods. The process starts by extracting basic I/Q data from the RadioML 2018.01a dataset as input. Subsequently, the CSGNet model performs an end-to-end modulation recognition task. The results demonstrate that this method excels at multiple SNR levels, highlighting the significant potential of deep learning in solving automatic modulation recognition challenges. This study underscores the effectiveness of integrating CNN, GRU, and Self-Attention mechanisms in enhancing the robustness and accuracy of modulation recognition systems. Furthermore, it demonstrates how deep learning techniques can be effectively applied to complex signal processing tasks. The findings suggest that CSGNet could be a valuable tool for improving communication systems, especially in challenging environments with low SNR.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 1344224 (2025) https://doi.org/10.1117/12.3053122
CTCS-3 train control system is the key system to ensure the safe operation of trains, and its effective combination with automatic train operation technology, which plays an important role in the control of train operation, and is also the mainstream direction of the development of high-speed railway system in China. A method in this paper is provided for modeling and formal simulation verification of the real-time behavior of the system based on timed automata to ensure the safe operation of high-speed railway C3+ATO system. Taking the function of movement authority generation as an example, according to the functional requirements of the train operation system, the information is transmitted between the communication equipment, establishes the timed automata network model of information interaction between the equipment, and the message sequence charts between the communication equipment is generated, which used in the function of movement authority generation, train sends unconditional emergency stop messsage and temporary speed reduction messsage through the formal simulation, and the security, existence and reachability of the system are verified by the BNF (Backus-Naur Form, BNF) statement. As a result, the model meets the requirements of system functional attributes and the technical specification of Radio Block Center, which lays a foundation for the subsequent research of C3+ATO system.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Wei Wang, Shenglong E., Zhongao Wang, Zhangquan Rao, Lei Wang, Hao Wang, Jianming Liu
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 1344225 (2025) https://doi.org/10.1117/12.3054340
This paper aims to evaluate the observational accuracy of the SWOT satellite’s water vapor radiometer and analyze the impact of various factors on its precision. The study compares wet delay data from the Jason-3 water vapor radiometer with calculated wet delays from the ERA5 model and examines the effects of cross-point distance, offshore distance, water depth, and time difference at crossover points on discrepancies in wet delay measurements. The results indicate that compared to Jason-3 data, the mean discrepancy of the SWOT water vapor radiometer is 0.0018 m, with a standard deviation of 0.02 m, showing no significant systematic error. Discrepancies decrease with increasing cycles, demonstrating good observational consistency and high stability. Compared to ERA5 data, discrepancies are larger in both high- and low-latitude areas, with a mean of less than 2 mm and a standard deviation of less than 1.5 cm, and without significant systematic bias. The study also finds that the accuracy of the wet delay discrepancies is inversely proportional to the distance and time difference at the crossover points and directly proportional to the offshore distance and water depth, indicating continuous optimization of the hardware and software performance of the water vapor radiometer during the SWOT mission.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 1344226 (2025) https://doi.org/10.1117/12.3054274
It is of great significance to extract the features of a large number of fault text data generated during the operation of train control vehicle equipment in the subsequent fault diagnosis task. In this paper, a fault diagnosis model based on BiGRU and CNN based on dual-channel feature fusion is constructed. Firstly, the dynamic word vectors were generated by the BERT pre-trained model from the data cleaning fault vocabulary database. Then, in order to fully extract the global and local semantic information of the data, a fault diagnosis model based on BiGRU-CNN dual-channel feature fusion was constructed, and an improved cross-entropy loss function was introduced to focus on the difficult to classify samples. Finally, the fused high-dimensional feature vector matrix is reduced by the PCA method, the redundant information is removed, and the SVM classifier is used to complete the fault data classification and realize fault diagnosis. Experimental results show that the model can obtain word vectors with strong representation ability, and the extracted features with this word vector as the input of the dual-channel model have global and local semantic information, and can effectively solve the problem of inaccurate classification of small sample data. The model in this paper has significantly improved the accuracy, recall rate, and F1 value, which can provide some support for the daily diagnosis of train control vehicle equipment.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 1344227 (2025) https://doi.org/10.1117/12.3054291
Currently, the line segment extraction algorithms used in point-line fusion SLAM tend to detect a long line segment as a number of short segments, which reduces the efficiency and accuracy of line segment matching. Additionally, the use of the computationally expensive LBD descriptor for line segment matching leads to a lower system efficiency. To address these issues, this paper proposes a point-line feature fusion visual-inertial SLAM system based on the ELSED line segment detection algorithm. The proposed system suppresses the extraction of shorter line segments and solves the problem of long line segments being divided into multiple short line segments, thereby improving the accuracy of the SLAM system. The line matching part uses a matching strategy based on the improved LK optical flow, replacing the computationally expensive LBD descriptor matching. Finally, experiments on the EuRoC dataset verify that the SLAM system designed in this paper reduces the average absolute trajectory error by 34.45% compared to the VINS-Mono system, and by 17.38% compared to the PL-VINS algorithm in some complex indoor scenarios. The results show that the proposed algorithm effectively improves the positioning accuracy and robustness of the point-line SLAM system.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 1344228 (2025) https://doi.org/10.1117/12.3053064
To address the challenge of enhancing the accuracy of Remaining Useful Life (RUL) predictions for bearings with deep learning models, we propose a novel method based on an improved Gated Recurrent Unit (GRU). First, Adaptive Noise-Assisted Complete Ensemble Empirical Mode Decomposition (CEEMDAN) method is utilized to extract essential degradation features from the signals. These features are then integrated using a GRU enhanced with a temporal attention mechanism. The final step involves predicting the RUL using a particle filter algorithm. Tests on the PHM2012 dataset validate that this method significantly improves the accuracy of RUL predictions.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 1344229 (2025) https://doi.org/10.1117/12.3053045
To address the issue of designing operational viewpoint for kill web operations, the Inf-ProA framework is employed as a research tool. Starting with the construction of a meta-model for the operational viewpoint of the kill web, the modeling language for the architecture of the kill web's operational viewpoint is defined. By analyzing operational scenarios and extending view models according to operational activity requirements, the operational activities, targets, rules, information flow, equipment, and organization of the kill web are described from an operational standpoint. This approach achieves the definition of the abstract syntax, static semantics, dynamic semantics, model design, and concrete semantics of the modeling language for the operational viewpoint of the kill web. This method serves as a foundational basis for exploring, exchanging, and optimizing the use of the kill web, providing guidance for subsequent optimization of operational systems, system design, and equipment development.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134422A (2025) https://doi.org/10.1117/12.3053034
On the basis of in-depth study of electronic engineering technology, this paper innovatively designed and realized a displacement signal acquisition system based on peak detection. This system solves the problem of real-time capture of displacement signals, and realizes the efficient conversion of displacement information to readable digital signal. In the process of circuit design, fully using the advantage of peak detection technology, accurately identify the peak point of the signal by real-time tracking and capturing the displacement signal, using the signal processing capability of the processor to filter and calculate the captured peak, thus realizing the digital output of LVDT sensor. This circuit has a wide range of applications. In practical applications, this circuit can quickly respond to the changes of displacement signal, and accurately output displacement information, which can provide reliable technical support for applications in related fields.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134422B (2025) https://doi.org/10.1117/12.3053321
To address these issues, this study proposes a lightweight and highly deployable vehicle detection model. The model builds upon the single-stage YOLOv7 architecture, replacing the original convolutional layers with the inverted residual structure and depth-wise separable convolutions from MobileNetv3. This streamlines the network width and computational parameters. Furthermore, an ECA attention mechanism is incorporated into both the backbone network and multi-scale feature branches to reduce computational overhead while enhancing the model's cross-channel feature extraction capabilities. Additionally, data augmentation techniques were applied to the vehicle dataset, and the Focal Loss strategy was employed for model training and evaluation. Experimental results demonstrate that the proposed model reduces the parameter count from 37.62M to 10.217M, achieving a competitive MAP of 77.59%. The proposed model achieves a real-time inference speed of 40.04 FPS when deployed on a Jetson Xavier NX edge platform, representing a 77.8% performance improvement over the original YOLOv7 model. The lightweight and high-performance characteristics of the proposed detection model enable its seamless integration into resource-constrained edge computing infrastructure for ITS applications. This work provides a valuable technical foundation for realizing the full potential of intelligent.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Mengyao Chen, Pengyan Yan, Lin Han, Haoran Li, Cuixia Wang
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134422C (2025) https://doi.org/10.1117/12.3054338
The automatic OpenMP implementation in the current GCC compiler adopts the fork-join model, where frequent creation and convergence of thread groups result in significant management control overhead. This paper studies methods to reduce thread group creation and convergence to improve the efficiency of automatic OpenMP programs. A universal optimization method for merging parallel regions is proposed in this paper to address the fork-join model. Through variable attribute modifications, handling of serial statements, and synchronization operation optimization, adjacent continuous parallel regions are merged into a larger parallel region to reduce parameter passing within the parallel region and lower the overhead caused by the creation and destruction of parallel regions. This work is implemented based on the GCC 10.3.0 compiler and experimentally validated using the NPB 3.4-OMP test suite, achieving an average overall performance improvement of 20%. The experimental results demonstrate the effectiveness and generality of the method proposed in this paper. This method can effectively enhance the runtime efficiency of OpenMP programs, serve as a reference for optimizing the implementation of OpenMP programs, and provide support for thread dynamic merging techniques in AI compilation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134422D (2025) https://doi.org/10.1117/12.3054276
This study introduces a novel framework for video-based 3D human pose and shape estimation, termed Selective sampling and Temporal Positional Encoding (STPE). Our method leverages selective sampling and advanced positional encoding to tackle the temporal complexities of video data and the high cost and scarcity of annotated datasets. Inspired by the Masked Autoencoder (MAE), our approach adopts a selective sampling strategy that efficiently captures the essential dynamics of human motion from partial views, significantly reducing reliance on continuous frames. The framework incorporates Rotary Position Embedding (RoPE), using rotational angles to simplify positional encoding. This innovation decreases model complexity and boosts learning effectiveness. We also introduce randomized index positions during training, introducing variability and enhancing generalization across various datasets and motion patterns. Our model, validated on standard datasets like 3DPW, MPI-INF-3DHP, and Human3.6M, shows enhanced performance in accurate and robust 3D pose and shape capture compared to existing methods. Our results demonstrate that strategic frame sampling and sophisticated positional encoding can significantly improve accuracy and robustness of video-based pose estimation systems.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134422E (2025) https://doi.org/10.1117/12.3054438
An adaptive improved dynamic window approach based on fuzzy logic is proposed to address the challenges of avoiding dense obstacles and the poor applicability of the traditional dynamic window approach. First, a target point traction evaluation sub-function is added to the original evaluation function. Then, fuzzy logic is integrated into the improved dynamic window approach, allowing for real-time adjustment of the weight coefficients of various evaluation subfunctions based on environmental information. This enables local dynamic path planning for unmanned surface vehicles in various scenarios. Finally, the effectiveness of the algorithm is validated through simulation experiments.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134422F (2025) https://doi.org/10.1117/12.3054346
In our paper, we developed a Long Short-Term Memory Model to explore the role of momentum in tennis performance, particularly within the context of the 2023 Wimbledon Championships. The Long Short-Term Memory Model describes changes in on-court situations through the difference in performance between the two sides, focusing on predicting changes in tennis matches. The model begins with a correlation analysis to identify effective indicators, then introduces a new dimension called "state" and employs a state-corrected LSTM network. It enhances the understanding of match dynamics and aids in more accurate player performance predictions. Based on this, the model further predicts the scoring rate for both sides and offers player suggestions. Our model accuracy up to 69.654%, which makes a more accurate predictions about the winning rate of the game than previous model (The Support Vector Machine model). Through the model, our thesis presents a multifaceted view of momentum’s role in tennis, offering strategies for improving player performance and decision-making. The evaluation of these models demonstrates their practical applicability and effectiveness in a competitive sports environment.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134422G (2025) https://doi.org/10.1117/12.3053022
The new cloud-network operation system is based on cloud, distributed technology and service-oriented architecture, aiming to improve the operation efficiency of cloud-network through all-digital transformation. Unified service modelling is a prerequisite for the construction of the new cloud-network operation system. This paper proposes a PSR (Product, Service, Resource) layered decoupling model and encapsulates standardized network capabilities across the network based on the PSR model in order to realize the intelligent opening and operation of the cloud-network. By introducing China Telecom's PSR model and proposing the basic principles and design methods of IP network capabilities encapsulation design under the PSR model architecture, as well as sharing the practical cases and exploratory experiences of IP network capabilities design and capabilities opening, this paper lays a solid technical foundation for the enterprise's digital transformation and the construction of intelligent, automated, and visualized operation and service system.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134422H (2025) https://doi.org/10.1117/12.3053113
Based on the design concept of “human-computer collaboration”, this research aims to design intelligent college English teaching assistants with the core goal of promoting the deep integration of technical wisdom and English teacher wisdom, so as to help English teachers carry out intelligent and collaborative English teaching. The intelligent teaching assistant adopts a modular design approach, and the network architecture model is selected as B/S structure. The functional modules include login verification module, teaching preparation module, teaching module and management module. The design of database logical structure adopts E-R data model, and each entity is accurately converted into the base table in the database. The core base table include user information table, sentence library table and courseware information table. MySQL database is chosen to store and process user information, password, corpus, assignment feedback and other information data.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.