Computer-aided thyroid CT image segmentation aims to provide imaging physicians and clinicians with auxiliary diagnostic suggestions and improve the efficiency of physicians in diagnosing the thyroid region. However, it is still a challenging task to distinguish the thyroid from other surrounding tissues due to adhesions in thyroid CT images caused by thyroid disease. To achieve accurate segmentation of thyroid CT images under the intervention of different types of thyroid nodules, we proposed a thyroid segmentation network named ResUnet by introducing the residual learning idea to UNet. Our network controls the gradient dispersion by incorporating a batch normalization operation and an intermediate layer regularization operation, then solves the degradation problem by introducing the residual connections into the convolution operation. Moreover, our ResUnet network can converge faster with the same number of layers, thus supporting a deeper design of the network. Extensive experiments also validated the high accuracy (94.10%), specificity (98.94%), and sensitivity (96.34%) of the proposed ResUnet for the segmentation of thyroid nodules, which can assist CT physicians in the diagnosis of the thyroid gland.
Human-computer interaction has always been a hot research direction in the field of computers, and augmented reality technology can combine the virtual world and the real world, which has a wide application prospect. In recent years, augmented reality based on gesture recognition interaction has been greatly developed. This paper proposed three modules: Gesture recognition module based on MediaPipe framework, Special effect generation module based on the posture estimation model mixed with feature points and straight lines, Intelligent media interactive application development module. Through the above modules, an intelligent media interactive application based on gesture recognition to generate 2D/3D interactive effects in real-time has been designed and developed.
Motion recognition is widely used in somatosensory games, rehabilitation training and robot motion learning. Tennis training can identify and classify the captured actions, and improve the performance of computer-aided tennis teaching timely and accurately. Traditional image based or video based human posture capture recognition are easily affected by complex background environments, different lighting conditions and other factors in practical applications. In this paper, Microsoft Kinect is used as the sensor device to capture the data information of the tennis player. Firstly, Kinect's depth sensing technology is used to obtain human skeleton data. Secondly, in order to improve the efficiency of classification, this paper reduces the dimension of data by extracting the feature value of human skeletons. Thirdly, a kind of KNN algorithm which defines the dimension weight is proposed to implement the movement classification, compared with the 94% accuracy of the algorithm, the accuracy of the KNN algorithm is 92.4%, the accuracy of the decision tree algorithm is 92.81%, and the accuracy of the CNN algorithm The accuracy is 89.97%. The evaluation method of tennis action is defined to provide guidance for users. By comparing the difference between the user’s postures and the standard postures in the joint positions and angles which is prone to get error, this method can correct the user's postures and build up the function of movements guidance. From the teaching effect of tennis aficionados and general tennis players, this method is more practical and targeted than the traditional tennis graphics and video teaching.
Infrared image from nighttime cattle farm has low contrast, blurred visual effect and unclear details. We proposed a method based on dark channel prior as well as piecewise linear stretch to enhance infrared image. Improvement of image quality contributes to manually annotating images more accurately when preparing the dataset. The results of image enhancement are compared with other methods to evaluate the performance. Further more, we verify its performance on nighttime cattle detection based on YOLOv4. We get appropriate prior anchor boxes for this work by K-means clustering on cattle image dataset. YOLOv4 models of cattle detection are trained with datasets of original images and processed images. A total of 1400 cattle images from different scenes have been collected from surveillance videos as a dataset for experiment. The average precision (AP) of cattle detection is more than 95%. Compared to control groups, the APs from enhanced images are 0.64% and 0.70% higher. Experimental results show that image enhancement can improve the accuracy of nighttime cattle detection based on YOLOv4
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.