PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
We present a real-time system for vehicle detection and classification in road intersections, incorporating image processing techniques. This system estimates the traffic flow at a specific point, as it is capable of recognizing the trajectories of different vehicles at an intersection, inferring whether they leave or enter the city. It is designed to be integrated into a high-fidelity digital twin, aiding in estimating environmental traffic pollutants. Since Computational Fluid Dynamics (CFD) use estimators like average or aggregate measurements, we use more accurate methods to estimate pollution. The implications of our study are significant for urban planning and traffic management. It allows for immediate decisions and informs long-term infrastructure planning by providing a deep understanding of intersection dynamics. Our research offers a comprehensive perspective on traffic analysis, introducing data-driven traffic management strategies for efficient urban mobility. The code developed for this purpose can be found in \https://github.com/capo-urjc/TrackingSORT
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Helical flutes are the most important elements of axial cutting tools. The technological process for manufacturing helical flutes is multifactorial, complex and difficult to quickly calculate online. The production of flutes is carried out on a multi-axis grinding machine with a conical or cylindrical grinding wheel. The task is also complicated by the fact that in more than one section the shape of the helical flute does not match the grinding wheel. Therefore, the development of technology and its preparation requires a long time. The work proposes a digital twin for the production of helical flutes with the input results of processing 40 variants of technological processes and compares them with the results of processing images of helical surfaces in real time. The work for the first time established an analytical connection between the shape of the grinding wheel, the geometry of the front helical surface of the cutter, the trajectory of movement and the location of the wheel, structured into a system of signals from sensors with a controlled monitoring effect per unit of time. This made it possible to install real-time control indicators to assess accuracy in the processing area and, based on the indicator, to form a control action to ensure the production of suitable products. This work allows us to lay the crating for a new direction in the field of industrial production of cutting tools. The use of real-time data in the current time window to evaluate future spatio-temporal information of the process state makes it possible to continuously adjust the movement of the grinding wheel with a given error and deviation, which is ensured by a continuous analytical solution that takes into account the basic geometric parameters of the cutter and the grinding wheel.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Recent algorithmic developments, specifically in deep learning, have propelled computer vision forward for practical applications. However, the high computational complexity and the resulting power consumption are often overlooked issues. This is not only a problem if the systems need to be installed in the wild, where often only a limited electricity supply is available, but also in the context of high energy consumption. To address both aspects, we explore the intersection of green artificial intelligence and real-time computer vision, focusing on the use of single-board computers. To this end, we need to take into account the limitations of single-board computers, including limited processing power and storage capacity, and demonstrate how the algorithm and data optimization ensure high-quality results, however, at a drastically reduced computational effort. Energy efficiency can be increased, aligning with the goals of Green AI and making such systems less dependent on a permanent electrical power supply.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Real-time image processing is a key area of focus, but computationally intensive. Neural networks effectively address classification tasks, but they are not always a viable option, particularly in environments where high power consumption or computational requirements are limiting factors. Hardware devices such as Field-Programmable Gate Arrays (FPGAs) offer significant parallelization capabilities that can be fully exploited when the implemented circuit is composed solely of logic gates. In addition, FPGAs are also interesting alternatives to traditional GPU-based implementations in terms of power consumption and reconfiguration capabilities. They can be used as a demonstration platform to validate a hardware design that can be later manufactured, creating the final Application-Specific Integrated Circuit (ASIC). This paper introduces a practical demonstration platform based on an FPGA that highlights the great capabilities of logic neural networks, a type of neural network constructed exclusively with logic gates.
By harnessing FPGA parallelization and logic gates, we have achieved a balance between computational power and real-time performance. This approach ensures that image classification occurs at speeds on the order of nanoseconds. This ultra-fast processing is well-suited for real-time image analysis applications across various domains. Industries that rely on quality control, such as manufacturing, will benefit from rapid and precise assessments. In the field of medical image processing, where quick diagnoses are crucial, this technology promises transformative advancements. The demonstration platform developed serves as a proof of concept for logic neural networks, offering a solution to the challenge of real-time image processing and representing the first step towards the implementation of future architectures of logic networks in hardware.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
For many practical applications, we face the problem that computer vision systems must be installed in the wild, without or with a limited permanent power supply. Therefore, computationally and energy efficient solutions are needed. In particular, in this work, we show that the meaningful use of single-board computers (SBCs) can help achieve these goals. This is in line with the goals of Green AI. In particular, we show that the computer vision algorithms adopted on SBCs yield competitive results compared to high-performance computing devices. To this end, in addition to quantitative performance evaluations, we also measured and compared the power consumption of the algorithmic and technical setup used for various practical problems. These examples demonstrate the practical sustainability of SBCs. They show their performance, reduced power consumption, and lower environmental impact, while still providing real-time performance.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Multiply-Accumulate (MAC) operation is widely used in various real-time image processing tasks, ranging from Convolutional Neural Networks to digital filtering, significantly impacting overall system performance. In this work the Self-Adapting Reconfigurable Multiply-Accumulate (SR-MAC) is proposed as a new instrument to find the optimal trade-off between operation throughput, power consumption and physical resources utilization in real-time image processing applications. Operations of the proposed system rely on the dynamic reconfiguration of the hardware resources on the basis of the current computational requirements. This is achieved by monitoring overflow and over-representation occurrences at each accumulation cycle, and properly considering the relevant portion of the accumulation result. A custom architecture of the proposed algorithm has been designed and implemented on an AMD Xilinx Artix-7 FPGA through a Verilog description and compared to the AMD Xilinx fixed-point macro (floating-point fused multiply-accumulate). The SR-MAC achieves reductions of 83% (82%), 79% (93%) and 87.2% (94%) in the number of LUTs, FFs, and the power dissipation, PdynN, respectively. The SR-MAC has also been used to replace arithmetic units in typical real-time image processing applications. In these cases, its employment has allowed the reduction up to 6% and 14% of FFs and PdynN, respectively, while increasing up to 14% the fMax. These results highlight the significant performance enhancement achieved with respect to both single operators and entire systems, making SR-MAC an excellent design choice in real-time image processing applications.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Super resolution (SR) is a technique designed for increasing the spatial resolution in an image from a low resolution (LR) to high resolution (HR) size. SR technology has had a considerable demand in a wide variety of applications to recover HR images, such as medicine, engineering, computer vision, pattern recognition and video production, etc. In contrast to interpolation-based algorithms that often introduce distortions or irregular borders, this study proposes an implementation that can preserve the edges and fine details of an original image through the computation of the wavelet decomposition. Different Discrete Wavelet Transform (DWT) families such as: Daubechies, Symlet, and Coiflet were evaluated. The proposed system was implemented on a Raspberry Pi 4 model B, an embedded device, to get around the PC’s mobility limitations, making it possible to create an in-expensive and energy-efficient SR system, reducing their complexity in realtime applications. To investigate the visual performance, SR images have been analysed in subjective matter via human perception view, guaranteeing good perception for the images of different nature from three different datasets such as Full– HD (DIV2K), medical (Raabin WBC), and remote sensing (Sentinel-1). The experimental results of designed implementations appear to demonstrate good performance in commonly used objective criteria: execution time, SSIM, and PSNR (0.742 sec., 0.9164, and 38.72 dB), respectively for images with a super resolution size of 1356 x 2040 pixels.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The intersection of deep learning and programmable logic controllers (PLCs) can lead to innovative applications in automation. One of the exciting application areas are gesture-based control systems for Automated Guided Vehicles (AGVs). AGVs are used in various industries for material handling, logistics, warehouse automation, etc. Traditionally, these vehicles are controlled using predefined routes or remote controls, but with gesture-based control, operators can communicate more naturally and efficiently. The incorporation of YOLO-Pose in YOLO versions 7 and 8 has elevated the YOLO algorithm to a leading tool for creating gesture recognition models. The YOLO algorithm employs convolutional neural networks (CNN) to detect objects in real-time. These latest YOLO models offer significantly improved accuracy, speed, and reduced training times. This paper presents the comparative results of 2D gesture recognition transfer learning models created using the YOLO v5, v7, and v8 models, along with the steps taken to implement the model in a PLC-controlled AGV.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper introduces an integrated approach to address challenges in traffic monitoring and control, alongside traffic simulation, by leveraging Visible Light Communication (VLC) technology. The proposed method optimizes traffic light signals and vehicle and pedestrians trajectories at urban intersections, incorporating Vehicle-to-Vehicle (V2V), Vehicle-to- Infrastructure (V2I), Infrastructures-to-Vehicles (I2V), and Pedestrians-to-Infrastructures (P2I) VLC communication. Experimental results demonstrate the feasibility of implementing these VLC modes in adaptive traffic control systems. Through modulated light, information exchange occurs between connected vehicles (CVs) and infrastructure elements like streetlamps and traffic light signals. Cooperative CVs share position and speed data via V2V communication within control zones, enabling adaptability to various traffic movements during signal phases. By utilizing Reinforcement Learning and the Simulation of Urban Mobility (SUMO) agent-based simulator, optimal traffic light control policies are determined. Unlike conventional methods focused solely on maximizing traffic capacity, this approach integrates traffic efficiency and safety considerations, including pedestrian concerns at intersections. Simulation scenarios adapted from real-world environments, such as Lisbon, feature interconnected intersections with traffic flow impact. A deep reinforcement learning algorithm dynamically manages traffic flows during peak hours via V2V and V/P2I communications, while prioritizing pedestrian and vehicle waiting times. VLC mechanisms facilitate queue/request/response interactions. A comparative analysis highlight the proposed approach's benefits in throughput, delay reduction, and minimizing vehicle stops, revealing improved patterns for signal and trajectory optimization. Evaluation on separate training and test sets ensures model reliability and effectiveness.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Satellite imagery-based ship detection is indispensable in maritime surveillance and monitoring the naval activities. Machine learning is an effective approach that enables the process to be automatic and more accurate as compared to many other approaches. Generally, optical and synthetic aperture radar satellite images are often employed for detecting/locating various marine activities using different methods. However, models trained on one set of images often yield large uncertainties when testing on other sets of images due to the complex scene characteristics. This study proposes a novel lightweight computationally efficient deep learning-based general ship detection model called the Multi- Attentive General Ship Detector (MAGSD) for detecting ships in both optical and SAR satellite images. The model is trained with the SAR Ship Dataset (SDD), which has ship instances from Gaofen-3 and Sentinel-1 SAR satellite images, and the MASATI dataset that contains ship instances from the Microsoft Bing Map. The proposed model focuses on the attention-guided convolutional neural network for extracting feature maps for detection, which bridges the gap between SAR and optical image characteristics constraints by focusing on different levels of convolutional features in the network. The model is built with a novel feature extractor that has fourteen convolutional layers with six max pool layers and six attention layers, connecting several convolutional points to focus on local features in different depth maps which serve as the backbone of the model. The comparative analysis showed the robustness of the proposed model over the state-of-the-art baseline model YOLOv5s, with a precision of 8.2% and a recall of 9.63%. These results indicate that the proposed model holds the potential to serve as an efficient tool for ship detection in any satellite images and contributes to the enhanced coastal surveillance and bolsters global naval security.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
With the increase of hybrid meetings, innovative strategies are needed to enable both local and remote participants to interact and communicate in the same way as in real meetings. In the context of a standing table scenario, we present an end-to-end real-time 3D video processing workflow and demonstrator setup. This system offers on a single display two different perspectives of a remote participant for two local participants. The system consists of a 3D capture setup and processing workflow at the remote site and a novel dual-view display on the local site. The standing table scenario requires highly realistic, high resolution rendering quality on the receiving side. Hence, low-resolution depth cameras and neural network-based depth enhancement in conjunction with 8K cameras are used to achieve high quality perspectively corrected views, allowing direct eye contact for individual users. Distinct real-time view rendering is achieved without holes to provide high-quality novel views. On the local site, a novel dual-view display is used that is capable to present two different perspectives of the remote participant at two different positions in space. The display is based on a lenticular lens that is specifically designed for our use case. The prototype setup is optimized towards low latency processing and transmission in order to fulfill the constraints given from our video communication scenario.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Optical image processing, which capitalizes on the distinctive characteristics of light, facilitates the manipulation of visual data in real-time and at a high speed. This technology is instrumental in performing tasks such as enhancing edges, recognizing patterns, and extracting features, all of which are crucial in fields like medical imaging, surveillance, and industrial automation. In this study, we present the successful demonstration of a photonic integrated circuit (PIC) made of Lithium niobate on insulator, enabling matrix-vector multiplications for image classification. By surpassing an electrical bandwidth of 15 GHz, our experiment showcases the PIC’s ability live edge detection and video streaming. Remarkably, its energy efficiency surpasses the limit imposed by electronic systems for each operation by consuming < 10 fJ/bit.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The correct segmentation in ultrasound (US) images/videos is essential since it directly influences future diagnostic applications of breast cancer. Therefore, we propose a lightweight U-Net with a 128 x 128 image input and 1941105 trainable parameters. Our architecture works with a multi-GPU strategy. Parallelization of the image/video processing via GPU hardware allows for optimization of the runtime of the procedures, reducing the executing time by employing multithreading processing through OpenMP and CUDA. The designed architectures were implemented in a parallel programming model to be executed on a multi-GPU NVIDIA GeForce RTX 3090 graphics card with 10496 CUDA cores. The proposed parallel implementation is tested on a workstation with a CUDA-enabled GPU and compared with the non-parallel variant.
This study presents an ablation study of the designed segmentation for the video US database (VBUS) with breast cancer lesions (113 malignant and 75 benign lesions), where the images/videos are segmented in real time.
The designed system was first used in the BUSI database since it contains ground truth references (GT), resulting in a segmentation accuracy of 97.43% and a mean Intersection over Union (IoU) of 95.31%. For database VBUS (videos) that contain breast lesions, the segmentation process generates a video where all lesions are marked in mpeg format. The videos from the VBUS database were segmented to evaluate real-time segmentation, and the inference time of the segmentation was computed.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Childhood leukaemia demands meticulous blood cell analysis for diagnosis, focusing on morphological irregularities like asymmetry and abnormal cell counts. Traditional manual diagnosis via microscopic blood smear images suffers from reduced reliability, time intensiveness, and observer variability. Computer-aided diagnostic (CAD) systems address these challenges. Integrating real-time image pre-processing and segmentation ensures swift operation, reducing the CAD system processing time. This enhances its overall effectiveness, enabling timely medical intervention and better patient outcomes. This study aims to simplify the algorithmic complexity of pre-processing steps, including bilateral filtering and Contrast-Limited Adaptive Histogram Equalization (CLAHE), alongside the segmentation stage involving morphological operations and the watershed algorithm. This work proposes a parallel implementation utilizing OpenMP and CUDA, evaluating its performance using accuracy and Intersection over Union (IoU) metrics along with computing time and algorithmic complexity. It highlights the benefits of parallel processing in enhancing efficiency and and accuracy in blood cell analysis.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
To address these issues, this paper introduces a real-time on-board satellite cloud cover detection system based on a lightweight neural network. By discarding excessively cloudy images, the proposed approach can lead to an improvement in the efficiency and accuracy of satellite image-based systems. At the same time, it allows to minimize the data to be transmitted to the ground, consequently mitigating bandwidth problems and reducing transmission power. The proposed CNN shows a compact architecture, requiring fewer than 9 thousand parameters, while maintaining a detection accuracy of 89% when evaluated using the Landsat 8 dataset. An optimized hardware accelerator is designed to meet the on-board nanosatellites constraints. Post-implementation simulations on a Xilinx Artix 7 FPGA demonstrate state-of-the-art results with a utilization of about 12 thousand and 7 thousand of mapped LUTs and FFs, respectively, with a power consumption of 116 mW.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The present work aims to improve on the existing solutions for inverting the discrete Radon transform (DRT) by using less data, reducing computational cost, and ensuring well-conditioned and stable algorithms for the inversion.
An analytical framework and a heuristic for finding possible inverse algorithms have been proposed. The study suggests an approach for finding a fast algorithm with a complexity of O(N2 log2N) by analyzing operation trees for consecutive input sizes.
The study also discusses the impact of noise on the proposed solutions, showing that the proposed algorithms lead to a better approximation than one iteration of Press’ inversion for added random error up to 40% of the signal’s magnitude. However, restricting the number of quadrants used in the algorithm leads to increased error.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Stroke is a devastating and life-threatening medical condition that demands immediate intervention. Timely diagnosis and treatment are paramount in reducing mortality and mitigating long-term disabilities associated with stroke. This research aims to address these critical needs by proposing a real-time stroke detection system based on Deep Learning (DL) with the incorporation of Federated Learning (FL), which offers improved accuracy and privacy preservation. The purpose of this research is to develop an efficient and accurate model capable of distinguishing between stroke and non-stroke cases in real-time, assisting healthcare professionals in making rapid and informed decisions. Stroke detection has traditionally relied on manual interpretation of medical images, which is time-consuming and prone to human error. DL techniques have shown significant promise in automating this process, but the need for large and diverse datasets, as well as privacy concerns, remains challenging. To achieve this goal, our methodology involves training the DL model on extensive datasets containing both stroke and non-stroke medical images. This training process will enable the model to learn complex patterns and features associated with stroke, thereby improving its diagnostic accuracy. Furthermore, we will employ Federated Learning, a decentralized training approach, to enhance privacy while maintaining model performance. This approach allows the model to learn from data distributed across multiple healthcare institutions without sharing sensitive patient information. The proposed approach has been executed on NVIDIA platforms, taking advantage of their advanced GPU capabilities to enable real-time processing and analysis. This optimized model has the potential to revolutionize stroke diagnosis and patient care, ultimately saving lives and improving the quality of healthcare services in the field of neurology.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Background: The evolution of AI applications in dental imaging, covering caries detection, anatomical structure segmentation, and pathology identification, highlights the importance of high-quality datasets for effective detection models. This paper focuses on optimizing dataset quality for real-time AI-based dental bitewing radiograph detection.
Methods: We systematically analyze preprocessing methods suitable for dental bitewing radiographs, covering image enhancement, noise reduction, and contrast adjustment. These techniques are strategically chosen to address common challenges in dental radiograph images, including variations in lighting, contrast disparities, and noise fluctuations. We employ optimized algorithms to meet real-time constraints, ensuring efficient model training and inference.
Results: Our study assesses the impact of each preprocessing step on dataset quality and its influence on AI model performance. Practical recommendations are provided to empower researchers and practitioners in creating datasets optimized for dental bitewing radiograph detection tasks, aiming to improve AI model accuracy while adhering to real-time requirements. In addition, a comparative analysis is conducted, evaluating datasets enhanced using conventional methods against the ResNet18 model for the segmentation of bitewing dental images.
Conclusion: This paper serves as a valuable guide for the dental imaging community, offering insights into preprocessing steps that elevate dataset quality for AI-driven dental bitewing radiograph detection. By emphasizing the relevance of real-time performance and providing a comparison with conventional enhancements on the ResNet18 model, we contribute to advancing early diagnosis and enhancing oral healthcare outcomes.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The article proposes an approach to the development of computationally simple and fast algorithms for data preprocessing and the selection of stable features. The following algorithms are used: 1. a modified method of multicriteria processing in local windows. The method is based on minimizing the objective function, which allows both to reduce the noise component in locally stationary areas and to preserve and strengthen the transition boundaries; 2. The method of reducing the scope of clusters allows you to change the number of color histograms with the absorption of nearby areas and preservation of objects; 3. The method of non-local change in color balance allows you to select areas on a dark/light background when the color balance is shifted; 4. Edge detector based on the analysis of local areas in various data layers.
The effectiveness test was carried out on a set of test images obtained by the flip chip machine, images by a microcircuit analyzer, as well as data from the product production line. The analyzation frames had low resolution and poor lighting. Images are captured in RGB color space.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Augmented reality is a visualization technology that displays information by adding virtual images to the real world. Effective implementation of augmented reality requires recognition of the current scene. Identifying objects in real-time video on computationally limited hardware requires significant effort. One way to solve this problem is to create a hybrid system that, based on machine learning and computer vision technology, processes and analyzes visual data to identify and classify real-world objects. The proposed architecture is based on a combination of the Vuforia augmented system, which provides good performance by balancing prediction accuracy and efficiency. First, the Vuforia neural network architecture allows convenient interaction with AR in Unity and provides initial conditions for detecting 3D objects. The augmented reality construction algorithm is based on the ARCore framework and the OpenGL interface for embedded systems. The system integrates recognition data with an AR platform to display corresponding 3D models, allowing users to interact with them through the functionality of the AR application. This method also involves the development of an enhanced user interface for AR, making the augmented environment more accessible for navigation and control. Experimental research has shown that the proposed method significantly improves the accuracy of object recognition and the ease of working with 3D models in AR.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Approximately 15% of the world's population faces some form of disability, with 2-4% experiencing significant challenges in using their hands and legs to meet their daily needs. This global estimate for disabilities is rising, primarily due to an aging population and the increasing prevalence of chronic diseases. Nonetheless, individuals with disabilities can still contribute as self-reliant members of society. In this paper, we present a system designed to empower people with disabilities by enabling them to independently perform daily tasks by precisely controlling their home devices using only their eye movements. The system comprises an infrared (IR) camera and a Raspberry Pi, which processes live video captured by the IR camera and performs eye-tracking tasks using the OpenCV library for Python. A microcontroller (Arduino) is linked to the home devices, enabling them to be controlled based on commands received from the Raspberry Pi.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.