Inspired by the ”What Matters in Unsupervised Optical Flow” study, the goal of this work is to evaluate the performance of the ARFlow architecture for unsupervised optical flow in the context of tracking keypoints in laparoscopic videos. This assessment could provide insight into the applicability of ARFlow and similar architectures for this particular application, as well as their strengths and limitations. To do so, we use the SurgT challenge’s dataset and metrics to evaluate the tracker’s accuracy and robustness and its relationship with distinct network components. Our results corroborate some of the findings reported by Jonschkowski et al. However, certain components demonstrate a distinct behavior, possibly indicating underlying issues, namely intrinsic to the application, that impact overall performance and which may have to be addressed in the context of soft-tissue trackers. These results point to potential bottlenecks and areas where future work may target on.
Surgical instrument segmentation in laparoscopy is essential for computer-assisted surgical systems. Despite the Deep Learning progress in recent years, the dynamic setting of laparoscopic surgery still presents challenges for precise segmentation. The nnU-Net framework excelled in semantic segmentation analyzing single frames without temporal information. The framework’s ease of use, including its ability to be automatically configured, and its low expertise requirements, have made it a popular base framework for comparisons. Optical flow (OF) is a tool commonly used in video tasks to estimate motion and represent it in a single frame, containing temporal information. This work seeks to employ OF maps as an additional input to the nnU-Net architecture to improve its performance in the surgical instrument segmentation task, taking advantage of the fact that instruments are the main moving objects in the surgical field. With this new input, the temporal component would be indirectly added without modifying the architecture. Using CholecSeg8k dataset, three different representations of movement were estimated and used as new inputs, comparing them with a baseline model. Results showed that the use of OF maps improves the detection of classes with high movement, even when these are scarce in the dataset. To further improve performance, future work may focus on implementing other OF-preserving augmentations.
Examination of head shape during the fetal period is an important task to evaluate head growth and to diagnose fetal abnormalities. Traditional clinical practice frequently relies on the estimation of head circumference (HC) from 2D ultrasound (US) images by manually fitting an ellipse to the fetal skull. However, this process tends to be prone to observer variability, and therefore, automatic approaches for HC delineation can bring added value for clinical practice. In this paper, an automatic method to accurately delineate the fetal head in US images is proposed. The proposed method is divided into two stages: (i) head delineation through a regression convolutional neural network (CNN) that estimates a gaussian-like map of the head contour; and (ii) robust ellipse fitting using a registration-based approach that combines the random sample consensus (RANSAC) and iterative closest point (ICP) algorithms. The proposed method was applied to the HC18 Challenge dataset, which contains 999 training and 335 testing images. Experiments showed that the proposed strategy achieved a mean average difference of -0.11 ± 2.67 mm and a Dice coefficient of 97.95 ± 1.12% against manual annotation, outperforming other approaches in the literature. The obtained results showed the effectiveness of the proposed method for HC delineation, suggesting its potential to be used in clinical practice for head shape assessment.
Deformational Plagiocephaly (DP) refers to an asymmetrical distortion of an infant’s skull resulting from external forces applied over time. The diagnosis of this condition is performed using asymmetry indexes that are estimated from specific anatomical landmarks, whose are manually defined on head models acquired using laser scans. However, this manual identification is susceptible to intra-/inter-observer variability, being also time-consuming. Therefore, automatic strategies for the identification of the landmarks and, consequently, extraction of asymmetry indexes, are claimed. A novel pipeline to automatically identify these landmarks on 3D head models and to estimate the relevant cranial asymmetry indexes is proposed. Thus, a template database is created and then aligned with the unlabelled patient through an iterative closest point (ICP) strategy. Here, an initial rigid alignment followed by an affine one are applied to remove global misalignments between each template and the patient. Next, a non-rigid alignment is used to deform the template information to the patient-specific shape. The final position of each landmark is computed as a local weight average of all candidate results. From the identified landmarks, a head’s coordinate system is automatically estimated and later used to estimate cranial asymmetry indexes. The proposed framework was evaluated in 15 synthetic infant head’s model. Overall, the results demonstrated the accuracy of the identification strategy, with a mean average distance of 2.8±0.6 mm between the identified landmarks and the ground-truth. Moreover, for the estimation of cranial asymmetry indexes, a performance comparable to the inter-observer variability was achieved.
Deformational plagiocephaly (DP) is a cranial deformity characterized by an asymmetrical distortion of an infant’s skull. The diagnosis and evaluation of DP are performed using cranial asymmetry indexes obtained from cranial measurements, which can be estimated using anthropometric landmarks of the infant’s head. However, manual labeling of these landmarks is a time-consuming and tedious task, being also prone to observer variability. In this paper, a novel framework to automatically detect anthropometric landmarks of 3D infant’s head models is described. The proposed method is divided into two stages: (i) unfolding of the 3D head model surface; and (ii) landmarks’ detection through a deep learning strategy. In the first stage, an unfolding strategy is used to transform the 3D mesh of the head model to a flattened 2D version of it. From the flattened mesh, three 2D informational maps are generated using specific head characteristics. In the second stage, a deep learning strategy is used to detect the anthropometric landmarks in a 3-channel image constructed using the combination of informational maps. The proposed framework was validated in fifteen 3D synthetic models of infant’s head, being achieved, in average for all landmarks, a mean distance error of 3.5 mm between the automatic detection and a manually constructed ground-truth. Moreover, the estimated cranial measurements were comparable to the ones obtained manually, without statistically significant differences between them for most of the indexes. The obtained results demonstrated the good performance of the proposed method, showing the potential of this framework in clinical practice.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.