m2caiSeg: Semantic Segmentation of Laparoscopic Images using Convolutional Neural Networks

August 23, 2020 · Entered Twilight · 🏛 arXiv.org

"Last commit was 5.0 years ago (≥5 year threshold)"

Evidence collected by the PWNC Scanner

Repo contents: README.md, datasets, evaluate.py, evaluate.sh, finetuneMiccaiSeg.sh, mainMiccaiRecon.py, mainMiccaiSeg.py, mainMiccaiSegPlusClass.py, model, trainMiccaiRecon.sh, trainMiccaiSeg.sh, trainMiccaiSegPlusClass.sh, utils.py

Authors Salman Maqbool, Aqsa Riaz, Hasan Sajid, Osman Hasan arXiv ID 2008.10134 Category cs.CV: Computer Vision Cross-listed cs.AI, cs.LG Citations 34 Venue arXiv.org Repository https://github.com/salmanmaq/segmentationNetworks ⭐ 8 Last Checked 1 month ago

Abstract

Autonomous surgical procedures, in particular minimal invasive surgeries, are the next frontier for Artificial Intelligence research. However, the existing challenges include precise identification of the human anatomy and the surgical settings, and modeling the environment for training of an autonomous agent. To address the identification of human anatomy and the surgical settings, we propose a deep learning based semantic segmentation algorithm to identify and label the tissues and organs in the endoscopic video feed of the human torso region. We present an annotated dataset, m2caiSeg, created from endoscopic video feeds of real-world surgical procedures. Overall, the data consists of 307 images, each of which is annotated for the organs and different surgical instruments present in the scene. We propose and train a deep convolutional neural network for the semantic segmentation task. To cater for the low quantity of annotated data, we use unsupervised pre-training and data augmentation. The trained model is evaluated on an independent test set of the proposed dataset. We obtained a F1 score of 0.33 while using all the labeled categories for the semantic segmentation task. Secondly, we labeled all instruments into an 'Instruments' superclass to evaluate the model's performance on discerning the various organs and obtained a F1 score of 0.57. We propose a new dataset and a deep learning method for pixel level identification of various organs and instruments in a endoscopic surgical scene. Surgical scene understanding is one of the first steps towards automating surgical procedures.