TY - GEN
T1 - Identification of Cyanobacteria for Harmful Algal Blooms Research Using the YOLO Framework
AU - Li, Benjamin
AU - Serrano, Karen
AU - Mazzaro, Melissa
AU - Wu, Meiyin
AU - Wang, Weitian
AU - Zhu, Michelle
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Cyanobacteria, an ancient type of photosynthetic microbe, inhabit most fresh and marine water on Earth. The rapid growth of cyanobacteria can lead to Harmful Algal Blooms (HABs), posing major threats to water quality and aquatic ecosystems. Rapid and accurate identification of cyanobacteria is essential for population monitoring and mitigation efforts, especially when cyanobacteria produce toxins, threatening the health of wildlife and humans. However, the diverse shapes and appearances of cyanobacteria render manual identification time-consuming and error-prone. In this study, we make multiple novel contributions to the field of microscopic cyanobacterial identification using computer vision algorithms. To begin, we utilize the YOLOv5 algorithm, known for its speed and accuracy, which has never been evaluated for its efficacy in this field. Additionally, we propose numerous methods of addressing limited dataset size and image heterogeneity. We use various image pre-processing techniques, including color-preserving CLAHE. We also construct a comprehensive dataset containing several genera of cyanobacteria by supplementing laboratory images with opensource database images for training and evaluation. To combat overfitting and avoid unrealistic model performance values, we evaluate detection performance on common microscope artifacts (detritus and water bubbles), incorporate 'background images', which contain unrelated microorganisms into the dataset, and utilize image augmentation conservatively. Finally, hyperparameter tuning was used with a genetic algorithm to optimize a specified fitness function. The final model outperformed the Faster R-CNN model used in previous literature, achieving average precision values ranging from 70% to 90% for five commonly found, toxin-producing cyanobacteria taxa in the USA, representing state-of the-art performance and great potential for usage by biologists investigating HABs.
AB - Cyanobacteria, an ancient type of photosynthetic microbe, inhabit most fresh and marine water on Earth. The rapid growth of cyanobacteria can lead to Harmful Algal Blooms (HABs), posing major threats to water quality and aquatic ecosystems. Rapid and accurate identification of cyanobacteria is essential for population monitoring and mitigation efforts, especially when cyanobacteria produce toxins, threatening the health of wildlife and humans. However, the diverse shapes and appearances of cyanobacteria render manual identification time-consuming and error-prone. In this study, we make multiple novel contributions to the field of microscopic cyanobacterial identification using computer vision algorithms. To begin, we utilize the YOLOv5 algorithm, known for its speed and accuracy, which has never been evaluated for its efficacy in this field. Additionally, we propose numerous methods of addressing limited dataset size and image heterogeneity. We use various image pre-processing techniques, including color-preserving CLAHE. We also construct a comprehensive dataset containing several genera of cyanobacteria by supplementing laboratory images with opensource database images for training and evaluation. To combat overfitting and avoid unrealistic model performance values, we evaluate detection performance on common microscope artifacts (detritus and water bubbles), incorporate 'background images', which contain unrelated microorganisms into the dataset, and utilize image augmentation conservatively. Finally, hyperparameter tuning was used with a genetic algorithm to optimize a specified fitness function. The final model outperformed the Faster R-CNN model used in previous literature, achieving average precision values ranging from 70% to 90% for five commonly found, toxin-producing cyanobacteria taxa in the USA, representing state-of the-art performance and great potential for usage by biologists investigating HABs.
KW - CNN
KW - HABs
KW - YOLO framework
KW - cyanobacteria
KW - machine learning
UR - http://www.scopus.com/inward/record.url?scp=85179754176&partnerID=8YFLogxK
U2 - 10.1109/UEMCON59035.2023.10316078
DO - 10.1109/UEMCON59035.2023.10316078
M3 - Conference contribution
AN - SCOPUS:85179754176
T3 - 2023 IEEE 14th Annual Ubiquitous Computing, Electronics and Mobile Communication Conference, UEMCON 2023
SP - 407
EP - 415
BT - 2023 IEEE 14th Annual Ubiquitous Computing, Electronics and Mobile Communication Conference, UEMCON 2023
A2 - Chakrabarti, Satyajit
A2 - Paul, Rajashree
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 14th IEEE Annual Ubiquitous Computing, Electronics and Mobile Communication Conference, UEMCON 2023
Y2 - 12 October 2023 through 14 October 2023
ER -