TY - GEN
T1 - Enhancing Accessibility
T2 - 3rd IEEE Conference on Artificial Intelligence, CAI 2025
AU - Singh, Siddharth Kumar
AU - Siddiqui, Mohammed Bakhtiyar
AU - Zhu, Michelle
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - This paper presents an innovative application that enables control of multimedia content through voice commands and static hand gestures, offering a transformative solution for individuals with disabilities. The application provides an intuitive, natural method for interacting with slides, images, videos, and audio, bypassing traditional input devices like keyboards and mice. It enhances accessibility and user experience by accommodating users with mobility impairments or low vision. Another key feature is the virtual writing capability, allowing users to draw or erase in mid-air and project the writing and drawing onto the screen, providing greater flexibility. The software leverages advanced technologies such as MediaPipe for gesture recognition, SpeechRecognition for voice command processing, and PyAutoGUI for automation, making it suitable for various environments, including personal entertainment, classroom, and conference settings. Additionally, users can customize hand gestures for specific commands, with newly captured gestures used to retrain the model. This paper outlines the system architecture, implementation, and performance evaluation, and discusses future developments.
AB - This paper presents an innovative application that enables control of multimedia content through voice commands and static hand gestures, offering a transformative solution for individuals with disabilities. The application provides an intuitive, natural method for interacting with slides, images, videos, and audio, bypassing traditional input devices like keyboards and mice. It enhances accessibility and user experience by accommodating users with mobility impairments or low vision. Another key feature is the virtual writing capability, allowing users to draw or erase in mid-air and project the writing and drawing onto the screen, providing greater flexibility. The software leverages advanced technologies such as MediaPipe for gesture recognition, SpeechRecognition for voice command processing, and PyAutoGUI for automation, making it suitable for various environments, including personal entertainment, classroom, and conference settings. Additionally, users can customize hand gestures for specific commands, with newly captured gestures used to retrain the model. This paper outlines the system architecture, implementation, and performance evaluation, and discusses future developments.
KW - Accessible Computing
KW - Disabled Population
KW - Hand Gestures
KW - Natural Multimedia Interaction
KW - Voice Commands
UR - https://www.scopus.com/pages/publications/105011263350
U2 - 10.1109/CAI64502.2025.00048
DO - 10.1109/CAI64502.2025.00048
M3 - Conference contribution
AN - SCOPUS:105011263350
T3 - Proceedings - 2025 IEEE Conference on Artificial Intelligence, CAI 2025
SP - 265
EP - 270
BT - Proceedings - 2025 IEEE Conference on Artificial Intelligence, CAI 2025
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 5 May 2025 through 7 May 2025
ER -