Vassilis Athitsos VLM Lab

ASL Dictionary Search Project

This project has been funded by NSF grants IIS-0705749, IIS-1055062 and CNS-1059235, and was conducted in collaboration with Professors Stan Sclaroff and Carol Neidle. The goal is to develop a system that lets users search dictionaries of American Sign Language (ASL) so as to look up the meaning of unknown signs. In the proposed system, the user submits as query a video of the sign of interest, or simply performs the sign in front of a camera. The system then searches a large database of sign videos in order to find the best matches for the query video, and presents the top results to the user. The user can then visually inspect the top results to verify which (if any) of them best matches the query sign.

A key part of this project is extensive data collection of video examples of signs. We have created the ASL Lexicon Video Dataset, described in this publication.

prototype

Figure 1: This image shows a sample result of an early prototype system that allows users to search for the meaning of a sign. The user provides as input a video sequence of a sign. Then, the computer searches through the videos of signs contained in the system database, and presents to the user the most similar matches, and English translations for those matches. The process is considered successful if those top matches include the sign that the user was searching for, since in that case the user can see the translation for that sign, and can also click on that sign to obtain more information. The top matches are determined automatically, using computer vision and machine learning techniques. The user can visually inspect each of the top matches to determine which (if any) of them is the correct match.

Designing techniques that produce accurate results as often as possible is a key challenge in this research project. Another key challenge is designing indexing methods that allow for efficient database search despite the vast amount of video data stored in the database. Over the years we have proposed several novel methods for these problems, as described in our publications. We are still actively working on these problems, as there is still significant room for improvement in the performance obtained by current methods.

References

Radu Tudor Ionescu, Marius Popescu, Christopher Conly, and Vassilis Athitsos.
Local Frame Match Distance: A Novel Approach for Exemplar Gesture Recognition.
European Signal Processing Conference (EUSIPCO), August 2017.
[PDF 245KB]
Alex Dillhoff, Himanshu Pahwa, Christopher Conly, and Vassilis Athitsos.
Providing Meaningful Alignments for Periodic Signs.
Pervasive Technologies Related to Assistive Environments (PETRA), June 2017.
[PDF 1.3MB]
Sakher Ghanem, Christopher Conly, and Vassilis Athitsos.
A Survey on Sign Language Recognition Using Smartphones.
Pervasive Technologies Related to Assistive Environments (PETRA), June 2017.
[PDF 250KB]
Christopher Conly, Alex Dillhoff, and Vassilis Athitsos.
Leveraging Intra-Class Variations to Improve Large Vocabulary Gesture Recognition.
International Conference on Pattern Recognition (ICPR), December 2016.
[PDF 683KB]
Srujana Gattupalli, Amir Ghaderi, and Vassilis Athitsos.
Evaluation of Deep Learning based Pose Estimation for Sign Language.
Pervasive Technologies Related to Assistive Environments (PETRA), June 2016.
[PDF 1.3MB]
Christopher Conly, Zhong Zhang, and Vassilis Athitsos.
An Integrated RGB-D System for Looking Up the Meaning of Signs.
Pervasive Technologies Related to Assistive Environments (PETRA), July 2015.
[PDF 920KB]
Jun Wan, Vassilis Athitsos, Pat Jangyodsuk, Hugo Jair Escalante, Qiuqi Ruan, Isabelle Guyon.
CSMMI: Class-Specific Maximization of Mutual Information for Action and Gesture Recognition.
IEEE Transactions on Image Processing, 23(7), 3152-3165, July 2014.
[PDF 5.3MB]
Pat Jangyodsuk, Christopher Conly, and Vassilis Athitsos.
Sign Language Recognition using Dynamic Time Warping and Hand Shape Distance Based on Histogram of Oriented Gradient Features.
Pervasive Technologies Related to Assistive Environments (PETRA), May 2014.
[PDF 267KB]
Christopher Conly, Zhong Zhang, and Vassilis Athitsos.
An Evaluation of RGB-D Skeleton Tracking for Use in Large Vocabulary Complex Gesture Recognition.
Pervasive Technologies Related to Assistive Environments (PETRA), May 2014.
[PDF 3.7MB]
Paul Doliotis, Vassilis Athitsos, Dimitrios Kosmopoulos, and Stavros Perantonis.
Hand Shape and 3D Pose Estimation using Depth Data from a Single Cluttered Frame.
International Symposium on Visual Computing (ISVC), July 2012.
[PDF 238KB]
Haijing Wang, Alexandra Stefan, Sajjad Moradi, Vassilis Athitsos, Carol Neidle, and Farhad Kamangar.
A System for Large Vocabulary Sign Search.
Workshop on Sign, Gesture and Activity (SGA), September 2010.
[Postscript 5.0MB] [PDF 253KB]
Vassilis Athitsos, Haijing Wang, and Alexandra Stefan.
A Database-Based Framework for Gesture Recognition.
Journal of Personal and Ubiquitous Computing, 14(6), pages 511-526, September 2010.
[Postscript 33.0MB] [PDF 450KB]
Jonathan Alon, Vassilis Athitsos, Quan Yuan, and Stan Sclaroff.
A Unified Framework for Gesture Recognition and Spatiotemporal Gesture Segmentation.
IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 31(9), pages 1685-1699, September 2009.
[Postscript 10.3MB] [PDF 405KB]
Haijing Wang, Alexandra Stefan, and Vassilis Athitsos.
A Similarity Measure for Vision-Based Sign Recognition.
Invited submission to International Conference on Universal Access in Human-Computer Interaction (UAHCI), pages 607-616, July 2009.
[Postscript 6.6MB] [PDF 204KB]
Alexandra Stefan, Vassilis Athitsos, Quan Yuan, and Stan Sclaroff.
Reducing JointBoost-Based Multiclass Classification to Proximity Search.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2009.
[Postscript 229KB] [PDF 114KB]
Zheng Wu, Margrit Betke, Jingbin Wang, Vassilis Athitsos, and Stan Sclaroff.
Tracking with Dynamic Hidden State Shape Models.
European Conference on Computer Vision (ECCV), pages 643-656, October 2008.
[Postscript 18.2MB] [PDF 493KB]
Vassilis Athitsos, Carol Neidle, Stan Sclaroff, Joan Nash, Alexandra Stefan, Quan Yuan, and Ashwin Thangali.
The American Sign Language Lexicon Video Dataset.
IEEE Workshop on Computer Vision and Pattern Recognition for Human Communicative Behavior Analysis (CVPR4HB), June 2008.
[Best paper award]
[Postscript 30.8MB] [PDF 277KB]

Vassilis Athitsos VLM Lab