A Kazakhstani Research Project Aims To Create the First Kazakh Sign Language Corpus
How Nazarbayev University research team works to advance research on sign languages
The world is changing very fast so that people have already gotten used to expecting that there exists or there will be innovative solutions for everyday tasks. Fortunately, due to high-tech efforts around the world including Kazakhstan, we witness how the world around us becomes a “better place” to live. No matter, whether it is in business, education, medicine, military or social service, there are numerous individuals who are challenging themselves to come up with ways to solve problems and issues that our society faces. So, today’s story is about an Assistant Professor at Nazarbayev University, School of Engineering and Digital Sciences.
Anara Sandygulova, PhD in Computer Science, works as an Assistant Professor at the Department of Robotics and Mechatronics of the School of Engineering and Digital Sciences. Currently, she is leading a research project team on Kazakh Sign Language Automatic Recognition System, shortly K-SLARS. The project has already started and it is planned to last for 36 months funded by NU Faculty Development Program.
Anara Sandygulova explained that deaf communities around the world use sign language as their first language which is independent from the spoken language used in that country. For example, American Sign Language and British Sign Language are quite different even though these countries’ spoken language is English. Similarly, each country or region has its own sign language of varying grammar and rules, leading to a few hundreds of sign languages that exist today. There are more than 18 thousands of deaf and hard-of-hearing individuals in Kazakhstan. According to Anara, Kazakhstan shares the same sign language with Russia, Moldova, and other countries of CIS region which might be due to a very centralized system that used to be around the whole Soviet Union.
“Research on sign language recognition, generation, and translation has a high potential for impact. While automatic speech recognition has progressed to being commercially available, automatic sign language recognition is still in its infancy. Many innovative solutions exist to support spoken languages (both oral and written forms) but many deaf individuals are not fluent in a spoken language of the countries they live in. Thus, they are often isolated from society, and have social and communication barriers in every aspect of their lives. If speech processing solutions such as YouTube’s captioning existed for sign languages with automatic replacement of the text to sign videos, deaf individuals would be able to take advantage of the online content in order to gain new competencies. However, one of the main constraints is the availability of large, generalizable, realistic sign language datasets.”
Anara explains that “signs in sign languages are composed of phonological components put together under certain rules. Linguists identify the following main components present in signs: handshapes, location on the body, movement, orientation, facial expressions, and lip-patterns. This project aims to create the first corpus of Kazakh Sign Language that would be appropriate for machine learning and linguistics research. As with any video dataset, manual annotation of sign languages (manual and non-manual components) is extremely time and resource consuming. We aim to create a semi-automatic annotation tool, which will automatically annotate manual and non-manual components, thus contributing to a faster creation of annotated datasets. At the same time, the algorithms will be further applied to automatic sign language recognition for various human-computer/robot interaction applications” she said. ssSSSs
The main motivating factor of the project is the necessity of thoroughly and systematically organized data to process KSL. “Similar datasets exist for other sign languages around the globe, but they are often quite limited in the vocabulary size, signer variability, and contain unrealistic signing as its often slower and has simpler interpretation. That’s why K-SLARS aims to collect a signer independent, realistic dataset using crowdsourcing techniques” she added.
The research project takes place at Nazarbayev University’s School of Engineering and Digital Science research laboratories. The research has attracted several local specialists including MS and PhD students. We closely collaborate with a sign language linguist, Associate Professor Dr. Vadim Kimmelman from Bergen University in Norway, whose expertise of Russian Sign Language is crucial for the success of the project. The project is expected to ensure compliance with the principles of scientific ethics, ethical management procedures, maintaining high standards of intellectual honesty and avoiding the fabrication of scientific data, falsification, plagiarism and false co-authorship.
“Our team has already secured two publications in prestigious international conferences. The project will end up having a combination of software, datasets, know-how and results which will be considered as Intellectual Property” she said. The expected results would be a new sign language dataset and a special web-based semi-automatic annotation tool for sign languages.