A new open source dataset links human motion and language


Credit: Mary Ann Liebert, Inc., publishers

Researchers have created a large, open source database to support the development of robot activities based on natural language input. The new KIT Motion-Language Dataset will help to unify and standardize research linking human motion and natural language, as presented in an article in Big Data, a peer-reviewed journal from Mary Ann Liebert, Inc., publishers. The article is available free on the Big Data website until March 9, 2017.

In the article "The KIT Motion-Language Dataset," Matthias Plappert, Christian Mandery, and Tamim Asfour, Institute for Anthropomatics and Robotics, Karlsruhe Institute of Technology (KIT), Germany, describe a novel crowd-sourcing approach and purpose-built web-based tool they used to develop their publicly available dataset that annotates motions. Their approach relies on a unified representation that is independent of the capture system or marker set to be able to merge data from different existing motion capture databases into the KIT Motion-Language Dataset. It currently includes about 4,000 motions and more than 6,200 annotations in natural language that contain nearly 53,000 words.

The article is part of a special issue of Big Data on "Big Data in Robotics" led by Guest Editors Jeannette Bohg, PhD, Matei Ciocarlie, PhD, Jaview Civera, PhD, and Lydia Kavraki, PhD.

"Human motion is complex and nuanced in terms of how it can be described, and it is surprisingly difficult to even retrieve motions from databases corresponding to natural language descriptions. There is a great need to describe robotic systems in natural language that captures the richness associated with motion, but doing this accurately is an extremely challenging problem," says Big Data Editor-in-Chief Vasant Dhar, Professor at the Stern School of Business and the Center for Data Science at New York University. "Plappert and his colleagues do a wonderful job using a novel crowd-sourcing approach and a tool to document the annotation process itself along with methods for obtaining high quality inputs and selecting motions that require further annotation automatically. They have constructed an impressive database of motions and annotations that can serve as a test-bed for research in this area. It is a great service to the research community."


About the Journal

Big Data, published quarterly online with open access options and in print, facilitates and supports the efforts of researchers, analysts, statisticians, business leaders, and policymakers to improve operations, profitability, and communications within their organizations. Spanning a broad array of disciplines focusing on novel big data technologies, policies, and innovations, the Journal brings together the community to address the challenges and discover new breakthroughs and trends living within this information. Complete tables of content and a sample issue may be viewed on the Big Data website.

About the Publisher

Mary Ann Liebert, Inc., publishers is a privately held, fully integrated media company known for establishing authoritative medical and biomedical peer-reviewed journals, including OMICS: A Journal of Integrative Biology, Journal of Computational Biology, New Space, and 3D Printing and Additive Manufacturing. Its biotechnology trade magazine, GEN (Genetic Engineering & Biotechnology News), was the first in its field and is today the industry's most widely read publication worldwide. A complete list of the firm's more than 80 journals, newsmagazines, and books is available on the Mary Ann Liebert, Inc., publishers website.

Media Contact

Kathryn Ryan
[email protected]


Leave A Reply

Your email address will not be published.