School of Science and Technology

Learning embedding space for Latin dance


Understanding the messages conveyed by humans is one of the fundamental capabilities necessary for successful human-machine interaction. In the past, the research was mainly focused on developing methods to understand unambiguous messages expressed through written text, spoken language, facial expressions and hand gestures. However, humans communicate not only through spoken language, written text and gestures, but also through the motion of their bodies as a whole. Therefore, to enable robots to understand humans better, it is necessary to enable them to understand the whole spectrum of humans body motions.


The goal of the proposed project is to conduct the fundamental research on the categorisation of human's body motion. In this project, you will especially focus on motions executed during a dance. The key objective is to compare and evaluate different embedding models and comparison of their performance.


  • Preparation and analysis of recorded mocap data.
  • Learning Embedding space for mocap data with deep learning.
  • Comparison of different embedding models (i.e. variational auto-encoder, triple encoding, Siamese networks)-

Necesary skills

  • Programming in Python
  • Basic knowledge of machine learning
  • Experience with deep learning a Pytorch is a plus



Johannes Stork and Tomasz Kucner