Demo Publication(s)

Introduction

This is an ongoing project in collaboration with Jules Françoise, in which we are trying to teach a deep learning model to generate beat-synchronous dance movements for a given song, as well as matching movement patterns with the musical patterns. We have created a database of synchronized groove moves/songs for the training data.

Rather than a supervised approach, we are treating this as an unsupervised learning problem. For each song, we extract the audio descriptors and train a multi-modal neural network both the audio descriptors and joint rotations.

I will be updating this page as we make progress...

The Approach

Preliminary Results - April 2017

As submitted to the Workshop on Machine Learning for Creativity: PDF.

Learning and Generating Movement Patterns

Dancing with Training Songs

  • FCRBM - Cooked Features
    Hidden Units: 500 | Factors: 500 | Order: 30 | Frame Rate: 60
    Audio Features: 84-Dimensions:
    low-level features (RMS level, Bark bands)
    spectral features (energy in low/middle/high frequencies, spectral centroid, spectral spread, spectral skewness, spectral kurtosis, spectral rolloff, spectral crest, spectral flux, spectral complexity),
    timbral Features (Mel-Frequency Cepstral Coefficients, Tristimulus),
    melodic Features (pitch, pitch salience and confidence, inharmonicity, dissonance).

    Based on audio track 1: Output 1Output 2Output 3
    Based on audio track 2: Output 4Output 5Output 6
    Based on audio track 3: Output 7Output 8Output 9

Dancing with Unheard Songs

  • FCRBM - Cooked Features
    Hidden Units: 500 | Factors: 500 | Order: 30 | Frame Rate: 60
    Audio Features: 84-Dimensions:
    low-level features (RMS level, Bark bands)
    spectral features (energy in low/middle/high frequencies, spectral centroid, spectral spread, spectral skewness, spectral kurtosis, spectral rolloff, spectral crest, spectral flux, spectral complexity),
    timbral Features (Mel-Frequency Cepstral Coefficients, Tristimulus),
    melodic Features (pitch, pitch salience and confidence, inharmonicity, dissonance).

    Output 1Output 2Output 3Output 4Output 5Output 6

Fun Outputs

Fun 1Fun 2Fun 3Fun 4Fun 5

Publications

  • Omid Alemi, Jules Françoise, and Philippe Pasquier. "GrooveNet: Real-Time Music-Driven Dance Movement Generation using Artificial Neural Networks". Accepted to the Workshop on Machine Learning for Creativity, 23rd ACM SIGKDD Conference on Knowledge Discovery and Data Mining. Halifax, Nova Scotia - Canada. 2017. PDF.