Setup

pods

In Sheffield we created a suite of software tools for ‘Open Data Science.’ Open data science is an approach to sharing code, models and data that should make it easier for companies, health professionals and scientists to gain access to data science techniques.

You can also check this blog post on Open Data Science.

The software can be installed using

%pip install --upgrade git+https://github.com/lawrennd/ods

from the command prompt where you can access your python installation.

The code is also available on github: https://github.com/lawrennd/ods

Once pods is installed, it can be imported in the usual manner.

import pods

mlai

[edit]

The mlai software is a suite of helper functions for teaching and demonstrating machine learning algorithms. It was first used in the Machine Learning and Adaptive Intelligence course in Sheffield in 2013.

The software can be installed using

%pip install --upgrade git+https://github.com/lawrennd/mlai.git

from the command prompt where you can access your python installation.

The code is also available on github: https://github.com/lawrennd/mlai

Once mlai is installed, it can be imported in the usual manner.

import mlai

%pip install gpy

GPy: A Gaussian Process Framework in Python

[edit]

Gaussian processes are a flexible tool for non-parametric analysis with uncertainty. The GPy software was started in Sheffield to provide a easy to use interface to GPs. One which allowed the user to focus on the modelling rather than the mathematics.

Figure: GPy is a BSD licensed software code base for implementing Gaussian process models in Python. It is designed for teaching and modelling. We welcome contributions which can be made through the Github repository https://github.com/SheffieldML/GPy

GPy is a BSD licensed software code base for implementing Gaussian process models in python. This allows GPs to be combined with a wide variety of software libraries.

The software itself is available on GitHub and the team welcomes contributions.

The aim for GPy is to be a probabilistic-style programming language, i.e. you specify the model rather than the algorithm. As well as a large range of covariance functions the software allows for non-Gaussian likelihoods, multivariate outputs, dimensionality reduction and approximations for larger data sets.

The documentation for GPy can be found here.

CMU Mocap Database

[edit]

Motion capture data from the CMU motion capture data base (CMU Motion Capture Lab, 2003).

import pods

You can download any subject and motion from the data set. Here we will download motion 01 from subject 35.

subject='35' 
motion=['01']

data = pods.datasets.cmu_mocap(subject, motion)

The data dictionary contains the keys ‘Y’ and ‘skel,’ which represent the data and the skeleton..

data['Y'].shape

The data was used in the hierarchical GP-LVM paper (Lawrence and Moore, 2007) in an experiment that was also recreated in the Deep Gaussian process paper (Damianou and Lawrence, 2013).

print(data['citation'])

And extra information about the data is included, as standard, under the keys info and details.

print(data['info'])
print()
print(data['details'])

CMU Motion Capture GP-LVM

[edit]

The original data has the figure moving across the floor during the motion capture sequence. We can make the figure walk ‘in place,’ by setting the x, y, z positions of the root node to zero. This makes it easier to visualize the result.

# Make figure move in place.
data['Y'][:, 0:3] = 0.0

We can also remove the mean of the data.

Y = data['Y']

Now we create the GP-LVM model.

import GPy

model = GPy.models.GPLVM(Y, 2, normalizer=True)

Now we optimize the model.

model.optimize(messages=True, max_f_eval=10000)

Figure: Gaussian process latent variable model visualisation of CMU motion capture data set.

References

CMU Motion Capture Lab, 2003. The CMU mocap database.

Damianou, A., Lawrence, N.D., 2013. Deep Gaussian processes. pp. 207–215.

Lawrence, N.D., Moore, A.J., 2007. Hierarchical Gaussian process latent variable models. pp. 481–488.