The 3 phase oil data was data used to demonstrate models like the GTM. The page where it was hosted seems to have disappeared, so I’m putting links to the parts I have here. This information comes to you courtesy of the Wayback machine (

Here is the original text from the page (the links won’t work apart from those to the data set).:

3 Phase Data


This is synthetic data modelling non-intrusive measurements on a pipe-line transporting a mixture of oil, water and gas. The flow in the pipe takes one out of three possible configurations: horizontally stratified, nested annular or homogeneous mixture flow. The data lives in a 12-dimensional measurement space, but for each configuration, there is only two degrees of freedom: the fraction of water and the fraction of oil. (The fraction of gas is redundant, since the three fractions must sum to one.) Hence, the data lives on a number of ‘sheets’ which locally are approximately 2-dimensional.

At the moment, there is no further information online. Even the old Aston paper repository has disappeared so I can’t any longer link to the papers that used the data from there.


The data is available either as a tar-file containing compressed ASCII files, or as a compressed MATLAB (R) workspace. The files/variables contain:

  • Data*, 1000 measurements, 1000-by-12
  • Data*Frctns, the corresponding fractions of water and oil (in that order), 1000-by-2
  • Data*Lbls, the corresponding configuration labels, given in a 1-of-3 coding scheme, where
    • [1 0 0] == Homogeneous configuration
    • [0 1 0] == Annular configuration
    • [0 0 1] == Stratified configuration


’*’ above is replaced by ‘Trn’, ‘Vdn’, and ‘Tst’, which are meant to correspond to training, validation and test data; the three file sets all contain 1000 samples. The fractions and configurations are picked at random from corresponding uniform distributions. gzipped tar-file
gzipped MATLAB workspace

