edit

This is the old group guide for how to release MATLAB software associated with papers. Nowadays we have moved to Python and jupyter.

Reproducibility

To ensure papers are reproducible, you should write your LaTeX files using MATweave for creating experiments and plots.

General Code Structure

We write code in a object-orientated style. It doesn’t use the MATLAB object orientated interface, as when this was initially released it was very slow. Each toolbox contains files that typically start with the model “type”, e.g. gpCreate.m which is used for creating a structure for a Gaussian process. Each structure contains a field type which contains this string, e.g. 'gp' in this case. The second part of the name gives the role of the function. You can think of this as being like a method in object orientated programming. I.e. gpCreate.m implements the method gp.create() which creates the model. Other than gpCreate all other functions of this type typically take a structure of type gp as their first argument. For example gpOptimise or gpDisplay. There is also an object hierarchy. for example the gp is a member of the model hierarchy. There are functions modelOptimise and modelDisplay that can operate on gp or other model types such as mlp or rbf. This allows generic demo files that can run different algorithms on the same data set. Much of this code is in the mltools toolbox.

dem Files

For experiments we follow netlab (netlab has influenced the design of the code in more ways than this, it is also the main source of optimization scripts for the models) in using &lquot;dem&rquot; files to implement each experiment. A typical example is shown in the dimred toolbox as demStickLle1.m. The file starts with dem indicating it is a demo or experiment file. The idea is that other users will make use of these demo files to run their own data. So the file should be as clear as possible. The next part of the file is the name of the dataset being run, in this case stick and the final part is the name of the model applied to the data set, e.g. lle, the locally linear embedding. Finally the number indicates the experiment number, number 1 in this case. Inside the file the data is loaded in (here using lvmLoadData) the experiment number is set, the model is created (lleCreate) using default options from lleOptions and optimized with lleOptimize. Finally the model is saved with modelWriteResult. This form should be used for saving all models as it allows only relevant parts of the model to be saved by writing lleWriteResult. But in the base case it will simply write the model structure as a mat file.

General Directory Structure

The code is released automatically by a really messy script that was put together over several years. You can find it in mlprojects/reproducible/python/contentsMaker.py.

The bulk of our code released is in MATLAB. The repository is structured to contain code in the following way:

mlprojects/TOOLBOXNAME/matlab/mfiles.m

Each toolbox name is a small letter version of the toolbox as it is released. In my own directory structure, under the matlab directory, I also have expanded versions of the toolbox releases (e.g. TOOLBOXNAME0p13 for release 0.13 of toolboxname). I then have some MATLAB scripts that quickly and easily load the desired toolbox into the path. These scripts are located in

mlprojects/matlab/general/

and are called: importLatest.m, importTool.m, closeLatest.m and closeTool.m. You can use importLatest('kern') to add to the path the latest version of the kernel toolbox, if it is stored under your kern/matlab/ subdirectory. importTool will, without arguments, add to the path the current active version (i.e. that in the kern/matlab directory) of the toolbox, or if given a version number, it imports a particular version. closeLatest and closeTool remove toolboxes from the path.

FAQ

  1. How do I declare dependencies?
  2. What about usage information?

    For usage information create a sub-directory under the main toolbox directory called html. In other words it should be placed at toolboxname/html/. In this directory create a file called index.txt. This file should have a format that is as given in index.txt_template. This file will be used to auto-generate the index.html file.

  3. What if there is a file I don’t want to release? Place the name of the file inside ignorefiles.txt.
  4. What if there is a file I need to release but it isn’t in the directory?

    Place the name of the file inside additionalfiles.txt in a line underneath a tag giving the directory. Every release directory should have one of these files starting:

    dir: html
    ~/mlprojects/toolboxname/html/index.html
    

    which ensures that the index.html file stored in the html subdirectory is copied across. Typically there will be other entries for image files containing the results so you might have

    dir: html
    ~/mlprojects/tooboxname/html/index.html
    ~/mlprojects/toolboxname/html/resultFile1.png 
    ~/mlprojects/toolboxname/html/resultFile2.png 
    

    to add further files, just start with

    dir: newDirectoryName
    ~/mlprojects/pathToFileName
    
  5. How should I comment my MATLAB code?

    Comments should be in the form listed below. They are processed by the script and placed in the release file. After the comments there needs to be a one line tag of the form:

    % TOOLBOXNAME
    

    which “tags” the function for release (i.e. if it isn’t their the function is assumed not to be in the toolbox and is placed in a separate “clutterDir” by the releasing script.

    % FUNCTIONNAME One line description of function (this will appear in the Contents.m file)
    % FORMAT
    % DESC description that will start with the format of the function to be used (e.g [x, y] = foo(a, b)). This is auto 
    % generated by the material below, you just need to write something like computes the foo of a and b and returns 
    % them to the user.
    % ARG  arg1 : description of arg1.
    % ARG  arg2 : description of arg2.
    % RETURN return1 : description of first returned variable.
    % RETURN return2 : description of second returned variable.
    %
    % COPYRIGHT : Firstauthorname Lastname, year1, year2, year3
    %
    % COPYRIGHT : Secondauthorname Lastname, year4, year5, year6
    %
    % MODIFICATIONS : minorModAuthorname Lastname, year4, year5, year6
    %
    % SEEALSO : functionOne, functionTwo, functionThree
    
This site last compiled Sat, 25 Jan 2025 06:17:20 +0000
Github Account Copyright © Neil D. Lawrence 2025. All rights reserved.