A standard Bayesian approach to transfer learning is to construct hierarchical probabilistic models. Learning tasks are typically related in the model through conditional independencies of the variables/parameters. Many of the variables are unobserved. Marginalization of the unobserved variables and Bayesian treatment of parameters induces structure and correlations between the tasks. Gaussian processes are prior distributions over functions: kernel functions are the covariances associated with these priors. A Gaussian process can be set up to have multiple outputs. However, for these outputs to have correlation between them a covariance function that models correlations between outputs is required. Equivalently we need to develop multiple output kernel functions (also known as multitask kernel functions, or structured output kernels). In this talk we will briefly review work in creating multiple output kernels before focusing on models represented by a convolution processes. We will arrive at convolutional processes through physical interpretations of our models. We will try to illustrate these models with a range of real world examples of both transfer learning and other applications.