# Malaria surveillance with multiple data sources using Gaussian process models

Martin Mubangizi, United Nations Global Pulse
Ricardo Andrade-Pacheco, UCSF Global Health Science
Michael Thomas Smith, University of Sheffield
John Quinn, United Nations Global Pulse and Makerere University
Neil D. Lawrence, University of Sheffield

in 1st International Conference on the Use of Mobile ICT in Africa

#### Abstract

A statistical framework for monitoring the health of a population should ideally be able to combine data from a wide variety of sources, such as remote sensing, telecoms, and official health records, in a principled manner. Gaussian process regression is commonly used to visualise disease incidence by interpolating values across a map; in this article, we show how it can be extended to deal with many different types of information by introducing a flexible covariance structure across data sources. Combining many data sources in a single model provides a number of practical advantages, such as the ability to automatically determine the importance of each data source through likelihood optimisation, and to deal with missing values. We show the basic idea with an application of malaria density modeling across Uganda using administrative records and remote sensing vegetation index data, and then go on to describe further extensions such as the incorporation of human mobility data extracted from mobile phone call detail records (CDRs).

  @InProceedings{mubangizi-malaria14, title = {Malaria surveillance with multiple data sources using {G}aussian process models}, author = {Martin Mubangizi and Ricardo Andrade-Pacheco and Michael Thomas Smith and John Quinn and Neil D. Lawrence}, booktitle = {1st International Conference on the Use of Mobile ICT in Africa}, year = {2014}, month = {00}, edit = {https://github.com/lawrennd//publications/edit/gh-pages/_posts/2014-12-09-mubangizi-malaria14.md}, url = {http://inverseprobability.com/publications/mubangizi-malaria14.html}, abstract = {A statistical framework for monitoring the health of a population should ideally be able to combine data from a wide variety of sources, such as remote sensing, telecoms, and official health records, in a principled manner. Gaussian process regression is commonly used to visualise disease incidence by interpolating values across a map; in this article, we show how it can be extended to deal with many different types of information by introducing a flexible covariance structure across data sources. Combining many data sources in a single model provides a number of practical advantages, such as the ability to automatically determine the importance of each data source through likelihood optimisation, and to deal with missing values. We show the basic idea with an application of malaria density modeling across Uganda using administrative records and remote sensing vegetation index data, and then go on to describe further extensions such as the incorporation of human mobility data extracted from mobile phone call detail records (CDRs).}, key = {Mubangizi:malaria14}, linkpdf = {http://air.ug/papers/MubangiziUMICTA2014.pdf}, OPTgroup = {} }
 %T Malaria surveillance with multiple data sources using Gaussian process models %A Martin Mubangizi and Ricardo Andrade-Pacheco and Michael Thomas Smith and John Quinn and Neil D. Lawrence %B %C 1st International Conference on the Use of Mobile ICT in Africa %D %F mubangizi-malaria14 %P -- %R %U http://inverseprobability.com/publications/mubangizi-malaria14.html %X A statistical framework for monitoring the health of a population should ideally be able to combine data from a wide variety of sources, such as remote sensing, telecoms, and official health records, in a principled manner. Gaussian process regression is commonly used to visualise disease incidence by interpolating values across a map; in this article, we show how it can be extended to deal with many different types of information by introducing a flexible covariance structure across data sources. Combining many data sources in a single model provides a number of practical advantages, such as the ability to automatically determine the importance of each data source through likelihood optimisation, and to deal with missing values. We show the basic idea with an application of malaria density modeling across Uganda using administrative records and remote sensing vegetation index data, and then go on to describe further extensions such as the incorporation of human mobility data extracted from mobile phone call detail records (CDRs). 
 TY - CPAPER TI - Malaria surveillance with multiple data sources using Gaussian process models AU - Martin Mubangizi AU - Ricardo Andrade-Pacheco AU - Michael Thomas Smith AU - John Quinn AU - Neil D. Lawrence BT - 1st International Conference on the Use of Mobile ICT in Africa PY - 2014/12/09 DA - 2014/12/09 ID - mubangizi-malaria14 SP - EP - L1 - http://air.ug/papers/MubangiziUMICTA2014.pdf UR - http://inverseprobability.com/publications/mubangizi-malaria14.html AB - A statistical framework for monitoring the health of a population should ideally be able to combine data from a wide variety of sources, such as remote sensing, telecoms, and official health records, in a principled manner. Gaussian process regression is commonly used to visualise disease incidence by interpolating values across a map; in this article, we show how it can be extended to deal with many different types of information by introducing a flexible covariance structure across data sources. Combining many data sources in a single model provides a number of practical advantages, such as the ability to automatically determine the importance of each data source through likelihood optimisation, and to deal with missing values. We show the basic idea with an application of malaria density modeling across Uganda using administrative records and remote sensing vegetation index data, and then go on to describe further extensions such as the incorporation of human mobility data extracted from mobile phone call detail records (CDRs). ER - 
 Mubangizi, M., Andrade-Pacheco, R., Smith, M.T., Quinn, J. & Lawrence, N.D.. (2014). Malaria surveillance with multiple data sources using Gaussian process models. 1st International Conference on the Use of Mobile ICT in Africa :-