Malaria surveillance with multiple data sources using Gaussian process models

Martin MubangiziRicardo Andrade-PachecoMichael Thomas SmithJohn QuinnNeil D. Lawrence
, 2014.

Abstract

A statistical framework for monitoring the health of a population should ideally be able to combine data from a wide variety of sources, such as remote sensing, telecoms, and official health records, in a principled manner. Gaussian process regression is commonly used to visualise disease incidence by interpolating values across a map; in this article, we show how it can be extended to deal with many different types of information by introducing a flexible covariance structure across data sources. Combining many data sources in a single model provides a number of practical advantages, such as the ability to automatically determine the importance of each data source through likelihood optimisation, and to deal with missing values. We show the basic idea with an application of malaria density modeling across Uganda using administrative records and remote sensing vegetation index data, and then go on to describe further extensions such as the incorporation of human mobility data extracted from mobile phone call detail records (CDRs).

Cite this Paper


BibTeX
@InProceedings{pmlr-v-mubangizi-malaria14, title = {Malaria surveillance with multiple data sources using {G}aussian process models}, author = {Martin Mubangizi and Ricardo Andrade-Pacheco and Michael Thomas Smith and John Quinn and Neil D. Lawrence}, year = {}, editor = {}, url = {http://inverseprobability.com/publications/mubangizi-malaria14.html}, abstract = {A statistical framework for monitoring the health of a population should ideally be able to combine data from a wide variety of sources, such as remote sensing, telecoms, and official health records, in a principled manner. Gaussian process regression is commonly used to visualise disease incidence by interpolating values across a map; in this article, we show how it can be extended to deal with many different types of information by introducing a flexible covariance structure across data sources. Combining many data sources in a single model provides a number of practical advantages, such as the ability to automatically determine the importance of each data source through likelihood optimisation, and to deal with missing values. We show the basic idea with an application of malaria density modeling across Uganda using administrative records and remote sensing vegetation index data, and then go on to describe further extensions such as the incorporation of human mobility data extracted from mobile phone call detail records (CDRs).} }
Endnote
%0 Conference Paper %T Malaria surveillance with multiple data sources using Gaussian process models %A Martin Mubangizi %A Ricardo Andrade-Pacheco %A Michael Thomas Smith %A John Quinn %A Neil D. Lawrence %B %C Proceedings of Machine Learning Research %D %E %F pmlr-v-mubangizi-malaria14 %I PMLR %J Proceedings of Machine Learning Research %P -- %U http://inverseprobability.com %V %W PMLR %X A statistical framework for monitoring the health of a population should ideally be able to combine data from a wide variety of sources, such as remote sensing, telecoms, and official health records, in a principled manner. Gaussian process regression is commonly used to visualise disease incidence by interpolating values across a map; in this article, we show how it can be extended to deal with many different types of information by introducing a flexible covariance structure across data sources. Combining many data sources in a single model provides a number of practical advantages, such as the ability to automatically determine the importance of each data source through likelihood optimisation, and to deal with missing values. We show the basic idea with an application of malaria density modeling across Uganda using administrative records and remote sensing vegetation index data, and then go on to describe further extensions such as the incorporation of human mobility data extracted from mobile phone call detail records (CDRs).
RIS
TY - CPAPER TI - Malaria surveillance with multiple data sources using Gaussian process models AU - Martin Mubangizi AU - Ricardo Andrade-Pacheco AU - Michael Thomas Smith AU - John Quinn AU - Neil D. Lawrence BT - PY - DA - ED - ID - pmlr-v-mubangizi-malaria14 PB - PMLR SP - DP - PMLR EP - L1 - UR - http://inverseprobability.com/publications/mubangizi-malaria14.html AB - A statistical framework for monitoring the health of a population should ideally be able to combine data from a wide variety of sources, such as remote sensing, telecoms, and official health records, in a principled manner. Gaussian process regression is commonly used to visualise disease incidence by interpolating values across a map; in this article, we show how it can be extended to deal with many different types of information by introducing a flexible covariance structure across data sources. Combining many data sources in a single model provides a number of practical advantages, such as the ability to automatically determine the importance of each data source through likelihood optimisation, and to deal with missing values. We show the basic idea with an application of malaria density modeling across Uganda using administrative records and remote sensing vegetation index data, and then go on to describe further extensions such as the incorporation of human mobility data extracted from mobile phone call detail records (CDRs). ER -
APA
Mubangizi, M., Andrade-Pacheco, R., Smith, M.T., Quinn, J. & Lawrence, N.D.. (). Malaria surveillance with multiple data sources using Gaussian process models. , in PMLR :-

Related Material