Our mission

The PANGEA HIV project adapts modern molecular epidemiology and phylodynamics of HIV sequence data to generate new insights into HIV transmission dynamics in generalized epidemics in Africa, and aims to provide a new approach for the evaluation of transmission interventions. In order to achieve these goals a large network has been formed between partners across Africa, the United States of America and Europe. The expertise of various partners is used to collect samples and demographic data, generate up to 20,000 HIV-1 genomes, and analyse the data within a phylogenetics framework.


Despite the availability of an increasing number of proven methods of HIV prevention, there is little consensus as to how to effectively deploy them, or how best to evaluate their impact. It is unknown whether treatment as prevention can be effectively targeted at either high-risk individuals or subpopulations, or what the net contribution to the epidemic of different risk groups is. The role of acute and early infection is still disputed, particularly outside of MSM epidemics, and there is an increasing realisation that movement and migration may be an important epidemic driver. To be transformative, additional epidemiological evidence will be needed to guide deployment of new and innovative biomedical prevention tools, particularly in a budget-constrained environment. Phylogenetics is the tool with the greatest promise to disentangle the dynamics of individual and population sources, sinks and flows within the wider epidemic. Furthermore, viral sequence data encodes vital information on the spread of drug resistance, and on incidence of new infections, that require new tools to be properly interpreted.

Scientific Aims

Overarching goal

The overarching goal of the consortium is to analyse HIV-1 phylogenetic data to identify individual and population level factors that drive the epidemic, analyse the dynamics of the epidemic and translate these findings into information that can be used to more effectively target interventions. An epidemic can be described by a model of connected sources, sinks and hubs. Sources are groups in a population that disproportionately pass on infections, sinks are groups that disproportionately receive infections and hubs are both sources and sinks. Population groups can be defined by age, gender, risk group, geographic location or other characteristics and combinations thereof. Identifying these groups allows prevention to be tailored.


Targeted HIV prevention requires knowledge of how the virus spreads through populations and which groups (e.g. age/gender/geographical/phase of infection) are most likely to be infected and to pass on the virus, i.e. it requires the identification of sources, sinks and hubs. The most powerful tool available to study HIV transmission is phylogenetics. This is especially true since deep sequencing is able to capture the viral diversity within one patient and allows the construction of trees that show how the viruses found in one patient are related to each other. Traditionally, phylogenetics uses the most common HIV sequence found in one patient and builds a tree based on how sequences from different patients are related. The resulting tree indicates which sequences are similar to each other. Often additional data, like confirmed dates of seroconversion, are needed to infer who infected whom in a cluster of similar sequences. We developed a new software called phyloscanner which, using deep sequencing data, is able to identify different sequences from one patient and build a tree from the sequences of a single patient, thus capturing HIV within-host diversity. Inferring who infected whom in a transmission cluster is now much easier and can often be achieved with reasonable uncertainty without the use of additional data: If the HIV sequences in person B are found to be entirely inside the tree of sequences found in person A, it is very likely that B inherited A’s viruses, either directly or via an unsampled intermediary. Phylogenetics thus allows the construction of a transmission network which can be used to identify sources, sinks and hubs and study many questions that crucial to making informed public health decisions.


The project covers four interlinked themes: 1. Molecular Epidemiology and Mathematical Modelling, 2. Phylodynamics, 3. Mobility and Migration, and 4. Clinical Science, Drug Resistance and Ethics. The four themes correspond to four working groups that will each be led by one member of the Executive Committee.

Molecular Epidemiology and Mathematical Modelling (Christophe Fraser)

This theme is building a transmission network from the PANGEA sequences, to answer several epidemiological questions, e.g. which demographical, clinical and virological correlates identify transmitters, whether stratifying the analysis by age and gender reveals transmission cycles across different age groups, and how drug-sensitive and drug-resistant viruses are transmitted. Mathematical modelling is a key tool for designing public health interventions, but several uncertainties have been difficult to resolve in HIV models, e.g. infectiousness of acute infection, transmission patterns between age and sex groups, heterogeneity of transmission rates and the contribution of hidden risk populations. As new technologies result in more costly interventions, it becomes ever more important to estimate key parameters using phylogenetics to prioritise target groups, to assess the impact of interventions and to use models to predict their likely effect.

Phylodynamics (Andrew Rambaut)

This theme takes the analysis of phylogenetic trees a step further, looking at spatial movement patterns of the virus across cohorts and countries and the identification of ‘outbreaks’, unusually dense transmission clusters in geographically or demographically defined groups which might indicate concentrated epidemics within the epidemic. The theme will also dig deeper into the trees, linking the branching length to time, thereby identifying core groups that disproportionally spread the epidemic and assessing the impact that the stage of infection has on onward transmission. The Phylodynamics team also works on visualising the results and on further developing the analysis software used in this project. They will adopt existing phylodynamics tools to account for genetic variation and for use with large datasets.

Mobility and Migration (Kate Grabowski)

This theme investigates how mobility and migration influences the spread of the HIV epidemic in different African countries. There is growing evidence that mobile populations are at higher risk of for HIV and can sustain epidemics by not being eligible for treatment. Analysing mobility can also inform us to what extent high prevalence areas fuel epidemics in surrounding areas of low prevalence. When this happens in areas of population trials (which is the case for three of the PANGEA sites), this might impact on the interpretation of the results. Phylogenetics offers a unique way to track the spread of the virus along with mobile populations and define local sources, sinks and hubs.

Clinical Science, Drug Resistance and Ethics (Deenan Pillay)

This theme investigates why incidence in many parts of Africa is not falling as fast as expected despite a dramatic increase of antiretroviral coverage and how new prevention approaches will impact on the epidemic. It also monitors to what extent the rapidly expanding use or antiretroviral therapy is associated with the emergence of HIV drug resistance. The phylogenetic approach will make it possible to find the groups that are still at high risk of HIV infection and target them in prevention approaches. Finally, this theme will look at the ethical issues around phylogenetic studies in a highly stigmatised disease like HIV and how the risks and benefits can best be explained to study participants.


The second phase of the project is led by Prof Christophe Fraser, with Prof Deenan Pillay, Dr Kate Grabowski and Prof Andrew Rambaut as members of the Executive Committee and Dr Lucie Abeler-Dörner as Project Manager.

The Steering Committee consists of the Executive Committee, the Project Manager and representatives from the Africa Health Research Institute (South Africa), the Medical Research Council/Uganda Virus Research Institute (Uganda), Johns Hopkins University, Harvard University and Washington University (United States of America), Imperial College London, the London School of Hygiene and Tropical Medicine, University College London and the University of Edinburgh (United Kingdom).

An Advisory Board has yet to be appointed.

Project History

PANGEA has been funded in two phases by the Bill & Melinda Gates Foundation. The first phase (2013-2017) focussed on setting up the network, acquiring sequences and clinical data and setting up a suitable database. Over 13,000 HIV genomes were sequenced during this first phase of the project. The second phase (2017-2021) will see the generation of more sequences, but mainly focus on analysis (see Scientific Aims).

The first phase was led by Prof Deenan Pillay (AHRI and UCL), with Prof Andrew Leigh-Brown, Prof Christophe Fraser, Prof Paul Kellam and Prof Tulio de Oliveira as members of the executive committee and Dr Anne Hoppe as project manager. Members of the Executive Committee chaired the working groups that are responsible for delivering the different work packages.


A discussion forum for this project can be found here.

PANGEA HIV is funded by the Bill& Melinda Gates Foundation.
The content on this webpage does not necessarily reflect the positions or policies of the funder.