Skip to content
Natural Environment Research Council
Grants on the Web - Return to homepage Logo

Details of Award

NERC Reference : NE/X009637/1

Generative adversarial networks for demographic inferences of nonmodel species from genomic data

Grant Award

Principal Investigator:
Dr M Fumagalli, Queen Mary University of London, Sch of Biological & Behavioural Sciences
Science Area:
Atmospheric
Earth
Freshwater
Marine
Terrestrial
Overall Classification:
Unknown
ENRIs:
Biodiversity
Environmental Risks and Hazards
Global Change
Natural Resource Management
Pollution and Waste
Science Topics:
Gene flow
Genetic diversity
Population Genetics/Evolution
Artificial Intelligence
Machine Learning (AI)
Genomics
Genomics
Algorithm Development
Bioinformatics
Abstract:
Understanding the temporal and geographic movement of populations is vital to address key questions in evolutionary and conservation biology. Whilst the generation of high-throughput genomic data enabled the inference of population genomic parameters at unprecedented rate, large-scale datasets also prompted the development of novel computational techniques. In recent years, the predictive power provided by machine learning algorithms, in particular deep learning, has led to breakthrough discoveries in many disciplines. Nevertheless, the application of deep learning in evolutionary genomics is still in its infancy. Deep learning algorithms exhibits several advantages over commonly-used inferential approaches in population genomics, as they can handle large data sets with minimal compression and are theoretically universal approximators of arbitrarily complex models. The intrinsic statistical uncertainty associated with genomic sequencing data, the lack of natural training data sets, and the computational resources needed have hampered the exploitation of these powerful techniques to generate novel findings in evolutionary biology. These challenges are particularly prominent in the study of nonmodel species, where prior knowledge of key parameters is typically missing. A promising strategy to partly overcome such barriers is given by the recent application of Generative Adversarial Networks (GANs), a branch of deep learning methods, which have been successfully applied to generate artificial genomes and estimate cryptic evolutionary parameters. GANs consist of two deep neural networks which are trained together and, at the end, the algorithm generates simulations that are indistinguishable from real examples (as in the case of "Deepfake" methods in Artificial Intelligence). Thus, the final simulator provides estimates of model parameters. In this project, we aim to to pilot the design, implementation, and deployment of a novel GAN architetcure for population genomic data. As an illustration, we will focus on the inference on demographic parameters, , including temporal changes in population size and migration rate, describing the recent evolution of Anopheles mosquito populations among three villages in Burkina Faso. As the first objective, we will adapt a recently proposed GAN architecture for population genomic data to incorporate multiple populations with unequal sizes. As the second objective, we will train the algorithm by integrating simulations with extensive genomic data from Anopheles mosquito populations. We will include a significant technological advance by integrating a model selection step to discriminate among competing evolutionary scenarios. By estimating the migration rate of mosquito populations among villages, we will be able to assist predictions on the spread of resistance mutations and support molecular surveillance and intervention strategies at local scale. In fact, it is still unclear to what extent resistant mutations can spread across the entire continent as different studies have led to contrasting findings on the extent of migration between Anopheles populations. Upon completion of this pilot study, we will be able to scale the deep learning algorithm to all available mosquito populations from sub-Saharan Africa and infer gene flow at the continental scale. Additionally, the novel deep learning framework will be applicable to all mutations potentially associated with resistance or other notable phenotypes. It can be further extended to model complex modes of adaptation (e.g. via introgression or polygenic adaptation) and to other species of importance.
Period of Award:
13 Feb 2023 - 12 Oct 2023
Value:
£76,699
Authorised funds only
NERC Reference:
NE/X009637/1
Grant Stage:
Completed
Scheme:
Standard Grant FEC
Grant Status:
Closed

This grant award has a total value of £76,699  

top of page


FDAB - Financial Details (Award breakdown by headings)

Indirect - Indirect CostsDA - InvestigatorsDI - StaffDA - Estate CostsDA - Other Directly Allocated
£37,421£5,938£23,353£9,139£847

If you need further help, please read the user guide.