Modelling fertility in rural South Africa, Rob Eyre

Individuals living within the study area of the Agincourt HDSS. Image by A Khosa, courtesy of
Individuals living within the study area of the Agincourt HDSS. Image by A Khosa, courtesy of

One of our data scientists Rob Eyre recently published a paper in Emerging Themes in Epidemiology on modelling fertility in a poor rural region of South Africa using an innovative non-linear approach (the full paper can be found here).

A common issue throughout much of quantitative Public Health research is the application of a range of standardised statistical methods even when such methods are not appropriate. Such standard methods often assume the relationships being modelled to be linear, despite this assumption often being unjustified. One such area where this is the case is in the modelling of how fertility changes over different socio-economic characteristics such as age, education, and social status.

A core aspect of the work we do here at Spectra Analytics involves using more modern, sophisticated, and well-thought-out methods that provide better results to our clients. In line with this, Rob’s research used an innovative combination of a non-linear parametric model of fertility over age, with the use of the highly flexible semi-parametric machine learning method of Gaussian process regression to bring in further variables such as socio-economic status for which no established fertility pattern model exists.

Rob and his research colleagues – Thomas House of the University of Manchester, F. Xavier Gómez-Olivé of the Agincourt research unit in South Africa, and Frances Griffiths of the University of Warwick – successfully applied this method to data from the Agincourt Health and Socio-Demographic Surveillance System (HDSS), run by the Medical Research Council/University of the Witwatersrand Rural Public Health and Health Transitions (Agincourt) Research Unit. This is an annual census performed in a poor rural region of South Africa, collecting information on births, deaths, migration, and many different health aspects. The results of this analysis provided more robust and reliable estimates of the fertility patterns within the Agincourt study area that are free from unjustified assumptions of linearity.

The researchers hope this work will encourage others working in fertility modelling to look beyond standard methodology and be more thoughtful about what methods they use and the assumptions they make when using these methods.