A genetic study of proteins: a big step for open data sharing in drug development


By Isabel Rowbotham, MSc Translational Cardiovascular Medicine

Bristol University’s epidemiology unit has launched a new database containing a network of 989 plasma proteins on 225 human diseases in an initiative that could help reduce failures in drug development.

Failures are reduced by using genetic data to identify proteins involved in complex diseases which will help to prioritise drug targets.

The database was created by Dr Jie Zheng, Prof Tom Gaunt and many other researchers from the University's MRC Integrative Epidemiology Unit (MRC-IEU).

Over the years, clinical trials in drug development have resulted in a ~50 per cent failure rate. Failures in clinical trials occur when the tested drug doesn't show a desired result or any benefit, and adverse side effects are potentially dangerous and can cause a trial to pause.

This leads to costly delays in the drug development process, but what if we could use data to lower clinical trial failures? To achieve this, a protein target needs to be identified during the early stages to successfully progress through the clinical trial. To solve this problem Dr Jie Zheng explains how his study, published in Nature, uses big data and statistical analysis to identify the best candidates for drug development.

The research focuses on a method called mendelian-randomisation (MR). ‘We randomised people using their genes,’ explained Dr Zheng. This is based on the genetic differences between people. ‘During evolution a group of people acquired a genetic change or variation, while another group did not. These genetic variations will cause changes in the human body such as weight or blood pressure. MR uses people's genetic variation to group people into two, a drug group and placebo group and test if using the drug will reduce the disease risk.’

Comparison between randomised controlled trial (RCT) and the genetic approach “Mendelian randomization” (MR). MR has been described as nature’s RCT due to the random allocation of genetic variants | University of Bristol

In the past, obtaining data sets for such a project would take writing hundreds of emails. Now, openness to data sharing will allow researchers to streamline the study of proteins and test hundreds of diseases in a single study.

‘I’m really glad that the concept of open sciences is more widely being accepted in the field.’

The group is already in talks with pharmaceutical companies about new drug development.  Many companies are building their own analytical pipeline, where they have already been translated into real life drug target identification.

The current COVID-19 pandemic has revealed that the scientific community needs more powerful and faster tools in drug development to treat patients with the disease. At the moment, population health scientists and data scientists are making efforts to collect COVID-19 data through a nation-wide projects such as the UK Biobank.

‘The idea is to put all the data together to make a great genetic effort,’ explained Dr Zheng. Publicly available data of COVID-19 can allow scientists in Dr Zheng’s group to apply it to their drug development analytical pipeline.

A recent paper published in the New England Journal of Medicine, is already being considered by the group to further verify which protein is the best drug target for COVID-19 severity treatment.

emPOWER- The fusion of mankind and technology
Bristol research reveals how cancer puts breaks on the immune system

‘We hope that drug development can be more quick and more efficient. Data science in genetics will be able to prioritise targets which will be more likely to be successful for clinical trials. This will make the process much quicker and cheaper and save more people in the near future’.

This is a great opportunity for students in any background to become interested in data science. There is a lot of potential for young people to make changes in their career and learn about data science even when it is not part of your university curriculum.

‘The future is for big data and data science.’

Featured image: Epigram / Isabel Rowbotham

Would you want to learn more about data science?