Friday, 22 April 2016

Clinical data entry for the 100,000 Genomes Project

Next Wednesday I will sit down with a colleague from Genomics England (GEL) and undertake a data entry blitz on upwards of 50 families recruited to the 100,000 Genomes Project.  The data will be transcribed from the patients' paper records into an electronic database administered by GEL.  The pile of paper records is quite substantial (see illustration below).  

The size of the pile has focussed the mind somewhat, and I would like to argue in this article that data should be collected from patients in the clinic electronically and in a way which is compatible with other formats, including the database held by GEL for patients and families enrolled in the 100,000 Genomes Project. 

The offending pile
Background on the 100,000 Genomes project is available here and here.  The project is, in my view, an entirely worthwhile attempt to improve the availability of state of the art genetic testing for patients and families affected by rare disorders.  The idea is to decrease the length of time it takes for these families to get a diagnosis, families such as that of the individual I met last month who has been trying to understand the reason for her three mentally handicapped sons since the first of them was born, in 1967.

The project has been funded by the UK government, but costs will be recouped, effectively by selling anonymized patient data, both clinical and genetic, to academic and industry partners.  The collection of good quality clinical data, which are stored centrally by Genomics England, is therefore an integral part of the project.

When the people at GEL came up with the idea for the project, they obviously thought that the data entry issue was going to be more straightforward than it has actually turned out to be.  From their draft protocol written last year:

A secure, web-based, information system will be provided for the collection of this data, removing the need for any additional, bespoke development within participating NHS organisations. An electronic facility will be provided for transmission of the same data from existing information systems. Where organisations have developed their own capacity for capturing and managing the same data, to the same standards, there will be no requirement for additional data entry 
[From: Genomic England Protocol for 100,000 Genomes Project, v2 date 16/01/2015,  p. 14]

To try to be fair to Genomics England, when they wrote '...transmission of the same data from existing information systems' in their protocol, I don't think that 'existing systems' referred to the stack of paper illustrated above, and that 'transmission' referred to manual transcription.  They thought that we were collecting data in electronic format and that this could be sucked into their system with minimal pain for all concerned. Speaking for my specialty, Clinical Genetics, I am not aware that any centre is collecting clinical data in a form which can be straightforwardly captured for the purposes of the 100,000 Genomes Project. All data is being re-entered into web forms such as OpenClinica, either from other electronic but so far not compatible IT systems, or from paper records. 

Clinicians are used to collecting data for NHS research projects. The problem in the 100,000 Genomes Project is that the data are being collected as if the project were research, whereas in fact the concept of the project as NHS transformation implies work which is carried out wholly within the arena of routine clinical practice.  The current data entry model is time-consuming and, in my view, a barrier to the success of the project, because clinicians are being asked to do data entry over and above the data collection which they routinely do as part of normal clinical care.

A couple of weeks ago I attended a meeting at Genomics England HQ at Queen Mary University of London. The meeting was attended by representatives from Genomics England, the UK Clinical Genetics community, and interested parties from industry.  The purpose of the meeting was, first, to draw the attention of Genomics England to the problem of duplicate data collection/entry, as outlined above; and, second, to try to identify a way forward.

The goal of drawing the attention of GEL was fairly straightforward to accomplish.  The second goal will be more challenging.  As I have previously discussed (here and here) it will involve a transformation of the way in which Clinical Geneticists work.   IT leads for (most of) the 23 UK Clinical Genetics centres have been meeting for the last few years, with myself as chair, to compare practice in this area and to try to help each other with progress towards an electronic patient record for Clinical Genetics. We have highlighted wide differences in IT systems in use and in the rate of progress in development towards that goal.  The 100,000 Genomes Project has brought these issues into sharp relief and this is one of several reasons why I think that it is a welcome initiative.

The potential value of having electronic rather than paper data collection for Clinical Genetics and the 100,000 Genomes Project cannot in my view be overstated.  The green files shown above are data 'silos'. They represent countless hours of careful collection of data which is then not available for potential other good uses: service development, clinical audit, research and, yes, the 100,000 Genomes Project.  And because the data is in these silos, it doesn't really matter if it is collected in a non-standardized way using non-standard descriptive and diagnostic terminology, because no-one else is looking at it. 

The best way of convincing sceptical colleagues, of which there are some, is to come up with a prototype electronic patient record that looks convincing, and works.  This is a major task, but in my view a worthwhile one.  It might even get me out of dealing with the pile.

Charles Shaw-Smith is a Consultant Clinical Geneticst based at the Peninsula Regional Genetics Service in Exeter, Devon UK, and Rare Disease Lead for the SouthWest Genomic Medicine Centre, 100,000 Genomes Project.