Wednesday 29 April 2015

Face2Gene and syndrome recognition by computers- photographs needed, try Facebook?

Face2Gene (F2G) is a new company which aims to use computer-based facial recognition rapidly to analzye large numbers of clinical photographs of children and adults with syndromes of genetic aetiology.  The faces of individuals with such syndromes can be highly recognizable, and the ability to recognize them is a skill which can be learnt. The ability to do this is not restricted to members of the medical profession, still less Clinical Geneticists, and it appears that computers are good at it too.

The face of a child with Down syndrome is probably the most recognizable 'syndrome face'.  Here is another example, taken from a journal article: 


Reproduced from: Koolen D et al Journal of Medical Genetics (2008) 45:11 710-720.  Freely available here

The individuals shown have a condition called 17q21 microdeletion syndrome. Their resemblance to each other is (mostly) striking; indeed, they resemble each other far more than they resemble members of their own family, which is amazing when one considers how tiny is the genetic change resulting in the syndrome in question.

It is possible to study the pictures to gain an idea of which individual bears the closest resemblance to the 'most typical'.  I would go for numbers 7, 14 and 20, with numbers 1, 3 and 11 being perhaps the least typical.  This categorization is the kind of thing that Clinical Geneticists do and it has been immensely valuable in delineating new syndromes.  

Clinical Geneticists usually take photographs of undiagnosed patients attending their clinics, with consent, and these photographs are used as an aid to diagnosis in the way outlined above. Although Clinical Geneticists take and store photographs of these children, they do not ultimately have ownership of the clinical photographs of the patients in their collections- the parents do; and, if they have capacity, ultimately the affected individuals themselves do.

F2G have created an app for use by Clinical Geneticists, in which clinical photographs of patients with undiagnosed conditions are uploaded to their website and analyzed by computer to generate a list of possible 'syndrome matches'.  Their motivation in doing this is not purely an altruistic one of helping Clinical Geneticists diagnose their patients, but also one of making the data available to pharmaceutical companies who are developing therapies to treat children with rare disorders.

On first trying out the app, I experienced simultaneously feelings both of excitement and disappointment. Excitement because I think that this has the potential to be very useful for my specialty of Clinical Genetics; disappointment because I was immediately struck that something seemed to be missing, and that was immediate access to a well-catalogued library of clinical photographs.  

As  the Clinical Geneticists on the  advisory board of F2G will perhaps have advised, access to photos of patients with known, diagnosed and molecularly confirmed syndromes is a powerful training resource for members of our specialty. Given that Clinical Geneticists spend a lot of time looking at facial photographs, they need good resources for training in this difficult skill, a skill which will still be needed even when the patient's entire genome sequence is available (see last month's post).  I believe that F2G could -and should- help to provide such a resource. 

We are told that F2G has a database of over 150 000 photos, clearly a very extensive resource.  Why are these photos not made available to Clinical Geneticists through the F2G app as a training resource?  Presumably it is because they do not have consent to release these photographs in this way. A pity.


Of course, there are plenty of training resources available to Clinical Geneticists already, including many text books and electronic databases, such as the London Dysmorphology database.  The latter is comprehensive and useful, but does suffer from some disadvantages, which the authors themselves would I am sure concede: it cannot make any photo available without consent, either directly from the patient or from the journals who own the copyright and who are therefore authorized to make them available to LMD.   The number of photographs is therefore limited.  Second, they are not always of the best quality.  Third, they are often a mix of different types of photo- different parts of the body, radiographs, MRI images and so on.  

In order to increase their resource still further, F2G have embarked on a quest to ask Clinical Geneticists to make clinical photographs of patients with molecularly confirmed syndrome diagnoses available to them.  Again, as I understand it, this is on the basis that the photographs will not be made available to the Clinical Genetics community, but will simply be 'read' by the face recognition software and thus used only as an aid to machine learning.  

More photos are surely better.  It must be a basic principle of computer-aided recognition that a machine will learn more from 1000 photos than from 10 or 100. Increasing the amount of data should improve recognition accuracy and might even enable detection of different 'subtypes' of face within a given syndrome; or allow for discovery of which part of the face is most characteristic.  It should also in theory allow for better correlation between genotype and facial phenotype, if such correlations exist.

I don't think that F2G will get much engagement from Clinical Geneticists in terms of gaining access to clinical photographs unless there is some reciprocal benefit. It would make sense if photographs from archives of individual Geneticists could be made available as a public resource for training purposes as outlined above. Each Geneticist must have dozens if not hundreds of such photographs, which, if pooled on a national or international basis, would constitute an amazing resource for practising clinicians.

Sadly, making such a resource available is not in prospect because Clinical Geneticists are not authorized to release their photographs in this way- they don't have consent from the individuals and families concerned.  They might possibly be able to allow their archived photographs to be read by computer to promote machine learning, but it's not clear what the trade off for them in doing this would be.

To increase access to photos by Clinical Geneticists, h
ere is an idea.  We have all been noting the growth of the involvement of patients and families in social networking sites, especially Facebook.  I have been a member of a couple of sites dedicated to specific syndromes for a while, and I have noted that they attract decent numbers of followers, worldwide.  The love for and dedication to their children that the parents have is plain for all to see.  They are a self-selected group of course but their willingness to share information to help each other is clear.  Research projects into the conditions from which their children suffer are highlighted in detail.  Photographs are shared: I was struck by one post in which a parent put up a photo of her affected child and asked other group members to add photos because she wasn't sure what individuals with that syndrome were supposed to look like.  About 50 photos were added.

Why not cut out the middlemen (Clinical Geneticists) and go straight to the people who can decide about these photographs- the families themselves? Would they be willing to share photographs of their children with Face2Gene?  If evidence of potential benefit to them could be provided, then I think that the answer would be ‘yes’.  It would be incumbent on Face2Gene to explain what this benefit would be, but if what they are saying about their business plan is accurate, they should be able to do that.

These families are used to uploading photographs of their children to websites.  They have done it enough times on Facebook, and, if they thought that it would help their child or other children with the same disorder in the future, then they could easily do it on Face2Gene as well. The Facebook group as a whole could be offered a financial inducement (why not?).  In this way, a larger number of consented photos than Clinical Geneticists could ever provide would be made available, and direct engagement of patients and families would have happened- always a good thing.

If it became known that the support groups on Facebook were becoming actively engaged with clinical, research and commercial partners, then this would improve their standing and significance and lead to better engagement of affected families- the support groups would grow.  There would presumably be a clear win for Face2Gene as they would get the photos they wanted, meaning that their computers would have improved opportunities for learning.  For clinicians, there would be a gain in terms of improvement in educational/training resources, as the photographs would now have the level of consent required to make them publicly available.  


With all this training and learning happening, I'd be interested to see whether computers could 'beat' Clinical Geneticists at syndrome recognition.  I wouldn't bet against them.





Tuesday 24 March 2015

100 000 Genomes Project: much transformation is in prospect, and not just of the NHS

The 100 000 Genomes Project is a ground-breaking study which will bring the power of new genetic technologies to NHS patients.  Eleven Genomic Medicine Centres (GMCs) from around England will be (or are already) recruiting patients with either a cancer or a rare genetic disorder.  These patients will go on to have their entire genomes sequenced.  The first patient was recruited to the study earlier this month.

A company called Genomics England, owned by the Department of Health, has been set up to run the project.  To run alongside it, a Masters programme in Genomic Medicine is being set up  in 6-8 sites in England.  One of the participating Genomic Medicine Centres (SouthWest Peninsula) is our very own Royal Devon and Exeter NHS Foundation Trust.   

The goal of the project is to transform the NHS, bringing genetics to mainstream medicine, allowing for the development of new therapies for cancer and rare disorders. The term for this is "precision medicine". Genetic information has the capability to predict which individuals will respond best to which cancer therapeutic agent, which drug side effects they may be at risk of developing, and the sub-type of rare genetic disorder which they have. 

Patients who are eligible to participate in this project are either those with a recent diagnosis of one of a defined set of cancers (for example, breast cancer, lung cancer; there are several others); or, secondly, a rare disorder (examples: a congenital malformation, a rare syndrome with learning disability, or a rare neurological disorder).  There are literally thousands of rare disorders, individually very rare, but collectively common - Rare Disease UK found that 1 in 17 individuals in the UK is affected by a rare disorder.  Participating GMCs are able to 'nominate' specific rare disorders which Genomics England will then consider and, if appropriate, approve.

There are valid reasons for scepticism about the project.  Here are a few:
  1. The issue of consent is a complex one, made more so by the fact that patients will be asked whether or not they wish to receive 'secondary' (also called 'unlooked for' findings).  These findings relate to their future health but are unrelated to the medical reason for which they were enrolled in the project, for example, a fault in a gene conferring susceptibility to breast and ovarian cancer identified in a child with a rare neurological disorder.
  2. The IT infrastructure requirement for the project is significant and will need considerable investment of time, energy and money- previous efforts to reform NHS IT do not necessarily give grounds for optimism
  3. There are issues regarding data security.  Patients who enrol in the project will be asked to consent to release of clinical information and genomic data to third parties, which will include commercial as well as academic groups, for the purposes of research and development.  This idea also has some tricky past history, still far from resolved.
Despite the problems and challenges, which are real, I have detected a keen sense of anticipation, even excitement at the meetings which I have attended.  For example, SouthWest Peninsula GMC leads met recently in Exeter with a group of paediatricians from our hospital, in order to inform them about the project and invite them to collaborate.  They were enthusiastic!  Why?  Maybe because, despite the challenges, they can see the benefits.  This recent article in a Clinical Genetics journal explains some of them.

At yet another recent meeting, this time of clinicians and laboratory scientists, we had a talk from a bioinformatician.  These people have the job of converting the raw genome sequence into something that is accessible to and usable by the clinician.  I asked him, I thought playfully, whether or not there was in his view a prospect that people could in the future download their genome sequence onto their mobile phones and analyse them on an 'app' ("MyGenome"?).   He couldn't see an issue with the genome download, and no doubt the clever people at Google, Apple and elsewhere are in the process of developing the app right now.  As technical challenges go, I wouldn't have thought it would be harder than some of the other stuff they do- driverless cars, for example.  

In the 100 000 Genomes Project, participants can request to receive their own raw genome data in addition to any results.  As at 2015, the process for this has not yet been defined by Genomics England, but there is no reason to think that it shouldn't be possible, even though there will probably have to be a payment.

Outside the 100 000  Genomes Project, the more interesting question is to ask at which point people from the population at large will start to want to have access to their own genome data.  As always, a key element of this question (not the only one, of course) is cost.  Here is a graph of the cost of sequencing a genome over time, to the present, taken from this source:

human genome sequencing costs graph

Once the cost of a whole genome sequence falls to a value which is accessible to the private citizen, and the technology to look at it is developed into a user-friendly interface, then the private citizen is likely to look carefully at the value of knowing this information.  Yes, of course there are millions of variants of uncertain or low significance.  But it is the handful of variants of very well known, and extremely adverse, clinical significance, that people could possibly want to know about.  

If people could find out very easily, and for an essentially trivial cost, that they were at, say, 80% risk of developing breast or bowel cancer in their lifetime; or that, as a couple, they were at high risk of having a child with a pre- or post-natally lethal disorder, would they ignore this opportunity?  I wonder.

Views expressed in this article are my own and not those of Peninsula Regional Genetics Service, Royal Devon and Exeter Hospital, SouthWest Peninsula Genomic Medicine Centre, Genomics England or any other public body.