Paula Dumbill, Partner at Browne Jacobson, specialises in intellectual property and technology law. Here she writes about how data should be used to harness the power of artificial intelligence (AI).

AI is the driving force behind many innovations and improvements in the health sector, making faster and more accurate diagnoses and predicting health issues and outcomes so that they can be addressed proactively and more effectively.

It’s not just about the amount of data

We also know that AI’s success depends on data, data and more data. But it is not just about data, it is also the quality of the data that is important. AI (or rather, machine learning) can train itself by analysing data sets and learning from its mistakes. That data could be inadvertently flawed, biased or incomplete. Sometimes the way AI is applied can lead to unintended results, some of which can be quite amusing, such as the report of an AI “agent” training itself to play an Atari game which, in order to avoid losing a life in level two, kills itself at the end of level one. The logic is undeniable, but not particularly useful.

In another example, the potential for affecting real-life diagnoses is clear. An AI system was trained to identify skin cancer using a database of 129,000 images of benign and malignant tumours. The system identified a correlation between malignant tumours and images containing a ruler. It turned out that those lesions identified by dermatologists as a cause for concern were often photographed with a ruler to measure their size. The algorithm found a link but didn’t know this was not a valid correlation.

More and better data has to come from somewhere. We are becoming used to a world where our personal data is valuable; sometimes more valuable to hackers than credit card data. The demand for personal data is growing so fast that it can be difficult to judge the potential consequences, if any, of giving it away. How many of us hesitate when asked to give consent for our data to be shared when accessing a website? We may instinctively feel uncomfortable giving consent without understanding what we are agreeing to, but often our desire to access the website or use the services takes priority and we give our consent regardless.

Personalisation is key

Large amounts of personal data are the key to realising the benefits of personalised digital care and well-trained AI systems. Many of us already use fitness trackers and specific healthcare apps. The latest technology aims to bring together information about the genome with information about the environment in which a person exists and their behaviour to improve health outcomes. For example, an app called estimates a person’s age, height, weight and gender from a selfie image. Information about current and previous addresses is used to calculate potential exposure to pollution and, in future, could include levels of disease, mosquito counts, etc. The app is granted access to all the person’s online health records, lab records, fitness tracker and any other smart devices and apps. This comprehensive set of data is used to create personalised health insights that could transform the approach to that individual’s healthcare.

A key question is how comfortable we are with the harvesting of that data and, equally, how much we trust the organisations that are harvesting it?

In 2017, a much-vaunted partnership between Google’s DeepMind and the London Royal Free Hospital ended up in trouble with the ICO when patient identifiable data from 1.6 million patients was handed to DeepMind without the patients’ knowledge or consent. Assurances were given at the time that Google would not be able to access that data, but, since then, DeepMind has been absorbed into Google Health. The London Royal Free, along with four other NHS trusts, has agreed to transfer its data processing agreements with DeepMind over to Google.

Had they been asked to give consent, would those 1.6 million patients have agreed to give their personal data to Google? Given the promise of breakthroughs in healthcare and the corresponding lack of understanding of how that data might be used … well, perhaps they might have done.