While the data about secondary and community health brought together by NHS Digital through the Secondary Uses Service can be very useful, the limitations of these datasets (which include the Hospital Episode Statistics database, Emergency Care Data Set and Community Services Data Set) have been acknowledged. They lack, for example, detailed clinical information, and have issues with data completeness, driven by their primary purpose being payment and activity monitoring.
There are increasing efforts to link data at a population level, for example:
Successfully linking data also means addressing legal, ethical and regulatory challenges, which can be resource intensive. For example, bringing together data from primary care can require data-sharing agreements to be put in place with each GP practice. This effort is often repeated to use the data for different purposes. Such measures are necessary to protect privacy, and (along with technical solutions to protecting privacy) are important for ensuring (and demonstrating) the trustworthiness of data use, but could be made easier to navigate to enable faster progress.
Even if joined up, the data collected and held by the NHS give only a partial picture of a patient’s health and wellbeing. Reasons for this include:
- Routine data does not include those who do not access health services. Unequal access to health services means the experience of some groups – for example patients from some minority ethnic backgrounds – are less well represented in datasets relative to the level of need that they have. Those missing may be those who have the greatest health need.
- Routine NHS data lacks information about a patient’s health in between interactions with the health service. This information could help better understand drivers of ill-health and enable more preventative care.
- Data quality is important, and is often an issue. Data quality can be affected by a number of factors, including choices made by clinicians or clinical coders when entering data, decisions about which data are mandatory to collect (coding of ethnicity data is known to be a problem, for example) and the systems used to record this data.
- The purpose for which data are collected shapes what gets recorded. For example, outpatient administrative data frequently lacks information about diagnosis, because the primary purpose of this data is operational (including ensuring hospitals get paid and managing resources), for which diagnosis is not always required. Even data collected from clinical encounters omits potentially important information about wider patient experience, priorities and symptoms.
Proposed approach: improve underlying infrastructure for data and technology
Strong foundations and good technical infrastructure are essential in order to provide high-quality, timely data for service improvement, research and innovation.
Actions that can help to achieve this, at the national level, include:
- developing technical infrastructure and data standards to address data fragmentation (eg through data linkage and federated analytics) and improve timeliness of data collection and access
- improving data quality and coverage, including addressing sources of bias that might impact health inequalities.
Data collected, and held, beyond the NHS can also be very useful. This ‘health-relevant’ data includes that held by the wider public sector (such as housing data held by local authorities, and data about employment), and citizen-generated data held by the private sector (eg data from fitness trackers, social media and retail). Exploring these sources could help create a more complete picture of individual and population health and wellbeing, and of the wider determinants of health. There are many challenges to ensure that such data sharing and use is legal, ethical and publicly acceptable, but efforts such as the Open Life Data Framework and work by the Wellcome Trust seek to demonstrate how this can be done. While the most pressing need is to ensure the NHS can make best use of the data generated within the system, it is important that we continue to explore the potential of health-relevant data over the medium-term.
Issues with data are only part of the problem. The NHS also has an under-developed approach to data science and data-driven innovation. Here we consider the range of factors that contribute to this, and explore possible solutions.
Barrier 2: An underutilised analytical workforce
The NHS employs around 10,000 data professionals and analysts, but has struggled to develop and deploy portable data-driven innovations across the NHS. Our previous report (Untapped potential) found that analysts lack opportunities for training and professional development, and need better access to the software tools required to derive insights from data and to develop prototypes of data-driven solutions. The NHS also lacks an established culture of open analytics in which solutions developed in one part of the NHS are routinely shared with other parts.
Proposed approach: develop the analytical workforce and wider analytical capability
Developing such capability requires action at multiple levels – and we have previously set out a detailed set of actions for addressing the shortfall in analytical capability. At the national and organisational level, actions include:
- providing the analytical workforce with the skills, recognition, professional status, leadership and support needed to make a difference
- investing in the software tools the analytical workforce needs.
Among the analytical community, it means committing to open analytics and collaboration – to share skills and knowledge, and support the development and scaling of tools across the NHS.
Barrier 3: A historic focus on the wrong problems
Data science and data-driven innovation aren’t necessarily focused on the right problems. Too much time and infrastructure is devoted to reactive and routine tasks, such as performance reporting, that could be streamlined and automated to free up capacity to innovate. Our work has highlighted that analytics teams can be siloed, while managers may not understand how data science can help address current or future challenges. In combination with underdeveloped analytical leadership, this means NHS analysts can often be focused on the wrong problems (this applies to innovations developed in industry as well).
Proposed approach: focus on data-driven innovation as a service
Retuning the focus of data science within the NHS will help shift attention to the most relevant problems. This should involve routine deployment of open-source innovations developed in collaboration with end-users – the Royal College of Paediatrics and Child Health’s growth charts project has shown how this can be done.
To make this happen, it’s important to support NHS managers (at a national and organisational level) to be good commissioners and consumers of data-driven insights. Collaboration could also be improved (at a national and local level) between data professionals and those leading, delivering and transforming health services. This will ensure that data-driven innovations:
- address the needs of end users – whether they be clinicians, senior managers, national policymakers, patients or carers (or a combination)
- address NHS priorities – whether the innovations are developed within the NHS or by industry partners.
Among the analytical community, analytical teams could also take a product-based approach to health data science. Product owners could work with analysts, service planners and clinicians to map problems, and develop and deploy solutions using agile methodologies.
Barrier 4: Challenges in scaling up quickly and well
The NHS faces multiple challenges in rolling out promising data-driven innovations at pace and at scale. When tools are successfully developed, too few are deployed into routine use across the NHS. At a local level, this is in part because analytical platforms aren’t connected, or otherwise set-up to allow these tools to be deployed at scale – for example, integrating tools into EHRs is difficult and expensive. For more advanced tools, especially those using artificial intelligence, and with a direct clinical use, there are also challenges including regulation, monitoring and evaluation to support safe adoption at scale. Challenges common to deploying other kinds of innovation in the NHS are also relevant here.
Proposed approach: build better implementation infrastructure
To successfully scale up the best data-driven innovations in the NHS, better implementation infrastructure (both technological and organisational) is essential. This must be accompanied by effective regulation, monitoring and evaluation to ensure safety and equity, and build confidence among health care professionals and the public.
What’s needed at a national level is to:
- develop new approaches to actively monitor and evaluate data-driven innovations, to manage any uncertainty and minimise risk
- ensure the regulatory environment is easy to navigate for innovators but mitigates potential harms.
Barrier 5: Risks to safety, outcomes, health inequalities and public trust
There are risks if any drive to increase the use of data and data science in the NHS is not supported by a responsible approach to innovation. Beyond the risks to safety and outcomes that regulation, monitoring and evaluation could address, there are also potential risks to health inequalities, through bias in data-driven innovations, and to public trust if data is misused, or privacy breached.
Proposed approach: foster a responsible innovation approach
To ensure everyone’s health care will benefit, a responsible approach to innovation is crucial. To begin with, diversity must be improved across the data and technology workforce at all levels.
Furthermore, at a national and organisational level, we must:
- involve patients and the public in setting expectations and rules for data collection, sharing and use
- ensure transparency in how data are used, and develop approaches to privacy-preserving data science (including through federated analytics and trusted research environments)
- provide transparency about who can access data and for what purpose, what the benefits are, and what options patients have
- develop approaches to understand, measure and mitigate possible bias and impact on inequalities as part of in data-driven innovation – at all steps in the process. The Health Foundation is working on this through partnerships with the NHS AI Lab and the Ada Lovelace Institute.