Big medical data analytics is a new and unique opportunity for national health systems to reduce costs and improve population health management. The processing of vast amounts of medical histories from electronic patient records can provide researchers, clinicians, policy makers and private health companies with invaluable insights into all aspects of health and illness. New treatments, medication regimens and medical technologies can then be developed based on more accurate cost/benefit analyses. Importantly, it constitutes national health systems engines of economic growth. The European Commission is actively promoting a ‘Digital Agenda for Europe’, where more ‘Open (Government) Data’ will support and accelerate the development of ‘A Thriving Data-Driven Economy’. However, the European Agency for Fundamental Rights is working to address social, legal and ethical implications from surveillance activities and data protection mishaps, particularly for personal health information.
The aim of the BIMEDA project is to elaborate a theoretical framework for critically analysing social, technical and ethical challenges from big medical data analytics, through the mapping of the data protection controversy of the Care.data programme and the qualitative study of organisations that perform big primary care data analytics as well as of GPs and citizens who decide to opt out from big medical databases. It brings together a talented researcher with background in qualitative research of clinical information systems’ implementation and use from a Science and Technology Studies perspective to work with an internationally renowned host institution (University of Nottingham) in big primary care data analytics and, healthcare organisations and (Horizon) Digital Economy research in order to identify and clarify, for policy-makers and the public, possibilities, limitations, assumptions and biases in research, knowledge production and ethical conduct.
Big medical data offers new opportunities and conditions of ‘learning’. Healthcare professionals can use the data they already collect about their patients and services and make better clinical and managerial decisions without having to wait for the results of other studies to apply new knowledge . In Europe, several EC and EPSRC funded projects are already demonstrating the power of big, integrated, data repositories in developing ICT platforms for reducing unhealthy lifestyles (daphne-fp7.eu), sharing electronic medical histories (linked2safety-project.eu), managing traumatic brain injuries (tbicare.eu), personalising medicine in paediatrics (md-paedigree.eu, health-e-child.org, sim-e-child.org) and improving the monitoring of patients after kidney transplantation .
However, big data in healthcare is a relatively new field of scientific research. While they hold great promise to transform healthcare and benefit patients, they have not yet demonstrated cost-benefit value. The actual big datasets are anything but comprehensive, and are even more prone to statistical errors. We cannot overcome sampling bias by simply sampling everything, particularly as statisticians are still scrambling to develop new methods for big data analytics .
Importantly, several technical, social, privacy and consent challenges remain. They must be addressed before they can fully influence health and social care , particularly via national and federal regulations around research on human subjects and the protection of health information .
Develop a conceptual framework for critically analysing the social, technical and ethical challenges in big medical data analytics
Map the controversy around open medical data and informed consent of the Care.data programme in England
Identify the technical complexities and challenges in maintaining big primary care datasets (e.g. data upload, data accuracy, data processing, computational infrastructure)
Identify the complexities and challenges in sustaining anonymity of electronic patient records for big data analytics (e.g. digital anonymisation)
Identify the complexities and challenges in performing big primary care data analytics (e.g. computational power and skills, statistical processing)
Identify the complexities and challenges in integrating big datasets with national databases (i.e. Care.data)
Identify the social and technical complexities and challenges for General Practitioners (GPs) in obtaining informed consent from individual patients to upload their medical record in big databases
Identify reasons for patients opting out from big medical datasets
Comparative analysis of the scientific, policy and popular media literature on the social, technical and ethical challenges in big medical data analytics in England, Europe and the US (e.g. data accessibility, data accuracy, data relationality, research automation, correlations, data context and interpretation, anonymity and informed consent).
Use of digital methods for information visualisation (risk cartography, interactive map, scientometric analysis)
Understand the nature and the challenges of processing primary care data (e.g. medical terminology, notes summarising, quality data, quality coding, analysis and interpretation)
Develop advanced ethnographic skills in participant observations, interviews and textual analysis
Explore the economic and social value of big medical data (e.g. new skills, products and services, quality and efficiency gains)
Understand the role and concerns of patient advisory groups regarding medical confidentiality in big datasets
Grasp the overall technical, social, ethical and cultural challenges and implications of big medical data analytics
Develop transferable skills in project, financial and organisational management and advance transferable skills in communication and dissemination of research results to scientific and lay audiences
While (predictive) big data analytics hold great promise for population health management, several technical, social, ethical and legal challenges must be addressed before they can fully influence health and social care.
The application of ethnographic methods from a Science & Technology Studies’ perspective, particularly Actor-Network Theory , is essential to investigate the aspirations and motivations as well as the concerns of organisations involved in the development and processing of big medical datasets.
Ethnography, as a method of studying and learning about groups of people in their own environment, rather than looking for small sets of variables across a large number of people, can lead the empirical researcher to results that are scientific and uncover complexities and unintended consequences particularly of technology implementation and use  as part of health IT projects’ evaluations .
ANT offers a researcher a set of methodological principles for more detached, objective, value-free but also rich descriptions of rationales behind technological innovations as they attempt to become part of everyday life. It specifically focuses on three issues: (a) the network of the users and the technology, (b) the content of the technology, and (c) the trajectory of the technology’s historical development . It is also informed by the concept of translation to emphasise the focus on tracing and describing the process by which a set of actors (research subjects) attempt to form a strong network of use around that particular technology. This process, during which actors, interactions, interests, controversies, negotiations, associations and behaviours are identified, resolved and agreed, is the fundamental focus of an ANT researcher who strives to understand how power, a theory or truth is performed and, then, sustained through these durable networks. For this, the ANT researcher also has to adopt three methodological principles . The first deals with the nature of the actor, which can be either human (users) or non-human (technologies), asserting that this is[irrelevant to the inquiry (agnosticism).
Secondly, for ANT human and non-human actors should be treated equally and described using the same language (symmetry). Lastly, no prior distinctions between human and non-human actors should exist (free association).
The research project BIMEDA will take as a case study the controversy of the Care.data programme in England and will ethnographically study with the use of participant observations and interviews, the work of two (academic and commercial) organisations that collaborate in big primary care data analytics so as to develop a first-hand, contextualised and naturalistic description of aforementioned challenges, limitations and hyped views. It will also examine the understanding and motivations of patients who consent to opt out from having their medical data being included in these big datasets. For this, I intend to work on two levels: conceptual framework and ethnography.
Search for, collect and comparatively analyse the scientific, policy and popular media literature in England, Europe and the US on big medical data analytics.
Investigation of the controversy around the Care.data programme in England, which has been delayed due to issues of privacy and informed consent.
Use of Bruno Latour’s cartography of controversies method which provide a novel set of techniques for the practical exploration and user-friendly visualisation of contemporary socio-technical debates  using various digital toolkits for mapping, timelining, scientometric analysis and visualisation of information (risk cartography, interactive map, scientometric analysis) .
Participant observations so as to develop an understanding of the daily activities and the complexities involved in big data analytics. The fellow will observe a small group of managerial, scientific, advisory and technical staff and record their activities as they unfold through field notes
(Semi-structured) interviews with:
managerial, scientific, advisory and technical staff working with big medical data
GPs that use a specific GP system and upload patients’ data in big databases but have also come across patients who have decided to opt out from having their health data uploaded in these databases
Citizens who have decided to opt out from having their health data uploaded in these databases
This approach (observations and interviews) will provide the fellow with the means to triangulate the data and minimise the effects of my inevitable influence as a researcher on the participants’ behaviours on the field. Through member checking  (participants’ feedback on observations), the fellow will ensure the accuracy of my depictions and conclusions with regards to participants' experiences. For the data analysis, an iterative process of coding and thematic analysis will be followed  using CAQDAS software (QSR NVivo). Recurrent themes and patterns will be identified and descriptive categories will be developed for the emerging themes. After coding, the analysis will engage with the theoretical concepts of ANT.
The approach (ANT) and the methods (interviews) applied here have been used in the past as innovative and interdisciplinary methodology for studying the implementation process of clinical information systems. Specifically ANT has been providing researchers with a unique lens to study the deployment of new technologies. This is achieved by considering it as a translation of certain actors’ interpretation or appropriation of the interests of other actors according to their own interests. This process, mechanism or result is then inscribed in the non-human actors (how technologies operate). By tracing and describing these acts, researchers can then understand how particular technologies arrive and proliferate.
|WORK PACKAGE 1 CONCEPTUAL FRAMEWORK (CF) MONTHS 1-5|
Objective: To develop a conceptual framework for critically analysing social, technical and ethical challenges of big medical data analytics based on the scientific, policy and popular media literature.
Tasks: Search for scientific literature in established databases (e.g. PubMed), for relevant policy documents in England, Europe and the US, and for results from relevant EU funded projects. From there, the fellow will conduct the cartography of the controversy around the Care.data programme in England with the use of digital methods (risk cartography, interactive map, scientometric analysis) and a comparative investigation on scientific, policy and popular media literature in England, Europe and the US about the social, technical and ethical challenges of big medical data analytics.
Deliverable 1: A conceptual framework for critically analysing social, technical and ethical challenges from big medical data analytics (data accessibility, data accuracy, data relationality, research automation, correlations, data context and interpretation, anonymity and informed consent), also to be used as a guide for the ethnographic fieldwork, by month 5.
|WORK PACKAGE 2 TRAINING (TR) MONTHS 1-6|
Objective: To develop an understanding of primary care data (e.g. medical terminology, notes summarising, quality data, quality coding, analysis and interpretation) and gain additional skills in ethnographic data collection (i.e. participant observations, textual analysis).
Tasks: Complete training in primary care data for researchers and in advanced ethnography at the host institution.
Deliverable 2: Study design (clarification and specification of research questions and objectives, identification of key informants, specification of sample size, identification of potential organisational documents for textual analysis, development of interview and participant observation guides) by month 6.
|WORK PACKAGE 3 DATA COLLECTION (DC) MONTHS 7-12|
|Objective: To collect data for analysis and interpretation of results to include in the final report.|
Tasks: Participant observations of and interviews with protagonists in big data analytics in primary care as well as interviews with GPs who use a specific GP system and with citizens who have opted out from big databases.
|WORK PACKAGE 4 DATA ANALYSIS (DA) MONTHS 13- 18|
Objective: To develop a theoretical framework for critically analysing big primary care data analytics’ social, technical and ethical challenges from the analysis of the collected qualitative data and the interpretation of results.
Deliverable 4: A theoretical framework for critically analysing big primary care data analytics’ social, technical and ethical challenges (e.g. data accessibility, data accuracy, data relationality, research automation, correlations, data context and interpretation, anonymity and informed consent) by month 18.
ETHICS COMMITTEE AND REGULATORY APPROVALS
The study will not be initiated before the protocol, consent forms and participant information sheets have received approval / favourable opinion from the Nottingham University Business School Research Ethics Committee (NUBS REC), and the respective National Health Service (NHS) Research & Development (R&D) department. Should a protocol amendment be made that requires REC approval, the changes in the protocol will not be instituted until the amendment and revised informed consent forms and participant information sheets have been reviewed and received approval / favourable opinion from the REC and R&D departments. A protocol amendment intended to eliminate an apparent immediate hazard to participants may be implemented immediately providing that the REC are notified as soon as possible and an approval is requested. Minor protocol amendments only for logistical or administrative changes may be implemented immediately; and the REC will be informed.
The study will be conducted in accordance with the ethical principles that have their origin in the Declaration of Helsinki, 1996; the principles of Good Clinical Practice, and the Department of Health Research Governance Framework for Health and Social care, 2005.
INFORMED CONSENT AND PARTICIPANT INFORMATION
The process for obtaining participant informed consent will be in accordance with the REC guidance, and Good Clinical Practice (GCP) and any other regulatory requirements that might be introduced. The investigator and the participant shall both sign and date the Consent Form before the person can participate in the study.
The participant will receive a copy of the signed and dated forms and the original will be retained in the Study records.
The decision regarding participation in the study is entirely voluntary. The investigator shall emphasize to them that consent regarding study participation may be withdrawn at any time without penalty or affecting their employment status, or loss of benefits to which the participant is otherwise entitled. No interviews will be conducted before informed consent has been obtained.
The investigator will inform the participant of any relevant information that becomes available during the course of the study, and will discuss with them, whether they wish to continue with the study. If applicable they will be asked to sign revised consent forms.
If the Consent Form is amended during the study, the investigator shall follow all applicable regulatory requirements pertaining to approval of the amended Consent Form by the REC and use of the amended form (including for ongoing participants).
Each participant will be assigned a study identity code number, for use on study documents. The documents will also use their initials (of first and last names separated by a hyphen or a middle name initial when available).
Interview transcripts will be treated as confidential documents and held securely in accordance with regulations. The investigator will make a separate confidential record of the participant’s name and Participant Study Number, to permit identification of all participants enrolled in the study, in case additional follow-up is required.
Interview transcripts shall be restricted to those personnel approved by the Chief Investigator and recorded as such in the study records.
All paper forms shall be filled in using black ballpoint pen. Errors shall be lined out but not obliterated by using correction fluid and the correction inserted, initialled and dated.
Source documents shall be filed at the investigator’s site and may include but are not limited to, consent forms, study records, field notes, interview transcriptions and audio records. Only study staff shall have access to study documentation other than the regulatory requirements listed below.
Direct access to source data / documents
All source documents shall made be available at all times for review by the Chief Investigator (CI), Sponsor’s designee and inspection by relevant regulatory authorities.
The CI will endeavour to protect the rights of the study’s participants to privacy and informed consent, and will adhere to the Data Protection Act, 1998. The CI will only collect the minimum required information for the purposes of the study. Interview transcripts and audio files will be held securely, in a locked room, or locked cupboard or cabinet. Access to the information will be limited to the CI and any relevant regulatory authorities (see above). Computer held data will be held securely and password protected. All data will be stored on a secure dedicated web server. Access will be restricted by user identifiers and passwords (encrypted using a one way encryption method).
Information about the study in the participant’s interview record will be treated confidentially in the same way as all other confidential information.
Electronic data will be backed up every 24 hours to both local and remote media in encrypted format.
This project has received funding from the European Commission's Horizon 2020 - Research and Innovation Framework Programme (H2020-EU.1.3.2.) under grant agreement n° .
The research results reflect the fellow’s views only and not those of the European Commission. © BIMEDA 2015