Big Medical Data in Primary Care (BiMeDa)



Aggregated data: Statistical data about several individuals that has been combined to show general trends or values without identifying individuals within the data.

Algorithm: An effective method for calculating a function, expressed as a finite list of well- defined instructions. In the report, this term is used with particular reference to statistical data-mining.

Anonymisation: The process of rendering data into a form which does not identify individuals and where identification is not likely to take place, i.e. the removal of the names, addresses and other identifying particulars of data subjects. 

Ascertainment bias: (also sampling bias) A systematic distortion in measuring the true occurrence of a phenomenon resulting from the way in which data are collected, where all relevant instances were not equally likely to have been recorded. 

Audit: An audit is an official internal or external examination of an organisation. See ‘Clinical audit’ and ‘Independent audit’.

Big data: A term used to describe large and rapidly growing datasets in all areas of life and accompanying technologies and developments to analyse and re-use these, especially with the ambition to release latent or unanticipated value.

Biobank: A repository which collects, processes, stores and distributes tissue and data for biomedical, genomic or other research purposes. 

Caldicott Guardian: A senior person responsible for protecting the confidentiality of patient and service-user information and enabling appropriate information-sharing at NHS and social care organisations as well as voluntary and independent sector organisations.

Care pathway: A care pathway is anticipated care placed in an appropriate time frame, written and agreed by a multi-disciplinary team. It has locally agreed standards based on evidence where available to help a patient with a specific condition or diagnosis move progressively through the clinical treatment. ‘Whole care pathways’ refer to the end-to- end process of care for particular conditions from the point of entry to the point of departure from care.

Care records: Care records are personal records. They comprise documentary and other records concerning an individual (whether living or dead) who can be identified from them and relating:

  • to the individual’s physical or mental health;

  • to spiritual counselling or assistance given or to be given to the individual; or

  • to counselling or assistance given or to be given to the individual, for the purposes of their personal welfare, by any voluntary organisation or by any individual who:

    • by reason of the individual’s office or occupation has responsibilities for their personal welfare; or

    • by an order of a court has responsibilities for the individual’s supervision.

This record may be held electronically or in a paper file or a combination of both.

Care team: The health and/or social care professionals and staff that directly provide or support care to an individual.

Clinical audit: Clinical audit is a tool for improving practice, patient care or services provided. It is used to measure current practice and care against a set of explicit standards or criteria, identify areas for improvement, make changes to practice and re-audit to ensure that improvement has been achieved. The findings of the clinical audit provide evidence of the quality of practice and care.

Cloud computing: A term for shared computing used to refer to the outsourcing of computation to centralised shared resources, typically in remote data centres.

Coded data and samples: Biological samples and associated data labelled with at least one specific code, which allows traceability to a given individual, the ability to perform clinical monitoring, subject follow-up, or addition of new data.

Commissioning (and commissioners): Commissioning is essentially buying care in line with available resources to ensure that services meet the needs of the population. The process of commissioning includes assessing the needs of the population, selecting service providers and ensuring that these services are safe, effective, people-centred and of high quality. Commissioners are responsible for commissioning services.

Confidentiality: A societal norm or legal duty relating to the disclosure of information obtained in contexts (often, but not exclusively, professional ones) where expectations underpinning such norms are reasonable i.e. ‘in confidence’. In specific professional contexts, there may be established codes of practice to safeguard confidentiality. 

Consent: The voluntary, informed and competent agreement of an individual to any action that would otherwise constitute an infringement of fundamental personal interests or rights.Consent is an important ethical mechanism in medical treatment, research participation and processing of sensitive data. 

Correlation: A broad class of statistical relationships involving formal dependence between two random variables or two sets of data. 

Data: Qualitative or quantitative statements or numbers that are (or are assumed to be) factual. Data may be raw or primary data (e.g. direct from measurement), or derivative of primary data, but are not yet the product of analysis or interpretation other than calculation.

Data abuse: A broad category of insecure, inadequate or unethical uses of data that have been empirically observed, including fabrication or falsification of data; data theft; unauthorised disclosure of or access to data; non-secure disposal of data; unauthorised retention of data; technical security failures and data loss. 

Data breach: Any failure to meet the requirements of the Data Protection Act, unlawful disclosure or misuse of personal confidential data and an inappropriate invasion of people’s privacy.

Data controller: A person (individual or organisation) who determines the purposes for which and the manner in which any personal confidential data are or will be processed. Data controllers must ensure that any processing of personal data for which they are responsible complies with the Act.

  • Joint data controllers control how data is processed jointly i.e. they must agree and make such decisions together.

  • Data controllers in common agree to pool data and are both responsible for how it is used but each may process the data independently for its own purposes. All of the data controllers in common are still responsible for ensuring it is adequately protected.

Data initiatives: Purposive activities in which data collected for one purpose are used for a new purpose, often involving linking with other data sources. 

Data loss: A breach of principle 7 of the DPA or an inappropriate breaking of confidentiality.

Data mining: The computational process of discovering patterns in big data sets.

Data processor: In relation to personal data, means any person (other than an employee of the data controller) who processes the data on behalf of the data controller. Data processors are not directly subject to the Data Protection Act. But the Information Commissioner recommends that organisations should choose data processors carefully and have in place effective means of monitoring, reviewing and auditing their processing and a written contract (detailing the information governance requirements) must be in place to ensure compliance with principle 7 of the Data Protection Act.

Data sharing: Extending access to data to data users who were not intended users at the time of data collection, usually for the purpose of further research or analysis; the term is common in research and policy contexts and is suggestive of the social benefits of data re- use.

De-anonymisation: See ‘Re-identification’.

De-identified data: This refers to personal confidential data, which has been through anonymisation in a manner conforming to the ICO Anonymisation code of practice. There are two categories of de-identified data:

  • De-identified data for limited access: this is deemed to have a high risk of re-identification if published, but a low risk if held in an accredited safe haven and subject to contractual protection to prevent re-identification.

  • Anonymised data for publication: this is deemed to have a low risk of re-identification, enabling publication.

Demographic data: Information relating to the general characteristics of an individual or population e.g. ethnicity, gender, geographical location, socio-economic status.

Direct care: A clinical, social or public health activity concerned with the prevention, investigation and treatment of illness and the alleviation of suffering of individuals. It includes supporting individuals’ ability to function and improve their participation in life and society. It includes the assurance of safe and high quality care and treatment through local audit, the management of untoward or adverse incidents, person satisfaction including measurement of outcomes undertaken by one or more registered and regulated health or social care professionals and their team with whom the individual has a legitimate relationship for their care.

Dynamic consent: Interfaces for patients or research participants that aim to enable them to give or withdraw consent for their information to be used in specific research projects, thereby it is argued overcoming the limitations of all-or-nothing consent regimes.

Electronic health record: A computerised record of a patient’s medical history, such as medication, allergies, results of health tests, lifestyle and personal information that can be used in different health care settings.  

Evidence-based medicine (EBM): The conscientious, explicit and judicious use of current best evidence in making decisions about the care of individual patients. This involves integrating individual clinical expertise with the best available external clinical evidence from systematic research. 

Free rider: Those who partake of the benefits of some cooperative enterprise without contributing to it. 

Genetic information: Genetic information is information about the genotype, or heritable characteristics of individuals obtained by direct analysis of DNA, or by other biochemical testing. Genetic information in itself is not always identifiable; personal genetic information refers to information about the genetic make-up of an identifiable person.

Genome: The total genetic complement of an individual.

Health service body: Organisations (or individuals) with specific functions, obligations and powers defined in law. In England, the health service bodies from 1 April 2013 are: the Secretary of State for Health (includes the Department of Health, and its Executive agencies such as Public Health England and the MHRA), the NHS Commissioning Board, clinical commissioning groups, NHS Trusts (including Foundation Trusts), special health authorities such as the NHS Business Services Authority, CQC, NICE, and the Health and Social Care Information Centre. Local authorities are not health service bodies.

Identifiable information: See ‘Personal confidential data’.

Identifier: An item of data, which by itself or in combination with other identifiers enables an individual to be identified.

Independent audit: An audit conducted by an external and therefore independent auditor to provide greater public assurance.

Indirect care: Activities that contribute to the overall provision of services to a population as a whole or a group of patients with a particular condition, but which fall outside the scope of direct care. It covers health services management, preventative medicine, and medical research. Examples of activities would be risk prediction and stratification, service evaluation, needs assessment, financial audit.

Information governance: How organisations manage the way information and data are handled within the health and social care system in England. It covers the collection, use, access and decommissioning as well as requirements and standards organisations and their suppliers need to achieve to fulfil the obligations that information is handled legally, securely, efficiently, effectively and in a manner which maintains public trust.

Informational privacy: An interest or right in the disclosure and withholding of information, founded in respect for persons; an aspect of privacy provided for in legal instruments such as the European Convention on Human Rights. 

Linkage: The merging of information or data from two or more sources with the object of consolidating facts concerning an individual or an event that are not available in any separate record.

Metadata: Data that describe the contents of substantive records and the circumstances of their creation and processing, for example the size of data files, the time or location at which they were created, identity of the author, and technical characteristics of the data.

Open data: Data anyone is free to access, use, modify, and share, but which may have sharing conditions so that it is correctly attributed or that further use is not constrained. 

Personal confidential data: This term describes personal information about identified or identifiable individuals, which should be kept private or secret. For the purposes of this review ‘Personal’ includes the DPA definition of personal data, but it is adapted to include dead as well as living people and ‘confidential’ includes both information ‘given in confidence’ and ‘that which is owed a duty of confidence’ and is adapted to include ‘sensitive’ as defined in the Data Protection Act.

Personal data: Data which relate to a living individual who can be identified from those data, or from those data and other information which is in the possession of, or is likely to come into the possession of, the data controller, and includes any expression of opinion about the individual and any indication of the intentions of the data controller or any other person in respect of the individual.

Personal information: See ‘Personal confidential data’.

Personal record/personal health and wellbeing record: See ‘Care records’.

Personalised medicine: A concept that reflects a confluence of different scientific, technological and social disciplines and approaches. It has a number of different meanings, but among these is the tailoring of medicine to the biological characteristics of particular patients or patient groups (pharmacogenetics, stratified medicine). The basic enabling technology for personalised medicine is molecular diagnostics. 

Potentially identifiable: See ‘De-identified data for limited access’.

Primary care: Primary care refers to services provided by GP practices, dental practices, community pharmacies and high street optometrists.

Privacy: The interest and right people have in controlling access to themselves, their homes, or to information about them. What counts as private can change depending on social norms, the specific context, and the relationship between the person concerned and those who might enjoy access. Privacy is exercised by selectively withholding or allowing access by others or through limits on acceptable behaviour in others. In the UK and the rest of Europe, privacy is guaranteed by the European Convention on Human Rights.

Privacy impact assessment: A systematic and comprehensive process for determining the privacy, confidentiality and security risks associated with the collection, use and disclosure for personal data prior to the introduction of or a change to a policy, process or procedure.

Processing: Processing in relation to information or data, means obtaining, recording or holding the information or data or carrying out any operation or set of operations on the information or data, including:

  • organisation, adaptation or alteration of the information or data;

  • retrieval, consultation or use of the information or data;

  • disclosure of the information or data by transmission, dissemination or otherwise making

    available; or

  • alignment, combination, blocking, erasure or destruction of the information or data.

Pseudonymisation: The process of distinguishing individuals in a data set by using a unique identifier, which does not reveal their ‘real world’ identity (see also ‘Anonymisation’ and ‘De-identified’ data).

Public good: A good that is non-rivalrous or non-excludable, or both. A good is non-rivalrous if my use of it does not in any way reduce the amount of it available for you to use. A good is non-excludable if it cannot be made available to you without also making it available to me and any number of others who might also wish to enjoy it. 

Public interest: Something ‘in the public interest’ is something that serves the interests of society as a whole. The ‘public interest test’ is used to determine whether the benefit of disclosing sensitive information outweighs the personal interest of the individual concerned and the need to protect the public’s trust in the confidentiality of services.

Re-identification: The process of analysing data or combining it with other data with the result that individuals become identifiable. Also known as ‘de-anonymisation’.

Safe haven: An accredited organisation with a secure electronic environment in which personal confidential data and/or de-identified data can be obtained and made available to users, generally in de-identified form. An accredited safe haven will need a secure legal basis to hold and process personal confidential data. De-identified data can be held under contract with obligations to safeguard the data.

Screening: Screening is a process of identifying apparently healthy people who may be at increased risk of a disease or condition. They can then be offered information, further tests and appropriate treatment to reduce their risk and/or any complications arising from the disease or condition.

Secondary use (‘reuse’, ‘secondary use’ or ‘repurposing’ of data): Any data use that goes beyond the use intended at the time of data collection and for which the patient gave consent. It typically means the use of GP and hospital medical records for research and administrative purposes. In this report we are also concerned with unpredictable future uses, and uses for incompatible purposes. 

Sensitive personal data: Data that identifies a living individual consisting of information as to his or her: racial or ethnic origin, political opinions, religious beliefs or other beliefs of a similar nature, membership of a trade union, physical or mental health or condition, sexual life, convictions, legal proceedings against the individual or allegations of offences committed by the individual. See also ‘Personal confidential data’.

Specialist commissioning: This relates to the purchasing and planning of specialised services for diseases and disorders. Specialised services are defined in law as those services with a planning population of more than one million people. The NHS Commissioning Board is responsible for commissioning specialised services.

Summary Care Record (SCR): nationwide system containing a GP record summary (initially, current prescriptions and allergies) that would facilitate out-of-hours care and could also enable patients to view their own records via a mechanism called Healthspace promoted by Connecting for Health.

Third party: In relation to personal data, any person other than the subject of the data, the data controller, or a data processor.

* partially adopted from Information: To Share Or Not To Share? The Information Governance Review, The collection, linking and use of data in biomedical research and health care: ethical issues and Emerging biotechnologies: technology, choice and the public good.