This is the third of a four-part series reviewing and critiquing the recent Medicines and Healthcare products Regulatory Agency (MHRA) guidance for industry document on data integrity.1 The first part of the series2 provided a background to the guidance document and discussed the introduction to the document. The second part reviewed and discussed the data governance system.3 In this part, we will look at data criticality and the data life cycle.
Establishing Data Criticality and Inherent Integrity Risk
This section of the guidance first discusses the data governance system that we discussed above and then moves on to look at data generation. The spectrum of data generation is purported to be represented by Figure 1 in the guidance, which turns out to be a diagram drawn by Monica Cahilly during the April 2014 training of the MHRA inspectors.4 However, the diagram is, to my mind, only focused on instruments and computer systems, and I have drawn up a more detailed description of what should be presented in a data integrity guidance, see Figure 1.
On the horizontal axis at the top of the figure are the different processes that can be used in a laboratory environment to generate data; these vary from observation to simple instruments, such as balances and pH meters, to chromatography data systems through to LIMS (laboratory information management systems) and ERP (enterprise resource planning) systems. The vertical axis consists of the attributes of each process, such as whether software is used and, if so, the GAMP classification, the mode of data recording, the raw data produced and the main data integrity issues of each process. Note that Figure 2 quotes firmware as Category 2 software, although this has been discontinued in GAMP version 5,5 it equates to Group B instruments in USP <1058> on Analytical Instrument Qualification (AIQ).6 When mapping USP <1058> groups versus GAMP software categories,7 if Category 2 software were reinstated, there would be equivalence between Category 2 software and Group B instruments.
The first three processes from observation to analytical balance have paper records, and the remaining four items have electronic records as raw data. Dependent upon how the latter four computerized systems are used, they can either be hybrid or electronic by using electronic signatures. Furthermore, the pH meter and analytical balance are discussed here from the perspective of being standalone instruments rather than being interfaced to a LIMS or ELN (electronic laboratory notebook). The problem with the MHRA figure is that it focusses only on instruments and computerized systems and does not consider data gathered by observation.
Figure 1 also shows that, for analytical instruments and laboratory computerized systems, the following items hold true:
- Going from left to right, there is increasing complexity.
- Increasing amounts of AIQ and / or CSV are required to demonstrate fitness for intended use as one goes from a simple instrument to a complex computerized system.
- There is increasing risk to data integrity from either inadvertent acts by users or deliberate falsification going from left to right.
- There is increasing reliance of a laboratory on a supplier’s quality management system the further to the right one goes.
Let us look at four examples of data gathering from Figure 1:
- Observation: Manual observations may be found in many laboratories for tests such as color or odor of samples, as well as recording data from some instruments as shown in the first column on the left of Figure 3. As noted here, the data integrity issue is that there is no independent evidence to verify that the value or result recorded is correct, has suffered from a transcription error (value only) or has been falsified. Therefore, each process using observation only needs to be risk assessed to determine the criticality of the data being generated: for example, is an odor determination the same criticality as the pH determination of HPLC mobile phase?
- Instrument: The example used in Figure 1 is an analytical balance with a printer. Given the importance of accurately measuring reference materials and samples and the impact that a balance can have on a regulated laboratory, it is important that the integrity of measurement is maintained. At a minimum, a printer is essential for an analytical balance, as the MHRA guidance makes clear1 and discussed later in this paper. However we need to consider more detail: what data need to be recorded when making a weighing measurement? In my view, the printer needs to record the weights captured during any weighing operation e.g. weight of weighing vessel, tared weight and the weight of material.
- Hybrid System: The hybrid system, typified by a UV spectrometer using GAMP Category 3 software, is the worst of both worlds, as the laboratory has to manage and co-ordinate two different and incompatible media types: paper records and electronic records. The issues are that paper cannot be defined as raw data as noted by the EU and FDA.8,9 Note that the FDA level 2 guidance9 is a much better discussion of why paper cannot be raw data. Other data integrity issues are that configuration of the software must be recorded, including definitions of user types and the access privileges for each type, and validation of this configured software for the intended use. Many hybrid systems consist of the instrument connected to a standalone workstation, where there are potential issues of access to the operating system, clock, the data files themselves via the OS and effective and validated backup and recovery.10 This situation is specifically commented in the MHRA guidance in the definitions section.1 Systems using the operating system to store the data files in open access directories can suffer from the stupidity of operators performing unintended deletions, as well as attempts at falsification from individuals. However the use of a database should protect data from many falsification attacks. But, in reality, data need to be acquired and stored securely in the network when using flat file systems.
- Electronic System: Using a chromatography data system with GAMP category 4 software with electronic signatures as an example. In this instance, the raw data are electronic records with electronic signatures. To ensure data integrity, the application has to be configured for security and access control (definition of user types and access privileges) and also for the use of electronic signatures. Data are acquired to the network and are secured with a database. Validation for intended use will demonstrate that the configured systems works. The audit trail documents changes made by authorized individuals. The issue now is the separation of system administration roles from that of the use of the system by chromatographers.
Note that this approach can only be a generalization: know your instrument or system and how it operates is the key maxim here. For example, modern balances can have clocks, and their screens can access software such as electronic laboratory notebooks or LIMS acting as terminals, as well as an analytical instrument. Simply having a balance connected to such an application may not be enough — where is the time and date stamp applied in such cases: at the balance or in the software application? Can anybody change the clock in the balance and impact the time stamp in the application?
The Data Life Cycle
The MHRA definition of data expects a data lifecycle but does not give any clue about what one should be. In the absence of guidance, here is my suggestion of what such a data life cycle could be as shown in Figure 2. Firstly, there are two phases of a data life cycle for laboratory data: an active phase and an inactive phase.
The active phase of the data life cycle consists of the following activities:
- Data acquisition: the process of controlling and recording the observation or generating the data from the analytical procedure
- Data processing: interpretation or processing of the original data
- Generate reportable result: calculation of the reportable result for comparison versus specification
- Information and Knowledge Use: use of the result for the immediate purpose, but also over a longer time for trending
- Short Term Retention: storage of the data and information in a secure but accessible environment for any further use e.g. complaints, investigations, as well as audits / inspections
Note that, for many laboratory computerized systems where electronic records are stored in flat files within the operating system, there may needs to be a retention process performed after each stage of the active phase to ensure preservation of the record and the integrity.
The inactive phase of the data lifecycle consists of the following stages:
- Long-term Archive: movement of the records into a secure archive for long-term retention
- Data Migration: if necessary or required, there may be one or more migrations of data from one system / repository to another over the retention period
- Data / Record Destruction: when the retention period has elapsed, then a formal process to destroy the data / records should be executed, providing that there is no litigation pending.
However, this life cycle does not account for any other use of the data e.g. trending over time or product quality reviews where the information generated during an analysis is used as the input data for generation of additional information or knowledge abstraction.
In this part of the MHRA data integrity guidance, we have looked at the data risk and criticality via different ways of generating data from observation to an electronic computerized system using electronic signatures. In addition, we have considered a data lifecycle and looked at some of the issues surrounding this. In the last part of the series we will look at the section on system design, discuss a few of the definitions that constitute the bulk of the guidance document and summarize the guidance document.
- MHRA GMP Data Integrity Definitions and Guidance for Industry version 2, March 2015
- R.D.McDowall, Scientific Computing, Part 1 http://www.scientificcomputing.com/articles/2015/05/review-and-critique-mrha-data-integrity-guidance-industry-%E2%80%94-part-1-overview
- R.D.McDowall, Scientific Computing, Part 2 http://www.scientificcomputing.com/articles/2015/05/review-and-critique-mrha-data-integrity-guidance-industry-%E2%80%94-part-2-data-governance-system
- M Cahilly, Personal Communication
- Good Automated Manufacturing Practice (GAMP) guidelines Version 5, ISPE, Tampa Florida, 2008
- United States Pharmacopoeia, <1058>, Analytical Instrument Qualification
- L. Vuolo-Schuessler, M. E. Newton, P. Smith, C.Burgess, and R.D. McDowall, Pharmaceutical Engineering 46, (1), 46 – 56, 2014
- EU GMP Chapter 4 Documentation, 2011
- FDA Level 2 guidance Questions and Answers on Current Good Manufacturing Practices, Good Guidance Practices, Level 2 Guidance – Records and Reports. Question 3: How do the Part 11 regulations and “predicate rule requirements” (in 21 CFR Part 211) apply to the electronic records created by computerized laboratory systems and the associated printed chromatograms that are used in drug manufacturing and testing?
- EU GMP Annex 11 Computerised Systems, 2011
R.D. McDowall is Director of R D McDowall Ltd. He may be contacted at editor@ScientificComputing.com.
- Review and Critique of the MRHA Data Integrity Guidance for Industry — Part 1: Overview
- Review and Critique of the MRHA Data Integrity Guidance for Industry — Part 2: Data Governance System
- Review and Critique of the MRHA Data Integrity Guidance for Industry — Part 4: System Design, Definitions and Overall Assessment