Is Automated Data Review “Good Enough”? - Environmental Standards, Inc.

Data Validation vs Automated Data Review QC Software

Environmental investigations and remediation projects generate a substantial amount of laboratory data, much of which are loaded to different types of data management systems. As data management systems have advanced, many providers have developed automated data quality control (QC) review software. The automated data review process uses QC information reported in laboratory-generated electronic data deliverables (EDDs) to assess data quality based on established validation protocol. The level of review may vary from checks of batch QC samples (i.e., matrix spike/matrix spike duplicates, laboratory control samples, and method blanks) to more comprehensive review inclusive of instrument calibrations.

Automated data review can be a useful tool to efficiently screen large amounts of laboratory data for usability purposes with minimal human effort. However, automated data review has a few key drawbacks.

Automated data review assumes that the information provided in the laboratory-generated EDD is correct. Automated data review alone cannot identify problems such as incorrect units; incorrect dilution or preparation factors; incorrect or missing batch QC information; incorrectly-reported results, recoveries, or RPDs; incorrect sample IDs; or incorrect analyte lists.
Automated data review assumes that the information provided in the laboratory-generated EDD is complete. Unless used in conjunction with sample planning software, automated data review cannot confirm that all requested samples and analyses were included in the EDD. Automated data review alone will not detect parameters or samples for which data were omitted.
Automated data review assumes that field data are provided and are accurate. Field data, such as sample IDs and collection dates and times may be required for some aspects of automated data review (e.g., holding time evaluation). Field data must be submitted to the laboratory by the field sampling team for incorporation into the final laboratory EDD, or must be submitted to the database administrator to be matched with the analytical data.
Automated data review assumes that the sample results are correct as reported in the laboratory-generated EDD. Arguably the most significant drawback of automated data review is the inability to confirm that results were generated correctly and reported accurately.
1. Automated data review does not include review of sample raw data, such as chromatograms or mass spectra, to confirm that the analytes of interest were correctly identified and quantitated.
2. Automated data review will not confirm that manual integrations were performed appropriately, or that peak patterns match reference standards.
3. Automated data review cannot identify transcription errors from the raw data to LIMS (those are only identified by comparing raw bench sheets to the final reported results). Automated data review cannot evaluate whether MDLs were determined appropriately and are realistic.
4. In short, automated data review does not critically evaluate the data to verify defensibility (viz., are the reported results even correct?).

Relying solely on automated data review requires an extraordinary level of confidence in a laboratory’s technical skill, quality control practices, and data reporting capabilities. For a large, ongoing project with an established baseline and relatively low risk, that confidence may be established through manual data validation by a qualified QC chemist – and automated data review may be a cost-effective alternative to monitor data quality once your data stream has matured. However, for projects with a high degree of potential liability, visibility, and/or regulatory oversight (such as those headed toward litigation), data defensibility must be established and monitored continuously through rigorous, critical third-party data validation.