ACRG Seminar Method validation ก ก ก ก ก ก ก 21 ก 2553

INTRODUCTION

Introduction Method validation is the process used to confirm that the analytical procedure employed for a specific test is suitable for its intended use. Results from method validation can be used to judge the quality, reliability and consistency of analytical results; it is an integral part of any good analytical practice. Validation and qulification in Analytical Laboratories, 2007

Simple illustration of the classical validation strategy (upper) and fitness-for-purpose validation (lower). Dashed circle represents the precision of the method, and continuous circle the acceptability limit. Max Feinberg, 2007

(Westgard JO)

The life cycle of a method of analysis. Max Feinberg, 2007

Introduction Analytical methods need to be validated or revalidated before their introduction into routine use; whenever the conditions change for which the method has been validated (e.g., an instrument with different characteristics or samples with a different matrix) whenever the method is changed and the change is outside the original scope of the method. Validation and qulification in Analytical Laboratories, 2007

Introduction Method validation has received considerable attention in the literature and from industrial committees and regulatory agencies. 1. The U.S. FDA CGMP request in section 211.165 (e) methods to be validated: the accuracy, sensitivity, specificity, and reproducibility of test methods employed by the firm shall be established and documented. Such validation and documentation may be accomplished in accordance with Sec. 211.194(a). These requirements include a statement of each method used in testing the sample to meet proper standards of accuracy and reliability, as applied to the tested product. The U.S. FDA has also proposed an industry guidance for Analytical Procedures and Methods Validation (2).

Introduction 2. ISO/IEC 17025 includes a chapter on the validation of methods with a list of nine validation parameters. The ICH has developed a consensus text on the validation of analytical procedures. The document includes definitions for eight validation characteristics. ICH also developed a guidance with detailed methodology.

Introduction 3. The U.S. EPA prepared a guidance for method s development and validation for the Resource Conservation and Recovery Act (RCRA). The AOAC, the EPA and other scientific organizations provide methods that are validated through multi-laboratory studies.

Scope of method validation

Scope of method validation The scope of the method and its validation criteria should be defined early in the process. These include the following questions: - What analytes should be detected? - What are the expected concentration levels? - What are the sample matrices? - Are there interfering substances expected, and, if so, should they be detected and quantified? - Are there any specific legislative or regulatory requirements? Validation and qulification in Analytical Laboratories, 2007

Scope of method validation Should information be qualitative or quantitative? What are the required detection and quantitation limits? What is the expected concentration range? What precision and accuracy is expected? How robust should the method be? Which type of equipment should be used? Will the method be used in one specific laboratory or should it be applicable in all laboratories at one side or around the globe? What skills do the anticipated users of the method have? Validation and qulification in Analytical Laboratories, 2007

Steps in Method Validation Develop a validation protocol, an operating procedure or a validation master plan for the validation For a specific validation project define owners and responsibilities Develop a validation project plan Define the application, purpose and scope of the method Define the performance parameters and acceptance criteria Define validation experiments Verify relevant performance characteristics of equipment Validation and qulification in Analytical Laboratories, 2007

Steps in Method Validation Qualify materials, e.g. standards and reagents for purity, accurate amounts and sufficient stability Perform pre-validation experiments Adjust method parameters or/and acceptance criteria if necessary Perform full internal (and external) validation experiments Develop SOPs for executing the method in the routine Define criteria for revalidation Define type and frequency of system suitability tests and/or analytical quality control (AQC) checks for the routine Document validation experiments and results in the validation report Validation and qulification in Analytical Laboratories, 2007

Steps in Method Validation Develop validation master plan define owners and responsibilities Define the application/ purpose/ scope Define the performance parameters and acceptance criteria Define validation experiments Validation report Define QC checks for the routine Define criteria for revalidation Develop SOPs Perform full validation experiments Verify relevant performance characteristics of equipment Adjust method parameters / acceptance criteria if necessary Prepare Qualify materials Perform pre-validation experiments Validation and qulification in Analytical Laboratories, 2007

Steps for validation USP defines eight steps for validation: Accuracy Precision Specificity Limit of detection Limit of quantitation Linearity and range Ruggedness Robustness

Validation parameters Accuracy The measure of exactness of an analytical method, or the closeness of agreement between the value which is accepted either as a conventional, true value or an accepted reference value and the value found. Precision Precision is the measure of the degree of repeatability of an analytical method under normal operation and is normally expressed as the percent relative standard deviation for a statistically significant number of samples. Detection Limit The point at which instrument response for an analyte or compound can be distinguished from instrument noise, but not be accurately quantitated. Stephen Lawson,

Validation parameters Quantitation Limit The lowest concentration of carbon that can be determined with acceptable precision and accuracy under the stated operational conditions of the method. Linearity Linearity should be demonstrated by an r squared value indicating that the regression line will be an excellent predictor when transforming sample data. Range the interval between upper and lower levels that have been demonstrated to be determined with precision, accuracy and linearity Robustness Ability of the analytical procedure to remain unaffected by small but deliberate variations in method parameters. Stephen Lawson,

Accuracy & Precision

Accuracy The Comparison of Methods Experiment The comparison of methods experiment is performed to estimate inaccuracy or systematic error. You perform this experiment by analyzing patient samples by the new method (test method) and a comparative method, then estimate the systematic errors on the basis of the differences observed between the methods. The systematic differences at critical medical decision concentrations are the errors of interest. Westgard JO

Accuracy Factors to Consider Comparative method The analytical method that is used for comparison must be carefully selected because the interpretation of the experimental results will depend on the assumption that can be made about the correctness of results from the comparative method. Any differences between a test method and a routine method must be carefully interpreted. If the differences are small, then the two methods have the same relative accuracy. If the differences are large and medically unacceptable, then it is necessary to identify which method is inaccurate. Westgard JO

Accuracy Factors to consider Number of patient specimens A minimum of 40 different patient specimens should be tested by the two methods. These specimens should be selected to cover the entire working range of the method and should represent the spectrum of diseases expected in routine application of the method. The actual number of specimens tested is less important than the quality of those specimens. Twenty specimens that are carefully selected on the basis of their observed concentrations will likely provide better information than the a hundred specimens that are randomly received by the laboratory. Westgard JO

Accuracy Factors to consider Single vs duplicate measurements Ideally, these duplicates should be two different samples (or cups) that are analyzed in different runs, or at least in different order (rather than back-to-back replicates on the same cup of sample). The duplicates provide a check on the validity of the measurements by the individual methods and help identify problems arising from sample mix-ups, transposition errors, and If duplicates are not performed, then it is critical to inspect the comparison results at the time they are collected, identify those specimens where the differences are large, and repeat those analyses while the specimens are still available. Westgard JO

Accuracy Factors to consider Time period Several different analytical runs on different days should be included to minimize any systematic errors that might occur in a single run. A minimum of 5 days is recommended, but it may be preferable to extend the experiment for a longer period of time. Since the long-term replication study will likely extend for 20 days, the comparison study could cover a similar period of time and would require only 2 to 5 patient specimens per day. Westgard JO

Accuracy Specimen stability Specimens should generally be analyzed within two hours of each other by the test and comparative methods Specimen handling needs to be carefully defined and systematized prior to beginning the comparison of methods study. Otherwise, the differences observed may be due to variables in the handling Westgard JO

Accuracy Data analysis Graph the data The most fundamental data analysis technique is to graph the comparison results and visually inspect the data. Ideally, this should be done at the time the data is collected in order to identify discrepant results that will complicate the data analysis. Any patient specimens with discrepant results between the test and comparative methods should be reanalyzed to confirm that the differences are real and not mistakes in recording the values or mixups of specimens. Westgard JO

Accuracy Graph difference plot displays the difference between the test minus comparative results on the y-axis versus the comparative result on the x-axis. These differences should scatter around the line of zero differences, half being above and half being below. Any large differences will stand out and draw attention to those specimens whose results need to be confirmed by repeat measurements. Westgard JO

Accuracy Graph comparison plot that displays the test result on the y-axis versus the comparison result on the x-axis, as shown by the second figure. As points are accumulated, a visual line of best fit should be drawn to show the general relationship between the methods and help identify discrepant results. The purpose of this initial graphical inspection of data is to identify discrepant results in order to reanalyze specimens while they are fresh and still available. Westgard JO

Accuracy Calculate appropriate statistics These statistics allow estimation of the systematic error at more than one medical decision concentration to judge method acceptability and also provide information about the proportional or constant nature of the systematic error to assess possible sources of errors. Statistical programs typically provide linear regression or least squares analysis calculation for the slope (b) and y-intercept (a) of the line of best fit and the standard deviation of the points about that line (sy/x). Westgard JO

Accuracy Yc = A + bxc SE = Yc - Xc For example, given a cholesterol comparison study where the regression line is Y = 2.0 +1.03X, i.e., the y-intercept is 2.0 mg/dl and the slope is 1.03, the Y value corresponding to a critical decision level of 200 would be 208 (Y = 2.0 + 1.03*200), which means there is a systematic error of 8 mg/dl (208 200) at a critical decision level of 200 mg/dl. Westgard JO

Accuracy Criteria for acceptable performance The judgment of acceptability depends on what amount of analytical error is allowable without affecting or limiting the use and interpretation of individual test results. This is complicated by the fact that any individual test result is also subject to random error, thus the overall or total error (TE) is composed of systematic error (SE) plus random error (RE). This total error can be calculated as follows: TEcalc = SE + RE TEcalc = biasmeas + 3smeas Westgard JO

Accuracy Recommended minimum studies Select 40 patient specimens to cover the full working range of the method. Analyze 8 specimens a day within 2 hours by the test and comparative methods. Graph the results immediately on a difference plot and inspect for discrepancies; reanalyze any specimens that give discrepant results to eliminate outliers and identify potential interferences. Continue the experiment for 5 days if no discrepant results are observed. Continue for another 5 days if discrepancies are observed during the first 5 days. Prepare a comparison plot of all the data to assess the range, outliers, and linearity. Calculate the correlation coefficient and if r is 0.99 or greater, calculate simple linear regression statistics and estimate the systematic error at medical decision concentrations. Westgard JO

Precision A replication experiment is performed to estimate the imprecision or random error. A replication experiment is typically performed by obtaining test results on 20 samples of the same material and then calculating the mean, standard deviation, and coefficient of variation. The purpose is to observe the variation expected in a test result under the normal operating conditions of the laboratory. Ideally, the test variation should be small, i.e., all the answers on the repeated measurements should be nearly the same. Westgard JO

Precision The replication experiment estimates the random error caused by factors that vary in the operation of the method, such as the pippetting of samples, the reaction conditions that depend on timing, mixing, temperature, and heating, and even the measurement itself. In non-automated systems, variation in the techniques of individual analysts may be a large contributor to the observed variation of a test. With automated systems, the lack of uniformity and the instability of instrument and reaction conditions may still cause small variations that may again show up as positive and negative variations in the final test results. The distribution of these effects over time can be predicted to provide estimates of how large the random error might be. Westgard JO

Factors to consider Precision Time period of experiment The length of time over which the experiment "within-run" random error observed will generally be low (and optimistic) This is the best performance possible by the method; if this performance is not acceptable, the method should be rejected or the causes of random error needs to be identified and eliminated before any further testing is carried out. "day-to-day", "between-day", or "total" imprecision An experiment conducted over a period of twenty days is expected to provide an even more realistic estimate of the variation that will be seen in patient samples over time. Westgard JO

Precision Factors to consider Matrix of sample The other materials present in a sample constitute its matrix. For example, the matrix of interest for a laboratory test may be whole blood, serum, urine, or spinal fluid. In evaluating method performance, it is important to use test samples that have a matrix as close as possible to the real specimen type of interest. Westgard JO

Precision Factors to consider Number and concentrations of materials The number of materials to be tested should depend on the concentrations that are critical for the medical use of the test. Generally, two or three materials should be selected to have analyte concentrations that are at medically important decision levels. example: Glucose is typically interpreted at several medical decision levels, such as 50 mg/dl for hypoglycemia, 120 mg/dl for a fasting sample, 160 mg/dl for a glucose tolerance test, and at higher elevations such as 300 mg/dl for monitoring diabetic patients. Westgard JO

Precision Factors to consider Number of test samples It is commonly accepted that a minimum of 20 samples should be measured in the time period of interest. A larger number of samples will give a better estimate of the random error, but cost and time considerations often dictate that the data are evaluated at the earliest time or minimum period, then additional data collected if necessary. Westgard JO

Precision Factors to consider Data calculations Random error is described quantitatively by calculating the mean (x), standard deviation (s), and coefficient of variation (CV) Westgard JO

Precision Criteria for acceptable performance The judgment on acceptability depends on what amount of analytical error is allowable. ( we recommend using the CLIA criteria which have been tabulated on website). For short-term imprecision, the within-run standard deviation (sw-run) or the within-day standard deviation (s w-day) should be ¼ or less of the defined allowable total error to be acceptable. For long-term imprecision, the total standard deviation (stot) should be 1/3 or less of the defined TE. Westgard JO

Precision Recommended minimum studies Select at least 2 different control materials that represent low and high medical decision concentrations for the test of interest. Analyze 20 samples of each material within a run or with a day to obtain an estimate of short-term imprecision. Calculate the mean, SD, and coefficient %CV for each material. Determine whether shortterm imprecision is acceptable before proceeding with any further testing. Analyze 1 sample of each of the 2 materials on 20 different days to estimate long-term imprecision. Calculate the mean, SD, and %CV for each material. Determine whether long-term imprecision is acceptable. Westgard JO

Other validation parameters www.istockphoto.com

Specificity the term specific generally refers to a method that produces a response for a single analyte only, while the term selective refers to a method that provides responses for a number of chemical entities that may or may not be distinguished from each other. If the response is distinguished from all other responses, the method is said to be selective. Validation and qulification in Analytical Laboratories, 2007

Limit of Quantitation Lower limit of quantification (LLOQ) The LLOQ is the lowest amount of an analyte in a sample that can be quantitatively determined with suitable precision and accuracy (bias). The acceptance criteria for these two parameters at LLOQ are 20% RSD for precision and ±20% for bias. Frank T. Peters and Hans H. Maurer

Limit of detection Quantification below LLOQ is by definition not acceptable. Therefore, below this value a method can only produce semiquantitative or qualitative data. However, it can still be important to know the LOD of the method. According to ICH, it is the lowest concentration of analyte in a sample which can be detected but not necessarily quantified as an exact value. According to Conference Report II, it is the lowest concentration of an analyte in a sample, that the bioanalytical procedure can reliably differentiate from background noise. Frank T. Peters and Hans H. Maurer

Linearity The linearity of an analytical method is its ability to elicit test results that are directly proportional to the concentration of analytes in samples within a given range or proportional by means of well-defined mathematical transformations. Linearity is determined by a series of 3 to 6 injections of 5 or more standards whose concentrations span 80 120 percent of the expected concentration range. A linear regression equation applied to the results should have an intercept not significantly different from 0. If a significant nonzero intercept is obtained, it should be demonstrated that this has no effect on the accuracy of the method. Validation and qulification in Analytical Laboratories, 2007

Range The range of an analytical method is the interval between the upper and lower levels (including these levels) that have been demonstrated to be determined with precision, accuracy and linearity using the method as written. The range is normally expressed in the same units as the test results (e.g., percentage, parts per million) obtained by the analytical method. Validation and qulification in Analytical Laboratories, 2007

Ruggedness The ruggedness of an analytical method is the resistance to change in the results produced by an analyticalmethod when minor deviations are made from the experimental conditions described in the procedure. Examples of the factors that a ruggedness test could address are: changes in the instrument, operator,or brand of reagent; concentration of a reagent; ph of a solution; temperature of a reaction; time allowed for completion of a process, etc. IUPAC, 2002

Validation Parameters Isabel Taverniers, Marc De Loose, Erik Van Bockstaele Trends in Analytical Chemistry, Vol. 23, No. 8, 2004

Isabel Taverniers, et al 2004

Isabel Taverniers et al. 2004

Isabel Taverniers, Marc De Loose, Erik Van Bockstaele Trends in Analytical Chemistry, Vol. 23, No. 8, 2004

Objective

Materials & Methods Imprecision

Materials & Methods Linearity

Analytical imprecision

Imprecision

Linearity Study

Method Comparison

Conclusion

Validation Master Plan

Amniotic Fluid 1.5-7 ml Centrifuge/ Resuspend Pellet With Growth Medium Plate into 1 of 24 Well Plate (Biocoat Collagen-I) 24 hours Detach with Trypain-like Solution Plate in T75 Flask QC: Viability, FCM, Sterilty, Mycoplasma, Endotoxin Frozen

Interested Reagents

Cell Release Criteria