The URL can be used to link to this page

Home My WebLink AboutDRC-2024-006915STATISTICAL ANALYSIS OF GROUND-WATER MONITORING DATA AT RCRA FACILITIES INTERIM FINAL GUIDANCE OFFICE OF SOLID WASTE WASTE MANAGEMENT DIVISION U.S. ENVIRONMENTAL PROTECTION AGENCY 401 M STREET, S.W. WASHINGTON, D.C.20460 APRIL 1989 DISCLAIMER This document is intended to assist Regional and State personnel in evaluating ground-water monitoring data from RCRA facilities.Conformance with this guidance is expected to result in statistical methods and sampling procedures that meet the regulatory standard of protecting human health andthe environment.However, EPA will not in all cases limit its approval of statistical methods and sampling procedures to those that comport with theguidance set forth herein.This guidance is not a regulation (i.e., it does not establish a standard of conduct which has the force of law) and should not be used as such.Regional and State personnel should exercise their discre- tion in using this guidance document as well as other relevant information in choosing a statistical method and sampling procedure that meet the regulatoryrequirements for evaluating ground-water monitoring data from RCRA facilities. This document has been reviewed by the Office of Solid Waste, U.S. Envi- ronmental Protection Agency, Washington, D.C., and approved for publication. Approval does not signify that the contents necessarily reflect the views and policies of the U.S.Environmental Protection Agency, nor does mention of trade names,commercial products,or publications constitute endorsement or recommendation for use. Guidance Document on the StatisticaI Analysisof Ground-Water Monitoring Dataat RCRA Facilities PREFACE This guidance document has been developed primarily for evaluating ground-water monitoring data at RCRA (Resource Conservation and Recovery Act) facilities.The statistical methodologies described in this document can be applied to both hazardous (Subtitle C of RCRA) and municipal (Subtitle D of RCRA) waste land disposal facilities. The recently amended regulations concerning the statistical analysis of ground-water monitoring data at RCRA facilities (53 FR 39720, October 11, 1988), provide a wide variety of statistical methods that may be used to evaluate ground-water quality.To the experienced and inexperienced water quality professional,the choice of which test to use under a particular set of conditions may not be apparent.The reader is referred to Section 4 of this guidance,"Choosing a Statistical Method,"for assistance in choosing an appropriate statistical test.For relatively new facilities that have only limited amounts of ground-water monitoring data, it is recommended that a form of hypothesis test (e.g.,parametric analysis of variance) be employed to evaluate the data.Once sufficient data are available (after 12 to 24 months or eight background samples),another method of analysis such as the control chart methodology described in Section 7 of the guidance is recommended.Eachmethod of analysis and the conditions under which they will be used can be written in the facility permit.This will eliminate the need for a permit modification each time more information about the hydrogeochemistry is collected, and more appropriate methods of data analysis become apparent. This guidance was written primarily for the statistical analysis of ground-water monitoring data at RCRA facilities.The guidance has wider applications however,if one examines the spatial relationships involved between the monitoring wells and the potential contaminant source.Forexample, Section 5 of the guidance describes background well (upgradient) vs. compliance well (downgradient) comparisons.This scenario can be applied toother non-RCRA situations involving the same spatial relationships and the same null hypothesis. between means, or where The explicit null hypothesis (Ho) for testing contrasts appropriate between medians, is that the means between groups (here monitoring wells) are equal (i.e.,no release has been detected), or that the group means are below a prescribed action level (e.g., the ground- water protection standard).Statistical methods that can be used to evaluate these conditions are described in Section 5.2 (Analysis of Variance), 5.3 (Tolerance Intervals), and 5.4 (Prediction Intervals). A different situation exists when compliance wells (downgradient) are compared to a fixed standard (e.g.,the ground-water protection standard). In that case, Section 6 of the guidance should be consulted. The value to which the constituent concentrations at compliance wells are compared can be any iii standard established by a Regional Administrator, State or county health official, or another appropriate official. A note of caution applies to Section 6.The examples used in Section 6 are used to determine whether ground water has been contaminated as a result of a release from a facility.When the lower confidence limit lies entirely above the ACL (alternate concentration limit) or MCL (maximum concentration limit),further action or assessment may be warranted.If one wishes to determine whether a cleanup standard has been attained for a Superfund site or a RCRA facility in corrective action,another EPA guidance document entitled, "Statistical Methods for the Attainment of Superfund Cleanup Standards (Vol- ume 2:Ground Water--Draft), should be consulted.This draft Superfund guidance is a multivolume set that addresses questions regarding the success of air, ground-water,and soil remediation efforts.Information about the availability of this draft guidance,currently being developed, can be obtained by calling the RCRA/Superfund Hotline, telephone (800) 424-9346 or (202) 382-3000. Those interested in evaluating individual uncontaminated wells or in an intrawell comparison are referred to Section 7 of the guidance which describes the use of Shewhart-CUSUM control charts and trend analysis. Municipal water supply engineers,for example, who wish to monitor water quality parameters in supply wells, may find this section useful. Other sections of this guidance have wide applications in the field of applied statistics,regardless of the intended use or purpose.Section 4.2 and 4.3 provide information on checking distributional assumptions and equality of variance,while Sections 8.1 and 8.2 cover limit of detection problems and outliers.Helpful advice and references for many experiments involving the use of statistics can be found in these sections. Finally, it should be noted that this guidance is not intended to be the final chapter on the statistical analysis of ground-water monitoring data, nor should it be used as such.40 CFR Part 264 Subpart F offers an alternative §5264.97(h)(5)] to the methods suggested and described in this guidance document.In fact, the guidance recommends a procedure (confidence intervals) for comparing monitoring data to a fixed standard that is not mentioned in the Subpart F regulations.This is neither contradictory nor inconsistent, but rather epitomizes the complexities of the subject matter and exemplifies the need for flexibility due to the-site-specific monitoring requirements of the RCRA program. iv CONTENTS Preface... . ........... .....................................................111 Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .vi Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .vii Executive Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .E-l 1. 2. 3. 4. 5. 6. 7. 8. Introduction .................................................. Regulatory Overview ........................................... 2.1 Background .......................................... 2.2 Overview of Methodology ............................. 2.3 General Performance Standards ....................... 2.4 Basic Statistical Methods and Sampling Procedures ........................................ Choosing a Sampling Interval .................................. 3.1 Example Calculations ................................ 3.2 Flow Through Karst and "Pseudo-Karst" Terranes ...... Choosing a Statistical Method ................................. 4.1 Flowcharts--Overview and Use ........................ 4.2 Checking Distributional Assumptions ................. 4.3 Checking Equality,of Variance: Bartlett's Test ..... Background Well to Compliance Well Comparisons ................ 5.1 Summary Flowchart for Background Well to Compliance Well Comparisons ....................... 5.2 Analysis of Variance ................................ 5.3 Tolerance Intervals Based on the Normal Distribution ...................................... 5.4 Prediction Intervals ................................ Comparisons with MCLs or ACLs ................................. 6.1 Summary Chart for Comparison with MCLs or ACLs ...... 6.2 Statistical Procedures .............................. Control Charts for Intra-Well Comparisons ..................... 7.1 Advantages of Plotting Data ......................... 7.2 Correcting for Seasonality .......................... 7.3 Combined Shewhart-CUSUM Control Charts for Each Well and Constituent .............................. 7.4 Update of a Control Chart ........................... 7.5 Nondetects in a Control Chart ....................... Miscellaneous Topics .......................................... 8.1 Limit of Detection .................................. 8.2 Outliers ............................................ 2-6 3-1 3-8 3-11 4-1 4-1 4-4 4-17 5-1 5-2 5-5 5-20 5-24 6-1 6-l 6-1 7-l 7-1 7-2 7-10 7-12 8-1 8-11 Appendices A.General Statistical Considerations and Glossary of Statistical Terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-l B. Statistical Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .B-l General Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .C-l C.Federal Register,40 CFR, Part 264 . . . . . . . . . . . . . . . . . . . . . . . . . . . . .D-l V FIGURES Page Hydraulic conductivity of selected rocks...................... 3-3 Range of values of hydraulic conductivity and permeability.... 3-4 Number 3-l 3-2 3-3 3-4 3-5 4-l 4-2 4-3 5-1 5-2 6-1 Comparisons with MCLs/ACLs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-2 7-1 Plot of unadjusted and seasonally adjusted monthlyobservations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-6 7-2 Combined Shewhart-CUSUM chart . . . . . . . . . . . . . . . . . . . . . . ..........7-11 Conversion factors for permeability and hydraulicconductivity units . . . ......................................3-4 Total porosity and drainable porosity for typicalgeologic materials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-7 Potentiometric surface map for computation of hydraulic gradient . . . ...........................................3-9 Flowchart overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-3 Probability plot of raw chlordane concentrations.............. 4-11 Probability plot of log-transformed chlordane concentrations.. 4-13 Background well to compliance well comparisons................ 5-3 Tolerance limits:alternate approach to background well to compliance well comparisons......................,.. 5-4 vi TABLES Number Page 2-1 Summary of Statistical Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-7 3-1 Default Values for Effective Porosity (Ne) for Use in Time of Travel (TOT) Analyses ....................................3-5 3-2 Specific Yield Values for Selected Rock Types .................3-6 3-3 Determining a Sampling Interval ...............................3-11 4-1 Example Data for Coefficient-of-Variation Test ................4-8 4-2 Example Data Computations for Probability Plotting ............4-10 4-3 Cell Boundaries for the Chi-Squared Test ......................4-14 4-4 Example Data for Chi-Squared Test .............................4-15 4-5 Example Data for Bartlett's Test ..............................4-19 5-l One-Way Parametric ANOVA Table ................................s-a 5-2 Example Data for One-Way Parametric Analysis of Variance ......5-11 5-3 Example Computations in One-Way Parametric ANOVA Table ........5-12 5-4 Example Data for One-Way Nonparametric ANOVA--Benzene Concentrations (ppm)........................................5-18 5-5 Example Data for Normal Tolerance Interval ....................5-23 5-6 Example Data for Prediction Interval--Chlordane Levels ........5-27 6-l Example Data for Normal Confidence Interval--Aldicarb Concentrations in Compliance Wells (ppb)....................6-4 6-2 Example Data for Log-Normal Confidence Interval--EDB Concentrations in Compliance Wells (ppb)....................6-6 6-3 Values of M and n+l-M and Confidence Coefficients for Small Samples ...............................................6-9 6-4 Example Data for Nonparametric Confidence Interval--T-29 Concentrations (ppm)........................................6-10 vii TABLES (continued) Number 6-5 7-l 7-2 8-l 8-2 8-3 8-4 Page Example Data for a Tolerance Interval Compared to an ACL ......6-13 Example Computation for Deseasonalizing Data ................. 7-4 Example Data for Combined Shewhart-CUSUM Chart--CarbonTetrachloride Concentration (µg/L)..........................7-9 Methods for Below Detection Limit Values ......................8-2 Example Data for a Test of Proportions ........................8-6 Example Data for Testing Cohen's Test .........................8-9 Example Data for Testing for an Outlier .......................8-13 viii ACKNOWLEDGMENT This document was developed by EPA's Office of Solid Waste under the direction of Dr. Vernon Myers,Chief of the Ground-Water Section of the Waste Management Division.The document was prepared by the joint efforts of Dr. Vernon B. Myers, Mr. James R.Brown of the Waste Management Division, Mr. James Craig of the Office of Policy Planning and Information, and Mr. Barnes Johnson of the Office of Policy, Planning, and Evaluation. Tech- nical support in the preparation of this document was provided by Midwest Research Institute- (MRI) under a subcontract to NUS Corporation, the prime contractor with EPA's Office of Solid Waste.MRI staff who assisted with the preparation of the document were Jairus 0. Flora, Jr., Ph.D., Principal Statistician, Ms. Karin M. Bauer, Senior Statistician, and Mr. Joseph S. Bartling, Assistant Statistician. ix EXECUTIVE SUMMARY The hazardous waste regulations under the Resource Conservation and Recovery Act (RCRA) require owners and operators of hazardous waste facilities to utilize design features and control measures that prevent the release of hazardous waste into ground. water. Further, regulated units (i.e., all sur- face impoundments, waste piles, land treatment units, and landfills that receive hazardous waste after July 26, 1982) are also subject to the ground- water monitoring and corrective action standards of 40 CFR Part 264, Sub- part F.These regulations require that a statistical method and sampling pro- cedure approved by EPA be used to determine whether there are releases from regulated units into ground water. This document provides guidance to RCRA facility permit applicants and writers concerning the statistical analysis of ground-water monitoring data at RCRA facilities.Section 1 is an introduction to the guidance; it describes the purpose and intent of the document and emphasizes the need for site- specific considerations in implementing the Subpart F regulations of 40 CFR Part 264. Section 2 provides the reader with an overview of the recently promul- gated regulations concerning the statistical analysis of ground-water moni- toring data (53 FR 39720, October 11, 1988). regulation are reviewed, The requirements of the and the need to consider site-specific factors in evaluating data at a hazardous waste facility is emphasized. Section 3 discusses the important hydrogeologic parameters to consider when choosing a sampling interval.The Darcy equation is used to determine the horizontal component of the average linear velocity of ground water. This parameter provides a good estimate of time of travel for most soluble con- stituents in ground water and may be used to determine a sampling interval. In karst, cavernous volcanics,and fractured geologic environments, alterna- tive methods are needed to determine an appropriate sampling interval. Exam- ple calculations are provided at the end of the section to further assist the reader. Section 4 provides guidance on choosing an appropriate statistical method.A flow chart to guide the reader through this section, as well as procedures to test the distributional assumptions of data, are presented. Finally, this section outlines procedures to test specifically for equality of variance. Section 5 covers statistical methods that may be used to evaluate ground- water monitoring data when background wells have been sited hydraulically upgradient from the regulated unit,and a second set of wells are sited E-l hydraulically downgradient from the regulated unit at the point of compli- ance.The data from these compliance wells are compared to data from the background wells to determine whether a release from a facility has occurred.Parametric and nonparametric analysis of variance, tolerance inter- vals, and prediction intervals are suggested methods for this type of compari- son.Flow charts, procedures, testing method. and example calculations are given for each Section 6 includes statistical procedures that are appropriate when comparing ground-water constituent concentrations to fixed concentration limits (e.g.,alternate concentration limits or maximum concentration lim- its).The methods applicable to this type of comparison are confidence inter- vals and tolerance intervals.As in Section 5, flow charts, procedures, and examples explain the calculations necessary for each testing method. Section 7 presents the case where the level of each constituent within a single,uncontaminated well is being compared to its historic background con- centrations.This is known as an intra-well comparison. In essence, the data for each constituent in each well are plotted on a time scale and inspected for obvious features such as trends or sudden changes in concentration levels.The method suggested in this section is a combined Shewhart-CUSUM control chart. Section 8 contains a variety of special topics that are relatively short and self-contained.These topics include methods to deal with data that is below the limit of analytical detection and methods to test for outliers or extreme values in the data. Finally, the guidance presents appendices that cover general statistical considerations,a glossary of statistical terms, listing of references. statistical tables, and a These appendices provide necessary and ancillary information to aid the user in evaluating ground-water monitoring data. E-2 SECTION 1 INTRODUCTION The U.S. Environmental Protection Agency (EPA) promulgated regulations for detecting contamination of ground water at hazardous waste land disposal facilities under the Resource Conservation and Recovery Act (RCRA) of 1976. The statistical procedures specified for use to evaluate the presence of con- tamination have been criticized and require improvement.Therefore, EPA has revised those statistical procedures in 40 CFR Part 264, "Statistical Methods for Evaluating Ground-Water Monitoring Data From Hazardous Waste Facilities." In 40 CFR Part 264, EPA has recently amended the Subpart F regulations with statistical methods and sampling procedures that are appropriate for evaluating ground-water monitoring data under a variety of situations (53 FR 39720, October 11, 1988).The purpose of this document is to provide guidance in determining which situation applies and consequently which statistical procedure may be used.In addition to providing guidance on selection of an appropriate statistica. procedure,this document provides instructions on carrying out the procedure and interpreting the results. The regulations provide three levels of monitoring for a regulated unit:detection monitoring;compliance monitoring; and corrective action. The regulations define conditions for a regulated unit to be changed from one level of monitoring to a more stringent level of monitoring (e.g., from detec- tion monitoring to compliance monitoring).These conditions are that there is statistically significant evidence of contamination [40 CFR §264.91(a)(l) and (2)]. The regulations allow the benefit of the doubt to reside with the current stage of monitoring.That is, a unit will remain in its current monitoring stage unless there is convincing evidence to change it.This means that a unit will not be changed from detection monitoring to compliance monitoring (or from compliance monitoring to corrective action) unless there is statisti- cally significant evidence of contamination (or contamination above the com- pliance limit). The main purpose of this document is to guide owners, operators, Regional Administrators, State Directors,and other interested parties in the selec- tion, use, and interpretation of appropriate statistical methods for monitor- ing the ground water at each specific regulated unit.Topics to be covered include sampling needed, sample sizes,selection of appropriate statistical design,matching analysis of data to design,and interpretation of results. Specific recommended methods are detailed and a general discussion of evalu- ation of alternate methods is provided.Statistical concepts are discussed in l-l Appendix A.References for suggested procedures are provided as well as references to alternate procedures and general statistics texts.Situationscalling for external consultation are mentioned as well as sources for obtain- ing expert assistance when needed. EPA would like to emphasize the need for site-specific considerations inimplementing the Subpart F regulations of 40 CFR Part 264 (especially as amended, 53 FR 39720, October 11, 1988).It has been an ongoing strategy to promulgate regulations that are specific enough to implement, yet flexibleenough to accommodate a wide variety of site-specific environmental factors. This is usually achieved by specifying criteria that are appropriate for the majority of monitoring situations,while at the same-time allowing alterna- tives that are also protective of human health and the environment.Thisphilosophy is maintained in the recently promulgated amendments entitled,"Statistical Methods for Evaluating Ground-Water Monitoring Data From Haz-ardous Waste Facilities" (53 FR 39720, October 11, 1988). The sections thatallow for the use of an alternate sampling procedure and statistical method [§264.97(g)(2) and §264.97(h)(5),respectively] are as viable as those thatare explicitly referenced [§264.97(g)(l) and §264.97(h)(l-4)], provided theymeet the performance standards of §264.97(i).Due consideration to thisshould be given when preparing and reviewing Part B permits and permit applications. l-2 SECTION 2 REGULATORY OVERVIEW In 1982, EPA promulgated ground-water monitoring and response standards for permitted facilities in Subpart F of 40 CFR Part 264, for detecting releases of hazardous wastes into ground water from storage, treatment, and disposal units, at permitted facilities (47 FR 32274, July 26, 1982). The Subpart F regulations required ground-water data to be examined by Cochran's Approximation to the Behrens-Fisher Student's t-test (CABF) to determine whether there was a significant exceedance of background levels, or other allowable levels, of specified chemical parameters and hazardous waste constituents.One concern was that this procedure could result in a high rate of "false positives" (Type I error),thus requiring an owner or operator unnecessarily to advance into a more comprehensive and expensive phase of monitoring.More importantly,another concern was that the procedure could result in a high rate of "false negatives" (Type II error), i.e., instances where actual contamination would go undetected. As a result of these concerns,EPA amended the CABF procedure with five different statistical methods that are more appropriate for ground-water moni- toring (53 FR 39720, October 11, 1988).These amendments also outline sam- pling procedures and performance standards that are designed to help minimize the event that a statistical method will indicate contamination when it is not present (Type I error),and fail to detect contamination when it is present (Type II error). 2.1 BACKGROUND Subtitle C of the Resource Conservation Recovery Act of 1976 (RCRA) cre- ates a comprehensive program for the safe management of hazardous waste.Sec- tion 3004 of RCRA requires owners and operators of facilities that treat, store, or dispose of hazardous waste to comply with standards established by EPA that are "necessary to protect human health and the environment."Sec- tion 3005 provides for implementation of these standards under permits issued to owners and operators by EPA or authorized States. Section 3005 also pro- vides that owners and operators of existing facilities that apply for a permit and comply with applicable notice requirements may operate until a permit determination is made.These facilities are commonly known as "interim status" facilities.Owners and operators of interim status facilities also must comply with standards set under Section 3004. EPA promulgated ground-water monitoring and response standards for per- mitted facilities in 1982 (47 FR 32274, July 26, 1982), codified in 40 CFR 2-l Part 264, Subpart F.These standards establish programs for protecting ground water from releases of hazardous wastes from treatment, storage, and disposal units.Facility owners and operators were required to sample ground water at specified intervals and to use a statistical procedure to determine whether or not hazardous wastes or constituents from the facility are contaminating ground water.As explained in more detail below, the Subpart F regulations regarding statistical methods used in evaluating ground-water monitoring data that EPA promulgated in 1982 have generated criticism. The Part 264 regulations prior to the October 11, 1988 amendments pro- vided that the Cochran's Approximation to the Behrens-Fisher Student's t-test (CABF) or an alternate statistical procedure approved by EPA be used to deter- mine whether there is a statistically significant exceedance of background levels, or other allowable levels,of specified chemical parameters and haz- ardous waste constituents.Although the regulations have always provided latitude for the use of an alternate statistical procedure, concerns were raised that the CABF statistical procedure in the regulations was not appro- priate.It was pointed out that:(1) the replicate sampling method is not appropriate for the CABF procedure,(2) the CABF procedure does not adequately consider the number of comparisons that must be made, and (3) the CABF does not control for seasonal variation.Specifically, the concerns were that the CABF procedure could result in "false positives" (Type I error), thus requir- ing an owner or operator unnecessarily to collect additional ground-water samples,to further characterize ground-water quality, and to apply for a permit modification,which is then subject to EPA review.In addition, there was concern that CABF may result in "false negatives" (Type II error), i.e., instances where actual contamination goes undetected.This could occur because the background data,which are often used as the basis of the statistical comparisons,are highly variable due to temporal,spatial, analytical,and sampling effects. As a result of these concerns,on October 11, 1988 EPA amended both the statistical methods and the sampling procedures of the regulations, by requir- ing (if necessary) that owners or operators more accurately characterize the hydrogeology and potential contaminants at the facility, and by including in the regulations performance standards that all the statistical methods and sampling procedures must meet.Statistical methods and sampling procedures meeting these performance standards would have a low probability of indicating contamination when it is not present,and of failing to detect contamination that actually is present.The facility owner or operator would have to demon- strate that a procedure is appropriate for the site-specific conditions at the facility,and to ensure that it meets the performance standards outlined below.This demonstration holds for any of the statistical methods and sam- pling procedures outlined in this regulation as well as any alternate methods or procedures proposed by facility owners and operators. EPA recognizes that the selection of appropriate monitoring parameters is also an essential part of a reliable statistical evaluation.The Agency addressed this issue in a previous Federal Register notice (52 FR 25942, July 9, 1987). 2-2 2.2 OVERVIEW OF METHOOOLOGY EPA has elected to retain the idea of general performance requirements that the regulated community must meet.This approach allows for flexibility in designing statistical methods and sampling procedures to site-specific considerations. EPA has tried to bring a measure of certainty to these methods, while accommodating the unique nature of many of the regulated units in question. Consistent with this general strategy,the Agency is establishing several options for the sampling procedures and statistical methods to be used in detection monitoring and, where appropriate, in compliance monitoring. The owner or operator shall submit, for each of the chemical parameters and hazardous constituents listed in the facility permit, one or more of the statistical methods and sampling procedures described in the regulations promulgated on October -11, 1988.In deciding which statistical test is appropriate,he or she will consider the theoretical properties of the test, the data available, the site hydrogeology, and the fate and transport charac- teristics of potential contaminants at the facility. The Regional Administra- tor will review, and if appropriate,approve the proposed statistical methods and sampling procedures when issuing the facility permit. The Agency recognizes that there may be situations where any one statis- tical test may not be appropriate.This is true of new facilities with little or no ground-water monitoring data.If insufficient data prohibit the owner or operator from specifying a statistical method of analysis, then contingency plans containing several methods of data analysis and the conditions under which the method can be used will be specified by the Regional Administrator in the permit.In many cases,the parametric ANOVA can be performed after six months of data have been collected.This will eliminate the need for a permit modification in the event that data collected during future sampling and analysis events indicate the need to change to a more appropriate statistical method of analysis.In the event that a permit modification is necessary to change a sampling procedure or a statistical method, the reader is referred to 53 FR 37912, September 28, 1988.These are considered Class 1 changes requir- ing Director approval and should follow minor modification procedures. 2.3 GENERAL PERFORMANCE STANDARDS EPA's basic concern in establishing these performance standards for sta- tistical methods is to achieve a proper balance between the risk that the pro- cedures will falsely indicate that a regulated unit is causing background values or concentration limits to be exceeded (false positives) and the risk that the procedures will fail to indicate that background values or concen- tration limits are being exceeded (false negatives).EPA's approach is designed to address that concern directly.Thus any statistical method or sampling procedure,whether specified here or as an alternative to those specified,should meet the following performance standards contained in 40 CFR §264.97(i): 2-3 1. 2. The statistical method used to evaluate ground-water monitoring data shall be appropriate for the distribution of chemical parameters or hazardous constituents.If the distribution of the chemical parameters or hazardous constituents is shown by the owner or operator to be inappropriate for a normal theory test, then the data should be transformed or a distribution-free theory test should be used.If the distributions for the constituents differ, more than one statistical method may be needed. If an individual well comparison procedure is used to compare an individual compliance well constituent concentration with background constituent concentrations or a ground-water protection standard, the test shall be done at a Type I error level of no less than 0.01 for each testing period.If a multiple comparisons procedure is used,the Type I experimentwise error rate shall be no less than 0.05 for each testing period;however, the Type I error of no less than 0.01 for individual well comparisons must be maintained. This performance standard does not apply to control charts, tolerance intervals, or prediction intervals. 3.If a control chart approach is used to evaluate ground-water moni- toring data,the specific type of control chart and its associated parameters shall be proposed by the owner or operator and approved by the Regional Administrator if he or she finds it to be protectiveof human health and the environment. 4.If a tolerance interval or a prediction interval is used to evaluate ground-water monitoring data,then the levels of confidence shall be proposed:in addition, for tolerance intervals, the proportion of the population that the interval must contain (with the proposed confidence) shall be proposed by the owner or operator and approved by the Regional Administrator if he or she finds these parameters to be protective of human health and the environment. These parameterswill be determined after considering the number of samples in the background data base,the distribution of the data, and the range of the concentration values for each constituent of concern. 5.The statistical method will include procedures for handling data below the limit of detection with one or more procedures that are protective of human health and the environment. Any practical quan- titation limit (PQL) approved by the Regional Administrator under§264.97(h) that is used in the statistical method shall be the low- est concentration level that can be reliably achieved within speci- fied limits of precision and accuracy during routine laboratory operating conditions available to the facility. 6.If necessary,the statistical method shall include procedures to control or correct for seasonal and spatial variability as well as temporal correlation in the data. In referring to "statistical methods,"EPA means to emphasize that the concept of "statistical significance“must be reflected in several aspects of the monitoring program.This involves not only the choice of a level of 2-4 significance,but also the choice of a statistical test, the sampling require- ments, the number of samples,and the frequency of sampling.Since all of these parameters interact to determine the ability of the procedure to detect contamination,the statistical methods,like a comprehensive ground-water monitoring program,must be evaluated in their entirety, not by individual components.Thus a systems approach to ground-water monitoring is endorsed. The second performance standard requires further comment. For individual well comparisons in which an individual compliance well is compared to back- ground, the Type I error level shall be no less than 1% (0.01) for each test- ing period.In other words, the probability of the test resulting in a false positive is no less than 1 in 100.EPA believes that this significance level is sufficient in limiting the false positive rate while at the same time con- trolling the false negative (missed detection) rate. Owners and operators of facilities that have an extensive network of ground-water monitoring wells may find it more practical to use a multiple well comparisons procedure.Multiple comparisons procedures control the experimentwise error rate for comparisons involving multiple upgradient and downgradient wells.If this method is used, the Type I experimentwise error rate for each constituent shall be no less than 5% (0.05) for each testing period. In using a multiple well comparisons procedure, if the owner or operator chooses to use a t-statistic rather than an F-statistic, the individual well Type I error level must be maintained at no less than 1% (0.01).This provision should be considered if a facility owner or operator wishes to use a procedure that distributes the risk of a false positive evenly throughout all monitoring wells (e.g., Bonferroni t-test). Setting these levels of significance at 1% and 5%, respectively, raises an important question in how the false positive rate will be controlled at facilities with a large number of ground-water monitoring wells and monitoring constituents.The Agency set these levels of significance on the basis of a single testing period and not on the entire operating life of the facility. Further, large facilities can reduce the false positive rate by implementing a unit-specific monitoring approach.Data from uncontaminated upgradient wells can be pooled and treated as one group.This will not only reduce the number of comparisons in a multiple well comparisons procedure but will also take into account spatial heterogeneities that may affect background ground-water quality.If the overall F-test is significant,then testing of the contrasts between the mean of each compliance well concentration and the mean background concentration must be performed for each constituent.This will identify the monitoring wells that are out of compliance.The Type I error level for the individual comparisons shall be no less than 0.01.Nonetheless, it is evident that facilities with an extensive number of ground-water monitoring wells which are monitored for many constituents may still generate a large number of comparisons during each testing period. In these particular situations,a determination of whether a release from a facility has occurred may require the Regional Administrator to evaluate the site hydrogeology, geochemistry, climatic factors, and other environmental parameters to determine if a statistically significant result is indicative of 2-5 an actual release from the facility.In making this determination, the Regional Administrator may note the relative magnitude of the concentration of the constituent(s).If the exceedance is based on an observed compliance well value that is the same relative magnitude as the PQL (practical quantitation limit) or the background concentration level,then a false positive may have occurred, and further sampling and testing may be appropriate.If, however, the background concentration level or an action level is substantially exceeded,then the exceedance is more likely to be indicative of a release from the facility. 2.4 BASIC STATISTICAL METHODS AND SAMPLING PROCEDURES The October 11, 1988 rule specifies five types of statistical methods to detect contamination in ground water.EPA believes that at least one of these types of procedures will be appropriate for virtually all facilities. To address situations where these methods may not be appropriate, EPA has included a provision for the owner or operator to select an alternate method which is subject to approval by the Regional Administrator. 2.4.1 The Five Statistical Methods Outlined in the October 11, 1988 Final Rule 1. 2. 3. 4. 5. A parametric analysis of variance (ANOVA) followed by multiple com- parison procedures to identify specific sources of difference. The procedures will include estimation and testing of the contrasts between the mean of each compliance well and the background mean for each constituent. An analysis of variance (ANOVA) based on ranks followed by multiple comparison procedures to identify specific sources of difference. The procedure will include estimation and testing of the contrasts between the median of each compliance well and the median background levels for each constituent. A procedure in which a tolerance interval or a prediction interval for each constituent is established from the background data, and the level of each constituent in each compliance well is compared to its upper tolerance or prediction limit. A control chart approach which will give control limits for each constituent.If any compliance well has a value or a sequence of values that lie outside the control limits for that constituent, it may constitute statistically significant evidence of contamination. Another statistical method submitted by the owner or operator and approved by the Regional Administrator. A summary of these statistical methods and their applicability is pre- sented in Table 2-1.The table lists types of comparisons and the recommended procedure and refers the reader to the appropriate sections where a discussion and example can be found. 2-6 TABLE 2-1.SUMMARY OF STATISTICAL METHODS SUMMARY OF STATISTICAL METHODS SECTION OF COMPOUND TYPE OF COMPARISON RECOMMENDED METHOD GUIDANCE I DOCUMENT ANY COMPOUND IN BACKGROUND VS ANOVA 5.2 COMPLIANCE WELL TOLERANCE LIMITS 5.3 PREDICTION INTERVALS 5.4 BACKGROUND INTRA-WELL CONTROL CHARTS 7 ACL/MCL FIXED STANDARD CONFIDENCE INTERVALS 6.2.1 SPECIFIC TOLERANCE LIMITS 6.2.2 SYNTHETIC MANY NONDETECTS SEE BELOW DETECTION IN DATA SET LIMIT TABLE 8-l 8.1 2-7 EPA is specifying multiple statistical methods and sampling procedures and has allowed for alternatives because no one method or procedure is appro- priate for all circumstances.EPA believes that the suggested methods and procedures are appropriate for the site-specific design and analysis of data from ground-water monitoring systems and that they can account for more of the site-specific factors that Cochran's Approximation to the Behrens-Fisher Student's t-test (CABF) and the accompanying sampling procedures in the past regulations.The statistical methods specified here address the multiple comparison problems and provide for documenting and accounting for sources of natural variation.EPA believes that the specified statistical methods and procedures consider and control for natural temporal and spatial variation. 2.4.2 Site-Specific Considerations for Sampling The decision on the number of wells needed in a monitoring system will be made on a site-specific basis by the Regional Administrator and will consider the statistical method being used, the site hydrogeology, the fate and trans- port characteristics of potential contaminants, and the sampling procedure. The number of wells must be sufficient to ensure a high probability of detect- ing contamination when it is present.To determine which sampling procedure should be used, the owner or operator shall consider existing data and site characteristics,including the possibility of trends and seasonality.These sampling procedures are: 1.Obtain a sequence of at least four samples taken at an interval that ensures,to the greatest extent technically feasible, that an inde- pendent sample is obtained, by reference to the uppermost aquifer's effective porosity,hydraulic conductivity, and hydraulic gradient, and the fate and transport characteristics of potential contami- nants.The sampling interval that is proposed must be approved by the Regional Administrator. 2.An alternate sampling procedure proposed by the owner or operator and approved by the Regional Administrator if he or she finds it to be protective of human health and the environment. EPA believes that the above sampling procedures will allow the use of statistical methods that will accurately detect contamination.These sampling procedures may be used to replace the sampling method present in the former Subpart F regulations.Rather than taking a single ground-water sample and dividing it into four replicate samples,a sequence of at least four samples taken at intervals far enough apart in time (daily, weekly, or monthly, depending on rates of ground-water flow and contaminant fate and transport characteristics) will help ensure the sampling of a discrete portion (i.e., an independent sample) of ground water.In hydrogeologic environments where the ground-water velocity prohibits one from obtaining four independent samples on a semiannual basis, an alternate sampling procedure approved by the Regional Administrator may be utilized 140 CFR §264.97(g)(l) and (2)]. The Regional Administrator shall approve an appropriate sampling proce- dure and interval submitted by the owner or operator after considering the effective porosity,hydraulic conductivity,and hydraulic gradient in the uppermost aquifer under the waste management area, and the fate and transport 2-8 characteristics of potential contaminants.Most of this information is already required to be submitted in the facility's Part B permit application under §270.14(c) and may be used by the owner or operator to make this deter- mination.Further, the number and kinds of samples collected to establish background concentration levels should be appropriate to the form of statisti- cal test employed,following generally accepted statistical principles [40 CFR §264.97(g)]. For example,the use of control charts presumes a well- defined background of at least eight samples per well.By contrast, ANOVA alternatives might require only four samples per well. It seems likely that most facilities will be sampling monthly over four consecutive months, twice a year.In order to maintain a complete annual record of ground-water data,the facility owner or operator may find it desirable to obtain a sample each month of the year. This will help identify seasonal trends in the data and permit evaluation of the effects of auto- correlation and seasonal variation if present in the samples. The concentrations of a consistent determined in these samples are intended to be used in one-point-in-time comparisons between background and compliance wells.This approach will help reduce the components of seasonal variation by providing for simultaneous comparisons between background and compliance well information. The flexibility for establishing sampling intervals was chosen to allow for the unique nature of the hydrogeologic systems beneath hazardous waste sites.This sampling scheme will give proper consideration to the temporal variation of and autocorrelation among the ground-water constituents.The specified procedure requires sampling data from background wells, at the compliance point,and according to a specific test protocol.The owner or operator should use a background value determined from data collected under this scenario if a test approved by the Regional Administrator requires it or if a concentration limit in compliance monitoring is to be based upon background data. EPA recognizes that there may be situations where the owner or operator can devise alternate statistical methods and sampling procedures that are more appropriate to the facility and that will provide reliable results.There- fore,today's regulations allow the Regional Administrator to approve such procedures if he or she finds that the procedures balance the risk of false positives and false negatives in a manner comparable to that provided by the above specified tests and that they meet specified performance standards [40 CFR §264.97(g)l.In examining the comparability of the procedure to provide a reasonable balance between the risk of false positives and false negatives,the owner or operator will specify in the alternate plan such parameters as sampling frequency and sample size. . 2.4.3 The "Reasonable Confidence" Requirement The methods indicate that the procedure must provide reasonable confi- dence that the migration of hazardous constituents from a regulated unit into and through the aquifer will be detected.(The reference to hazardous constituents does not mean that this option applies only to compliance monitoring;the procedure also applies to monitoring parameters and 2-9 constituents in the detection monitoring program since they are surrogates indicating the presence of hazardous constituents.)The protocols for the specific tests, however,will be used as general benchmark to define "reasonable confidence"in the proposed procedure.If the owner or operator shows that his or her suggested test is comparable in its results to one of the specified tests, then it is likely to be acceptable under the 'reasonable confidence"test.There may be situations, however, where it will be difficult to directly compare the performance of an alternate test to the protocols for the specified tests.In such cases the alternate test will have to be evaluated on its own merits. 2.4.4 Implementation Owners and operators currently operating under a RCRA permit and employing the CABF procedure may change this procedure to a more appropriate procedure at the time of State or Regional permit review and update. Of course,these owners and operators may also apply for a permit modification under § 270.41(a)(3).This change is considered a Class 1 permit modifica- tion.Class 1 permit modifications are technical in nature and generally of limited interest to the public.Class 1 modifications may be made with prior approval from the Director.The reader is referred to 53 FR 37912, September 28, 1988 for more details about the permit modification process. Under appropriate circumstances,the owner or operator may wish to continue using the CABF procedure.This would involve a facility that has comparably few monitoring wells (e.g.,fewer than five) and monitors for only a limited number of chemical parameters and hazardous constituents (e.g., fewer than four).In this case, fewer than 20 comparisons would be made each testing period,and performing the CABF procedure at the 0.05 level of sig- nificance may result in no more than one false positive each testing period. The owner or operator should consider a similar evaluation when deciding the adequacy of the CABF procedure for his or her facility.Likewise, the owner or operator should also continually update the background concentrations in upgradient monitoring wells and simultaneously compare aggregate upgradient well data (background wells) to downgradient well data (compliance wells). This practice will help reduce the component of temporal variability associ- ated with the CABF procedure.Further, efforts should be made to obtain independent samples from the monitoring wells.Section 3 of the guidance addresses how one might accomplish this task.If situations permit, the replicate sampling procedure should be avoided.Replicate samples provide information about analytical variability and accuracy. The goal of all RCRA ground-water sampling programs should be to provide data about the hydro- geochemical variability in the aquifers below the hazardous waste facility. Obtaining independent samples when possible will help reduce the effects of autocorrelation. In all cases any statistical method or sampling procedure must be approved by the Regional Administrator or State Director.Changing from one statistical method or sampling procedure to another may be done at the time of Regional or State permit review and update,or at any time a Class 1 permit modification is approved (see 53 RF 37912, September 28, 1988). 2-10 SECTION 3 CHOOSING A SAMPLING INTERVAL This section discusses the important hydrogeologic parameters to consider when choosing a sampling interval.The Darcy equation is used to determine the horizontal component of the average linear velocity of ground water for confined, semiconfined, and unconfined aquifers.This value provides a good estimate of time of travel for most soluble constituents in ground water, and can be used to determine a sampling interval.Example calculations are pro- vided at the end of the section to further assist the reader.Alternative methods must be employed to determine a sampling interval in hydrogeologic environments where Darcy's law is invalid.Karst, cavernous basalt, fractured rocks,and other "pseudo karst"terranes usually require specialized monitor- ing approaches. Section 264.97(g) of 40 CFR Part 264 Subpart F provides the owner or operator of a RCRA facility with a flexible sampling schedule that will allow him or her to choose a sampling procedure that will reflect site-specific con- cerns.This section specifies that the owner or operator shall, on a semi- annual basis, obtain a sequence of at least four samples from each well, based on an interval that is determined after evaluating the uppermost aquifer's effective porosity,hydraulic conductivity, and hydraulic gradient, and the fate and transport characteristics of potential contaminants.The intent of this provision is to set a sampling frequency that allows sufficient time to pass between sampling events to ensure,to the greatest extent technically feasible, that an independent ground-water sample is taken from each well. For further information on ground-water sampling, refer to the EPA "Practical Guide for Ground-Water Sampling," Barcelona et al., 1985. The sampling frequency of the four semiannual sampling events required in Part 264 Subpart F can be based on estimates using the average linear velocity of ground water.Two forms of the Darcy equation stated below relate ground- water velocity (V) to effective porosity (Ne), hydraulic gradient (i), and hydraulic conductivity (K): Vh=(Kh*i)/Ne and Vv,=(Kv,*i)/Ne where Vh and Vv are the horizontal and vertical components of the average linear velocity of ground water, respectively; Kh and Kv are the horizontal and vertical components of hydraulic conductivity; i is the head gradient; and Ne is the effective porosity.In applying these equations to ground-water monitoring, the horizontal Component of the average linear Velocity (Vh) can be used to determine an appropriate sampling interval.Usually, field 3-1 investigations will yield bulk values for hydraulic conductivity.In most cases, the bulk hydraulic conductivity determined by a pump test, tracer test,or a slug test will be sufficient for these calculations.The vertical com- ponent of the average linear velocity of ground water (VV,) may be considered in estimating flow velocities in areas with significant components of verticalvelocity such as recharge and discharge zones. To apply the Darcy equation to ground-water monitoring, one needs to determine the parameters K, i, and Ne.The hydraulic conductivity, K, is the volume of water at the existing kinematic viscosity that will move in unittime under a unit hydraulic gradient through a unit area measured at rightangles to the direction of flow.The reference to "existing kinematic vis- cosity" relates to the fact that hydraulic conductivity is not only determinedby the media (aquifer),but also by fluid properties (ground water or poten- tial contaminants).Thus, it is possible to have several hydraulic conduc-tivity values for many different chemical substances that are present in the same aquifer.In either case it is advisable to use the greatest value for velocity that is calculated using the Oarcy equation to determine samplingintervals.This will provide for the earliest detection of a leak from a hazardous waste facility and expeditious remedial action procedures. A rangeof hydraulic conductivities (the transmitted fluid is water) for various aqui-fer materials is given in Figures 3-1 and 3-2.The conductivities are given in several units.Figure 3-3 lists conversion factors to change between vari- ous permeability and hydraulic conductivity units. The hydraulic gradient, i,is the change in hydraulic head per unit of distance in a given direction.It can be determined by dividing the differ- ence in head between two points on a potentiometric surface map by theorthogonal distance between those two points (see example calculation). Waterlevel measurements are normally used to determine the natural hydraulic gradi-ent at a facility.However, the effects of mounding in the event of a leak from a waste disposal facility may produce a steeper local hydraulic gradientin the vicinity of the monitoring well.These local changes in hydraulic gradient should be accounted for in the velocity calculations. The effective porosity, Ne, is the ratio, usually expressed as a per- centage, of the total volume of voids available for fluid transmission to thetotal volume of the porous medium dewatered.It can be estimated during a pump test by dividing the volume of water removed from an aquifer by the totalvolume of aquifer dewatered (see example calculation).Table 3-l presents approximate effective porosity values for a variety of aquifer material-s. Incases where the effective porosity is unknown,specific yield may be substi- tuted into the equation.Specific yields of selected rock units are given in Table 3-2.In the absence of measured values,drainable porosity is often used to approximate effective porosity.Figure 3-4 illustrates representative values of drainable porosity and total porosity as a function of aquiferparticle size.If available, field measurements of effective porosity are preferred. 3-2 IGNEOUS AND METAMORPHIC ROCKS Unfracrured Fractured BASALT Unfractured Fractured Lava flow SANDSTONE Fractured Semiconsolidated SHALE Unfractured Fractured CARBONATE ROCKS Fractured Covernous CLAY SILT, LOESS SILTY SAND GLACIAL TILL CLEAN SAND Fine Coarse GRAVEL 108 10-7 10-6 10-5 l0-4 l0-3 10-2 10-1 1 10 IO 2 I0 3 10 4 m/day 10 7 10 6 10-5 10 4 IO-3 l0-2 l0-1 1 10 10 2 IO 3 IO 4 10 5 ft/day 10-7 10-6 10-5 l0-4 1O-3 10 2 10 1 1 10 IO 2 10 3 IO 4 10 5 gal/day-ft 2 Source:Heath, R. C.1983.Basic Ground-Water Hydrology. U.S. Geological Survey Water Supply Paper, 2220, 84 pp. Figure 3-1.Hydraulic conductivity of selected rocks. 3-3 *To obtain k in ftz, multiply k in cm2 by 1.08 x 10-3. Source:Freeze, R. A.,and J. A. Cherry.1979.Ground Water.Prentice Hall, Inc.,Englewood Cliffs, New Jersey. p. 29. Figure 3-3.Conversion factors for permeability and hydraulic conductivity units. 3-4 TABLE 3-I.DEFAULT VALUES FOR EFFECTIVE POROSITY (Ne) FOR USE IN TIME OF TRAVEL (TOT) ANALYSES Soil textural classes Effective porosity of saturationa Unified soil classification system GS, GP, GM, GC, SW, SP, SM, SC 0.20 (20%) ML, MH 0.15 (15%) CL, OL, CH, OH, PT 0.01 (1%)b USDA soil textural classes Clays, silty clays, sandy clays 0.01 (1%)b Silts, silt loams, silty clay loams 0.10 (10%) All others 0.20 (20%) Rock units (all) Porous media (nonfractured rocks such as sandstone and some carbonates) 0.15 (15%) Fractured rocks (most carbonates,0.0001 shales, granites, etc.)(0.01%) Source:Barari, A.,and L. S. Hedges.1985.Movement of Water in Glacial Till.Proceedings of the 17th International Congress of the International Association of Hydrogeologists, pp. 129-134. a These values are estimates and there may be differences between similar units.For example, recent studies indicate that weathered and unweathered glacial till may have markedly dif- ferent effective porosities (Barari and Hedges, 1985; Bradbury et al., 1985). b Assumes de minimus secondary porosity. If fractures or soil structure are present,effective porosity should be 0.001 (0.1%). 3-5 TABLE 3-2.SPECIFIC YIELD VALUES FOR SELECTED ROCK TYPES Rock type Specific yield (%) Clay SandGravel LimestoneSandstone (semiconsolidated) Granite Basalt (young) 22 1918 6 0.09 8 Source:Heath, R. C. 1983. Basic Ground-Water Hydrology.U.S. Geological Survey, Water Supply Paper 2220, 84 pp. 3-6 50 45 40 35 30 -2S=II '"20; Q,. 15 10 S 0 , ", / 'ii Ii -='tl >>Ii Ii=:...... =Q ;:c3 c3 >>;;;>...Ii ......;e ...'"-='"'":>Ot &...U '"'">'"E E-;;;;c:;;i :;:I;e;;,....~.........'ii ~::l ...'"32>-~~.~;;:>S -;0 :i 5'"~ii..j :II <3 ...0 c C ......:E U r,;...:E :E (.,l (.,l a:I 1/16 1/18 1/4 1/2 1 2 4 8 16 32 64 128 256 MaXImum 10%grain Size.millimeters (Th.i,.,n ./Z.In wIt/CII.til.cumul.rll'.rorM.O.gmnmi "111m til.co.n••t m.r.,,-'. r.Kllft 10~of rfl.tor.'•.",",•.J Source:Todd,O.K.1980.Ground Water Hydrology.John Wiley and Sons,New York.534 pp. Figure 3-4.Total porosity and drainable porosity for typical geologic materials. 3-7 Once the values for K, i, and Ne are determined, the horizontal component of the average linear velocity of ground water can be calculated.Using the Darcy equation,we can determine the time required for ground water to pass through the complete monitoring well diameter by dividing the monitoring well diameter by the horizontal component of the average linear velocity of ground water.(If considerable exchange of water occurs during well purging, the diameter of the filter pack may be used rather than the monitoring well diam- eter.)This value will represent the minimum time interval required between sampling events that will yield an independent ground-water sample.(Three- dimensional mixing of ground water in the vicinity of the monitoring well will occur when the well is purged before sampling,which is one reason why this method only provides an estimation of travel time). In determining these sampling intervals,one should note that many chemi- cal compounds will not travel at the same velocity as ground water.Chemical characteristics such as adsorptive potential, specific gravity, and molecular size will influence the way chemicals travel in the subsurface.Large mole- cules, for example, will tend to travel slower than the average linear veloc- ity of ground water because of matrix interactions.Compounds that exhibit a strong adsorptive potential will undergo a similar fate that will dramatically change time of travel predictions using the Darcy equation.In some cases chemical interaction with the matrix material will alter the matrix structure and its associated hydraulic conductivity that may result in an increase in contaminant mobility.This effect has been observed with certain organic solvents in clay units (see Brown and Andersen, 1981).Contaminant fate and transport models may be useful in determining the influence of these effects on movement in the subsurface.A variety of these models are available on the commercial market for private use. 3.1 EXAMPLE CALCULATIONS EXAMPLE CALCULATION NO. 1:DETERMINING THE EFFECTIVE POROSITY (He) The effective porosity, Ne, expressed in %, can be determined during a pump test using the following method: Ne =100% x volume of water removed/volume of aquifer dewatered Based on a pumping rate of the pump of 50 gal/min and a pumping duration of 30 min, compute the volume of water removed as: 50 gal/min x 30 min = 1,500 gal ed, use the formula:.To calculate the volume of aquifer dewater where r is the radius (ft) of area affected by pump ing and h (ft) is the drop in the water level.If, for example, h = 3 ft and r = 18 ft, then: V = (l/3)*3.14*182*3 = 1,018 ft3 3-8 Next, converting ft3 of water to gallons of water, V = (1,018 ft3)(7.48 gal/ft3) = 7,615 gal Substituting the two volumes in the equation for the effective porosity, obtain Ne =100% x 1,500/7,615 = 19.7% EXAMPLE CALCULATION NO. 2: DETERMINING THE HYDRAULIC GRADIENT (i) Using the values given in Figure 3-3, obtain Figure 3-5.Potentiometric surface map for computation of hydraulic gradient. This method provides only a very general estimate of the natural hydraulic gradient that exists in the vicinity of the two piezometers. Chemical gradients are known to exist and may override the effects of the hydraulic gradient.A detailed study of the effects of multiple chemical contaminants may be necessary to determine the actual average linear velocity (horizontal component) of ground water in the vicinity of the monitoring wells. 3-9 EXAMPLE CALCULATION NO. 3:DETERMINING THE HORIZONTAL COMPONENT OF THE AVERAGE LINEAR VELOCITY OF GROUND MATER (Vh) A land disposal facility has ground-water monitoring wells that are screened in an unconfined silty sand aquifer.Slug tests, pump tests, and tracer tests conducted during a hydrogeologic site investigation have revealedthat the aquifer has a horizontal hydraulic conductivity (Kh) of 15 ft/day and an effective porosity (Ne) of 15%.Using a potentiometric map (as in example 2), the hydraulic gradient (i) has been determined to be 0.003 ft/ft. To estimate the minimum time interval between sampling events that willallow one to obtain an independent sample of ground water proceed as follows. Calculate the horizontal component of the average linear velocity of ground water (Vh) using the Darcy equation, Vh = (Kh*i)/Ne. With Kh = 15 ft/day, Ne =15%, and i = 0.003 ft/ft, calculate Vh = (15)(0.003)/(15%)= 0.3 ft/day, or equivalently vh = (0.3 ft/day)(l2 in/ft) = 3.6 in/day Discussion:The horizontal component of the average linear velocity of ground water, Vh,has been calculated and is equal to 3.6 in/day. Monitoring well diameters at this particular facility are 4 in.We can determine the minimum time interval between sampling events that will allow one to obtain anindependent sample of ground water by dividing the monitoring well diameter by the horizontal component of the average linear velocity of ground water: Minimum time interval = (4 in)/(3.6 in/day) = 1.1 days Based on the above calculations, the owner or operator could sample every other day.However, because the velocity can vary with recharge rates sea-sonally, a weekly sampling interval would be advised. Suggested Sampling Interval Date Obtain Sample No. June 1 1 June 8 2 June 15 3 June 22 4 Table 3-3 gives some results for common situations. 3-10 TABLE 3-3.DETERMINING A SAMPLING INTERVAL DETERMINING A SAMPLING INTERVAL UNIT Kh, (ftday) GRAVEL 10 4 SAND 10 2 SILTY SAND 10 TILL I0 -3 SS (SEMICON)1 BASALT 10 -1 Ne (%)v h (in/mo 19 9.6x l0 4 22 8.3 x l0 2 14 1.3x l0 2 2 9.1 x 10-2 6 30 8 2.28 SAMPLING INTERVAL DAILY DAILY WEEKLY MONTHLY * WEEKLY MONTHLY * The horizontal component of the average linear velocities is based on a hydraulic gradient, i, of 0.005 ft/ft. ??Use a Monthly sampling interval or an alternate sampling procedure. 3.2 FLOW THROUGH KARST AND “PSEUDO-KARST” TERRANES The Darcy equation is not valid in turbulent and nonlinear laminar flow regimes.Examples of these particular hydrogeological environments include karst and "pseudo-karst" (e.g.,cavernous basalts and extensively fractured rocks) terranes.Specialized methods have been investigated by Quinlan (1989) for developing alternative monitoring procedures for karst and "pseudo-karst" terranes.Dye tracing as described by Quinlan (1989) and Mull et al. (1988) is useful for identifying flow paths and travel times in karst and "pseudo- karst"terranes.Conventional ground-water monitoring wells in these environments are often of little value in designing an effective monitoring system.Field investigations are necessary to locate seeps and springs, which may serve as better "monitoring wells"for identifying releases of hazardous constituents into ground water and surface water. 3-11 SECTION 4 CHOOSING A STATISTICAL METHOD This section discusses the choice of an appropriate statistical method. Section 4.1 includes a flowchart to guide this selection. Section 4.2 contains procedures to test the distributional assumptions of statistical methods and Section 4.3 has procedures to test specifically for equality of variances. The choice of an appropriate statistical test depends on the type of mon- itoring and the nature of the data.The proportion of values in the data set that are below detection is one important consideration.If most of the values are below detection, a test of proportions is suggested. One set of statistical procedures is suggested when the monitoring con- sists of comparisons of water sample data from the background (hydraulically upgradient) well with the sample data from compliance (hydraulically down- gradient) wells.The recommended approach is analysis of variance (ANOVA). Also, for a facility with limited amounts of data, it is advisable to ini- tially use the ANOVA method of data evaluation, and later, when sufficient amounts of data are collected, to change to a tolerance interval or a control chart approach for each compliance well.However,alternate approaches are al lowed.These include adjustments for seasonality, use of tolerance inter- vals,and use of prediction intervals.These methods are discussed in Sec- tion 5. When the monitoring objective is to compare the concentration of a haz- ardous constituent to a fixed level such as a maximum concentration limit (MCL), a different type of approach is needed.This type of comparison com- monly serves as a basis of compliance monitoring. Control charts may be used, as may tolerance or confidence intervals.Methods for comparison with a fixed level are presented in Section 6. When a long history of data from each well is available, intra-well com- parisons are appropriate.That is, the data from a single uncontaminated well are compared over time to detect shifts in concentration, or gradual trends in concentration that may indicate contamination.Methods for this situation are presented in Section 7. 4.1 FLOWCHARTS--OVERVIEW AND USE The selection and use of a statistical procedure for ground-water moni- toring is a detailed process.Because a single flowchart would become too complicated for easy use,a series of flowcharts has been developed.These flowcharts are found at the beginning of each section and are intended to 4-1 guide the user in the selection and use of procedures in that section. The more detailed flowcharts can be thought of as attaching to the general flow- charts at the indicated points. Three general types of statistical procedures are presented in the flow- chart overview (Figure 4-l):(1) background well to compliance well data comparisons; (2) comparison of compliance well data with a constant limit such as an alternate concentration limit (ACL) or a maximum concentration limit(MCL); and (3) intra-well comparisons.The first question to be asked in determining the appropriate statistical procedure is the type of monitoring program specified in facility permit.The type of monitoring program may determine if the appropriate comparison is among wells, comparison of down- gradient well data to a constant,intra-well comparisons, or a special case. If the facility is in detection monitoring, the appropriate comparison is between wells that are hydraulically upgradient from the facility and those that are hydraulically downgradient.The statistical procedures for this type of monitoring are presented in Section 5.In detection monitoring, it is likely that many of the monitored constituents may result in few quantified results (i.e.,much of the data are below the limit of analytical detection). If this is the case, then the test of proportions (Section 8.1.3) may be rec- ommended.If the constituent occurs in measurable concentrations in back- ground, then analysis of variance (Section 5.2) is recommended. This method of analysis is preferred when the data lack sufficient quantity to allow for the use of tolerance intervals or control charts. If the facility is in compliance monitoring, the permit will specify the type of compliance limit.If the compliance limit is determined from the background,the statistical method is chosen from those that compare back- ground well to compliance well data.Statistical methods for this case are presented in Section 5.The preferred method is the appropriate analysis of variance method in Section 5.2, or if sufficient data permit, tolerance inter- vals or control charts.The flow chart in Section 5 aids in determining which method is applicable. If a facility in compliance monitoring has a constant maximum concentra- tion limit (MCL) or alternate concentration limit (ACL) specified, then the appropriate comparison is with a constant.Methods for comparison with MCLs or ACLs are presented in Section 6,which contains a flow chart to aid in determining which method to use. Finally, when more than one year of data have been collected from each well, the facility owner or operator may find it useful to perform intra-well comparisons over time to supplement the other methods.This is not a regula- tory requirement,but it could provide the facility owner or operator with information about the site hydrogeology.This method of analysis may be used when sufficient data from an individual uncontaminated well exist and the data allow for the identification of trends.A recommended control chart procedure (Starks, 1988) suggests that a minimum background sample of eight observations is needed.Thus an intra-well control chart approach could begin after the first complete year of data collection.These methods are presented in Section 7. 4-2 FLOWCHART OVERVIEW Background Figure 4-1.Flowchart overview. 4-3 4.2 CHECKING DISTRIBUTIONAL ASSUMPTIONS The purpose of this section is to provide users with methods to check the distributional assumptions of the statistical procedures recommended for ground-water monitoring.It is emphasized that one need not do an extensive study of the distribution of the data unless a nonparametric method of analy- sis is used to evaluate the data.If the owner or operator wishes to trans- form the data in lieu of using a nonparametric method, it must first be shown that the untransformed data are inappropriate for a normal theory test.Similarly, if the owner or operator wishes to use nonparametric methods, he or she must demonstrate that the data do violate normality assumptions. EPA has adopted this approach because most of the statistical proceduresthat meet the criteria set forth in the regulations are robust with respect to departures from many of the normal distributional assumptions. That is, only extreme violations of assumptions will result in an incorrect outcome of a statistical test.Moreover,it is only in situations where it is unclear whether contamination is present that departures from assumptions will alter the outcome of a statistical test.EPA therefore believes that it is protec- tive of the environment to adopt the approach of not requiring testing of assumptions of a normal distribution on a wide scale. It should be noted that the normal distributional assumptions forstatistical procedures apply to the errors of the observations.Application of the distributional tests to the observations themselves may lead to theconclusion that the distribution does not fit the observations.In some cases this lack of fit may be due to differences in means for the different wells or some other cause.The tests far distributional assumptions are best applied to the residuals from a statistical analysis. A residual is the difference between the original observation and the value predicted by a model.For example,in analysis of variance,the predicted values are the group means and the residual is the difference between each observation and its group mean. If the conclusion from testing the assumptions is that the assumptions are not adequately met,then a transformation of the data may be used or a nonparametric statistical procedure selected.Many types of concentration data have been reported in the literature to be adequately described by a log- normal distribution.That is, the natural logarithm of the original observa- tions has been found to follow the normal distribution. Consequently, if the normal distributional assumptions are found to be violated for the original data, a transformation by taking the natural logarithm of each observation is suggested.This assumes that the data are all positive.If the log trans- formation does not adequately normalize the data or stabilize the variance, one should use a nonparametric procedure or seek the consultation of a profes- sional statistician to determine an appropriate statistical procedure. The following sections present four selected approaches to check for normality.The first option refers to literature citation, the other three are statistical procedures.The choice is left to the user. The availability of statistical software and the user's familiarity with it will be a factor in the choice of a method.The coefficient of variation method, for example, requires only the computation of the mean and standard deviation of the data. 4-4 Plotting on probability paper can be done by hand but becomes tedious with many data sets.However, the commercial Statistical Analysis System (SAS) software package provides a computerized version of a probability plot in its PROC UNIVARIATE procedure.SYSTAT, a package for PCs also has a probability plot procedure.The chi-squared test is not readily available through commer- cial software but can be programmed on a PC (for example in LOTUS l-2-3) or in any other (statistical) software language with which the user is familiar.The amount of data available will also influence the choice.All tests of distributional assumptions require a fairly large sample size to detect moderate to small deviations from normality.The chi-squared test requires a minimum of 20 samples for a reasonable test. Other statistical procedures are available for checking distributional assumptions.The more advanced user is referred to the Kolmogorov-Smirnov test (see, for example, Lindgren,1976) which is used to test the hypothesis that data come from a specific (that is,completely specified) distribution. The normal distribution assumption can thus be tested for. A minimum sample size of 50 is recommended for using this test. A modification to the Kolmogorov-Smirnov test has been developed by Lilliefors who uses the sample mean and standard deviation from the data as the parameters of the distribution (Lilliefors, 1967).Again,a sample sizeof at least 50 is recommended. Another alternative to testing for normality is provided by the rather involved Shapiro-Wilk's test.The interested user is referred to the relevant article in Biometrika by Shapiro and Wilk (1965). 4.2.1 Literature Citation PURPOSE An owner or operator may wish to consult literature to determine what type of distribution the ground-water monitoring data for a specific con- stituent are likely to follow.In cases where insufficient data prevents theuse of a quantitative method for checking distributional assumptions, this approach may be necessary and make it easier to determine whether there is statistically significant evidence of contamination. PROCEDURE One simple way to select a procedure based on a specific statistical dis- tribution, is by citing a relevant published reference. The owner or operator may find papers that discuss data resulting from sampling ground water and conclude that such data for a particular constituent follow a specified dis- tribution.Citing such a reference may be sufficient justification for using a method based on that distribution, provided that the data do not show evi- dence that the assumptions are violated. To justify the use of a literature citation, the owner or operator needs to make sure that the reference cited considers the distribution of data for the specific compound being monitored.In addition, he or she must evaluate 4-5 the similarity of their site to the site that was discussed in the literature, especially similar hydrogeologic and potential contaminant characteristics. However, because many of the compounds may not be studied in the literature,extrapolations to compounds with similar chemical characteristics and to sites with similar hydrogeologic conditions are also acceptable.Basically, the owner or operator needs to provide some reason or justification for choosing aparticular distribution. 4.2.2 Coefficient-of-Variation Test Many statistical procedures assume that the data are normally distrib- uted.The concentration of a hazardous constituent in ground water is inher- ently nonnegative,while the normal distribution allows for negative values. However,if the mean of the normal distribution is sufficiently above zero, the distribution places very little probability on negative observations and is still a valid approximation. One simple check that can rule out use of the normal distribution is to calculate the coefficient of variation of the data.The use of this method was required by the former Part 264 Subpart F regulations pursuant to Sec- tion 264.97(h)(l).Because most owners and operators as well as Regional personnel are already familiar with this procedure, it will probably be used frequently.The coefficient of variation, CV, is the standard deviation of the observations, divided by their mean.If the normal distribution is to be a valid model,there should be very little probability of negative values. The number of standard deviations by which the mean exceeds zero determines the probability of negative values. For example,if the mean exceeds zero by one standard deviation, the normal distribution will have less than 0.159 probability of a negative observation. Consequently,one can calculate the standard deviation of the observa- tions,calculate the mean,and form the ratio of the standard deviation di- vided by the mean.If this ratio exceeds 1.00, there is evidence that the data are not normal and the normal distribution should not be used for those data.(There are other possibilities for nonnormality, but this is a simple check that can rule out obviously nonnormal data.) PURPOSE This test is a simple check for evidence of gross nonnormality in the ground-water monitoring data. PROCEDURE To apply the coefficient-of-variation check for normality proceed as fol- lows. Step 1. 4-6 Step 2.Calculate the sample standard deviation, S.* Step 3.Divide the sample standard deviation by the sample mean. This ratio is the CV. Step 4.Determine if the result of Step 3 exceeds 1.00. If so, this is evidence that the normal distribution does not fit the data adequately. EXAMPLE Table 4-1 is an example data set of chlordane concentrations in 24 water samples from a fictitious site.The data are presented in order from least to greatest. Applying the procedure steps to the data of Table 4-1, we have: Step 1. Step 2.S = 1.56 Step 3.CV = 1.56/1.52 = 1.03 Step 4.Because the result of Step 3 was 1.03, which exceeds 1.00, we conclude that there is evidence that the data do not adequately follow the normal distribution.As will be discussed in other sections one would then either transform the data, use a nonparametric procedure, or seek professional guidance. *Throughout this document we use S 2 to denote the unbiased estimate of the population variance ó2.We refer to this unbiased estimate of the popu- lation variance as the sample variance.The formula given in Step 2 above for S, the square root of the unbiased estimate of the population variance,is used as the sample estimate of the standard deviation and is referred to as the "sample standard deviation.'Any computation of the sample standard deviation or the sample variance, unless explicitly noted otherwise, refers to these formulas.It should be noted that this esti- mate of the standard deviation is not unbiased in that its expected value is not equal to the population standard deviation.However, all of the statistical procedures have been developed using the formulas as we define them here. 4-7 TABLE 4-1.EXAMPLE DATA FOR COEFFICIENT- OF-VARIATION TEST Chlordane concentration (ppm) Dissolved phase Immiscible phase NOTE.The owner or operator may choose to use parametric tests since 1.03 is so close to the limit but should use a transformation or a nonparametric test if he or she believes that the parametric test results would be incorrect due to the departure from normality. 4.2.3 Plotting on Probability Paper PURPOSE Probability paper is a visual aid and diagnostic tool in determining' whether a small set of data follows a normal distribution.Also, approximate estimates of the mean and standard deviation of the distribution can be read from the plot. PROCEDURE Let X be the variable; X1, X2,...,Xi,,..,Xn the set of n observations. The values of X can be raw data, residuals, or transformed data. 4-8 Step 1.Rearrange the observations in ascending order: X(1), X(2),..,X(n). Step 2.Compute the cumulative frequency for each distinct value X(i) as (i/(n+l)) x 100%.The divisor of (n+l) is a plotting convention to avoid cumulative frequencies of 100% which would be at infinity on the probability paper. If a value of X occurs more than once, then the corresponding value of i increases appropriately.For example, if X(2)= X(3), then the cumulative frequency for X(1) is lOO*l/(n+l), but the cumulative frequency for X(2) or X(3) is 100*(1+2)/(n+l). Step 3.Plot the distinct pairs [X(i), (i/n+l)) x 100] values on prob- ability paper (this paper is commercially available) using an appropriate scale for X on the horizontal axis.The vertical axis for the cumulative frequencies is already scaled from 0.01 to 99.99%. If the points fall roughly on a straight line (the line can be drawn with a ruler), then one can conclude that the underlying distribution is approxi- mately normal.Also, an estimate of the mean and standard deviation can be made from the plot.The horizontal line drawn through 50% cuts the plotted line at the mean of the X values.The horizontal line going through 84% cuts the line at a value corresponding to the mean plus one standard deviation. By subtraction, one obtains the standard deviation. REFERENCE Dixon, W. J.,and F. J. Massey, Jr. McGraw-Hill, Fourth Edition, 1983. Introduction to Statistical Analysis. EXAMPLE Table 4-2 lists 22 distinct chlordane concentration values (X) along with their frequencies.These are the same values as those listed in Table 4-1. There is a total of n=24 observations. Step 1.Sort the values of X in ascending order (column 1). Step 2.Compute (100 x (i/25)], column 4, for each distinct value of X, based on the values of i (column 2). Step 3. ure 4-2). Plot the pairs [Xi, lOOx(i/25)] on probability paper (Fig- INTERPRETATION The points in Figure 4-2 do not fall on a straight line; therefore, the hypothesis of an underlying normal distribution is rejected.However, the 4-9 TABLE 4-2.EXAMPLE DATA COMPUTATIONS FOR PROBABILITY PLOTTING Concentration Absolute X frequency i l00x(i/(n+l))1n(X) 4-10 4-11 : ~--,== Figure 4-2.Probability plot of raw chlordane concentrations. shape of the curve indicates a lognormal distribution. This is checked in the next step. Also, information about the solubility of chlordane in this example is helpful.Chlordane has a solubility (in water) that ranges between 0.0156 and 1.85 mg/L.Because the last six measurements exceed this solubility range, contamination is suspected. Next, take the natural logarithm of the X-values (1n(X)) (column 5 in Table 4-2).Repeat Step 3 above using the pairs [1in(X), lOOx(i/25)]. The re- sulting plot is shown in Figure 4-3.The points fall approximately on a straight line (hand-drawn) and the hypothesis of lognormality of X, i.e., 1n(X) is normally distributed, can be accepted. The mean can be estimated at slightly below 0 and the standard deviation at about 1.2 on the log scale. CAUTIONARY NOTE The probability plot is not a formal test of whether the data follow a normal distribution.It is designed as a quick, graphical procedure to identify cases of obvious nonnormality.Figure 4-3 is an example of a probability plot of normal data,illustrating how a probability plot of normal data looks.Figure 4-2 is an example of how nonnormal data look on a prob- ability plot.Data that are sufficiently nonnormal to require use of a pro- cedure not based on the normal distribution will show a definite curve. A single point that does not fall on the straight line does not indicate non- normality, but may be an outlier. 4.2.4 The Chi-Squared Test The chi-squared test can be used to test whether a set of data properly fits a specified distribution within a specified probability. Most introduc- tory courses in statistics explain the chi-squared test, and its familiarity among owners and operators as well as Regional personnel may make it a frequently used method of analysis.In this application the assumed distribu- tion is the normal distribution,but other distributions could also be used. The test consists of defining cells or ranges of values and determining the expected number of observations that would fall in each cell according to the hypothesized distribution.The actual number of data points in each cell is compared with that predicted by the distribution to judge the adequacy of the fit. PURPOSE The chi-squared test is used to test the adequacy of the assumption of normality of the data. PROCEDURE Step 1.Determine the appropriate number of cells, K.This number usually ranges from 5 to 10.Divide the number of observations, N, by 4. Dividing the total number of observations by 4 will guarantee a minimum of four observations necessary for each of the K = N/4 cells.Use the largest whole number of this result, using 10 if the result exceeds 10. 4-12 _--...-_= __~l ~.-4-A I.._.~±-_.~-.J 1I-+--1-_C=-L.l.j-~+_~---:--i ;:±:=:i'=t:11 i·-r-I/·i';iiii·I~I-~[='==~...__:::::r=:-.:.::::=:.~.."~_~.~-~=..:'_=""-.~::-.-·84% ~;:;.e:.=-=_-.~-~:--_:..-~.-=! I -{;:::-;t!,.1I ~---_>-'.'_~!___.__,.+___.--l" -iI"-~'-.__..to--;.~ '~t .--.~-'---r--':.:::.-',-.'-:50%.~.•_••.:::__a :~4=-!-=j 1100~(iI(~+1))i =j.....,--i--+---;--WiW :l: 4 !-~:I 1_ 12 I :I 16 =:t:::::::t=:::t:==::t:,=:=:t=t::::t:=:t==L.-20 I ! 24 32 36 40 44 48 52 56 60 64 68 72 76 SO S4 S8 92 96 :t -:;==:-.~~~ ,r/- t=--~~: :~I In (X) -E=-3.Zl ~J:::-1.71 -1.39-,-1.24=€-0.97 -0.69_..t:=-0.51~-0.07j:=; -~-0.03~-0.10L ...J-.0.15 II ~0.25!0.31"..,.":-0.32:0.37j==0.38=~- t=0.95 .-i---0.99.-,-...-,:=1.03+!===1.20~,,- )C i=::1.50 0 :-=1.890E!r--' II)"I---'x ~-'---<">-r '---'-.' =L __'--"!---i-'"'7'-i'-.....,..--+-+---f+---;-'--,..-+-+~-+-_----+--..;.--~_+'--4 I T i/!1I1111!AII;I===·~S~:'~L~2i·=_i·l--;lii;'~:I;I;lf=I --+--+--4 ~.--- .- I f0o-l-;'I•I•-3 -2~5 -2 ·1 a 2 X-Axis:In (Concentration)Mean Mean+Std Figure 4-3.Probability plot of log-transformed chlordane concentrations. 4-13 Step 2.Standardize the data by subtracting the sample mean and divid- ing by the sample standard deviation: Step 3.Determine the number of observations that fall in each of the cells defined according to Table 4-3.The expected number of observations for each cell is N/K, where N is the total number of observations and K is the number of cells. values from 1 to Let Ni denote the observed number in cell i (for i taking K) cell i. and let Ei denote the expected number of observations in Note that in this case the cells are chosen to make the Ei’s equal. TABLE 4-3.CELL BOUNDARIES FOR THE CHI-SQUARED TEST 5 6 Number of cells (K) 7 8 9 10 Cell boundaries -0.84 -0.97 -1.07 -1.15 -1.22 -1.28 for equal ex--0.25 -0.43 -0.57 -0.67 -1.08 -0.84 pected cell 0.25 0.00 -0.18 -0.32 -0.43 -0.52 sizes with the 0.84 0.43 0.18 0.00 -0.14 -0.25 normal distri-0.97 0.57 0.32 0.14 0.00 bution 1.07 0.67 0.43 0.25 1.15 1.08 0.52 1.22 0.84 1.28 Step 4.Calculate the chi-squared statistic by the formula below: Step 5.Compare the calculated result to the table of the chi-squared distribution with K-3 degrees of freedom (Table 1, Appendix B).Reject the hypothesis of normality if the calculated value exceeds the tabulated value. REFERENCE Remington, R. D., and M. A. Schork.Statistics with Applications to the Biological and Health Sciences. Prentice-Hall, 1970. 235-236. EXAMPLE The data in Table 4-4 are N =21 residuals from an analysis of variance on dioxin concentrations.The analysis of variance assumes that the errors 4-14 TABLE 4-4.EXAMPLE DATA FOR CHI-SQUARED TEST Observation Residual Standardized residual -0.45 -1.90 -0.35 -1.48 -0.35 -1.48 -0.22 -0.93 -0.16 -0.67 -0.13 -0.55 -0.11 -0.46 -0.10 -0.42 -0.10 -0.42 -0.06 -0.25 -0.05 -0.21 0.04 0.17 0.11 0.47 0.13 0.55 0.16 0.68 0.17 0.72 0.20 0.85 0.21 0.89 0.30 1.27 0.34 1.44 0.41 1.73 4-15 (estimated by the residuals) are normally distributed. The chi-squared test is used to check this assumption. Step 1.Divide the number of observations, 21, by 4 to get 5.25. Keep only the integer part, 5, so the test will use K = 5 cells. Step 2.The sample mean and standard deviation are calculated and found to be: X = 0.00, S =0.24. The data are standardized by subtracting the mean (0 in this case) and dividing by S.The results are also shown in Table 4-4. Step 3.Determine the number of (standardized) observations that fall into the five cells determined from Table 4-3. These divisions are: (1) less than or equal to -0.84, (2) greater than -0.84 and less than or equal to -0.25, (3) greater than -0.25 and less than or equal to +0.25, (4) greater than 0.25 and less than or equal to 0.84, and (5) greater than 0.84. We find 4 observations in cell 1, 6 in cell 2, 2 in cell 3, 4 in cell 4, and 5 in cell 5. Step 4.Calculate the chi-squared statistic.The expected number in each cell is N/K or 21/5 = 4.2. Step 5.The critical value at the 5% level for a chi-squared test with 2 (K-3 = 5-3 = 2) degrees of freedom is 5.99 (Table 1, Appendix 6). Because the calculated value of 2.10 is less than 5.99 there is no evidence that these data are not normal. INTERPRETATION The cell boundaries are determined from the normal distribution so that equal numbers of observations should fall in each cell.If there are large differences between the number of observations in each cell and that predicted by the normal distribution,this is evidence that the data are not normal. The chi-squared statistic is a nonnegative statistic that increases as the difference between the predicted and observed number of observations in each cell increases. If the calculated value of the chi-squared statistic exceeds the tabu- lated value, there is statistically significant evidence that the data do not follow the normal distribution.In that case, one would need to do a trans- formation, use a nonparametric procedure,or seek consultation before inter- preting the results of the test of the ground-water data.If the calculated value of the chi-squared statistic does not exceed the tabulated critical value, there is no significant lack of fit to the normal distribution and one can proceed assuming that the assumption of normality is adequately met. 4-16 REMARK The chi-squared statistic can be used to test whether the residuals from an analysis of variance or other procedure are normal.In this. case the degrees of freedom are found by (number of cells minus one minus the number of parameters that have been estimated).This may require more than the sug- gested 10 cells.The chi-squared test does require a fairly large sample size in that there should be generally at least four observations per cell. 4.3 CHECKING EQUALITY OF VARIANCE: BARTLETT'S TEST The analysis of variance procedures presented in Section 5 are often more sensitive to unequal variances than to moderate departures from normality. The procedures described in this section allow for testing to determine whether group variances are equal or differ significantly. Often in practice unequal variances and nonnormality occur together.Sometimes a transformation to stabilize or equalize the variances also produces a distribution that is more nearly normal.This sometimes occurs if the initial distribution was positively skewed with variance increasing with the number of observations. Only Bartlett's test for checking equality, or homogeneity, of variances is presented here.It encompasses checking equality of more than two variances with unequal sample sizes.Other tests are available for special cases. The F-test is a special situation when there are only two groups to be compared. The user is referred to classical textbooks for this test (e.g., Snedecor and Cochran, 1980).In the case of equal sample sizes but more than two variances to be compared,the user might want to use Hartley's or maximum F-ratio test (see Nelson, 1987).This test provides a quick procedure to test for variance homogeneity. PURPOSE Bartlett's test is a test of homogeneity of variances. In other words, it is a means of testing whether a number of population variances of normal distributions are equal.Homogeneity of variances is an assumption made in analysis of variance when comparing concentrations of constituents between background and compliance wells, or among compliance wells.It should be noted that Bartlett's test is itself sensitive to nonnormality in the data. With long-tailed distributions the test too often rejects equality (homo- geneity) of the variances. PROCEDURE Assume that data from k wells are available and that there are ni data points for well i. 4-17 thus f is the total sample size minus the number of wells (groups); and k Step 3.Using the chi-squared table (Table 1, Appendix B), find the critical value for X2 with (k-l) degrees of freedom at a predetermined signif- icance level, for example, 5%. INTERPRETATION If the calculated value x2 is larger than the tabulated value, then con- clude that the variances are not equal at that significance level. REFERENCE Johnson N. L., and F. C. Leone.Statistics and Experimental Design in Engineering and the Physical Sciences.Vol. I, John Wiley and Sons, New York, 1977. EXAMPLE Manganese concentrations are given for k=6 wells in Table 4-5 below. Note:Some numbers in Table 4-5 have been rounded. 4-18 TABLE 4-5.EXAMPLE DATA FOR BARTLETT'S TEST Sampling date Well 1 Well 2 Well 3 Well 4 Well 5 Well 6 This is the sum of the last line in Table 4-5. 4-19 INTERPRETATION The sample variances of the data from the six wells were compared by means of Bartlett's test.The test was significant at the 5% level, suggest- ing that the variances are significantly unequal (heterogeneous).A log- transform of the data can be done and the same test performed on the trans- formed data.Generally,if the data followed skewed distribution, this approach resolves the problem of unequal variances and the user can proceed with an ANOVA for example. On the other hand, unequal variances among well data could be a direct indication of well contamination,since the individual data could come from different distributions (i.e.,different means and variances). Then the user may wish to test which variance differs from which one.The reader is referred here to the literature for a gap test of variance (Tukey, 1949; David, 1956; or Nelson, 1987). NOTE .In the case of k=2 variances, the test of equality of variances is the F-test (Snedecor and Cochran, 1980). . i=l, Bartlett's test simplifies in the case of equal sample sizes, ni=n, ...,k.The test used then is Cochran's test. Cochran's test focuses on the-largest variance and compares it to the sum of all the variances. Hartley introduced a quick test of homogeneity of variances that uses the ratio of the largest over the smallest variances.Technical aids for the procedures under the assumption of equal sample sizes are given by L. S. Nelson in the Journal of Quality Technology, Vol.19, 1987, pp. 107 and 165. 4-20 SECTION 5 BACKGROUND WELL TO COMPLIANCE WELL COMPARISONS There are many situations in ground-water monitoring that call for the comparison of data from different wells.The assumption is that a set of uncontaminated wells can be defined.Generally these are background wells and have been sited to be hydraulically upgradient from the regulated unit. A second set of wells are sited hydraulically downgradient from the regulated unit and are otherwise known as compliance wells.The data from these com- pliance wells are compared to the data from the background wells to determine whether there is any evidence of contamination in the compliance wells that would presumably result from a release from the regulated unit. If the owner or operator of a hazardous waste facility does not have reason to suspect that the test assumptions of equal variance or normality will be violated, then he or she may simply choose the parametric analysis of variance as a default method of statistical analysis.In the event that this method indicates a statistically significant difference between the groups being tested, then the test assumptions should be evaluated. This situation, where the relevant comparison is between data from back- ground wells and data from compliance wells,is the topic of this section. Comparisons between background well data and compliance well data may becalled for in all phases of monitoring.This type of comparison is the gen- eral case for detection monitoring.It is also the usual approach for com- pliance monitoring if the compliance limits are determined by the background well constituent concentration levels.Compounds that are present in back- ground wells (e.g.,naturally occurring metals) are most appropriately evaluated using this comparison method. Section 5.1 provides a flowchart and overview for the selection of methods for comparison of background well and compliance well data.Sec- tion 5.2 contains analysis of variance methods.These provide methods for directly comparing background well data to compliance well data. Section 5.3 describes a tolerance interval approach,where the background well data are used to define the tolerance limits for comparison with the compliance well data.Section 5.4 contains an approach based on prediction intervals, again using the background well data to determine the prediction interval for com- parison with the compliance well data.Methods for comparing data to a fixed compliance limit (an MCL or ACL) will be described in Section 6. 5-l 5.1 SUMMARY FLOWCHART FOR BACKGROUND WELL TO COMPLIANCE WELL COMPARISONS Figure 5-l is a flowchart to aid in selecting the appropriate statistical procedure for background well to compliance well comparisons. The first step is to determine whether most of the observations are quantified (that is, above the detection limits) or not.Generally, if more than 50% of the obser- vations are below the detection limit (as might be the case with detection or compliance monitoring for volatile organics) then the appropriate comparison is a test of proportions.The test of proportions compares the proportion of detected values in the background wells to-those in the compliance wells. See Section 8.1 for a discussion of dealing with data below the detection limit. If the proportion of detected values is 50% or more, then an analysis of variance procedure is the first choice.Tolerance limits or prediction inter- vals are acceptable alternate choices that the user may select. The analysis of variance procedures give a more thorough picture of the situation at the facility.However, the tolerance limit or prediction interval approach is acceptable and requires less computation in many situations. Figure 5-2 is a flowchart to guide the user if a tolerance limits approach is selected.The first step in using Figure 5-2 is to determine whether the facility is in detection monitoring.If so, much of the data may be below the detection limit.See Section 8.1 for a discussion of this case, which may call for consulting a statistician.If most of the data are quanti- fied, then follow the flow chart to determine if normal tolerance limits can be used.If the data are not normal (as determined by one of the procedures in Section 4.2), then the logarithm transformation may be done and the trans- formed data checked for normality. If the log data are normal, the lognormal tolerance limit should be used.If neither the original data nor the log- transformed data are normal,seek consultation with a professional statistician. If a prediction interval is selected as the method of choice, see Sec- tion 5.4 for guidance in performing the procedure. If analysis of variance is to be used, then continue with Figure 5-l to select the specific method that is appropriate.A one-way analysis of vari- ance is recommended.If the data show evidence of seasonality (observed, for example, in a plot of the data over time),a trend analysis or perhaps a two- way analysis of variance may be the appropriate choice.These instances may require consultation with a professional statistician. If the one-way analysis of variance is appropriate, the computations are performed, then the residuals are checked to see if they meet the assumptions of normality and equal variance.If so, the analysis concludes.If not, a logarithm transformation may be tried and the residuals from the analysis of variance on the log data are checked for assumptions.If these still do not adequately satisfy the assumptions,then a one-way nonparametric analysis of variance may be done, or professional consultation may be sought. 5-2 BACKGROUND WELL TO COMPLIANCE WELL COMPARISONS (Olhar Possible Approaches),,_L--.•,.__., Replace NOs wllh MOlJ2 or PQlJ2 No No U1, w OneWav ANOVA Save Residuals Nonparamelric One-Way ANOVA Paramolric One-WaV ANOVA Figure 5-1.Background well to compl iance well comparisons. Tolerance Limits: Alternate Approach toBackground Well To Compliance Well Comparisons Figure 5-2.Tolerance limits:alternate approach to background well to compliance well comparisons. 5-4 5.2 ANALYSIS OF VARIANCE If contamination of the ground water occurs from the waste disposal facility and if the monitoring wells are hydraulically upgradient and hydraulically downgradient from the site,then contamination is unlikely to change the levels of a constituent in all wells by the same amount.Thus, contamination from a disposal site can be seen as differences in average con- centration among wells,and such differences can be detected by analysis of variance. Analysis of variance (ANOVA) is the name given to a wide variety of sta- tistical procedures.All of these procedures compare the means of different groups of observations to determine whether there are any significant differ- ences among the groups, and if so,contrast procedures may be used to determine where the differences lie.Such procedures are also known in the statistical literature as general linear model procedures. Because of its flexibility and power,analysis of variance is the pre- ferred method of statistical analysis when the ground-water monitoring is based on a comparison of background and compliance well data. The ANOVA is especially useful in situations where sample sizes are small, as is the case during the initial phases of ground-water monitoring. Two types of analysis of variance are presented:parametric and nonparametric one-way analyses of variance.Both methods are appropriate when the only factor of concern is the different monitoring wells at a given sampling period. The hypothesis tests with parametric analysis of variance usually assume that the errors (residuals) are normally distributed with equal variance. These assumptions can be checked by saving the residuals (the difference between the observations and the values predicted by the analysis of variance model) and using the tests of assumptions presented in Section 4.Since the data will generally be concentrations and since concentration data are often found to follow the lognormal distribution,the log transformation is sug- gested if substantial violations of the assumptions are found in the analysis of the original concentration data.If the residuals from the transformed data do not meet the parametric ANOVA requirements, then nonparametric approaches to analysis of variance are available using the ranks of the obser- vations.A one-way analysis of variance using the ranks is presented in Section 5.2.2. When several sampling periods have been used and it is important to con- sider the sampling periods as a second factor,then two-way analysis of vari- ance, parametric or nonparametric, is appropriate.This would be one way to test for and adjust the data for seasonality.Also, trend analysis (e.g., time series) may be used to identify seasonality in the data set.If neces- sary, data that exhibit seasonal trends can be adjusted. Usually, however, seasonal variation will affect all wells at a facility by nearly the same amount,and in most circumstances,corrections will not be necessary.Fur- ther, the effects of seasonality will be substantially reduced by simultane- ously comparing aggregate compliance well data to background well data. Situations that require an analysis procedure other than a one-way ANOVA should be referred to a professional statistician. 5-5 5.2.1 One-Way Parametric Analysis of Variance In the context of ground-water monitoring,two situations exist for which a one-way analysis of variance is most applicable: *Data for a water quality parameter are available from several wells but for only one time period (e.g., monitoring has just begun). *Data for a water quality parameter are available from several wells for several time periods.However, the data do not exhibit sea- sonality. In order to apply a parametric one-way analysis of variance, a minimum number of observations is needed to give meaningful results.At least p › 2 groups are to be compared (i.e., two or more wells). It is recommended that each group (here,wells) have at least three observations and that the total sample size,N, be large enough so that N-p › 5.A variety of combinations of groups and number of observations in groups will fulfill this minimum. Onesampling interval with four independent samples per well and at least three wells would fulfill the minimum sample size requirements. The wells should be spaced so as to maximize the probability of intercepting a plume of contamina- tion.The samples should be taken far enough apart in time to guard against autocorrelation. PURPOSE One-way analysis of variance is a statistical procedure to determine whether differences in mean concentrations among wells, or groups of wells, are statistically significant.For example,is there significant contamina- tion of one or more compliance wells as compared to background wells? PROCEDURE Suppose the regulated unit has p wells and that ni data points (concen- trations of a constituent) are available for the ith well. These data can be from either a single sampling period or from more than one.In the latter case, the user could check for seasonality before proceeding by plotting the data over time.Usually the computation will be done on a computer using a commercially available program.However, the procedure is presented so that computations can be done using a desk calculator, if necessary. (N is the total sample size at this specific regulated unit): 5-6 Observations Step 2.Compute well totals and well means as follows: These totals and means Step 3.Compute and the grand mean: are shown in the last two columns of the table above. the sum of squares of differences between well means (The formula on the far right is usually most convenient for calculation.) This sum of squares has (p-l) degrees of freedom associated with it and is a measure of the variability between wells. 5-7 Step 4.Compute the corrected total sum of squares (The formula on the far right is usually most convenient for calculation.) This sum of squares has (N-l) degrees of freedom associated with it and is a measure of the variability in the whole data set. Step 5.-Compute the sum of squares of differences of observations within wells from the well means.This is the sum of squares due to error and is obtained by subtraction: ss Error = SSTotal - ssWells It has associated with it (N-p) degrees of freedom and is a measure of the variability within wells. Step 6.Set up the ANOVA table as shown below in Table 5-l. The sums of squares and their degree of freedom were obtained from Steps 3 through 5. The mean square quantities are simply obtained by dividing each sum of squares by its corresponding degrees of freedom. TABLE 5-l.ONE-WAY PARAMETRIC ANOVA TABLE Source of Degrees of Variation Sums of squares freedom Mean squares F Between wells SSWells MSWells F MSWells = MS = SSWells/(p-l)Error Error (within SSError N-Pwells) MSError = SSError/(N-p) Total SSTotal N-l Step 7.To test the hypothesis of equal means for all p wells, compute F = mSWells/MSError (last column in Table 5-l).Compare this statistic to the tabulated F statistic with (p-l) and (N-p) degrees of freedom (Table 2, Appen- dix B) at the 5% significance level.If the calculated F value exceeds the tabulated value,reject the hypothesis of equal well means.Otherwise, 5-8 conclude that there is no significant difference between the concentrations at the p wells and thus no evidence of well contamination. In the case of a significant F (calculated F greater than tabulated F in Step 7), the user will conduct the next few steps to determine which compli- ance well(s) is (are) contaminated.This will be done by comparing each com- pliance well with the background well(s).Concentration differences between a pair of background wells and compliance wells or between a compliance well and a set of background wells are called contrasts in the ANOVA and multiple com- parisons framework. Step 8.Determine if the significant F is due to differences between background and compliance wells (computation of Bonferroni t-statistics). Assume that of the p wells,u are background wells and m are compliance wells (thus u + m = p).Then m differences--m compliance wells each compared with the average of the background wells --need to be computed and tested for statistical significance.If there are more than five downgradient wells, the individual comparisons are done at the comparisonwise significance level of l%, which may make the experimentwise significance level greater than 5%. •.Obtain the total sample size of all u background wells. .Compute the average concentration from the u background wells. .Compute the m differences between the average concentrations from each compliance well and the average background wells. xi .-Xb, i=l,...,m .Compute the standard error of each difference as 5-9 Compute the m quantities Di = SEi x t for each compliance well i. If m > 5 use the entry for t(N-p),(l-0.0l).That is, use the entryat m = 5. Step 9.Compute the residuals.The residuals are the differences between each observation and its predicted value according to the particular analysis of variance model under consideration.In the case of a one-way analysis of variance,the predicted value for each observation is the group (that is, well) mean.Thus the residuals are given by: The residuals, Rij can be used to check for departures from normality as described in Section 4.2. NOTE The data can also be checked for equality of variances as described in Section 4.3.The last column of Table 5-2 contains the standard deviations estimated for each well, the Si used in Bartlett's test. In some cases it may be appropriate to implement the ANOVA procedureindependently for an individual regulated unit.If there are more than five wells at the compliance point and the waste management area consists of more than one regulated unit,then the data may be evaluated separately for each regulated unit if approved by the Regional Administrator or State Director. In many cases the monitoring well system design and site hydrogeology will determine if this approach is appropriate for a particular regulated unit.This will help reduce the number of compliance wells used in a multiple well comparisons procedure. If a single regulated unit has more than five wells at the point compliance, refer to the caveat in the cautionary note. CAUTIONARY NOTE of Should the regulated unit consist of more than five compliance wells, then the Bonferroni t-test should be modified by doing the individual compari- sons at the 1% level so that the Part 264 Subpart F regulatory requirement 5-10 pursuant to §264.97(i)(2) will be met.Alternately, a different analysis of contrasts, such as Scheffe's, may be used.The more advanced user is referred to the second reference below for a discussion of multiple comparisons. REFERENCES Johnson, Norman L., and F. C. Leone.1977.Statistics and Experimental Design in Engineering and the Physical Sciences.Vol. II, Second Edition, John Wiley and Sons, New York. Miller, Ruppert G., Jr.1981.Simultaneous Statistical Inference.Second Edition, Springer-Verlag, New York. EXAMPLE Four lead concentration values at each of six wells are given in Table 5-2 below.The wells consist of u=2 background and m=4 compliance wells.(The values in Table 5-2 are actually the natural logarithms of the original lead concentrations.) Step 1.Arrange the 4 x 6 = 24 observations in a data table as follows: TABLE 5-2.EXAMPLE DATA FOR ONE-WAY PARAMETRIC ANALYSIS OF VARIANCE Well No. 1 Background wells 4.06 3.99 3.40 3.83 15.28 3.82 0.29623.83 4.34 3.47 4.22 15.86 3.97 0.395 Step 2.The calculations are shown on the right-hand side of the data table above.Sample standard deviations have been computed also. Step 3.Compute the between-well sum of squares. with [6 (wells) - l]= 5 degrees of freedom. 5-11 Step 4.Compute the corrected total sum of squares. with [24 (observations) - 1] = 23 degrees of freedom. Step 5,Obtain the within-well or error sum of squares by subtraction. ss Error = 11.92 - 5.75 = 6.17 with [24 (observations) - 6 (wells)] = 18 degrees of freedom. Step 6.Set up the one-way ANOVA as in Table 5-3 below: TABLE 5-3.EXAMPLE COMPUTATIONS IN ONE-WAY PARAMETRIC ANOVA TABLE Source of Sums of Degrees of variation squares freedom Mean squares F Between wells 5.76 5 5.76/5 = 1.15 1.15/0.34 = 3.38 Error (within wells) Total 6.18 18 6.18/18 =0.34 - 11.94 23 Step 7.The calculated F statistic is 3.38. The tabulated F value with 5 and 18 degrees of freedom at the a = dix B). 0.05 level is 2.77 (Table 2, Appen- Since the calculated value exceeds the tabulated value, the hypothe- sis of equal well means must be rejected,and post hoc comparisons are necessary. Step 8.Computation of Bonferroni t-statistics. .Note that there are four compliance wells, so m = 4 comparisons will be made 5-12 Compute the quantities Dj Again, due to equal sample sizes, they will al1 be equal. Di = SEi x t = 0.357 x 2.43 = 0.868 for i = 3,..., 6 Step 9.Compute the residuals using the data given in Table 5-2. Residuals for Well 1: =5.61 -4.55 =1.06 =5.14 -4.55 =0.59 5-13 R 33 =3.47 -4.55 =-1.08 R 34 =3.97 -4.55 =-0.58 Residuals for Well 4: =3.53 -4.19 =-0.66 =4.54 -4.19 =0.35 R 43 =4.26 -4.19 =0.07 R 44 =4.42 -4.19 =0.23 Residuals for Well 5: R 51 =3.91 -4.75 =-0.84 R 52 =4.29 -4.75 =-0.46 R 53 =5.50 -4.75 =0.75 R 54 =5.31 -4.75 =0.56 Residuals for Well 6: R 61 =5.42 -5.25 =0.17 R 62 =5.21 -5.25 =-0.04 R 63 =5.29 -5.25 =0.04 R 64 =5.08 -5.25 =-0.17 INTERPRETATION All the compliance well concentrations were somewhat above the mean con- centration of the background levels.The well means should be used to indi- cate the location of the plume.The findings should be reported to the Regional Administrator. 5.2.2 One-Way Nonparametric Analysis of Variance This procedure is appropriate for interwell comparisons when the data or the residuals from a parametric ANOVA have been found to be significantly dif- ferent from normal and when a log transformation fails to adequately normalize the data.In one-way nonparametric ANOVA,the assumption under the null hypothesis is that the data from each well come from the same continuous dis- tribution and hence have the same median concentrations of a specific hazard- ous constituent.The alternatives of interest are that the data from some wells show increased levels of the hazardous constituent in question. 5-14 The procedure is called the Kruskal-Wallis test. For meaningful results,there should be at least three groups with a minimum sample size of three in each group.For large data sets use of a computer program is recommended. In the case of large data sets a good approximation to the procedure is to re- place each observation by its rank (its numerical place when the data are ordered from least to greatest) and perform the (parametric) one-way analysis of variance (Section 5.2.1) on the ranks.Such an approach can be done withsome commercially statistical packages such as SAS. PURPOSE The purpose of the procedure is to test the hypothesis that all wells (or groups of wells) around regulated units have the same median concentration of a hazardous constituent.If the wells are found to differ, post-hoc compari-sons are again necessary to determine if contamination is present. Note that the wells define the groups.All wells will have at least fourobservations.Denote the number of groups by K and the number of observations in each group by nJ,with N being the total number of all observations.LetXiJ denote the jth observation in the ith group, where j runs from 1 to the number of observations in the group, ni,and i runs from 1 to the number ofgroups, K. PROCEDURE Step 1.Rank all N observations of the groups from least to greatest. Let Rij denote the rank of the jth observation in the ith group.As a convention, denote the background well(s) as group 1. Step 3.Compute the Kruskal-Wallis statistic: Step 4.Compare the calculated value H to the tabulated chi-squaredvalue with (K-l) degrees of freedom, where K is the number of groups (Table 1, Appendix B).Reject the null hypothesis if the computed value exceeds the tabulated critical value. 5-15 Step 5.If the computed value exceeds the value from the chi-squared table, compute the critical difference for well comparisons to the background, assumed to be group 1: for i taking values 2,..., K, where z(a/(K-1))is the upper (a/(K-l))-percentile from the standard normal distribution found in.Table 4, Appendix 8. Note: If there are more than five compliance wells at the regulated unit (K > 6), use Z.0l, the upper one- percentile from the standard normal distribution. Step 6.Form the differences of the average ranks for each group to the background and compare these with the critical values found in step 5 to de- termine which wells give evidence of contamination. That is, compare Ri-RI to Ci for i taking the values 2 through K.(Recall that group 1 is the back- ground.) While the above steps are the general procedure, some details need to be specified further to handle special cases.First, it may happen that two or more observations are numerically equal or tied.When this occurs, determine the ranks that the tied observations would have received if they had been slightly different from each other,but still in the same places with respect to the rest of the observations.Add these ranks and divide by the number of observations tied at that value to get an average rank.This average rank is used for each of the tied observations.This same procedure is repeated for any other groups of tied observations.Second, if there are any values below detection,consider all values below detection as tied at zero.(It is irrelevant what number is assigned to nondetected values as long as all such values are assigned the same number,and it is smaller than any detected or quantified value.) The effect of tied observations is to increase the value of the sta- tistic, H.Unless there are many observations tied at the same value, the effect of ties on the computed test statistic is neqligible (in practice, the effect of ties can probably be neglected unless some group contains 10 percent of the observations all tied, which is most likely to occur for concentrations below detection limit).In the present context,the term "negligible" can be more specifically defined as follows.Compute the Kruskal-Wallis statistic without the adjustment for ties.If the test statistic is significant at the 5% level then conclude the test since the statistic with correction for ties will be significant as well.If the test statistic falls between the 10% and the 5% critical values,then proceed with the adjustment for ties as shown below. 5-16 ADJUSTMENT FOR TIES If there are 50% or more observations that fell below the detection limit, then this method for adjustment for ties is inappropriate. The user is referred to Section 8 "Miscellaneous Topics."Otherwise,if there are tied values present in the data,use the following correction for the H statistic H’ =H l- REFERENCE Hollander, Myles, and D. A. Wolfe.1973.Nonparametric Statistical Methods.John Wiley and Sons, New York. EXAMPLE The data in Table 5-4 represent benzene concentrations in water samples taken at one background and five compliance wells. Step 1.The 20 observations have been ranked from least to greatest. The limit of detection was 1.0 ppm.Note that two values in Well 4 were below detection and were assigned value zero.These two are tied for the smallest value and have consequently been assigned the average of the two ranks 1 and 2, or 1.5.The ranks of the observations are indicated in parentheses after the observation in Table 5-4.Note that there are 3 observations tied at 1.3 that would have had ranks 4, 5, and 6 if they had been slightly different. These three have been assigned the average rank of 5 resulting from averaging 4, 5, and 6.Other ties occurred at 1.5 (ranks 7 and 8) and 1.9 12). (ranks I1 and Step 2.The values of the sums of ranks and average ranks at the bottom of Table 5-4. Step 3.Compute the Kruskal-Wallis statistic are indicated 5-17 TA B L E 5 - 4 . EX A M P L E D A T A F O R O N E - W A Y N O N P A R A M E T R I C A N O V A - - B E N Z E N E C O N C E N T R A T I O N S (p p m ) Da t e Ba c k g r o u n d We l l 1 We l l 2 Co m p l i a n c e w e l l s We l l 3 We l l 4 We l l 5 We l l 6 Ja n 1 1. 7 ( 1 0 ) 11 . 0 ( 2 0 ) 1. 3 ( 5 ) 0 ( 1 . 5 ) 4. 9 ( 1 7 ) 1. 6 ( 9 ) Fe b 1 1. 9 ( 1 1 . 5 ) 8. 0 ( 1 8 ) 1. 2 ( 3 ) 1. 3 ( 5 ) 3. 7 ( 1 6 ) 2. 5 ( 1 5 ) Ma r 1 1. 5 ( 7 . 5 ) 9. 5 ( 1 9 ) 1. 5 ( 7 . 5 ) 0 ( 1 . 5 ) 2. 3 ( 1 4 ) 1. 9 ( 1 1 . 5 ) ADJUSTMENT FOR TIES There are four groups of ties in the data of Table 5-4: T1 = (23-2) = 6 for the 2 observations of 1,900. T2 = (23-2) = 6 for the 2 observations of 1,500. for the 3 observations of 1,300. for the 2 observations of 0. Step 4.To test the null hypothesis of no contamination, obtain the critical chi-squared value with (6-l)= 5 degrees of freedom at the 5% signif- icance level from Table 1, Appendix B.The value is 11.07. Compare the cal- culated value, H',with the tabulated value.Since 14.76 is greater than 11.07, reject the hypothesis of no contamination at the 5% level. If the site was in detection monitoring it should move into compliance monitoring.If the site was in compliance monitoring it should move into corrective action. If the site was in corrective action it should stay there. In the case where the hydraulically upgradient wells serve as the back- ground against which the compliance wells are to be compared, comparisons of each compliance well with the background wells should be performed in addition to the analysis of variance procedure.In this example, data from each of the compliance wells would be compared with the background well data.This com- parison is accomplished as follows.The average ranks for each group, Ri.are used to compute differences.If a group of compliance wells for a regulated unit have larger concentrations than those found in the background wells, theaverage rank for the compliance wells at that unit will be larger than the average rank for the background wells. Step 5.Calculate the critical values to compare each compliance well to the background well. In this example,K=6, SQ there are 5 comparisons of the compliance wells with the background wells.Using an experimentwise significance level of a = 0.05, we find the upper 0.05/5 =0.01 percentile of the standard normal distribution to be 2.33 (Table 4, Appendix B). The total sample size, N, is 20.The approximate critical value, C2,is computed for compliance Well 2, which has the largest average rank, as: The critical values for the other wells are: 10.5 for Wells 3, 5, and 6; and 9.8 for Well 4. 5-19 Step 6.Compute the differences between the average rank of each com- pliance well and the average rank of the background well: Differences Critical values Compare each difference with the corresponding critical difference, D2 = 10.5 equals the critical value of C2 =10.5. We conclude that the concentration of benzene averaged over compliance Well 2 is significantly greater than that at the background well.None of the other compliance well concentration of benzene is significantly higher than the average background value. Based upon these results, only compliance Well 2 can be singled out as being contaminated. For data sets with more than 30 observations, the parametric analysis of variance performed on the rank values is a good approximation to the Kruskal- Wallis test (Quade, 1966).If the user has access to SAS, the PROC RANK pro- cedure is used to obtain the ranks of the data.The analysis of variance pro- cedure detailed in Section 5.2.1 is then performed on the ranks.Contrasts are tested as in the parametric analysis of variance. INTERPRETATION The Kruskal-Wallis test statistic is compared to the tabulated critical value from the chi-squared distribution.If the test statistic does not exceed the tabulated value, there is no statistically significant evidence of contamination and the analysis would stop and report this finding.If the test statistic exceeds the tabulated value, there is significant evidence that the hypothesis of no differences in compliance concentrations from the back- ground level is not true.Consequently,if the test statistic exceeds the critical value, one concludes that there is significant evidence of contami- nation.One then proceeds to investigate where the differences lie, that is, which wells are indicating contamination. The multiple comparisons procedure described in steps 5 and 6 compares each compliance well to the background well.This determines which compliance wells show statistically significant evidence of contamination at an experi- mentwise error rate of 5 percent.In many cases,inspection of the mean or median concentrations will be sufficient to indicate where the problem lies. 5.3 TOLERANCE INTERVALS BASED ON THE NORMAL DISTRIBUTION An alternate approach to analysis of variance to determine whether there is statistically significant evidence of contamination is to use tolerance intervals.A tolerance interval is constructed from the data on (uncontam- inated) background wells.The concentrations from compliance wells are then 5-20 compared with the tolerance interval.With the exception of pH, if the com- pliance concentrations do not fall in the tolerance interval, this provides statistically significant evidence of contamination. Tolerance intervals are most appropriate for use at facilities that do not exhibit high degrees of spatial variation between background wells and compliance wells.Facilities that overlie extensive, homogeneous geologic deposits (for example, thick, homogeneous lacustrine clays) that do not natu- rally display hydrogeochemical variations may be suitable for this statistical method of analysis. A tolerance interval establishes a concentration range that is con- structed to contain a specified proportion (P%) of the population with a specified confidence coefficient, Y.The proportion of the population included, P, is referred to as the coverage.The probability with which the tolerance interval includes the proportion P% of the population is referred to as the tolerance coefficient. A coverage of 95% is recommended.If this is used, random observations from the same distribution as the background well data would exceed the upper tolerance limit less than 5% of the time.Similarly, a tolerance coefficient of 95% is recommended.This means that one has a confidence level of 95% that the upper 95% tolerance limit will contain at least 95% of the distribution of observations from background well data.These values were chosen to be con- sistent with the performance standards described in Section 2.The use of these values corresponds to the selection of a of 5% in the multiple well testing situation. The procedure can be applied with as few as three observations from the background distribution.However,doing so would result in a large upper tolerance limit.A sample size of eight or more results is an adequate toler- ance interval.The minimum sampling schedule called for in the regulations would result in at least four observations from each background well. Only if a single background well is sampled at a single point in time is the 'sample size so small as to make use of the procedure questionable. Tolerance intervals can be constructed assuming that the data or the transformed data are normally distributed.Tolerance intervals can also be constructed assuming other distributions.It is also possible to construct nonparametric tolerance intervals using only the assumption that the data came from some continuous population.However,the nonparametric tolerance intervals require such a large number of observations to provide a reasonable coverage and tolerance coefficient that they are impractical in this application. The range of the concentration data in the background well samples should be considered in determining whether the tolerance interval approach should be used, and if so, what distribution is appropriate.The background well con- centration data should be inspected for outliers and tests of normality applied before selecting the tolerance interval approach.Tests of normality were presented in Section 4.2.Note that in this case, the test of normality would be applied to the background well data that are used to construct the 5-21 tolerance interval.These data should all be from the same normal distribution. In this application,unless pH is being monitored, a one-sided tolerance interval or an upper tolerance limit is desired, since contamination is indi- cated by large concentrations of the hazardous constituents monitored.Thus,for concentrations, the appropriate tolerance interval is (0, TL), with the comparison of importance being the larger limit, TL. PURPOSE The purpose of the tolerance interval approach is to define a concentra- tion range from background well data, within which a large proportion of the monitoring observations should fall with high probability. Once this is done,data from compliance wells can be checked for evidence of contamination by, simply determining whether they fall in the tolerance interval.If they donot, this is evidence of contamination. In this case the data are assumed to be approximately normally distrib- uted.Section 4.2 provided methods to check for normality. If the data arenot normal, take the natural logarithm of the data and see if the transformed data are approximately normal.If so, this method can be used on the loga- rithms of the data.Otherwise,seek the assistance of a professional statistician. PROCEDURE where K is the one-sided normal tolerance factor found in Table 5, Appendix B. Step 3,Compare each observation from compliance wells to the tolerance limit found in Step 2.If any observation exceeds the tolerance limit, that is statistically significant evidence that the well is contaminated.Notethat if the tolerance interval was constructed on the logarithms of the orig- inal background observations,the logarithms of the compliance well observa- tions should be compared to the tolerance limit. Alternatively the tolerance limit may be transferred to the original data scale by taking the anti- logarithm. REFERENCE Lieberman, Gerald J.1958."Tables for One-sided Statistical Tolerance Limits."Industrial Quality Control. Vol. XIV, NO. 10. 5-22 EXAMPLE Table 5-5 contains example data that represent lead concentration levels in parts per million in water samples at a hypothetical facility.Thebackground well data are in columns 1 and 2, while the other four columns represent compliance well data. TABLE 5-5.EXAMPLE DATA FOR NORMAL TOLERANCE INTERVAL Lead concentrations (ppm) Background well Compliance wells Date A B Well 1 Well 2 Well 3 Well 4 225.9* 183.1* 198.3* 160.8” limit is Step 1.The mean and standard deviation of the n = 8 observations have been calculated for the background well.The mean is 51.4 and the standarddeviation is 16.3. Step 2.The tolerance factor for a one-sided normal tolerance interval is found from Table 5, Appendix B as 3.188. probability 95% and for n = 8.This is for 95% coverage withThe upper tolerance limit is then calculated as 51.4 + (3.188)(16.3) = 103.4. Step 3.The tolerance limit of 103.3 is compared with the compliance well data.Any value that exceeds the tolerance limit indicates statistically significant evidence of contamination.Two observations from Well 1, twoobservations from Well 3, and all four observations from Well 4 exceed the tolerance limit.Thus there is statistically significant evidence of con- tamination at Wells 1, 3, and 4. 5-23 INTERPRETATION A tolerance limit with 95% coverage gives an upper bound below which 95% of the observations of the distribution should fall.The tolerance coeffi- cient used here is 95%, implying that at least 95% of the observations should fall below the tolerance limit with probability 95%, if the compliance well data come from the same distribution as the background data. In other words, in this example,we are 95% certain that 95% of the background lead concentra- tions are below 104 ppm.If observations exceed the tolerance limit, this is evidence that the compliance well data are not from the same distribution, but rather are from a distribution with higher concentrations.This is inter- preted as statistically significant evidence of contamination. 5.4 PREDICTION INTERVALS A prediction interval is a statistical interval calculated to include one or more future observations from the same population with a specified confi- dence.This approach is algebraically equivalent to the average replicate (AR) test that is presented in the Technical Enforcement Guidance Document (TEGD), September 1986.In ground-water monitoring, a prediction interval approach may be used to make comparisons between background and compliance well data.This method of analysis is similar to that for calculating a tolerance limit, and familiarity with prediction intervals or personal prefer- ence would be the only reason for selecting them over the method for tolerance limits.The concentrations of a hazardous constituent in the background wells are used to establish an interval within which K future observations from the same population are expected to lie with a specified confidence.Then each of K future observations of compliance well concentrations is compared to the prediction interval.The interval is constructed to contain all of K future observations with the stated confidence.If any future observation exceeds the prediction interval,this is statistically significant evidence of contam- ination.In application,the number of future observations to be collected, K, must be specified.Thus, the prediction interval is constructed for a specified time period in the future.One year is suggested. The interval can be constructed either to contain all K individual observations with a speci- fied probability,or to contain the K'means observed at the K' sampling periods. The prediction interval presented here is constructed assuming that the background data all follow the same normal distribution.If that is not the case (see Section 4.2 for tests of normality), but a log transformation results in data that are adequately normal on the log scale, then the interval may still be used.In this case, use the data after transforming by taking the logarithm.The future observations need to also be transformed by taking logarithms before comparison to the interval.(Alternatively, the end points of the interval could be converted back to the original scale by taking their anti-logarithms.) PURPOSE The prediction interval is constructed so that K future compliance well observations can be tested by determining whether they lie in the interval or 5-24 not.If not, evidence of contamination is found.Note that the number of future observations, K, for which the interval is to be used, must be speci- fied in advance.In practice,an owner or operator would need to construct the prediction interval on a periodic (at least yearly) basis, using the most recent background data.The interval is described using the 95% confidence factor appropriate for individual well comparisons.It is recommended that a one-sided prediction interval be constructed for the mean of the four observa- tions from each compliance well at each sampling period. PROCEDURE Step 1.Calculate the mean, x, and the standard deviation, S, for the background well data (used to form the prediction interval). Step 2.Specify the number of future observations for a compliance well to be included in the interval, K.Then the interval is given by Step 3.Once the interval has been calculated, at each sampling period, the mean of the m compliance well observations is obtained. This mean is com- pared to see if it falls in the interval.If it does, this is reported and monitoring continues.If a mean concentration at a sampling period does not fall in the prediction interval,this is statistically significant evidence of contamination.This is also reported and the appropriate action taken. REMARK For a single future observation,t is given by the t-distribution found in Table 6 of Appendix B.In general, the interval to contain K future means of sample size m each is given by where t is as before from Table 3 of Appendix B and where m is the number of observations in each mean.Note that for K single observations, m=l, while for the mean of four samples from a compliance well, m=4. Note, too, that the prediction intervals are one-sided, giving a value that should not be exceeded by the future observations. The 5% experimentwise significance level is used with the Bonferroni approach.However, to ensure 5-25 that the significance level for the individual comparisons does not go below 1%a/K is restricted to be 1% or larger.If more than K comparisons are used, the comparisonwise significance level of 1% is used, implying that the comparisonwise level may exceed 5%. EXAMPLE Table 5-6 contains chlordane concentrations measured at a hypothetical facility.Twenty-four background observations are available and are used to develop the prediction interval.The prediction interval is applied to K-2 sampling periods with m=4 observations at a single compliance well each. Step 1.Find the mean and standard deviation of the 24 background well measurements.These are 101 and 11, respectively. Step 2.There are K =2 future observations of means of 4 observations to be included in the prediction interval.Entering Table 3 of Appendix B at K = 2 and 20 degrees of freedom (the nearest entry to the 23 degrees of freedom), we find t(20, 2, 0.95) = 2.09. The interval is given by [0, 101 + (11)2.09(1/4 + 1/24)½] = (0, 113.4). Step 3.The mean of each of the four compliance well observations at samp ling per iod one and two is found and compared with the interval found in Step 2.The mean of the first sampling period is 122 and that for the second sampling period is 113.Comparing the first of these to the prediction inter- val for two means based on samples of size 4,we find that the mean exceeds the upper limit of the prediction interval.This is statistically significant evidence of contamination and should be reported to the Regional Administra- tor.Since the second sampling period mean is within the prediction interval, the Regional Administrator may allow the facility to remain in its current stage of monitoring. INTERPRETATION A prediction interval is a statistical interval constructed from back- ground sample data to contain a specified number of future observations from the same distribution with specified probability.That is, the prediction interval is constructed so as to have a 95% probability of containing the next K sampling period means,provided that there is no contamination.If the future observations are found to be in the prediction interval, this is evi- dence that there has been no change at the facility and that no contamination is occurring.If the future observation falls outside of the prediction interval, this is statistical evidence that the new observation does not come from the same distribution, that is,from the population of uncontaminated water samples previously sampled.Consequently, if the observation is a con- centration above the prediction interval's upper limit, it is statistically significant evidence of contamination. 5-26 TABLE 5-6.EXAMPLE DATA FOR PREDICTION INTERVAL--CHLORDANE LEVELS Background well data--Well 1 Compliance well data--Well 2 Chlordane Chlordane Sampling date concentration concentration Sampling date 5-27 The prediction interval could be constructed in several ways. It can be developed for means of observations at each sampling period, or for each in- dividual observation at each sampling period. It should also be noted that the estimate of the standard deviation, S, that is used should be an unbiased estimator.The usual estimator, presented above, assumes that there is only one source of variation.If there are other sources of variation, such as time effects, or spatial variation in the data used for the background,these should be included in the estimate of the vari- 'ability.This can be accomplished by use of an appropriate analysis-of-vari- ance model to include the other factors affecting the variability. Determina- tion of the components of variance in complicated models is beyond the scope of this document and requires consultation with a professional statistician. REFERENCE Hahn, G. and Wayne Nelson. 1973. "A Survey of Predict Applications."Journal of Quality Technology.5: 178-188. ion Intervals and Their 5-28 SECTION 6 COMPARISONS WITH MCLs OR ACLs This section includes statistical procedures appropriate when the moni- toring aims at determining whether ground-water concentrations of hazardous constituents are below or above fixed concentration limits.In this situation the maximum concentration limit (MCL) or alternate concentration limit (ACL) is a specified concentration limit rather than being determined by the back- ground well concentrations.Thus the applicable statistical procedures are those that compare the compliance well concentrations estimated from sampling with the prespecified fixed limits.Methods for comparing compliance well concentrations to a (variable) background concentration were presented in Section 5. The methods applicable to the type of comparisons described in this sec- tion include confidence intervals and tolerance intervals. A special section deals with cases where the observations exhibit very small or no variability. 6.1 SUMMARY CHART FOR COMPARISON WITH MCLs OR ACLs Figure 6-1 is a flow chart to aid the user in selecting and applying a statistical method when the permit specifies an MCL or ACL. As with each type of comparison,a determination is made first to see if there are enough data for intra-well comparisons. If so, these should be done in parallel with the other comparisons. Here, whether the compliance limit is a maximum concentration limit (MCL) or an alternate concentration limit (ACL), the recommended procedure to com- pare the mean compliance well concentration against the compliance limit is the construction of a confidence interval.This approach is presented in Section 6.2.1.Section 6.2.2 adds a 'special case of limited variance in the data.If the permit requires that a compliance limit is not to be exceeded more than a specified fraction of the time, then the construction of tolerance limits is the recommended procedure, discussed in Section 6.2.3. 6.2 STATISTICAL PROCEDURES This section presents the statistical procedures appropriate for com- parison of ground-water monitoring data to a constant compliance limit, a fixed standard.The interpretation of the fixed compliance limit (MCL or ACL) is that the mean concentration should not exceed this fixed limit. An alter- nate interpretation may be specified.The permit could specify a compliance limit as a concentration not to be exceeded by more than a small, specified 6-1 Comparisons with MCUACLs Comparisons with MCUACLs (Section 6) with----, I I, I I Intra-Well Comparisons if More than 1 Yr of Data Control Charts (Section 7) with Mean Normal>-_Y_es~~Confidence Intervals Take Log of Data Conclusions Consult with Professional Statistician No Yes Yes Lognormal Confidence Intervals No nparametric Confidence Intervals Figure 6-1.Comparisons with MCLs/ACLs. 6-2 proportion of the observations.A tolerance interval approach for such a situation is also presented. 6.2.1 Confidence Intervals When a-regulated unit is in compliance monitoring with a fixed compliance limit (either an MCL or an ACL), confidence intervals are the recommended pro- cedure pursuant to--§264,97(h)(5) in the Subpart F regulations.The unit willremain in compliance monitoring unless there is statistically significant evi- dence that the mean concentration at one or more of the downgradient wells exceeds the compliance limit.A confidence interval for the mean concentra- tion is constructed from the sample data for each compliance well individu- ally.These confidence intervals are compared with the compliance limit. If the entire confidence interval exceeds the compliance limit, this is statisti- cally significant evidence that the mean concentration exceeds the compliance limit. Confidence intervals can generally be constructed for any specified dis- tribution.General methods can be found in texts on statistical inference some of which are referenced in Appendix C.A confidence limit based on the normal distribution is presented first,followed by a modification for the log-normal distribution.A nonparametric confidence interval is also presented. 6.2.1.1 Confidence Interval Based on the Normal Distribution PURPOSE The confidence interval for the mean concentration is constructed from the compliance well data.Once the interval has been constructed, it can be compared with the MCL or ACL by inspection to determine whether the mean con- centration significantly exceeds the MCL or ACL. PROCEDURE Step 1.concentration values.Do this separately for each compliance well. Step 2.For each well calculate the confidence interval as where t(0.99, n-l)is obtained from the t-table (Table 6, Appendix 6). Generally, there will be at least four observations at each sampling period, so t will usually have at least 3 degrees of freedom. Step 3.Compare the intervals calculated in Step 2 to the compliance limit (the MCL or ACL, as appropriate).If the compliance limit is contained in the interval or is above the upper limit, the unit remains in compliance. 6-3 If any well confidence interval's lower limit exceeds the compliance limit, this is statistically significant evidence of contamination. REMARK The 99th percentile of the t-distribution is used in constructing the confidence interval.This is consistent with an alpha (probability of Type I error) of 0.01,since the decision on compliance is made by comparing the lower confidence limit to the MCL or ACL.Although the interval as con- structed with both upper and lower limits is a 98% confidence interval, the use of it is one-sided, which is consistent with the 1% alpha level of individual well comparisons. EXAMPLE Table 6-1 lists hypothetical concentrations of Aldicarb in three compli- ance wells.For illustration purposes,the MCL for Aldicarb has been set at 7 ppb.There is no evidence of nonnormality, so the confidence interval based on the normal distribution is used. TABLE 6-1.EXAMPLE DATA FOR NORMAL CONFIDENCE INTERVAL--ALDICARB CONCENTRATIONS IN COMPLIANCE WELLS (ppb) Sampling date Well 1 Well 2 Well 3 MCL = 7 ppb Step 1.Calculate the mean and standard deviation of the concentrations for each compliance well.These statistics are shown in the table above. Step 2.Obtain the 99th percentile of the t-distribution with (4-l) = 3 degrees of freedom from Table 6, Appendix B as 4.541. Then calculate the con- fidence interval for each well's mean concentration. 6-4 where the usual convention of expressing the upper and lower limits of the confidence interval in parentheses separated by a comma has been followed. Step 3.Compare each confidence interval to the MCL of 7 ppb. When this is done, the confidence interval for Well 1 lies entirely above the MCL of 7, indicating that the mean concentration of Aldicarb in Well 1 significantly exceeds the MCL.Similarly, the confidence interval for Well 2 lies entirely above the MCL of 7.This is significant evidence that the mean concentration in Well 2 exceeds the MCL.However, the confidence interval for Well- 3 is mostly below the MCL.Thus, there is no statistically significant evidence that the mean concentration in Well 3 exceeds the MCL. INTERPRETATION The confidence interval is an interval constructed so that it should con- tain the true or population mean with specified confidence (98% in this case).If this interval does not contain the compliance limit, then the mean concentration must differ from the compliance limit.If the lower end of the interval is above the compliance limit,then the mean concentration must be significantly greater than the compliance limit, indicating noncompliance. 6.2.1.2 Confidence Interval for Log-Normal Data PURPOSE The purpose of a confidence interval for the mean concentration of log- normal data is to determine whether there is statistically significant evidence that the mean concentration exceeds a fixed compliance limit.Theinterval gives a range that includes the true mean concentration with confidence 98%.The lower limit will be below the true mean with confidence 99%, corresponding to an alpha of 1%. PROCEDURE This procedure is used to construct a confidence interval for the mean concentration from the compliance well data when the data are log-normal (that is, when the logarithms of the data are normally distributed).Once theinterval has been constructed,it can be compared with the MCL or ACL by inspection to determine whether the mean concentration significantly exceeds the MCL or ACL.Throughout the following procedures and examples, natural logarithms (1n) are used. Step 1.Take the natural logarithm of each data point (concentration measurement).Also, take the natural logarithm of the compliance limit. Step 2.Calculate the sample mean and standard deviation of the log- transformed data from each compliance well.(This is Step 1 of the previous section, working now with logarithms.) 6-5 Step 3.Form the confidence intervals for each compliance well as where t(o.99,n-1) t will typically is from the t-distribution in Table 6 of Appendix B. Here have 3 degrees of freedom. Step 4.Compare the confidence intervals found in Step 3 to the logarithm of the compliance limit found in Step 1.If the lower limit of the confidence interval lies entirely above the logarithm of the compliance limit, there is statistically significant evidence that the unit is out of compli- ance.Otherwise, the unit is in compliance. EXAMPLE Table 6-2 contains EDB concentration data from three compliance wells at a hypothetical site.The MCL is assumed to be 20 ppb. For demonstration pur- poses,the data are assumed not normal; a natural log-transformation normalized them adequately.The lower part of the table contains the natural logarithms of the concentrations. TABLE 6-2.EXAMPLE DATA FOR LOG-NORMAL CONFIDENCE INTERVAL--EDB CONCENTRATIONS IN COMPLIANCE WELLS (ppb) Sampling date Well 1 Well 2 Well 3 In (MCL) = 3.00 6-6 Step 1.The logarithms of the data are used to calculate a confidence interval.Take the natural log of the concentrations in the top part ofTable 6-2 to find the values given in the lower part of the table. For exam- ple, 1n(24.2) = 3.19, . . ., ln(25.3)= 3.23. Also, take the logarithm of the MCL to find that ln(20) = 3.00. Step 2.Calculate the mean and standard deviation of the log concentra- tions for each compliance well. These are shown in the table. Step 3.Form the confidence intervals for each compliance well. where 4.541 is the value obtained from the t-table (Table 6 in Appendix B) as in the previous example. Step 4.Compare the individual well confidence intervals with the MCL (expressed on the log scale).The natural log of the MCL of 20 ppm is 3.00. None of the individual well confidence intervals for the mean has a lowerlimit that exceeds this value, so none of the individual well mean concentra- tions is significantly different from the MCL. Note:The lower and upper limits of the confidence interval for each well's mean concentration could be converted back to the original scale bytaking antilogs.For example, on the original scale, the confidence intervals would be: Well 1:(exp(1.72), exp(4.30)) or (5.58, 73.70) Well 2:(exp(l.67), exp(5.51)) or (5.31, 262.43) Well 3:(exp(l.90), exp(5.44)) or (6.69, 230.44) These limits could be compared directly with the MCL of 20 ppb.It is gen- erally easier to take the logarithm of the MCL rather than the antilogarithmof all of the intervals for comparison. INTERPRETATION If the original data are not normal,but the log-transformation ade- quately normalizes the data,the confidence interval (on the log scale) is an interval constructed so that the lower confidence limit should be less thanthe true or population mean (on the log scale) with specified confidence (99% 6-7 in this case).If the lower end of the confidence interval exceeds the appro- priate compliance limit, then the mean concentration must exceed that compli- ance limit.These results provide statistically significant evidence of contamination. 6.2.1.3 Nonparametric Confidence Interval If the data do not adequately follow the normal distribution even after the logarithm transformation,a nonparametric confidence interval can be con- structed.This interval is for the median concentration (which equals the mean if the distribution is symmetric).The nonparametric confidence interval is generally wider and requires more data than the corresponding normal dis- tribution interval,and so the normal or log-normal distribution interval should be used whenever it is appropriate.It requires a minimum of seven (7) observations in order to construct an interval with a two-sided confidence coefficient of 98%, corresponding to a one-sided confidence coefficient of 99%.Consequently,it is applicable only for the pooled concentration of compliance wells at a single point in time or for special sampling to produce a minimum of seven observations at a single well during the sampling period. PURPOSE The nonparametric confidence interval is used when the raw data have been found to violate the normality assumption,a log-transformation fails to normalize the data, and no other specific distribution is assumed.It pro- duces a simple confidence interval that is designed to contain the true orpopulation median concentration with specified confidence (here 99%). If this confidence interval contains the compliance limit,it is concluded that the median concentration does not differ significantly from the compliance limit.If the interval's lower limit exceeds the compliance limit, this is statistically significant evidence that the concentration exceeds the compli- ance limit and the unit is out of compliance. PROCEDURE Step 1.Within each compliance well,order the n data from least to greatest, denoting the ordered data by X(l),..., X(n), where X(i) is the ith value in the ordered data. Step 2.Determine the critical values of the order statistics as follows.If the minimum seven observations is used, the critical values are 1 and 7.Otherwise, find the smallest integer, M, such that the cumulative binomial distribution with Parameters n (the sample size) and D = 0.5 is at leas take t 0.99.Table 6-3 gives' the values of M and n+l-M together with the exact confidence coefficient for sample sizes from 4 to 11.For larger samples, as an approximation the nearest integer value to M where Z0.99 is the 99th percentile from the normal distribution (Table 4, Appendix B) and equals 2.33. 6-8 TABLE 6-3.VALUES OF M AND n+I-M AND CONFIDENCE COEFFICIENTS FOR SMALL SAMPLES M n+l-M Two-sided confidence 87.5% 93.8% 96.9% 98.4% 99.2% 99.6% 97.9% 98.8% Step 3.Once M has been determined in Step 2, find n+l-M and take as the confidence limits the order statistics, X(M) and X(n+l-M). (With the minimum seven observations, use X(1) and X(7).) Step 4.Compare the confidence limits found in Step 3 to the compliance limit.If the lower limit, X(M) exceeds the compliance limit, there is sta- tistically significant evidence of contamination.Otherwise, the unit remains in compliance. REMARK The nonparametric confidence interval procedure requires at least seven observations in order to obtain a (one-sided) significance level of 1% (confi- dence of 99%).This means that data from two (or more) wells or sampling periods would have to be pooled to achieve this level.If only the four observations from one well taken at a single sampling period were used, the one-sided significance level would be 6.25%.This would also be the false alarm rate. Ties do not affect the procedure.If there are ties, order the observa- tions as before,including all of the tied values as separate observations. That is,each of the observations with a common value is included in the ordered list (e.g.,1, 2, 2, 2, 3, 4, etc.).For ties, use the average of the tied ranks as in Section 5.2.2, Step 1 of the example. The ordered statistics are found by counting positions up from the bottom of the list as before. Multiple values from separate observations are counted separately. EXAMPLE Table 6-4 contains concentrations of T-29 in parts per million from two hypothetical compliance wells.The data are assumed to consist of four sam- ples taken each quarter for a year,so that sixteen observations are available 6-9 TABLE 6-4.EXAMPLE DATA FOR NONPARAMETRIC CONFIDENCE INTERVAL--T-29 CONCENTRATIONS (ppm) Well 1 Well 2 Sampling Concentration Concentration date (ppm)Rank (ppm)Rank from each well.The data are not normally distributed, neither as raw data nor when log transformed.Thus,the nonparametric confidence interval is used.The MCL is taken to be 15 ppm. Step 1.Order the 16 measurements from least to greatest within each well separately.The numbers in parentheses beside each concentration in Table 6-4 are the ranks or order of the observation. For example, in Well 1, the smallest observation is 2.32, which has rank 1.The second smallest is 3.17, which has rank 2, and so forth, with the largest observation of 21.36 having rank 16. Step 2.The sample size is large enough so that the approximation is used to find M. Step 3.The approximate 95% confidence limits are given by the 16 + 1 - 14 =3rd largest observation and the 14th largest observation. For 6-10 Well 1, the 3rd observation is 3.39 and the 14th largest observation is 10.25.Thus the confidence limits for Well 1 are (3.39, 10.25). Similarly for Well 2, the 3rd largest observation and the 14th largest observation are found to give the confidence interval (2.20, 11.02). Note that for Well 2 there were two values below detection.These were assigned a value of zero and received the two smallest ranks.Had there been three or more values below the limit of detection, the lower limit of the confidence interval would have been the limit of detection because these values would have been the smallest values and so would have included the third order statistic. Step 4.Neither of the two confidence intervals' lower limit exceeds the MCL of 15.In fact, the upper limit is less than the MCL, implying that the concentration in each well is significantly below the MCL. INTERPRETATION The rank-order statistics used to form the confidence interval in the nonparametric confidence interval procedure will contain the population median with confidence coefficient of 98%.The population median equals the mean whenever the distribution is symmetric.The nonparametric confidence interval is generally wider and requires more data than the corresponding normal dis- tribution interval,and so the normal or log-normal distribution interval should be used whenever it is appropriate. If the confidence interval contains the compliance limit (either MCL or ACL), then it is reasonable to conclude that the median compliance well con- centration does not differ significantly from the compliance limit.If thelower end of the confidence interval exceeds the compliance limit, this is statistically significant evidence at the 1% level that the median compliance well concentration 'exceeds the compliance limit and the unit is out of compliance. 6.2.2 Tolerance Intervals for Compliance Limits In some cases a permit may specify that a compliance limit (MCL or ACL)is not to be exceeded more than a specified fraction of the time.Since lim- ited data will be available from each monitoring well, these data can be usedto estimate a tolerance interval for concentrations from that well.If theupper end of the tolerance interval (i.e.,upper tolerance limit) is less than the compliance limit,the data indicate that the unit is in compliance. That is, concentrations should be less than the compliance limit at least a speci- fied fraction of the time.If the upper tolerance limit of the interval exceeds the compliance limit,then the concentration of the hazardous con-stituent could exceed the compliance limit more than the specified proportion of the time. This procedure compares an upper tolerance limit to the MCL or ACL. Withsmall sample sizes the upper tolerance limit can be fairly large, particularly if large coverage with high confidence is desired.If the owner or operator wishes to use a tolerance limit in this application, he/she should suggest values for the parameters of the procedure subject to the approval of the Regional Administrator.For example, the owner or operator could suggest a 6-11 95% coverage with 95% confidence.This means that the upper tolerance limit is a value which, with 95% confidence, will be exceeded less than 5% of the time. PURPOSE The purpose of the tolerance interval approach is to construct an inter- val that should contain a specified fraction of the concentration measurements from compliance wells with a specified degree of confidence. In this appli- cation it is generally desired to have the tolerance interval contain 95% of the measurements of concentration with confidence at least 95%. PROCEDURE It is assumed that the data used to construct the tolerance interval are approximately normal.The data may consist of the concentration measurements themselves if they are adequately normal (see Section 4.2 for tests of normal- ity),or the data used may be the natural logarithms of the concentration data.It is important that the compliance limit (MCL or ACL) be expressed in the same units (either concentrations or logarithm of the concentrations) as the observations. Step 1. compliance well concentration data. Table 5, Appendix B contains the factors for a 95% coverage tolerance interval with confidence factor 95%. Step 3.Compare the upper limit of the tolerance interval computed in Step 2 to the compliance limit.If the upper limit of the tolerance interval exceeds that limit, this is statistically significant evidence of contamina- tion. EXAMPLE Table 6-5 contains Aldicarb concentrations at a hypothetical facility in compliance monitoring.The data are concentrations in parts per million (ppm) and represent observations at three compliance wells.Assume than the permit establishes an ACL of 50 ppm that is not to be exceeded more than 5% of the time. Step 1.Calculate the mean and standard deviation of the observations from each well.These are given in the table. 6-12 TABLE 6-5.EXAMPLE DATA FOR A TOLERANCE INTERVAL COMPARED TO AN ACL Sampling date Aldicarb concentrations (pp) Well 1 Well 2 Well 3 Jan.1 19.9 23.7 25.6 Feb.1 29.6 21.9 23.3 Mar.1 18.7 26.9 22.3 Apr.1 24.2 26.1 26.9 Mean =23.1 24.7 24.5 SD =4.93 2.28 2.10 ACL =50 ppm Step 2.For n = 4, the factor, K, in Table 5, Appendix B, is found to be 5.145.Form the upper tolerance interval limits as: Well 1:23.1 + 5.145(4.93) = 48.5 Well 2:24.7 + 5.145(2.28) = 36.4 Well 3:24.5 + 5.145(2.10) = 35.3 Step 3.Compare the tolerance limits with the ACL of 50 PPM. Since the upper tolerance limits are below the ACL, there is no statistically signifi- cant evidence of contamination at any well.The site remains in detection monitoring. INTERPRETATION It may be desirable in a permit to specify a compliance limit that is not to be exceeded more than 5% of the time.A tolerance interval constructed from the compliance well data provides an estimated interval that will contain 95% of the data with confidence 95%.If the upper limit of this interval is below the selected compliance limit, concentrations measured at the compliance wells should exceed the compliance limit less than 5% of the time.If the upper limit of the tolerance interval exceeds the compliance limit, then more than 5% of the concentration measurements would be expected to exceed the compliance limit. 6.2.3 Special Cases with Limited Variance Occasionally,all four concentrations from a compliance well at a par- ticular sampling period could be identical.If this is the case, the formula for estimating the standard deviation at that specific sampling period would 6-13 give zero,and the methods for calculating parametric confidence intervals would give the same limits for the upper and lower ends of the intervals, which is not appropriate. In the case of identical concentrations, one should assume that there is some variation in the data, but that the concentrations were rounded and give the same values after rounding-To account for the variability that was present before rounding,take the least significant digit in the reported concentration as having resulted from rounding. Assume that rounding resultsin a uniform error on the interval centered at the reported value with the interval ranging up or down one half unit from the reported value.Thisassumed rounding is used to obtain a nonzero estimate of the variance for use in cases where all the measured concentrations were found to be identical. PURPOSE The purpose of this procedure is to obtain a nonzero estimate of the variance when all observations from a well during a given sampling period gave identical results.Once this modified variance is obtained, its square root is used in place of the usual sample standard deviation, S, to construct con- fidence intervals or tolerance intervals. PROCEDURE Step 1.Determine the least significant value of any data point. That is, determine whether the data were reported to the nearest 10 ppm, nearest 1 ppm, nearest 100 ppm, etc. Denote this value by 2R. Step 2.The data are assumed to have been rounded to the nearest 2R, so each observation is actually the reported value ±R.Assuming that the obser- vations were identical because of rounding,the variance is estimated to beR2/3, assuming the uniform distribution for the rounding error.This givesthe estimated standard deviation as Step 3.Take this estimated value from Step 2 and use it as the estimate of the standard deviation in the appropriate parametric procedure.That is,replace S by S'. EXAMPLE In calculating a confidence interval for a single compliance well, sup- pose that four observations were taken during a sampling period and all resulted in 590 ppm.There is no variance among the four values 590, 590, 590, and 590. Step 1.Assume that each of the values 590 came from rounding the con- centration to the nearest 10 ppm.That is, 590 could actually be any value between 585.0 and 594.99.Thus, 2R is 10 ppm (rounded off), so R is 5 ppm. 6-14 Step 2.The estimate of the standard deviation is as the 98% confidence interval of the average concentration.Note that 4.541 is the 99th percentile from the t-distribution (Table 6, Appendix B) with 3 degrees of freedom since the sample size was 4. INTERPRETATION When identical results are obtained from several different samples, the interpretation is that the data are not reported to enough significant figures to show the random differences.If there is no extrinsic evidence invalidat- ing the data,the data are regarded as having resulted from rounding more precise results to the reported observations,The rounding is assumed to result in variability that follows the uniform distribution on the range ±R, where 2R is the smallest unit of reporting.This assumption is used to calcu- late a standard deviation for the observations that otherwise appear to have no variability. REMARK Assuming that the data are reported correctly to the units indicated, other distributions for the rounding variability could be assumed.The max- imum standard deviation that could result from rounding when the observation is ±R is the value R. 6-15 SECTION 7 CONTROL CHARTS FOR INTRA-WELL COMPARISONS The previous sections cover various situations where the compliance well data are compared to the background well data or to specified concentration limits (ACL or MCL) to detect possible contamination. This section discusses the case where the level of each constituent within a single uncontaminated well is being monitored over time.In essence, the data for each constituent in each well are plotted on a time scale and inspected for obvious features such as trends or sudden changes in concentration levels.The method sug- gested here is a combined Shewhart-CUSUM control chart for each well and constituent. The control chart method is recommended for uncontaminated wells only, when data comprising at least eight independent samples over a one-year period are available.This requirement is specified under current RCRA regulations and applies to each constituent in each well. As discussed in Section 2, a common sampling plan will obtain four inde- pendent samples from each well on a semi-annual basis. With this plan a con- trol chart can be implemented when one year's data are available. As a result of Monte Carlo simulations,Starks (1988) recommended at least four sampling periods at a unit of eight or more wells,and at least eight sampling periods at a unit with fewer than four wells. The use of control charts can be an effective technique for monitoring the levels of a constituent at a given well over time.It also provides a visual means of detecting deviations from a "state of control."It is clear that plotting of the data is an important part of the analysis process. Plot- ting is an easy task,although time-consuming if many data sets need to be plotted.Advantage should be taken of graphics software, since plotting of time series data will be an ongoing process.New data points will be added to the already existing data base each time new data are available. The follow- ing few sections will discuss, in general terms, the advantages of plotting time series data; the corrective steps one could take to adjust when season- ality in the data is present; and finally,the detailed procedure for con- structing a Shewhart-CUSUM control chart,along with a demonstration of that procedure, is presented. 7.1 ADVANTAGES OF PLOTTING DATA While analyzing the data by means of any of the appropriate statistica? procedures discussed in earlier sections is recommended, we also recommend plotting the data.Each data point should be plotted against time using a time scale (e.g., month, quarter).A plot should be generated for each 7-l constituent measured in each well.For visual comparison purposes, the scale should be kept identical from well to well for a given constituent. Another important application of the plotting procedure is for detecting possible trends or drifts in the data from a given well.Furthermore, when visually comparing the plots from several wells within a unit, possible con- tamination of one rather than all downgradient wells could be detected which would then warrant a closer look at that well.In general, graphs can provide highly effective illustrations of the time series, allowing the analyst to obtain a much greater sense of the data.Seasonal fluctuations or sudden changes, for example, may become quite evident,thereby supporting the analyst in his/her decision of which statistical procedure to use.General upward or downward trends, if present,can be detected and the analyst can follow-up with a test for trend,such as the nonparametric Mann-Kendall test (Mann, 1945; Kendall, 1975).If, in addition, seasonality is suspected, the user can perform the seasonal Kendall test for trend developed by Hirsch et al. (1982).The reader is also referred to Chapters 16 "Detecting and Estimating Trends" and 17 "Trends and Seasonality" of Gilbert's "Statistical Methods for Environmental Pollution Monitoring," 1987.In any of the above cases, the help of a professional statistician is recommended. Another important use of data plots is that of identifying unusual data points (e.g., outliers).These points should then be investigated for pos- sible QC problems, data entry errors,or whether they are truly outliers. Many software packages are available for computer graphics, developed for mainframes, mini-, or microcomputers.For example,SAS features an easy-to- use plotting procedure,PROC PLOT; where the hardware and software are avail- able, a series of more sophisticated plotting routines can be accessed through SAS GRAPH.On microcomputers,almost everybody has his or her favorite graphics software that they use on a regular basis and no recommendation will be made as to the most appropriate one.The plots shown in this document were generated using LOTUS l-2-3. Once the data for each constituent and each well are plotted, the plots should be examined for seasonality and a correction is recommended should seasonality be present.A fairly simple-to-use procedure for deseasonalizing data is presented in the following paragraphs. 7.2 CORRECTING FOR SEASONALITY A necessary precaution before constructing a control chart is to take into account seasonal variation of the data to minimize the chance of mistak- ing seasonal effect for evidence of well contamination.This could resultfrom variations in chemical concentrations with recharge rates during different seasons throughout the years.If seasonality is present, thendeseasonalizing the data prior to using the combined Shewhart-CUSUM control chart procedure is recommended. Many approaches to deseasonalize data exist. If the seasonal pattern is regular,it may be modeled with a sine or cosine function.Moving averagescan be used, or differences (of order 12 for monthly data for example) can be 7-2 used.However, time series models may include rather complicated methods for deseasonalizing the data.Another simpler method exists which should be ade- quate for the situations described in this document. It has the advantage of being easy to understand and apply, and of providing natural estimates of the monthly or quarterly effects via the monthly or quarterly means.The method proposed here can be applied to any seasonal cycle--typically an annual cycle for monthly or quarterly data. NOTE Corrections for seasonality should be used with great caution as they represent extrapolation into the future.There should be a good scientific explanation for the seasonality as well as good empirical evidence for the seasonality before corrections are made.Larger than average rainfalls for two or three Augusts in a row does not justify the belief that there will never be a drought in August,and this idea extends directly to groundwater qua1ity.In addition, the quality (bias, robustness, and-variance) of the estimates of the proper corrections must be considered even in cases where corrections are called for.If seasonality is suspected, the user might want to seek the help of a professional statistician. PURPOSE When seasonality is known to exist in a time series of concentrations, then the data should be deseasonalized prior to constructing control charts in order to take into account seasonal variation rather than mistaking seasonal effects for evidence of contamination. PROCEDURE The following instructions to adjust a time series for seasonality are based on monthly data with a yearly cycle.The procedure can be easily modi- fied to accommodate a yearly cycle of quarterly data. Assume that N years of monthly data are available.Let Xij denote the unadjusted observation for the ith month during the jth year. Step 1.Compute the average concentration for month i over the N-year period: This is the average of all observations taken in different years but during the same month.That is, calculate the mean concentrations for all Januarys, then the mean for all Februarys and so on for each of the 12 months. Step 2.Calculate the grand mean, 7-3 January 1.99 2.01 2.15 2.05 2.10 2.13 2.27 February 2.10 2.10 2.17 2.12 2.14 2.15 2.21 March 2.12 2.17 2.27 2.19 2.10 2.15 2.25 April 2.12 2.13 2.23 2.16 2.13 2.14 2.24 May 2.11 2.13 2.24 2.16 2.12 2.13 2.25 June 2.15 2.18 2.26 2.20 2.12 2.15 2.23 July 2.19 2.25 2.31 2.25 2.11 2.16 2.23 August 2.18 2.24 2.32 2.25 2.10 2.16 2.24 September 2.16 2.22 2.28 2.22 2.11 2.17 2.22 October 2.08 2.13 2.22 2.14 2.10 2.16 2.24 November 2.05 2.08 2.19 2.11 2.11 2.14 2.25 December 2.08 2.16 2.22 2.16 2.09 2.17 2.23 Overall 3-year average = 2.17 Step 3.Compute the adjusted concentrations, EXAMPLE Columns 2 through 4 of Table 7-l show monthly unadjusted concentrations of a fictitious analyte over a 3-year period. TABLE 7-l.EXAMPLE COMPUTATION FOR DESEASONALIZING DATA Unadjusted concentrations19831984 1985 3-Month average Monthly adjusted concentrations 1983 1984 1985 Step 1.Compute the monthly averages across the 3 years. These values are shown in the fifth column of Table 7-l. Step 2.The grand mean over the 3-year period is calculated to be 2.17 7-4 Step 3.Within each month and year,subtract the average monthly con- centration for that month and add the grand mean.For example, for January 1983, the adjusted concentration becomes 1.99 -2.05 + 2.17 = 2.11 The adjusted concentrations are shown in the last three columns The reader can check that the average of all 36 adjusted of Table 7-l. concentrations shows the plotequals 2.17, the average unadjusted concentration.Figure 7-1 of the unadjusted and adjusted data.The raw data clearly exhibit seasonality as well as an upwards trend which is less evident by simply looking at the data table. INTERPRETATION As can be seen in Figure 7-1,seasonal effects were present in the data.After adjusting for monthly effects,the seasonality was removed as can be seen in the adjusted data plotted in the same figure. 7.3 COMBINED SHEWHART-CUSUM CONTROL CHARTS FOR EACH WELL AND CONSTITUENT Control charts are widely used as a statistical tool in industry as well as research and development laboratories.The concept of control charts is relatively simple,which makes them attractive to use.From the population distribution of a given variable,such as concentrations of a given constit- uent, repeated random samples are taken at intervals over time.Statistics, for example the mean of replicate values at a point in time, are computed and plotted together with upper and/or lower predetermined limits on a chart where the x-axis represents time.If a result falls outside these boundaries, then the process is declared to be "out of control"; otherwise, the process is declared to be "in control."The widespread use of control charts is due to their ease of construction and the fact that they can provide a quick visual evaluation of a situation,and remedial action can be taken, if necessary. In the context of ground water monitoring, control charts can be used to monitor the inherent statistical variation of the data collected within a single well,and to flag anomalous results.Further investigation of data points lying outside the established boundaries will be necessary before any direct action is taken. A control chart that can be used on a real time basis must be constructed from a data set large enough to characterize the behavior of a specific well.It is recommended that data from a minimum of eight samples within a year be collected for each constituent at each well to permit an evaluation of the consistency of monitoring results with the current concept of the hydro- geology of the site.Starks (1988) recommends a minimum of four sampling periods at a unit with eight or more wells and a minimum of eight sampling periods at a unit with less than four wells. Once the control chart for the specific constituent at a given well is acceptable, then subsequent data 7-5 2. 3 2 2. 3 2. 2 8 2. 2 6 2. 2 4 2. 2 2 2. 2 2. 1 8 2. 1 6 2. 1 4 2. 1 2 2. 1 2. 0 8 2. 0 6 2. 0 4 2. 0 2 2 1. 9 8 Ti m e S e r i e s o f M o n t h l y O b s e r v a t i o n s (U n a d j u s t e d , A d j u s t e d , 3 - y e a r M e a n ) -- Ja n - 8 3 M a y - 8 3 S e p - 8 3 J a n - 8 4 M a y - 8 4 S e p - 8 4 J a n - 8 5 Ma y - 8 5 S e p - 8 5 Ti m e ( m o n t h ) a Un a d j u s t e d + Ad j u s t e d -- - - - - 3 - y e a r M e a n Fi g u r e 7 - 1 . Pl o t o f u n a d j u s t e d a n d s e a s o n a l l y a d j u s t e d m o n t h l y o b s e r v a t i o n s . points can be plotted on process is in control. it to provide a quick evaluation as to whether the The standard assumptions generated by the process, in the use of control charts are that the data when it is in control, are independently (see Sec- tion 2.4.2) and normally distributed with a fixed mean µ and constant variance ó2.The most important assumption is that of independence; control charts are not robust with respect to departure from independence (e.g., serial correla- tion, see glossary).In general, the sampling scheme will be such that the possibility of obtaining serially correlated results is minimized, as noted in Section 2.The assumption of normality is of somewhat less concern, but should be investigated before plotting the charts.A transformation (e.g., log-transform,square root transform) can be applied to the raw data so as to obtain errors normally distributed about the mean.An additional situation which may decrease the effectiveness of control charts is seasonality in the data.The problem of seasonality can be handled by removing the seasonality effect from the data, provided that sufficient data to cover at least two seasons of the same type are available (e.g.,2 years when monthly or quart- erly seasonal effect).A procedure to correct a time series for seasonality was shown above in Section 7.2. PURPOSE Combined Shewhart-cumulative sum (CUSUM) control charts are constructed for each constituent at each well to provide a visual tool of detecting both trends and abrupt changes in concentration levels. PROCEDURE Assume that data from at least eight independent samples of monitoring are available to provide reliable estimates of the mean, µ, and standard deviation, ó, of the constituent's concentration levels in a given well. Step 1.To construct a combined Shewhart-CUSUM chart, three parameters need to be selected prior to plotting: h-a decision internal value SCL a reference value - Shewhart control limit (denoted by U in Starks (1988)) The parameter k of the CUSUM scheme is directly obtained from the value, D, of the displacement that should be quickly detected; k = D/2. It is recom- mended to select k =1, which will allow a displacement of two standard devia- tions to be detected quickly. When k is selected to be 1, the parameter h is usually set at values of 4 or 5.The parameter h is the value against which the cumulative sum in the CUSUM scheme will be compared.In the context of groundwater monitoring, a value of h =5 is recommended (Starks, 1988; Lucas, 1982). 7-7 The upper Shewhart limit is set at SCL = 4.5 in units of standard devia- tion.This combination of k = 1, h = 5, and SCL = 4.5 was found most appro- priate for the application of combined Shewhart-CUSUM charts for groundwater monitoring (Starks, 1988). Step 2. X X Assume that at time period Ti, ni concentration measurements 1,. ..)ni,are available.Compute their average Xi. Step 3.Calculate the standardized mean where µ and ó are the mean and standard deviation obtained from prior monitor- ing at the same well (at least four sampling periods in a year). Step 4.At each time period, Ti,compute the cumulative sum, Si, as: Si = max [O, (Zi - k) + Si-1) where max {A, B) is the maximum of A and B, starting with S0 = 0. Step 5.Plot the values of Si versus Ti on a time chart for this com- bined Shewhart-CUSUM scheme.Declare an "out-of-control" situation at sam- pling period Ti if for the first time, Si > h or Zi › SCL.This will indicate probable contamination at the well and further investigations will be necessary. REFERENCES Lucas, J. M. 1982."Combined Shewhart-CUSUM Quality Control Schemes." JOur- nal of Quality Technology. Vol. 14, pp. 51-59. Starks, T. H.1988 (Draft)."Evaluation of Control Chart Methodologies for RCRA Waste Sites." Hockman, K. K.,and J. M. Lucas.1987."Variability Reduction Through Sub- vessel CUSUM Control.'Journal of Quality Technology. Vol. 19, pp. 113-121. EXAMPLE The procedure is demonstrated on a set of carbon tetrachloride measure- ments taken monthly at a compliance well over a l-year period.The monthly means of two measurements each (n = 2 for all i's) are presented in the third column of Table 7-2 below.Estimates of µ or and ó,the mean and standard deviation of carbon tetrachloride measurements at that particular well were obtained from a preceding monitoring period at that well; µ = 5.5 µg/L and a = 0.4 µg/L. 7-8 TABLE 7-2.EXAMPLE DATA FOR COMBINED SHEWHART-CUSUM CHART-- CARBON TETRACHLORIDE CONCENTRATION (µg/L) Parameters: Mean = 5.50; std = 0.4; k = 1; h = 5; SCL = 4.5. a Indicates "out-of-control" process via Shewhart control limit (Zi > 4.5). b CUSUM "out-of-control" signal (Si > 5). Step 1. The three parameters necessary to construct a combined Shewhart-CUSUM chart were selected as h = 5; k = 1; SCL = 4.5 in units of standard deviation. Step 2.The monthly means are presented in the third column of Table 7-2. Step 3.Standardize the means within each sampling period.These computations are shown in the fourth column of Table 7-2. For example, Z1 = (5.52 - Step 4.Compute the quantities Si, i = 1, . . . , 12. For example, S1 = max {0,-0.93 + 0) = 0 S 2 = max {O,-0.65 + 0} = 0 . . . S5 = max {O, 0.59 + S4} S 6 = max {O,-0.86 + S5} . etc. -0.86 + 0.59) = max {0, -0.27) = 0 7-9 These quantities are shown in the last column of Table 7-2. Step 5.Construct the control chart.The y-axis is in units of stan- dard deviations.The x-axis represent time, or-the sampling periods.For each sampling period, Tj, record the value of Xi and Si.Draw horizontal lines at values h =5 and SCL = 4.5. These two lines represent the upper con- trol limits for the CUSUM scheme and the Shewhart control limit, respec- tively.The chart for this example data set is shown in Figure 7-2. The combined chart indicates statistically significant evidence of con- tamination starting at sampling period T9.Both the CUSUM scheme and the Shewhart control limit were exceeded by S9 and Z9, respectively.Investi- gation of the situation should begin to confirm contamination and action should be required to bring the variability of the data back to its previous level. INTERPRETATION The combined Shewhart-CUSUM control scheme was applied to an example data set of carbon tetrachloride measurements taken on a monthly basis at a well. The statistic used in the construction of the chart was the mean of two measurements per sampling period.(It should be noted that this method can be used on an individual measurement as well, in which case ni = 1).Estimates of the mean and standard deviation of the measurements were available from previous data collected at that well over at least four sampling periods. The parameters of the combined chart were selected to be k = 1 unit, the reference value or allowable slack for the process; h = 5 units, the decision interval for the CUSUM scheme; and SCL =4.5 units, the upper Shewhart control limit.All parameters are in units of ó,the standard deviation obtained from the previous monitoring results.Various combinations of parameter values can be selected.The particular values recommended here appear to be the best for the initial use of the procedure from a review of the simulations and recom- mendations in the references.A discussion on this subject is given by Lucas (1982), Hockman and Lucas (1987), and Starks (1988). The choice of the param- eters h and k of a CUSUM chart is based on the desired performance of the chart.The criterion used to evaluate a control scheme is the average number of samples or time periods before an out-of-control signal is obtained. This criterion is denoted by ARL or average run length.The ARL should be large when the mean concentration of a hazardous constituent is near its target value and small when the mean has shifted too far from the target.Tables have been developed by simulation methods to estimate ARLs for given combina- tions of the parameters (Lucas,Hockman and Lucas, and Starks). The user is referred to these articles for further reading. 7.4 UPDATE OF A CONTROL CHART The control chart is based on preselected performance parameters as well as on estimates of µ and a, the parameters of the distribution of the measure- ments in question.As monitoring continues and the process is found to be in control, these parameters need periodic updating so as to incorporate this new information into the control charts.Starks (1988) has suggested that in 7-10 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 -1 -2 CO M B I N E D S H E V V H A R T - C U S U M C H A R T me a n = 5 . 5 ; s t d = 0 . 4 ; k = 1 ; h = 5 ; S C L = 4 . 5 1 I I I I I I I I I 1 2 a 3 4 5 6 7 0 9 10 11 12 Sa m p l i n g P e r i o d St a n d a r d i z e d M e a n + CU S U M Fi g u r e 7 - 2 . Co m b i n e d S h e w h a r t - C U S U M c h a r t . general,adjustments in sample means and standard deviations be made after sampling periods 4,8, 12, 20, and 32, following the initial monitoring period recommended to be at least eight sampling periods.Also, the performance parameters h,k, and SCL would need to be updated. The author suggests that h =5,k=1, and SCL = 4.5 be kept at those values for the first 12 sampling periods following the initial monitoring plan, and that k be reduced to 0.75 and SCL to 4.0 for all subsequent sampling periods. These values and sampling period numbers are not mandatory.In the event of an out-of-control state or a trend, the control chart should not be updated. 7.5 NONDETECTS IN A CONTROL CHART Regulations require that four independent water samples be taken at each well at a given sampling period.The mean of the four concentration measure- ments of a particular constituent is used in the construction of a control chart.Now situations will arise when the concentration of a constituent is below detection limit for one or more samples.The following approach is suggested for treating nondetects when plotting control charts. If only one of the four measurements is a nondetect, then replace it with one half of the detection limit (MDL/2) or with one half of the practical quantitation limit (PQL/2) and proceed as described in Section 7.3. If either two or three of the measurements are nondetects, use only the quantitated values (two or one,respectively) for the control chart and pro- ceed as discussed earlier in Section 7.3. If all four measurements are nondetects, then use one half of the detec- tion limit or practical quantitation limit as the value for the construction of the control chart.This is an obvious situation of no contamination of the well. In the event that a control chart requires updating and a certain propor- tion of the measurements is below detection limit, then adjust the mean and standard deviation necessary for the control chart by using Cohen's method described in Section 8.1.4.In that case,the proportion of nondetects applies to the pool of data available at the time of the updating and would include all nondetects up to that time,not just the four measurements taken at the last sampling period. CAUTIONARY NOTE:Control charts are a useful supplement to other statistical techniques because they are graphical and simple to use.However, it is inappropriate to construct a control chart on wells that have shown evidence of contamination or an increasing trend (see §264,97(a)(l)(i)). Further, con- tamination may not be present in a well in the form of a steadily increasing concentration profile--it may be present intermittently or may increase in a step function.Therefore,the absence of an increasing trend does not necessarily prove that a release has not occurred. 7-12 SECTION 8 MISCELLANEOUS TOPICS This chapter contains a variety of special topics that are relatively short and self contained.These topics include methods to deal with data below the limit of detection and methods to check for, and deal with outliers or extreme values in the data. 8.1 LIMIT OF DETECTION In a chemical analysis some compounds may be below the detection limit (DL) of the analytical procedure.These are generally reported as not detected (rather than as zero or not present) and the appropriate limit of detection is usually given.Data that include not detected results are a special case referred to as censored data in the statistical literature.For compounds not detected,the concentration of the compound is not known. Rather, it is only known that the concentration of the compound is less than the detection limit. There are a variety of ways to deal with data that include values below detection.There is no general procedure that is applicable in all cases. However there are some general guidelines that usually prove adequate. If these do not cover a specific situation,the user should consult a profes- sional statistician for the most appropriate way to deal with the values below detection. A summary of suggested approaches to deal with data below the detection limit is presented as Table 8-l.The method suggested depends on the amount of data below the detection limit.For small amounts of below detection values, simply replacing a "ND" (not detected) report with a small number, say the detection limit divided by two,and proceeding with the usual analysis is satisfactory.For moderate amounts of below detection limit data, a more detailed adjustment is appropriate,while for large amounts one may need to only consider whether a compound was detected or not as the variable of analysis. The meaning of small, moderate,and large above is subject to judgment. Table 8-1 contains some suggested values.It should be recognized that these values are not hard and fast rules, but are based on judgment.If there is a question about how to handle values below detection, consult a statistician. 8-l TABLE 8-1.METHODS FOR BELOW DETECTION LIMIT VALUES Percentage of Nondetects in the Data Base Statistical Section of Analysis Method Guidance Document Less than 15%Replace NDs with MDL/2 or PQL/2, then proceed with parametric procedures: Section 8.1.1 ??ANOVA Section 5.2.1 ??Tolerance Units Section 5.3 ??Prediction Intervals Section 5.4 ??Control Charts Section 7 Between 15 and 50%Use NDs as ties, then proceed with Nonparametric ANOVA Section 5.2.2 or use Cohen’s adjustment,Section 8.1.3 then proceed with: More than 50% ??Tolerance Limits . Confidence Intervals . Control Charts Test of Proportions Section 5.3 Section 6.2.1 Section 7 Section 8.1.2 8-2 It should be noted that the nonparametric methods presented earlier auto- matically deal with values below detection by regarding them as all tied at a level below any quantitated results.The nonparametric methods may be used if there is a Moderate amount of data below detection.If the proportion of non- quantified values in the data exceeds 25%, these methods should be used with caution.They should probably not be used if less than half of the data con- sists of quantified concentrations. 8.1.1 The DL/2 Method The amount of data that are below detection plays an important role in selecting the method to deal with the limit of detection problem.If a small proportion of the observations are not detected,these may be replaced with a small number, usually the method detection limit divided by 2 (MDL/2), and the usual analysis performed.This is the recommended method for use with the analysis of various procedure of Section 5.2.1.Seek professional help if in doubt about dealing with values below detection limit.The-results of the analysis are generally not sensitive to the specific choice of the replacement number. As a guideline,if 15% or fewer of the values are not detected, replace them with the method detection limit divided by two and proceed with the appropriate analysis using these modified values.Practical quantitation limits (PQL) for Appendix IX compounds were published by EPA in the Federal Register (Vol 52, No 131, July 9, 1987, pp 25947-25952). These give practical quantitation limits by compound and analytical method that may be used in replacing a small amount of nondetected data with the quantitation limit divided by 2.If approved by the Regional Administrator, site specific PQL's may be used in this procedure.If more than 15% of the values are reported as not detected, it is preferable to use a nonparametric method or a test of pro- portions. 8.1.2.Test of Proportions If more than 50% of the data are below detection but at least lO%.of the observations are quantified,a test of proportions may be used to compare the background well data with the compliance well data. Clearly, if none of the background well observations were above the detection limit, but all of the compliance well observations were above the detection limit, one would suspect contamination.In general the difference may not be as obvious. However, a higher proportion of quantitated values in compliance wells could provide evi- dence of contamination.The test of proportions is a method to determine whether a difference in proportion of detected values in the background well observations and compliance well observations provides statistically signifi- cant evidence of contamination. The test of proportions should be used when the proportion of quantified values is small to moderate (i.e., between 10% and 50%).If very few quanti- fied values are found, a method based on the Poisson distribution may be used as an alternative approach.A method based on a tolerance limit for the number of detected compounds and the maximum concentration found for any detected compound has been proposed by Gibbons (1988). This alternative would 8-3 be appropriate when the number of detected compounds is quite small relative to the number of compounds analyzed for as might occur in detection monitoring. PURPOSE The test of proportions determines whether the proportion of compounds detected in the compliance well data differs significantly from the proportion of compounds detected in the background well data.If there is a significant difference, this is statistically significant evidence of contamination. PROCEDURE The procedure uses the normal distribution approximation to the binomial distribution. ally, This assumes that the sample size is reasonably large. Gener- if the proportion of detected values is denoted by P, and the sample size is n,then the normal approximation is adequate, provided that nP and n(l-P) both are greater than or equal to 5. Step 1.Check criterion for using the normal approximation. .Determine X,the number of background well samples in which the compound was detected,and Y, the number of compliance well samples in which the compound was detected. .Let nb be the total number of background well samples analyzed and nc be the total number of compliance well samples analyzed.Let n = nb + nc. . Step 2. samples: Compute the proportion of detects in the background well Step 3. samples: Compute the proportion of detects in the compliance well Step 4.Compute the standard error of the difference in proportions: {[(x+y)/(nb+nc)][1 -(X+Y)/(nb+nc)][1/nb + 1/nc]}½ and form the statistic: 8-4 Step 5.Compare the absolute value of Z to the 97.5th percentile from the standard normal distribution, 1.96.If the absolute value of Z exceeds 1.96, this provides statistically significant evidence at the 5% significance level that the proportion of compliance well samples where the compound was detected exceeds the proportion of background well samples where the compound was detected.This would be interpreted as evidence of contamination.(The two-sided test is used to provide information about differences in either direction.) EXAMPLE Table 8-2 contains data on cadmium concentrations measured in background well and compliance wells at a facility.In the table,"BDL" is used for below detection limit. Since both of these exceed 5, the normal approximation is justified. Step 2.Estimate the proportion above detection in the background wells.As shown in Table 8-2, there were 24 samples from background wells analyzed for cadmium, so nb = 24.Of these, 16 were below detection and X = 8 were above detection, so Pb = 8/24 = 0.333. Step 3.Estimate the proportion above detection in the compliance wells.There were 64 samples from compliance wells analyzed for cadmium, with 40 below detection and 24 detected values. This gives nc = 64, Y = 24, so PC= 24/64 = 0.375. Step 4.Calculate the standard error of the difference in proportions. S D = ([(8+24)/(24+64)1[1-(8+24)/(24+64)](1/24 +l/64)}½ = 0.115 Step 5.Form the statistic Z and compare it to the normal distribution. Z = 0.375 - 0.333 = 0 37 0.115 which is less in absolute value than the value from the normal distribution, 1.96.Consequently, there is no statistically significant evidence that the proportion of samples with cadmium levels above the detection limit differs in the background well and compliance well samples. 8-5 TABLE 8-2.EXAMPLE DATA FOR A TEST OF PROPORTIONS Cadmium concentration (µg/L)Cadmium concentration (µg/L) at background well at compliance wells (24 samples)(64 samples) 0.1 0.12 BDL*0.26 BDL 0.1 BDL 0.014 BDL BDL BDL BDL BDL0.12 BDL 0.21 BDL 0.12 BDL BDL BDL 0.12 BDL 0.08 BDL BDL BDL 0.2BDL 0.1 BDL 0.012 BDL BDL BDL BDL BDL 0.12 0.07 BDL0.19 BDL 0.1 BDL 0.01 BDL BDL BDL BDL BDL 0.11 0.06 BDL0.23 BDL0.11 BDL 0.031BDL BDL BDLBDL BDL 0.12 0.08 BDL 0.26 BDL 0.02 BDL 0.024 BDL BDL BDL BDL BDL0.1 0.04 BDL BDL0.1 BDL0.01 BDL BDL BDL BDL BDL *BDL means below detection limit. 8-6 INTERPRETATION Since the proportion of water samples with detected amounts of cadmium in the compliance wells was not significantly different from that in the background wells,the data are interpreted to provide no evidence of contam- ination.Had the proportion of samples with detectable levels of cadmium in the compliance wells been significantly higher than that in the background wells this would have been evidence of contamination.Had the proportion been significantly higher in the background wells,additional study would have been required.This could indicate that contamination was migrating from an off- site source, or it could mean that the hydraulic gradient had been incorrectly estimated or had changed and that contamination was occurring from the facil- ity, but the ground-water flow was not in the direction originally esti- mated.Mounding of contaminants in the ground water near the background wells could also be a possible explanation of this observance. 8.1.3 Cohen's Method If a confidence interval or a tolerance interval based upon the normal distribution is being constructed,a technique presented by Cohen (1959) specifies a method to adjust the sample mean and sample standard deviation to account for data below the detection limit.The only requirements for the use of this technique is that the data are normally distributed and that the detection limit be always the same.This technique is demonstrated below. PURPOSE Cohen's method provides estimates of the sample mean and standard devia- tion when some (< 50%) observations are below detection.These estimates can then be used to construct tolerance, confidence, or prediction intervals. PROCEDURE Let n be the total number of observations,m represent the number of data points above the detection limit (DL), and Xi represent the value of the ith constituent value above the detection limit. 8-7 where n is the total number of observations (i.e.., above and below the detection limit), and where DL is equal to the detection limit. Step 4.Estimate the corrected sample mean, which accounts for the data below detection limit, as follows: Step 5.Estimate the corrected sample standard deviation, which accounts for the data below detection limit, as follows: EXAMPLE Table 8-3 contains data on sulfate concentrations. Three observations of the 24 were below the detection limit of 1,450 mg/L and are denoted by "< 1,450" in the table. Step 1.Calculate the mean from the m = 21 values above detection TABLE 8-3.EXAMPLE DATA FOR COHEN'S TEST Sulfate concentration (mg/L) 1,850 1,760 < 1,450 1,710 1,575 1,475 1,780 1,790 1,780 < 1,450 1,790 1,800 < 1,450 1,800 1,840 1,820 1,860 1,780 1,760 1,800 1,900 1,770 1,790 1,780 DL =1,450 mg/L Note:A symbol 11<11 before a number indicates that the value is not detected.The number following is then the limit of detection. Step 3.Determine h = (24-21)/24 = 0.125 and Y = 8593.69/(1771.9-145O)z = 0.083 8-9 For the interested reader, the details of the double linear interpolation re provided. The values from Table 7 between which the user needs to interpolate are: Y h = 0.10 h = 0.15 0.05 0.11431 0.17935 0.10 0.11804 0.18479 There are 0.025 units between 0.01 and 0.125 on the h-scale. There are 0.05 units between 0.10 and 0.15.Therefore, the value of interest (0.125) lies (0.025/0.05 * 100)= 50% of the distance along the interval between 0.10 and 0.15.To linearly interpolate between the tabulated values on the h axis, the range between the values must be calculated, the value that is 50% of the distance along the range must be computed and then that value must be added to the lower point on the tabulated values.The result is the interpolated value.The interpolated points on the h-scale for the current example are: 0.17935 - 0.11431 = 0.06504 0.06504 * 0.50 = 0.03252 0.11431 + 0.03252 = 0.14683 0.18479 - 0.11804 = 0.06675 0.06675 * 0.50 = 0.033375 0.11804 + 0.033375 = 0.151415 On the y-axis there are 0.033 units between 0.05 and 0.083.There are 0.05 units between 0.05 and 0.10.The value of interest (0.083) lies (0.0330.05 * 100)= 66% of the distance along the interval between 0.05 and 0.10. The interpolated point on the y-axis is: 0.141415 - 0.14683 = 0.004585 0.004585 * 0.66 = 0.0030261 0.14683 + 0.0030261 = 0.14986 Step 6.These modified estimates of the mean, x = 1723.66, and of the standard deviation, S =155.31, would be used in the tolerance or confidence interval procedure.For example,if the sulfate concentrations represent background at a facility, the upper 95% tolerance limit becomes 1723.7 + (155.3)(2.309) = 2082.3 mg/L 8-10 For the interested reader, the details of the double linear interpolation re provided. The values from Table 7 between which the user needs to interpolate are: Y h = 0.10 h = 0.15 0.05 0.11431 0.17935 0.10 0.11804 0.18479 There are 0.025 units between 0.01 and 0.125 on the h-scale. There are 0.05 units between 0.10 and 0.15.Therefore, the value of interest (0.125) lies (0.025/0.05 * 100)= 50% of the distance along the interval between 0.10 and 0.15.To linearly interpolate between the tabulated values on the h axis, the range between the values must be calculated, the value that is 50% of the distance along the range must be computed and then that value must be added to the lower point on the tabulated values.The result is the interpolated value.The interpolated points on the h-scale for the current example are: 0.17935 - 0.11431 = 0.06504 0.06504 * 0.50 = 0.03252 0.11431 + 0.03252 = 0.14683 0.18479 - 0.11804 = 0.06675 0.06675 * 0.50 = 0.033375 0.11804 + 0.033375 = 0.151415 On the y-axis there are 0.033 units between 0.05 and 0.083.There are 0.05 units between 0.05 and 0.10.The value of interest (0.083) lies (0.0330.05 * 100)= 66% of the distance along the interval between 0.05 and 0.10. The interpolated point on the y-axis is: 0.141415 - 0.14683 = 0.004585 0.004585 * 0.66 = 0.0030261 0.14683 + 0.0030261 = 0.14986 Step 6.These modified estimates of the mean, x = 1723.66, and of the standard deviation, S =155.31, would be used in the tolerance or confidence interval procedure.For example,if the sulfate concentrations represent background at a facility, the upper 95% tolerance limit becomes 1723.7 + (155.3)(2.309) = 2082.3 mg/L 8-10 Observations from compliance wells in excess of 2,082 mg/L would give sta- tistically significant evidence of contamination. INTERPRETATION Cohen's method provides maximum likelihood estimates of the mean and variance of a censored normal distribution.That is, of observations that follow a normal distribution except for those below a limit of detection, which are reported as "not detected."The modified estimates reflect the fact that the not detected observations are below the limit of detection, but not necessarily zero.The large sample properties of the modified estimates allow for them to be used with the normal theory procedures as a means of adjusting for not detected values in the data.Use of Cohen's method in more compli- cated calculations such as those required for analysis of variance procedures, requires special consideration from a professional statistician. 8.2 OUTLIERS A ground-water constituent concentration value that is much different from most other values in a data set for the same ground-water constituent concentration can be referred to as an "outlier."Possible reasons for outliers can be: .A catastrophic unnatural occurrence such as a spill; .Inconsistent sampling or analytical chemistry methodology that may result in laboratory contamination or other anomalies; .Errors in the transcription of data values or decimal points; and .True but extreme ground-water constituent concentration measure- ments. There are several tests to determine if there is statistical evidence that an observation is an outlier.The reference for the test presented here is ASTM paper E178-75. PURPOSE The purpose of a test for outliers is to determine whether there is statistical evidence that an observation that appears extreme does not fit the distribution of the rest of the data.If a suspect observation is identified as an outlier, then steps need to be taken to determine whether it is the result of an error or a valid extreme observation. PROCEDURE Let the sample of observations of a hazardous constituent of ground water be denoted by Xi,. . . ,Xn.For specificity,assume that the data have been ordered and that the largest observation, denoted by Xn, is suspected of being an outlier.Generally,inspection of the data suggests values that do not 8-11 appear to belong to the data set.For example,if the largest observation is an order of magnitude larger than the other observations, it would be suspect. including all observations. Step 2.Form the statistic, Tn: Note that T is the difference between the largest observation and the sample mean, divided by the sample standard deviation. step 3.Compare the statistic Tn to the critical value given the sample size, n, in Table 8 in Appendix B.If the Tn statistic exceeds the critical value from the table, this is evidence that the suspect observation, Xn, is a statistical outlier. Step 4.If the value is identified as an outlier, one of the actions outlined below should be taken.(The appropriate action depends on what can be learned about the observation.)The records of the sampling and analysis of the sample that led to it should be investigated to determine whether the outlier resulted from an error that can be identified. .If an error (in transcription,dilution, analytical procedure, etc.) can be identified and the correct value recovered, the observation should be replaced by its corrected value and the appropriate statistical analysis done with the corrected value. .If it can be determined that the observation is in error, but the correct value cannot be determined,then the observation should be deleted from the data set and the appropriate statistical analysis performed.The fact that the observation was deleted and the reason for its deletion should be reported when reporting the results of the statistical analysis. •If no error in the value can be documented then it must be assumed that the observation is a true but extreme value.In this case it must not be altered.It may be desirable to obtain another sample to confirm the observa- tion.However, analysis and reporting should retain the observation and state that no error was found in tracing the sample that led to the extreme observa- tion. EXAMPLE Table 8-4 contains 19 values of total organic carbon (TOC) that were obtained from a monitoring well.Inspection shows one value which at 11,000 mg/L is nearly an order of magnitude larger than most of the other observa- tions.It is a suspected outlier. 8-12 TABLE 8-4.EXAMPLE DATA FOR TESTING FOR AN OUTLIER Total organic carbon (mg/L) 1,700 1,900 1,500 1,300 11,000 1,250 1,000 1,300 1,200 1,450 1,000 1,300 1,000 2,200 4,900 3,700 1,600 2,500 1,900 Step 2.Calculate the statistic T19. T19 = (llOOO-2300)/2325.9 = 3.74 Step 3.Referring to Table 8 of Appendix B for the upper 5% significance level,with n =19, the critical value is 2.532.Since the value of the. statistic T19 = 3.74 is greater than 2.532,there is statistical evidence that the largest observation is an outlier. Step 4.In this case, tracking the data revealed that the unusual value of 11,000 resulted from a keying error and that the correct value was 1,100. This correction was then made in the data. INTERPRETATION An observation that is 4 or 5 times as large as the rest of the data is generally viewed with suspicion.An observation that is an order of magnitude different could arise by a common error of misplacing a decimal. The test for an outlier provides a statistical basis for determining whether an observation 8-13 is statistically different from the rest of the data.If it is, then it is a statistical outlier.However,a statistical outlier may not be dropped or altered just because it has been identified as an outlier. The test provides formal identification of an observation as an outlier, but does not identify the cause of the difference. Whether or not a statistical test is done,any suspect data point should be checked.An observation may be corrected or dropped only if it can be determined that an error has occurred.If the error can be identified and corrected (as in transcription or keying) the correction should be made and the corrected values used.A value that is demonstrated to be incorrect may be deleted from the data. However,if no specific error can be documented, the observation must be retained in the data.Identification of an observa- tion as an outlier but with no error documented could be used to suggest resampling to confirm the value. 8-14 APPENDIX A GENERAL STATISTICAL CONSIDERATIONS ANDGLOSSARY OF STATISTICAL TERMS A-l GENERAL STATISTICAL CONSIDERATIONS FALSE ALARMS OR TYPE I ERRORS The statistical analysis of data from ground-water monitoring at RCRA sites has as its goal the determination of whether the data provide evidence of the presence of,or an increase in the level of contamination.In the case of detection monitoring,the goal of the statistical analysis is to determine whether statistically significant evidence of contamination exists.In the case of compliance monitoring,the goal is to determine whether statistically significant evidence of concentration levels exceeding compliance limits exists.In monitoring sites in corrective action, the goal is to determine whether levels of the hazardous constituents are still above compliance limits or have been reduced to, at, or below the compliance limit. These questions are addressed by the use of hypothesis tests.In the case of detection monitoring,it is hypothesized that a site is not contami- nated;that is,the hazardous constituents are not present in the ground water.Samples of the ground water are taken and analyzed for the constitu- ents in question.A hypothesis test is used to decide whether the data indi- cate the presence of the hazardous constituent.The test consists of calcu- lating one or more statistics from the data and comparing the calculated results to some prespecified critical levels. In performing a statistical test,there are four possible outcomes. Two of the possible outcomes result in the correct decision:(a) the test may correctly indicate that no contamination is present or (b) the test may cor- rectly indicate the presence of contamination.The other two possibilities are errors:(c) the test may indicate that contamination is present when in fact it is not or (d) the test may fail to detect contamination when it is present. If the stated hypothesis is that no contamination is present (usually called the null hypothesis) and the test indicates that contamination is present when in fact it is not,this is called a Type I error.Statistical hypothesis tests are generally set up to control the probability of Type I error to be no more than a specified value,called the significance level, and usually denoted by a.Thus in detection monitoring, the null hypothesis would be that the level of each hazardous constituent is zero (or at least below detection).The test would reject this hypothesis if some measure of concen- tration were too large,indicating contamination.A Type I error would be a false alarm or a triggering event that is inappropriate. In compliance monitoring,the null hypothesis is that the level of each hazardous constituent is less than or equal to the appropriate compliance A-3 limit. For the purpose of setting up the statistical procedure, the simple null hypothesis that the level is equal to the compliance limit would be As in detection monitoring,the test would indicate contamination if some measure of concentration is too large.A false alarm or Type I error would occur if the statistical procedure indicated that levels exceed the appropriate compliance limits when, in fact, they do not. Such an error would be a false alarm in that it would indicate falsely that compliance limits were being exceeded. PROBILITY OF DETECTION AND TYPE II ERROR The other type of error that can occur is called a Type II error. It occurs if the test fails to detect contamination that is present.Thus a Type II error is a missed detection.While the probability of a Type I error can be specified,since it is the probability that the test will give a false alarm, the probability of a Type II error depends on several factors, includ- ing the statistical test, the sample size,and the significance level or prob- ability of Type I error.In addition, it depends on the degree of contamina- tion present. In general,the probability of a Type II error decreases as the level of contamination increases.Thus a test may be likely to miss low lev- els of contamination, less likely to miss moderate contamination, and very unlikely to miss high levels of contamination. One can discuss the probability of a Type II error as the probability of a missed detection,or one can discuss the complement (one minus the prob- ability of Type II error) of this probability.The complement, or probability of detection,is also called the power of the test.It depends on the magni- tude of the contamination so that the power or probability of detecting con- tamination increases with the degree of contamination. If the probability of a Type I error is specified, then for a given sta- tistical test,the power depends on the sample size and the alternative of interest.In order to specify a desired power or probability of detection, one must specify the alternative that should be detected. Since generally the power will increase as the alternative differs more and more from the null hypothesis,one usually tries to specify the alternative that is closest to the null hypothesis, yet enough different that it is important to detect. In the detection monitoring situation,the null hypothesis is that the concentration of the hazardous constituent is zero (or at least below detec- tion).In this case the alternative of interest is that there is a concen- tration of the hazardous constituent that is above the detection limit and is large enough so that the monitoring procedure should detect it. Since it is a very difficult problem to select a concentration of each hazardous constituent that should be detectable with specified power,a more useful approach is to determine the power of a test at several alternatives and decide whether the procedure is acceptable on the basis of this power function rather than on the power against a single alternative. In order to increase the power,a larger sample must be taken.This would mean sampling at more frequent intervals.There is a limit to how much can be achieved, however.In cases with limited water flow, it may not be possible to sample wells as frequently as desired. If samples close together A-4 in time prove to be correlated,this correlation reduces the information available from the different samples.The additional cost of sampling and analysis will also impose practical limitations on the sample size that can be used. Additional wells could also be used to increase the performance of the test.The additional monitoring wells would primarily be helpful in ensuring that a plume would not escape detection by missing the monitoring wells. How- ever,in some situations the additional wells would contribute to a larger sample size and so improve the power. In compliance monitoring the emphasis is on determining whether addi- tional contamination has occurred, raising the concentration above a compli- ance limit.If the compliance limit is determined from the background well levels, the null hypothesis is that the difference between the background and compliance well concentrations is zero.The alternative of interest is that the compliance well concentration exceeds the background concentration. This situation is essentially the same for power considerations as that of the detection monitoring situation. If compliance monitoring is relative to a compliance limit (MCL or ACL), specified as a constant, then the situation is different.Here the null hypo- thesis is that the concentration is less than or equal to the compliance limit, with equality used to establish the test. The alternative is that the concentration is above the compliance limit.In order to specify power, a minimum amount above the compliance limit must be established and power speci- fied for that alternative or the power function evaluated for several possible alternatives. SAMPLE DESIGNS AND ASSUMPTIONS As discussed in Section 2, the sample design to be employed at a regu- lated unit will primarily depend on the hydrogeologic evaluation of the site.Wells should be sited to provide multiple background wells hydrauli- cally upgradient from the regulated unit.The background wells allow for determination of natural spatial variability in ground-water quality.They also allow for estimation of background levels with greater precision than would be possible from a single upgradient well.Compliance wells should be sited hydraulically downgradient to each regulated unit.The location and spacing of the wells,as well as the depth of sampling, would be determined from the hydrogeology to ensure that at least one of the wells should inter- cept a plume of contamination of reasonable size. Thus the assumed sample design is for a sample of wells to include a number of background wells for the site,together with a number of compliance wells for each regulated unit at the site.In the event that a site has only a single regulated unit,there would be two groups of wells, background and compliance.If a site has multiple regulated units, there would be a set of compliance wells for each regulated unit,allowing for detection monitoring or compliance monitoring separately at each regulated unit. Data from the analysis of the water at each well are initially assumed to follow a normal distribution.This is likely to be the case for detection A-5 monitoring of analytes in that levels should be near zero and errors would likely represent instrument or other sampling and analysis variability. If contamination is present,then the distribution of the data may be skewed to the right, giving a few very large values.The assumption of normality of errors in the detection monitoring case is quite reasonable, with deviations from normality likely indicating some degree of contamination. Tests of nor- mality are recommended to ensure that the data are adequately represented by the normal distribution. In the compliance monitoring case,the data for each analyte will again initially be assumed to follow the normal distribution.In this case, how- ever,since there is a nonzero concentration of the analyte in the ground water, normality is more of an issue.Tests of normality are recommended. If evidence of nonnormality is found,the data should be transformed or a distribution-free test be used to determine whether statistically significant evidence of contamination exists. The standard situation would result in multiple samples (taken at dif- ferent times) of water from each well.The wells would form groups of back- ground wells and compliance wells for each regulated unit.The statistical procedures recommended would allow for testing each compliance well group against the background group.Further,tests among the compliance wells within a group are recommended to determine whether a single well might be intercepting an isolated plume.The specific procedures discussed and recom- mended in the preceding sections should cover the majority of cases.They did not cover all of the possibilities.In the event that none of the procedures described and illustrated appears to apply to a particular case at a given regulated site,consultation with a statistician should be sought to determine an appropriate statistical procedure. The following approach is recommended.If a regulated unit is in detec- tion monitoring,it will remain in detection monitoring until or unless there is statistically significant evidence of contamination, in which case it would be placed in compliance monitoring.Likewise, if a regulated unit is in com- pliance monitoring,it will remain in compliance monitoring unless or until there is statistically significant evidence of further contamination, in which case it would move into corrective action. In monitoring a regulated unit with multiple compliance wells, two types of significance levels are considered.One is an experimentwise significance level and the other is a comparisonwise significance level. When a procedure such as analysis of variance is used that considers several compliance wells simultaneously,the significance is an experimentwise significance. If individual well comparisons are made,each of those comparisons is done at a comparisonwise significance level. The fact that many comparisons will be made at a regulated unit with multiple compliance wells can make the probability that at least one of the comparisons will be incorrectly significant too high.To control the false positive rate,multiple comparisons procedures are allowed that control the experimentwise significance level to be 5%.That is, the probability that one more of the comparisons will falsely indicate contamination is controlled A-6 at 5%.However,to provide some assurance of adequate power to detect real contamination,the comparisonwise significance level for comparing each individual well to the background is required to be no less than 1%. Control of the experimentwise significance level via multiple comparisons procedures is allowed for comparisons among several wells. However, use of an experimentwise significance level for the comparisons among the different haz- ardous constituents is not permitted.Each hazardous constituent to be moni- tored for in the permit must be treated separately. A-7 GLOSSARY OF STATISTICAL TERMS (underlined terms are explained subsequently) Alpha (a) Alpha-error Alternative hypothesis Arithmetic average Autocorrelation Biased estimator Bonferroni t A greek letter used to denote the significance level or probability of a Type I error. Sometimes used for Type I error. An alternative hypothesis specifies that the underlying distribution differs from the null hypothesis.The alternative hypothesis usually specifies the value of a parameter, for example the mean concentration, that one is trying to detect. The arithmetic average of a set of observations is their sum divided by the number of observations. This is a measure of dependence among sequential observations from the same well. There are dif- ferent orders of autocorrelation, depending on how far apart in time the correlation per- sists.For example,the first order auto- correlation is the correlation between suc- cessive pairs of observations. A biased estimator is an estimator that has an expectation or average value that is not equal to the parameter it is estimating.Often the bias decreases as the sample size increases. This is an approach,developed by Bonferroni, to control the experimentwise error rate in multi- ple comparisons.The number of comparisons or hypotheses to be tested is fixed (at k) and a "t" statistic is computed to test each of these.Instead of the usual "t" table, where each of these tests would be done at the sig- nificance level alpha,a special table is used so that each test is done at level aloha/k. This ensures that the experimentwise error rate is no more than alpha. A-8 Comparisonwise error rate This term is used in association with multiple comparisons.It refers to the probability of an error occurring on a single comparison of sev- eral that might be done.It is computed assum- ing that the single comparison or hypothesis test is the only one being done. Composite hypothesis This is a hypothesis for which not all relevant parameters are specified. A composite hypothe- sis is made up of two or more simple hypothe- ses.For example,the hypothesis that the dataare normally distributed with unspecified mean and variance is a composite hypothesis. Confidence coefficient The confidence coefficient of a confidence interval for a parameter is the probability that the random interval constructed from the sample data contains the true value of the parameter. The confidence coefficient is related to the significance level of an associated hypothesis test by the fact that the significance level (in percent) is one hundred minus the confidence coefficient (in percent). Confidence interval A confidence interval for a parameter is a random interval constructed from sample data in such a way that the probability that the interval will contain the true value of the parameter is a specified value. Cumulative distribution function The distribution function for a random variable, X, is a function that specifies the probability that X is less than or equal to t, for all real values of t. Distribution-free Distribution function Estimator This is sometimes used as a synonym for nonparametric.A statistic is distribution-free if its distribution does not depend upon which specific distribution function (in a large class) the observations follow. This document uses "Cumulative Distribution Function" and "Distribution Function" inter- changeably.See Cumulative Distribution Function. An estimator is a statistic computed from the observed data.It is used to estimate a param- eter of interest; for example, the population mean.Often estimators are the sample equiva- lents of the population parameters. A-9 Experimentwise error rate This term refers to multiple comparisons.If a total of n decisions are made about comparisons(for example of compliance wells to backgroundwells) and x of the decisions are wrong, then the experimentwise error rate is x/n.The probability that X exceeds zero is the experi-mentwise significance. Hypothesis Independence Mean Median Multiple comparison procedure This is a formal statement about a parameter ofinterest and the distribution of a statistic. It is usually used as a null hypothesis or analternative hypothesis.For example, the null hypothesis might specify that ground water had azero concentration of benzene and that analyti-cal errors followed a normal distribution withmean zero and standard deviation 1 ppm. A set of events are independent if the probability of the joint occurrence of anysubset of the events factors into the product ofthe probabilities of the events.A set of observations is independent if the joint distribution function of the random errors associated with the observations factors into the product of the distribution functions. Arithmetic average. This is the middle value of a sample when theobservations have been ordered from least to greatest.If the number of observations is odd, it is the middle observation.If the number of observations is even,it is customary to take the midpoint between the two middle observa- tions.For a distribution, the median is a value such that the probability is one-half thatan observation will fall above or below the median. This is a statistical procedure that makes alarge number of decisions or comparisons on oneset of data.For example, at a sampling period, several compliance well concentrations may becompared to the background well concentration. Nonparametric statistical A nonparametric statistical procedure is a procedure statistical procedure that has desirable properties that hold under mild assumptionsregarding the data.Typically the procedure is valid for a large class of distributions ratherthan for a specific distribution of the data such as the normal. A-10 Normal population, normality Null hypothesis One-sided test The errors associated with the observations follow the normal or Gaussian distribution function. A null hypothesis specifies the underlying distribution of the data completely. Often the null distribution specifies that there is nodifference between the mean concentration in background well water samples and compliance well water samples.Typically, the null hypo- thesis is a simple hypothesis. A one-sided test is appropriate if concentra- tions higher than those specified by the null hypothesis are of concern.A one-sided test only rejects for differences that are large and in a prespecified direction. One-sided tolerance limit This is an upper limit on observations from a specified distribution. One-sided confidence limit This is an upper limit on a parameter of a distribution. Order statistics The sample values observed after they have been arranged in increasing order. Outlier An outlier is an observation that is found to lie an unusually long way from the rest of the observations in a series of replicate observations. Parameter A parameter is an unknown constant associated with a population.For example, the mean concentration of a hazardous constituent in ground water is a parameter of interest. Percentile A percentile of a distribution is a value below which a specified proportion or percent of theobservations from that distribution will fall. Post hoc comparison Power This is a comparison, say between hazardous constituent concentrations in two wells, that was found to be of interest after the data were collected.Special methods must be used to determine significance levels for post hoc comparisons. The power of a test is the probability that the test will reject under a specified alternative hypothesis.This is one minus the probability of a Type II error.The power is a measure of A-11 the test's ability to detect a difference ofspecified size from the null hypothesis. Sample standard deviation This is the square root of the sample variance. Sample variance This is a statistic (computed on a sample of observations rather than on the whole popula- tion) that measures the variability or spread of the observations about the sample mean.It is the sum of the squared differences from the sample mean,divided by the number of observa- tions less one. Serial correlation This is the correlation of observations spaced a constant interval apart in a series.For exam- ple, the first order serial correlation is the correlation between adjacent observations.The first order serial correlation is found by cor- relating the pairs consisting of the first and second,second and third, third and fourth, etc.,observations. Significance level Sometimes referred to as the alpha level, the significance level of a test is the probabilityof falsely rejecting a true null hypothesis. The probability of a Type I error. Simple hypothesis A hypothesis which completely specifies the distribution of the observed random variables. To completely define a distribution, both thetype of distribution and numeric values for the parameters must be given. Test statistic A test statistic is a value computed from the observed data.This value is used to test a hypothesis by relating the value to a distribu- tion table and rejecting the hypothesis if the computed value falls in a region that has lowprobability under the hypothesis being tested. A "t" statistic, an "F" statistic, and a chi- squared statistic are examples. Trend analysis This refers to a collection of statistical methods that analyze data to determine trends over time.The trends may be of various types, steady increases (or decreases), or a step increase at a point in time. Type I error A Type I error occurs when a true null hypothesis is rejected erroneously.In the monitoring context a Type I error occurs when a A-12 Type II error Unbiased estimator test incorrectly indicates contamination or an increase in contamination at a regulated unit. A Type II error occurs when one fails to reject a null hypothesis that is false.In the moni- toring context,a Type II error occurs when monitoring fails to detect contamination or an increase in a concentration of a hazardous constituent. An unbiased estimator is an estimator that has zero bias.That is, its expectation is equal to the parameter it is estimating.Its average value is the parameter. A-13 APPENDIX B STATISTICAL TABLES B-1 1 2 3 4 5 6 7 8 B-3 SOURCE:Johnson, Norman L. and F. C. Leone.1977. Statistics and Experimental Design in Engineering and the Physical Sciences. Vol. I. Second Edition. John Wiley and Sons, New York. B-4 TABLE 1.PERCENTILES OF THE x2 DISTRIBUTION WITH v DEGREES OF FREEDOM,x~,p 0 \p 0.7050 0.900 0.950 0.9705 0.990 0.995 0.999 1/\ 1 1.323 2.706 3.841 05.024 6.635 7.879 10.83 2 2.773 4.605 '05.991 7.378 9.210 10.60 13.82 3 4.108 6.251 7.815 9.348 11.34 12.84 16.27 4 05.385 7.779 9.488 11.14 13.28 14.86 18.47 S 6.626 9.236 11.07 12.83 15.09 16.705 20.52 6 7.841 10.64 12.59 14.45 16.81 18 ..5.5 22.46 7 9.037 12.02 14.07 16.01 18.48 20.28 24.32 8 10.22 13.36 105.51 17..53 20.09 21.96 26.12 9 11.39 14.68 16.92 19.02 21.67 23.59 27.88 10 12..5.5 1.5.99 18.31 20.48 23.21 2.5.19 29 ..59 11 13.70 17.28 19.68 2L92 24.72 26.76 31.26 12 14.8.5 18 ..5S 21.03 23.34 26.22 28.30 32.91 13 1.5.98 19.81 22.36 24.74 27.69 29.82 34.53 l4 17.12 21.06 23.68 26.12 29.14-31.32 36.12 15 18.25 21.31 2.5.00 27.49 30.58 32.80 37.70 16 19.37 23.54 26.30 28.8.5 32.00 34.27 39.25 17 20.49 24.77 27•.59 30.19 33.41 3.5.72 40.79 18 21.60 25.99 28.87 31.53 34.81 37.16 42.31 19 22.72 27.20 30.14 32.85 36.19 38 ..58 43.82 20 23.83 28.41 31.41 34.17 37 ..57 40.00 4.5.32 21 24.93 29.62 32.67 35.48 38.93 41.40 46.80 22 26.04 30.81 33.92 36.78 40.29 42.80 48.27 23 27.14 32.01 3.5.17 38.08 41.64 44.18 49.73 24 28.24 33.20 36.42 39.36 4:"98 405 ..56 .5US 2S 29.34 34.38 37.65 40.65 44.31 46.93 52.62 26 30.43 3.5.56 38.89 41.92 45.64 48.29 54.05 27 31.53 36.74 40.11 43.19 46.96 49.64 55.48 28 32.62 37.92 41.34 44.46 48.28 50.99 56.89 29 33.71 39.09 42..56 45.72 49.59 52.34 58.30 30 34.80 40.26 43.77 46.98 50.89 53.67 59.70 40 45.62 51.80 55.76 59.34 63.69 66.77 73.40 50 56.33 63.17 67.50 71.42 76.15 79.49 86.66 60 66.98 74.40 79.08 83.30 88.38 91.95 99.61 70 n..58 805 ..53 90 ..53 95.02 100.4 104.2 112.3 80 88.13 96.58 101.9 106.6 112.3 116.3 124.8 90 98.65 107.6 113.1 118.1 124.1 128.3 137.2 100 109.1 118.5 124.3 129.6 135.8 140.2 149.4 NOTE: vl:Degrees of freedom for numerator v 2 :Degrees of freedom for denominator SOURCE: Johnson, Norman L. and F. C, Leone.1977. Statistics and Experimental Design in Engineering and the Physical Sciences. Vol. I.Second Edition.John Wiley and Sons, New York. B-5 2.13 2.78 2.02 2.57 1.94 2.45 1.90 2.37 1.86 2.31 1.83 2.26 1.01 2.23 1.75 2.13 1.73 2.09 1.70 2.04 1.65 1.96 3.20 2.90 2.74 2.63 2.55 2.50 2.45 2.32 2.27 2.21 2.13 3.51 3.75 3.17 3.37 2.97 3.14 2.83 3.00 2.74 2.90 2.67 2.82 2.61 2.76 2.47 2.60 2.40 2.53 2.34 2.46 2.24 2.33 B-6 TABLE 4.PERCENTILES OF THE STANDARD NORMAL DISTRIBUTION, Up P 0.000 0.001 0.002 0.003 0.004 0.005 0.006 0.007 0.008 0.009 0.50 0.0000 0.0025 0.0050 0.0075 0.0100 0.0125 0.0 I50 0.51 0.025I 0.0276 0.0301 0.0326 0.035I 0.0376 0.0401 0.52 0.0502 0.0527 0.0552 0.0577 0.0602 0.0627 0.0652 0.53 0.0753 0.0778 0.0803.0.0828 0.0853 0.0878 0.0904 0.54 0.1004 0.1030 0.1055 0.1080 0.1O5 0.1130 0.1156 0.55 0.1257 0.1282 0.1307 0.I332 0.1358 0. I 383 0.1408 0.56 0.1510 0.1535 0.1560 0.1586 0.1611 0.1637 0.1662 0.57 0.1764 0.I789 0.1815 0. I840 0.1866 0.1891 0.1917 0.58 0.2019 0.2045 0.2070 0.2096 0.2121 0.2147 0.2173 0.59 0.2275 0.2301 0.2327 0.2353 0.2378 0.2404 0.2430 0.60 0.2533 0.2559 0.2585 0.26II 0.2637 0.2663 0.2689 0.61 0.2793 0.2819 0.2845 0.2871 0.2898 0.2924 0.2950 0.62 0.3055 0.308I 0.3107 0.3134 0.3160 0.3186 0.3213 0.63 0.3319 0.3345 0.3372 0.3398 0.3425 0.3451 0.3478 0.64 0.3585 0.361 I 0.3638 0.3665 0.3692 0.3719 0.3745 0.65 0.3853 0.3880 0.3907 0.3934 0.3961 0.3989 0.40I6 0.66 0.4125 0.4152 0.4179 0.4207 0.4234 0.4261 0.4289 0.67 0.4399 0.4427 0.4454 0.4482 0.4510 0.4538 0.4565 0.68 0.4677 0.4705 0.4733 0.476 I 0.4789 0.4817 0.4845 0.69 0.4959 0.4987 0.5015 0.5044 0.5072 0.5101 0.5 I29 0.70 0.5244 0.5273 0.5302 0.5330 0.5359 0.5388 0.5417 0.71 0.5534 0.5563 0.5592 0.5622 0.565 1 0.568 1 0.5710 0.72 0.5828 0.5858 0.5888 0.5918 0.5948 0.5978 0.6008 0.73 0.6128 0.6158 0.6189 0.6219 0.6250 0.6280 0.63 I I 0.74 0.6433 0.6464 0.6495 0.6526 0.6557 0.6588 0.6620 0.0175 0.0426 0.0677 0.0929 O.lISI 0.1434 0.1687 0. I942 0.2198 0.2456 0.2715 0.2976 0.3239 0.3505 0.3772 0.4043 0.43 I6 0.4593 0.4874 0.5158 0.5446 0.5740 0.6038 0.634 I 0.665 I 0.020 1 0.0226 0.045 I 0.0476 0.0702 0.0725 0.0954 0.0979 0. I206 0.1’31 0.1459 0. I-184 0.1713 0.1738 0. I968 0. I993 0.2224 0.2250 0.2482 0.2508 0.274 I 0.2767 0.3002 0.3029 0.3266 0.3292 0.353 I 0.3558 0.3799 0.3826 0.4070 0.4097 0.4344 0.4372 0.4621 0.4649 0.4902 0.4930 0.5187 0.52IS 0.5476 0.5505 0.5769 0.5799 0.6068 0.6098 0.6372 0.6403 0.6682 0.6713 NOTE:For values of P below 0.5, obtain the value of U(l-p) from Table 4 and change its sign.For example, U0-45 = -U(l-0.45) = -U.0.55 = -0.1257. (Continued) B-7 TABLE 4 (Continued) 0.75 0.6745 0.6776 0.76 0.7063 0.7095 0.77 0.7388 0.7421 0.78 0.7722 0.7756 0.79 0.8064 0.8099 0.80 0.8416 0.8452 0.81 0.8779 0.8816 0.82 0.9154 0.9192 0.83 0.9542 0.958 I 0.84 0.9945 0.9986 O-85 I.0364 1.0407 0.86 1.0803 1.0848 0.87 1.1264 1.131I 0.88 1.1750 1.1800 0.89 I.2265 I.2319 0.90 1.2816 1.2873 0.91 I.3408 1.3469 0.92 I.405I 1.4118 0.93 I.4758 1 .4833 0.94 I ,5548 1.5632 0.95 1.6449 1.6546 0.96 1.7507 1.7624 0.97 1.8808 1.8957 0.98 2.0537 2.0749 0.99 2.3243 2.3656 0.6808 0.7 1 28 0.7454 0.7790 0.8134 0.8488 0.8853 0.9230 0.9621 1 .0027 1 .0450 1 .0893 1.1359 1.1850 1.2372 .2930 .3532 .4187 .4909 5718 1.6666 I.7744 1.91 IO 2.0969 2.4089 0.6840 0.6871 0.6903 0.6935 0.7160 0.7192 0.7225 0.7257 0.7488 0.7521 0.7544 0.7588 0.7824 0.7358 0.7892 0.7926 0.8I69 0.8204 0.8239 0.8274 0.8524 0.8560 0.8596 0.8633 0.8890 0.8927 0.8965 0.9002 0.9269 0.9307 0.9346 0.9385 0.9661 0.970I 0.974I 0.9782 1.0069 1.011O 1.0152 1.0194 1.0494 1.0537 I .058 I 1.0625 I.0939 I.0985 1.1031 1.1077 1.1407 1.1455 I.1503 1.1552 1.1901 1.1952 I .2OO4 I.2055 1.2426 1 .2481 I.2536 1.2591 1.2988 1.3047 1.3106 I.3165 1.3595 1.3658 1.3722 I.3787 1.4255 1.4325 I.4395 1.4466 1.4985 1,5O63 l.5141 I.5220 1.5805 1.5893 I.5982 I.6072 1.6747 1.6849 1.6954 I.7060 1.7866 1.7991 I.8119 I.8250 I.9268 1.9431 I.9600 I.9774 2.1201 2.1444 2.1701 2.1973 2.4573 2,5l21 2.5758 2.652I 0.6967 0.7290 0.7621 0.7961 0.8310 0.8669 0.9040 0.9424 I .3225 I .3852 I .4538 1.5301 1.6164 1.7169 1.8384 I.9954 2.2262 2.7478 0.6999 0.7323 0.7655 0.7995 0.8345 0.8705 0.9078 0.9463 0.9863 I.0279 I.0714 I.1 170 1.1650 1.2160 1.2702 I.3285 1.3917 I.4611 I.5382 1.6258 I.7279 I.822 2.0141 2.257I 2.8782 0.703I 0.7356 0.7688 0.8030 0.838I 0.8742 0.9116 0.9502 0.9904 .0322 .0758 .I217 .I700 -2212 .2759 1.3346 1.3984 1.4684 1.5464 1.6352 1.7392 1.8663 2.0335 2.2904 3.0902 SOURCE: Johnson, Norman L. and F. C. Leone.1977. Statistics and ExperimentaZ Design in Engineering and the Physical Sciences. Vol. I, Second Edition.John Wiley and Sons, New York. B-8 TABLE 5. TOLERANCE FACTORS (K) FOR ONE-SIDED NORMAL TOLERANCE INTERVALS WITH PROBABILITY LEVEL (CONFIDENCE FACTOR) Y = 0.95 AND COVERAGE P = 95% B-9 TABLE 6. PERCENTILES OF STUDENT's t-DISTRIBUTION (F=l-a; n = degrees of freedom) SOURCE: CRC Handbook of Tables for Probability and Statistics.1966. W. H. Beyer, Editor.Published by the Chemical Rubber Company. Cleveland,Ohio. B-10 ADJUSTING FOR NONDETECTED VALUES SOURCE: Cohen, A. C., Jr. 1961."Tables for Maximum Likelihood Estimates:Singly Truncated and Singly Censored Samples." Technometrics. B-11 (Continued) B-12 TABLE 8.CRITICAL VALUES FOR TS'(ONE-S IDEO TEST)WHEN THE STANDARD DEVIATION I CALCULATED FROM THE SAME SAMPLE ,.......01 1,;;porG.l'!lo u,,..O.ft U"..I~u".uro u"..ft U"..-IO'o......a-.Sipillcaaca Si~Si.......$jmticua Si....r_Sipir_•l.cw_..........lftd l .... 3 1.1S5 1.1"1.1"1.1"1.19 1.141•I.~J..496 I..-n 1.41&1.46)1.425,1.710 1.76&1.749 1.115 un 1.60: 6 1.011 U13 1-"'"l.n1 1.122 1.13 1 ::.:01 2.U'2.091 1.0:0 1.931 1.1%1 I :z.J'1 }.."7.1.1%1 2.1l6 2.on 1.909 9 2.492 1.311 1.3:3 2.11~%.110 1.'71 10 2.606 2.~2.410 U90 2.116 ~36 11 1.'70.5 U64 2.415 2.3"2.~UlU 12 1.791 1.636 U50 1.412 2.~2.1J~ 13 U61 %.699 U01 2.~:.J3J 2.115 14 :"15 %.",%.'"2317 2.J1J un " 1.991 1;106 1.10'2.,5:9 2.409 U~1 1ft M'2 U52 .,.....:.:0"2.-141 :~2'N•.1.' 17 ".IOJ :"194 %.11'2.6:0 2.·475 L'09 II 3.149 1.'32 UZJ 2.6"1.»&2Jl' I'3.191 1.961 U~2.611 2.532 1.361 20 J.lJO MOl uS:'2.709 %.S57 :'135 .:1 3.256 3.031 1.912 2.7J3 2.~80 :..;os u 3.300 3.060 2.'39 l.7~1 2.603 1.-129 23 3.3J1 3.087 2.~3 2.711 2.614 :'~I 24 3.362 l.1\2 2.'.7 :.!'J2 2.l>o"2..067 15 3.ll'l.ll'3.009 2.1:%2.66J Z.Q6 :a u"3.157 3.019 ~I 2.611 ~.~o~ 11 3.~3.111 3.lI'9 :'5.5'1 2.691 ~~,9 21 HW 3.19'l 3.D6I 1.r.r.2.114 ~~ 29 3.~3.l11 3.015 :.193 2.730 ~.., 30 3.507 J~6 3.IOJ 2.908 2.745 :.s63 JI 3.528 3.253 1.119 2.924 1.759 2.Si7 J2 3.546 lo2iO 1.135 1.9.n 2.iT3 1.~91 33 3~6S 3.1J6 3.150 2.951 2.716 2.!lQ.I ]-l 3~':1.301 1.1(04 %.965 2.'7'»:'616 35 3~99 lo3I6 3.111 l.9i9 :'111 2.~ 36 3.616 3.JJO 3.191 1.991 U23 !-639 l1 3.631 3.3-lJ 3.;04 3.003 :'IlS :'650 JI 3.(046 3.J5ft 3.216 l.Olol :.146 2.661 39 3.660 lo369 3.2:1 3~2.357 1.671 .lQ 3.613 3.381 l,Z.:.o J.OJ6 l.166 :'Ul ."3.617 3.39]J.~I ].00&6 2.S71 1.692 42 3.700 ]..flU J.l61 3.0;7 ~m l.1OO 43 3.112 3.oiU ]..21\3.067 1.196 2.110 44 3.r-~J.4~3.:S:!3.01'2.901 1.1194'3.736 ].4]5 3.:92 J.GSS l.Y14 2..121 .&6 3.1ol7 3.~S 3.302 3.Q9.&:'9::3 2.7J6 "j.,7ji ].';~5 3.310 3.103 2.911 2.7"- ~3.761 ].~3.319 ].111 2.~:'i~j 49 J.7';'9 3.474 3031')3.1::1)2.941 :'1W 'CJ 3.:'19 HI]J.3J6 J.l:!2.916 l ..i61 TABLE 8 (Continued) :-;_wo(U,pcrO.I'\.l:;tpcr 0.,""UJlIMI'I~U".,Z."Upper"U"",",IO'Io0___ Sipiraa_Si,air_Si ...r__Sipir_Sipr--.Si,airlC.1AC:C•y..L....L.....Lcnl y.e1 Lent n 3.191 3.0191 3~'J.I36 2.964 2.17' '2 3.101 3.jQO 3.3'3 3.1013 2.971 2.713 5)3.1"3.'117 3.361 3.151 ~.m 2.1'JO )4 3.ll'3.J16 3.361 ).,,'UN 2.791 "3.1!-a l.S2~3.376 3.166 2.9.2 2.304 "3.8-41 3.m 3.313 un 3.000 UII 57 3.ISI 3.S3~3.391 3.110 M06 2...1 'I 3.US 3.50:6 3.397 3.136 Mil 2.1201 ~3.8"3.J5J 3."0'3.193 3.019 2.831 1lO 3.S7"3.'eo 3.411 3.1'H 3.025 2.1J7 61 Hi:3.~'"UIS ~.:o,3.032 ~.S42 62 3.~~"..,-"l ..l~~1.:1:3037 2.~~~.....: 63 3.196 3,j~~3.0130 j.:~¥~0+&2.s5.l 64 3.903 3-'1<6 3.437 3'=~3.0019 ~.Mlll 6S 3.910 J.J+2 )....2 3.DO 3.0S5 :'S6O 66 UIT 3.,M 3•.w.3..13'3.061 2.111 67 3.9"..3 3.60'3.454 l.z.&1 3.066 U77 61 3.930 3.610 3.460 3~46 3.071 2.KSJ (>9 3.936 3.617 3.466 3.2.s2 3.076 2.US 70 3.9012 3.622 3.0171 3.257 3.082 U93 71 3.9011 3.6:7 3.476 3.262 3.087 2.897 72 uS-'l.633 3.42 3.:67 3.092 2.903 73 1.960 3.63&3.487 3.272 3.09S 2.9OS 74 3.96S 3.~3 3.492 3.27&3.102 2.912 is 3.971 3.Ma 3."96 3.:32 3.107 2.917 76 3.97"'3.6:4 3.J02 3.:17 3.111 2.912 77 3.'S!3.6:8 3..507 3.291 3.117 :'927 i~3.9$1 J.643 3.Jll ].297 3.1:1 2.931 79 3.99!3.~3.J16 UOI 3.1:5 2.935 ~Q 3.m un 3.521 3.105 1.130 2.9-10..01.002 lc677 1.J:ZS 3.lO9 3.114 2.94' !:!4.!lUi US2 3.'29 3.l1S 3.139 2.94\1 83 4.01:JolIS7 3.53"3.119 3.143 2.953 54 01.017 3.6").J39 U23 3.1017 2.957 85 4.021 3.H5 3-'43 3.3I.'3.151 2.961 a6 4.0:6 3.699 3-'47 J.331 3.m 3.966 87 4.031 }.7001 3.5,51 3.33'3.160 2.970 sa 403'3.708 3.J"3.339 3.163 2.973 I'4.03"3.712 3.Jj9 3~3 3.167 2.977 90 4.0+&3.716 3.563 3•.347 3.171 2.911 91 ol.OoI.3.7:0 3.567 3.J~J.174 2.944 ,:4.0n 3.7"'J 3..s70 u"3.179 2.98' 9J 4.0'i 3.728 3.J75 3.ns 3.IS2 2.993 94 4.<leO 3.i32 3,j79 J.J6:3.186 :.996 95 4.ON 3.736 3.J1:u"3.119 3.OOIl 96 4.0:14 3.73'3.S86 3.369 3.193 3.003 97 ".073 v.u J..sS9 U72 '.196 3.006 98 4.076 l.1~7 ).593 3.317 3.:01 3.011 Q9 ••QlUJ 3.750 3.m 3.310 3.~3.010& 100 ~.~3.7So&3.600 3.J13 3.207 3.017 (Continued) 8-13 171-8 II·SUO~leh.J.asqo 5u~AL+no 4+~"'1 5u ~L'eaa .J.oJ a:>~+:>'E.J.d papuawwo:>a~ P.J.'EPUl?~S II ·SL61 '5L-8L13 uO~+l?u6~sao WlS~:3JMnOS :-t lor t,n 6(h","~J,.or "q ~I:,.~:-[ :t:"(Il("f ~(/)(;:.~or I~H tf!:'"~tl C~;(K:n ~on r:n fo.;w"r tl:"";tt 'i Ii 9:n io)",';~T 'Wl"(.rn ':"1'1 ;;n t!n 10>',61n tSf 6O:t (tol if!'i :!n Mt,qlL'{ILH ..o~·t :"1 if j r lJ:n '-6"[tin fi9H ~c:"Iti '~I(lin i6"'l !In Lon (O:'t ~I q::,;In l~t(lil~,;o~',fJO:"t t-C ::I ,(In blit'(.Ili',~n "blt iii ::l"£I W,Lllt'i I'9.cT U'lH 961'"".,,;,.,. 61 !'r 60n l'!t',:c~(.,n (~I"t '1,\ O!l;'Ion :,""',v....,o,S'(()I,I'"~.,l. til",KJ(,0"'"l ~~H i,t",>;1ii t "r1 :ll(!On ~_t i i~H t.;q .Rlt if I 6fJ!'l R6r'(~L"t t69'{~tn fl!"t ." ":01 'r "..:',iLt (lInH ,tH ~l'"1,1 KJI (r~!'i C!H ~.·H ttl;'(~n'~rl :CH(16:",LQr',qil'/',Otli(.!!"t b:1 ~i-i bK:',;'In £t'r(BiH CLl't ~~I ~M(q~n "/t"011'1"('Irq /001'";.:l .~O(t""f CQt',!.!'I"'(fH '~l't q:i ;('or)"i li:'r ",t'C 'L'If ICH 1"/1"i':~ f\';IJ'i 6.:.r·~;~~'i ~~o·(L.it I'll'"r:I °'0'(9"'r ~';rT O!9 f t:H 6\'!""(::... ,;~(t~:'((ht',.0'1'(;:rl"i 1/,1'''~:\ I~p'r OL:'f ."'(,Y9"'(~IH (,I'''::1 I~O"C Hn rtt'£,99',~ln 0'1,.0:i S~IH ;9:','''''"(6.9"(tlH "tl'''fill ,~~(.9:'(I{t"(9,'1"(lit"(""I't "I OLO,6.:'('(t'r ,~lj"(llflt'(It\,.!Iiroo',L':"C Ht"(0;'1'£;Glif ~Cl~on t"/O'r ~!'£ort'(L"'H :on .(1 ',.;:j II/OC IS:'C L!n .~'(66!'{ria ,.11 t~O"C srn ,,:n .1"/'(96H 6:1'"(I I ..0',;tz'r :ZH 6(0'(fbO.'(;~t·t :'1 : .;or :t!'£Rlt'r '1('1 (06,('(~41·"III ~I'Il',bcn "t'r :f'H La~'r 6ilt 011 ~l'O'r 9,,',:t!"(b:9"£tW:.'('III'"foOl (t()'(rrn bOt',9:'1 (()Il~(:11'"~J! ~',Oin oOl"'r (ro'(LLLr ~r,:·t ~o:.,o'r L:L"{ion 0:''1'£,.~n ~Ol't 1101 WI'(t:L"r lIOn LJOH ILtr :Ol't ;tli OflH c:n ~6('{tll/'(R~."(S~'t f1l1 L.o'r LI:'(r6,'(011/'(,'In ~~O"t (01 t,ll'C ,.In D6rL 409'r 119n .(0("":01 1,0'(01:'(\Ian r09'r L~n ~l1lTt Inl--ta~'P-al .-,p~,p ..,pu,• '''''I:3lJ.uS$''''£:'!l'u',s '''':~~!I''''!S 'H~I""'S ~""!lIU:t!S ':Ia""!.l"~'""roll"\J-.qO ~I .udcJ"~'~fl '!t,"....J.1 :1 ~I~"'~.'oRdrl"~f"O.llOllQ.'l ~..;""""'!Ioi (panu~~UO:l)8 318\1'1 APPENDIX C GENERAL BIBLIOGRAPHY C-l The following list provides the reader with those references directly mentioned in the text.It also includes, for those readers desiring further information, references to literature dealing with selected subject matters in a broader sense.This list is in alphabetical order. ASTM Designation: E178-75. 1975."Standard Recommended Practice for Dealing with Outlying Observations." ASTM Manual on Presentation of Data and Control Chart Analysis.1976. ASTM Special Technical Publication 15D. Barari, A.,and L. S. Hedges.1985."Movement of Water in Glacial Till." Proceedings of the 17th International Congress of the International’ Association of Hydrogeologists. pp. 129-134. Barcelona, M. J.,J. P. Gibb, J. A. Helfrich, and E. E. Garske. 1985. "Prac- tical Guide for Ground-Water Sampling."Report by Illinois State Water Sur- vey, Department of Energy and Natural Resources for USEPA. EPA/600/2-85/104. Bartlett, M. S.1937."Properties of Sufficiency and Statistical Tests." Journal of the Royal Statistical Society, Series A.160:268-282. Box, G. E. P.,and J. M. Jenkins.1970. Time Series Analysis. Holden-Day, San Francisco, California. Brown, K. W.,and D. C. Andersen.1981,"Effects of Organic Solvents on the Permeability of Clay Soils."EPA 600/2-83-016, Publication No. 83179978, U.S. EPA, Cincinnati, Ohio. Cohen, A. C., Jr.1959."Simplified Estimators for the Normal Distribution When Samples Are Singly Censored or Truncated." Technometrics. 1:217-237. Cohen, A. C., Jr.1961."Tables for Maximum Likelihood Estimates:Singly Truncated and Singly Censored Samples.' Technometrics. 3:535-541. Conover, W. J. 1980.Practical Nonparametric Statistics. Second Edition, John Wiley and Sons,New York, New York. CRC Handbook of Tables for Probability and Statistics.1966.William H. Beyer (ed.).The Chemical Rubber Company. Current Index to Statistics.Applications,Methods and Theory.Sponsored by American Statistical Association and Institute of Mathematical Statistics. Annual series providing indexing coverage for the broad field of statistics. David, H. A. 1956."The Ranking of Variances in Normal Populations." Jour- nal of the American Statistical Association.Vol. 51, pp. 621-626. Davis, J. C.1986.Statistics and Data Analysis in Geology.Second Edition. John Wiley and Sons, New York, New York. C-3 Dixon, W. J.,and F. J. Massey, Jr.1983.Introduction to Statistica. Analysis. Fourth Edition.McGraw-Hill, New York, New York. Freeze, R. A.,and J. A. Cherry.1979.Groundwater.Prentice Hall, Inc., Englewood Cliffs, New Jersey. Gibbons, R. D. 1987."Statistical Prediction Intervals for the Evaluation of Ground-Water Quality.” Ground Water. Vol. 25, pp. 455-465. Gibbons, R. D.1988."Statistical Models for the Analysis of Volatile Organic Compounds in Waste Disposal Sites." Ground Water. Vol. 26. Gilbert, R.1987.Statistical Methods for Environmental Pollution Monitoring. Professional Books Series, Van Nos Reinhold. Hahn, G. and W. Nelson.1973."A Survey of Prediction Intervals and Their Applications."Journal of Quality Technology.5: 178-188. Heath, R. C.1983.Basic Ground-Water Hydrology.U.S. Geological Survey Water Supply Paper. 2220, 84 p. Hirsch, R. M.,J. R. Slack, and R. A. Smith.1982."Techniques of Trend Analysis for Monthly Water Quality Data." Water Resources Research. Vol. 18, No. 1, pp. 107-121. Hockman, K. K.,and J. M. Lucas.1987."Variability Reduction Through Sub- vessel CUSUM Control.Journal of Quality Technology. Vol. 19, pp. 113-121. Hollander, M.,and D. A. Wolfe. 1973.Nonparametric Statistical Methods. John Wiley and Sons,New York, New York. Huntsberger, D. V., and P. Billingsley.1981.Elements of Statistical Infer- ence.Fifth Edition.Allyn and Bacon, Inc., Boston, Massachusetts. Johnson, N. L.,and F. C. Leone.1977.Statistics and Experimental Design in Engineering and the Physical Sciences. 2 Vol., Second Edition. John Wiley and Sons, New York, New York. Kendall, M. G., and A. Stuart.1966.The Advanced Theory of Statistics. 3 Vol.Hafner Publication Company, Inc., New York, New York. Kendall, M. G.,and W. R. Buckland.1971.A Dictionary of Statistical Terms. Third Edition.Hafner Publishing Company, Inc., New York, New York. Kendall, M. G. 1975.Rank Correlation Methods. Charles Griffin, London. Langley, R. A. 1971.Practical Statistics Simply Explained.Second Edition. Dover Publications, Inc., New York, New York. Lehmann, E. L.1975. Nonparametric Statistical Methods Based on Ranks. Holsten Day, San Francisco, California. C-4 Lieberman, G. J.1958."Tables for One-Sided Statistical Tolerance Limits."Industrial Quality Control. Vol. XIV, NO. 10. Lilliefors, H. W.1967."On the Kolmogorov-Smirnov Test for Normality with Mean and Variance Unknown."Journal of the American Statistical Association. 64:399-402. Lingren, B. W.1976. Statistical Theory. Third Edition. McMillan. Lucas, J. M. 1982."Combined Shewhart-CUSUM Quality Control Schemes. "Jour- nal of Quality Technology. Vol. 14, pp. 51-59. Mann, H. B.1945."Non-parametric Tests Against Trend."Econometrica. Vol. 13, pp. 245-259. Miller, R. G., Jr.1981.Simultaneous Statistical inference.Second Edition. Springer-Verlag, New York, New York. Mull, D. S.,T. O. Liebermann, J. L. Smoot, and L. H. Woosley, Jr.1988. "Application of Dye-Tracing Techniques for Determining Solute Transport Characteristics of Ground Water in Karst Terranes." USEPA, EPA 904/6-88-001, October 1988. 103 pp. Nelson, L. S.1987."Upper l0%,5%, and l%--Points of the Maximum F- Ratio."Journal of Quality Technology. Vol. 19, p. 165. Nelson, L. S.1987."A Gap Test for Variances. "Journal of Quality Technol- ogy.Vol. 19, pp. 107-109. Noether, G. E. 1967.Elements of Nonparametric Statistics.Wiley, New York. Pearson, E. S.,and H. O. Hartley.1976.Biometrika Tables for Statistician. Vol. 1, Biometrika Trust, University College, London. Quade, D. 1966."On Analysis of Variance for the K-Sample Problem." Annals of Mathematical Statistics.37:1747-1748. Quinlan, J. F."Ground-Water Monitoring in Karst Terranes:Recommended Protocols and Implicit Assumptions." EPA/600/X-89/050, March 1989. Remington, R. D., and M. A. Schork.1970. Statistics with Applications to the Bio- logical and Health Sciences.Prentice-Hall, pp. 235-236. Shapiro, S. S., and M. R. Wilk. 1965."An Analysis of Variance Test for Nor- mality (Complete Samples)." Biometrika. Vol. 52, pp. 591-611. Snedecor, G. W.,and W. G. Cochran.1980.Statistical Methods. Seventh Edi- tion.The Iowa State University Press, Ames, Iowa. C-5 Starks, T. H.1988 (Draft)."Evaluation of Control Chart Methodologies for RCRA Waste Sites."Report by Environmental Research Center, University of Nevada, Las Vegas,for Exposure Assessment Research Division, Environmental Monitoring Systems Laboratory-Las Vegas, Nevada. CR814342-01-3. "Statistical Methods for the Attainment of Superfund Cleanup Standards (Volume 2:Ground Water--Draft)." Steel, R. G. D.,and J. H. Torrie.1980. Principles and Procedures of Statistics, A Biometrical Approach. Second Edition.McGraw-Hill Book Company, New York, New York. Todd, O. K.1980.Ground Water Hydrology.John Wiley and Sons, New York, 534 p. Tukey, J. W.1949."Comparing Individual Means in the Analysis of Vari- ance."Biometrics.Vol. 5, pp. 99-114. Statistical Software Packages: BMDP Statistical Software.1983.1985 Printing.University of California Press, Berkeley. Lotus l-2-3 Release 2.1986.Lotus Development Corporation, 55 Cambridge Parkway, Cambridge,Massachusetts 02142. SAS: Statistical Analysis System, SAS Institute, Inc. SAS® User's Guide:Basics, Version 5 Edition, 1985. SAS® User's Guide:Statistics, Version 5 Edition, 1985. SPSS: Statistical Package for the Social Sciences. 1982. McGraw-Hill. SYSTAT:Statistical Software Package for the PC. Systat, Inc., 1800 Sherman Avenue, Evanston, Illinois 60201. C-6 APPENDIX D FEDERAL REGISTER, 40 CFR, Part 264 D-l TuesdayOctober 11, 1988 Part II Environmental Protection Agency 40 CFR Part 264 Statistical Methods for Evaluating Ground-Water Monitoring From Hazardous Waste Facilities; Final Rule D-3 39729 Federal Register / Vol. 53. No. 196 / Tuesday, October 11, 1988 / Rules and Regulations final authorization will have to revise their programs to cover the additional requirements in today’s announcement. Generally. these authorized State programs must be revised within one year of the date of promulgation of such standards. or within two years if the State must amend or enact a statute in order to make the required revision (see 40 CFR 271.21). However, States may always impose requirements which are more stringent or have greater coverage than EPA’s programs. Regulations which are broader in scope, however, may not be enforced as part of the federally-authorized RCRA program. B. Regulatory Impact Analysis Executive Order 12291 (48 FR 13191, February 9, 1981) requires that a regulatory agency determine whether a new regulation will be “major” and, if so, that a Regulatory impact Analysis be conducted. A major rule is defined as a regulation that is likely to result in: 1. An annual effect on the economy of $100 million or more: 2. A major increase in costs or prices for consumers, individual industries. Federal. State. or local government agencies or geographic regions: or 3. Significant adverse effects on competition employment, investment. productivity. innovation, or the ability of United States-based enterprises to compete with foreign-based enterprises in domestic or export markets. The Agency has determined that today’s regulation is not a major rule because it does not meet the above criteria. Today’s action should produce a net decrease in the cost of ground- water monitoring at each facility. This final rule has been submitted to the Office of Management and Budget (OMB) for review in accordance with Executive Order 12291. OMB has concurred with this final rule. C. Regulatory Flexibility Act Pursuant to the Regulatory Flexibility Act, 5 U.S.C. 601 et seq., whenever an agency is required to publish a general notice of rulemaking for any proposed or final rule, it must prepare and make available for public comment a regulatory flexibility analysis which describes the impact of the rule on small entities (e.g., small businesses, small organizations. and small governmental jurisdictions). The Administrator may certify, however, that the rule will not have a significant economic impact on a substantial number of small entities. As stated above, this final rule will have no adverse impacts on businesses of any size. Accordingly, I hereby certify that this regulation will not have a significant economic impact on a substantial number of small entities. This final rule, therefore. does not require a regulatory flexibility analysis. List of Subjects in 40 CFR Part 264 Hazardous material, Reporting and recordkeeping requirements, Waste treatment and disposal, Ground water, Environmental monitoring. Date: September 28, 1988. Lee M. Thomas Administrator. Therefore, 40 CFR Chapter 1 is amended as follows: PART 264-STANDARDS FOR OWNERS AND OPERATORS OF HAZARDOUS WASTE TREATMENT, STORAGE, AND DISPOSAL FACILITIES 1. The authority citation for Part 284 continues to read as follows: Authority: Sees. 1006, 2002(a). 3004. and3005 of the Solid Waste Disposal Act. asamended by the Resource Conservation andRecovery Act, as amended (42 U.S.C. 6905.6912(a). 6924. and 6925). 2. in §284.91 by revising paragraphs (a)(l) and (a)(2) to read as follows: §294.91 Required programs. (a) l l l (I) Whenever hazardous constituents under §284.93 from a regulated unit are detected at a compliance point under §264.95, the owner or operator must institute a compliance monitoring program under §264.99. Detected is defined as statistically significant evidence of contamination as described in §264.98(f); (2) Whenever the ground-water protection standard under §264.92 is exceeded, the owner or operator must institute a corrective action program under §264.100. Exceeded is defined as statistically significant evidence of increased contamination as described in §262.99(d); ????. ??. 3. Section 264.92 is revised to read as follows: §294.92 Ground-water protectionstandard. The owner or operator must comply with conditions specified in the facility permit that are designed to ensure that hazardous constituents under §264.93 detected in the ground water from a regulated unit do not exceed the concentration limits under §264.94 in the uppermost aquifer underlying the waste management area beyond the point of compliance under §264.95 during the compliance period under §264.9E. The Regional Administrator will establish this ground-water protection standard in the facility permit when hazardous constituents have been detected in the ground water. 4. In §264.97 by removing the word “and” from the end of (a)(l), redesignating and revising (g)(3) as (a)(l)(i), adding (a)(3). revising paragraphs (g) and (h). and adding (i) and (j), to read as follows: §264.97 General ground-water monitoring (i) A determination of background quality may include sampling of wells that are not hydraulically upgradient of the waste management area where: (A) Hydrogeologic conditions do not allow the owner or operator to determine what wells are hydraulically upgradient; and (B) Sampling at other wells will provide an indication of background ground-water quality that is representative or more representative than that provided by the upgradient wells; and. ?????? (3) Allow for the detection of contamination when hazardous waste or hazardous constituents have migrated from the waste management area to the uppermost aquifer. ??????. . (g) In detection monitoring or where appropriate in compliance monitoring, data on each hazardous constituent specified in the permit will be collected from background wells and wells at the compliance point(s). The number and kinds of samples collected to establish background shall be appropriate for the form of statistical test employed. following generally accepted statistical principles. The sample size shall be as large as necessary to ensure with reasonable confidence that a contaminant release to ground water from a facility will be detected. The owner or operator will determine an appropriate sampling procedure and interval for each hazardous constituent listed in the facility permit which shall be specified in the unit permit upon approval by the Regional Administrator. This sampling procedure shall be: (1) A sequence of at least four samples, taken at an interval that assures, to the greatest extent technically feasible, that an independent sample is obtained, by reference to the uppermost aquifer’s effective porosity. hydraulic conductivity, and hydraulic gradient, and the fate and transport D-4 Federal Register / Vol. 53, No. 196 / Tuesday. October 11. 1988 / Rules and Regulations 39729 characteristics of the potential contaminants, or (2) an alternate sampling procedure proposed by the owner or operator and approved by the Regional Administrator. (h) The owner or operator will specify one of the following statistical methods to be used in evaluating ground-water monitoring data for each hazardous constituent which, upon approval by the Regional Administrator. will be specified in the unit permit. The statistical test chosen shall be conducted separately for each hazardous constituent in each well. Where practical quantification limits (pql’s) are used in any of the following statistical procedures to comply with §264.97(i)(5), the pql must be proposed by the owner or operator and approved by the Regional Administrator. Use of any of the following statistical methods must be protective of human health and the environment and must comply with the performance standards outlined in paragraph (i) of this section. (1) A parametric analysis of variance (ANOVA) followed by multiple comparisons procedures to identify statistically significant evidence of contamination. The method must include estimation and testing of the contrasts between each compliance well’s mean and the background mean levels for each constituent. (2) An analysis of variance (ANOVA) based on ranks followed by multiple comparisons procedures to identify statistically significant evidence of contamination. The method must include estimation and testing of the contrasts between each compliance well’s median and the background median levels for each constituent. (3) A tolerance or prediction interval procedure in which an interval for each constituent is established from the distribution of the background data, and the level of each constituent in each compliance well is compared to the upper tolerance or prediction limit. (4) A control chart approach that gives control limits for each constituent. (5) Another statistical test method submitted by the owner or operator and approved by the Regional Administrator. (i) Any statistical method chosen under §284.97(h) for specification in the unit permit shall comply with the following performance standards, as appropriate: (1) The statistical method used to evaluate ground-water monitoring data shall be appropriate for the distribution of chemical parameters or hazardous constituents. If the distribution of the chemical parameters or hazardous constituents is shown by the owner or operator to be inappropriate for a normal theory test, then the data should be transformed or a distribution-free theory test should be used. If the distributions for the constituents differ. more than one statistical method may be needed. (2) If an individual well comparison procedure is used to compare an individual compliance well constituent concentration with background constituent concentrations or a ground- water protection standard, the test shall be done at a Type I error level no less than 0.01 for each testing period. If a multiple comparisons procedure is used, the Type I experimentwise error rate for each testing period shall be no less than 0.05: however, the Type I error of no less than 0.01 for individual well comparisons must be maintained This performance standard does not apply to tolerance intervals. prediction intervals or control charts. (3) If a control chart approach is used to evaluate ground-water monitoring data, the specific type of control chart and its associated parameter values shall be proposed by the owner or operator and approved by the Regional Administrator if he or she finds it to be protective of human health and the environment. (4) If a tolerance interval or a prediction interval is used to evaluate groundwater monitoring data. the levels of confidence and. for tolerance intervals, the percentage of the population that the interval must contain, shall be proposed by the owner or operator and approved by the Regional Administrator if he or she finds these parameters to be protective of human health and the environment. These parameters will be determined after considering the number of samples in the background data base, the data distribution, and the range of the concentration values for each constituent of concern. (5) The statistical method shall account for data below the limit of detection with one or more statistical procedures that are protective of human health and the environment. Any practical quantification limit (pql) approved by the Regional Administrator under §264.97(h) that is used in the statistical method shall be the lowest concentration level that can be reliably achieved within specified limits of precision and accuracy during routine laboratory operating conditions that are available to the facility. (6) If necessary, the statistical method shall include procedures to control or correct for seasonal and spatial D-5 variability as well as temporal correlation in the data. (j) Ground-water monitoring data collected in accordance with paragraph (g) of this section including actual levels of constituents must be maintained in the facility operating record. The Regional Administrator will specify in the permit when the data must be submitted for review. 5. In §264.98 by removing paragraphs (i), (j) and (k), and by revising paragraphs (cl, (d), (f.), (g), and (h) to read as follows: §264.90 Detection monitoring program.l l l l (c) The owner or operator must conduct a ground-water monitoring program for each chemical parameter and hazardous constituent specified in the permit pursuant to paragraph (a) of this section in accordance with §264.97(g). The owner or operator must maintain a record of ground-water analytical data as measured and in a form necessary for the determination of statistical significance under §284.97(h). (d) The Regional Administrator will specify the frequencies for collecting samples and conducting statistical tests to determine whether there is statistically significant evidence of contamination for any parameter or hazardous constituent specified in the permit under paragraph (a) of this section in accordance with §264.97(g). A sequence of at least four samples from each well (background and compliance wells) must be collected at least semi- annually during detection monitoring.l l l l l l (r) The owner or operator must determine whether there is statistically significant evidence of contamination for any chemical parameter of hazardous constituent specified in the permit pursuant to paragraph (a) of this section at a frequency specified under paragraph (d) of this section. (1) In determining whether statistically significant evidence of contamination exists, the owner or operator must use the method(s) specified in the permit under §264.97(h). These method(s) must compare data collected at the compliance point(s) to the background ground-water quality data. (2) The owner or operator must determine whether there is statistically significant evidence of contamination at each monitoring well as the compliance point within a reasonable period of time after completion of sampling. The Regional Administrator will specify in the facility permit what period of time is reasonable, after considering the 39730 Federal Register / Vol. 53. No. 196 / Tuesdav. October 11. 1988 / Rules and Regulations complexity of the statistical test and the availability of laboratory facilities to perform the analysis of ground-water samples. (g) If the owner or operator determines pursuant to paragraph (f) of this section that there is statistically significant evidence of contamination for chemical parameters or hazardous constituents specified pursuant to paragraph (a) of this section at any monitoring well at the compliance point, he or she must: (I) Notify the Regional Administrator of this finding in writing within seven days. The notification must indicate what chemical parameters or hazardous constituents have shown statistically significant evidence of contamination: (2) Immediately sample the ground water in all monitoring wells and determine whether constituents in the list of Appendix IX of Part 264 are present, and if so. in what concentration. (3) For any Appendix IX compounds found in the analysis pursuant to paragraph (g)(2) of this section, the owner or operator may resample within one month and repeat the analysis for those compounds detected. If the results of the second analysis confirm the initial results. then these constituents will form the basis for compliance monitoring. If the owner or operator does not resample for the compounds found pursuant to paragraph (g)(Z) of this section. the hazardous constituents found during this initial Appendix IX analysis will form the basis for compliance monitoring. (4) Within 90 days, submit to the Regional Administrator an application for a permit modification to establish a compliance monitoring program meeting the requirements of §264.99. The application must include the following information: (i) An identification of the concentration or any Appendix IX constituent detected in the ground water at each monitoring well at the compliance point; (ii) Any proposed changes to the ground-water monitoring system at the facility necessary to meet the requirements of §284.99; (iii) Any proposed additions or changes to the monitoring frequency. sampling and analysis procedures or methods. or statistical methods used at the facility necessary to meet the requirements of §264.99; (iv) For each hazardous constituent detected at the compliance point, a proposed concentration limit under §264.94(a) (1) or (2). or a notice of intent to seek an alternate concentration limit under §26.94(b): and (5) Within 180 days, submit to the Regional Administrator: (i) All data necessary to justify an alternate concentration limit sought under §264.94(b); and (ii) An engineering feasibility plan for a corrective action program necessary to meet the requirement of §264.100. unless: (A) All hazardous constituents identified under paragraph (g)(2) of this section are listed in Table 1 of §264.94 and their concentrations do not exceed the respective values given in that Table: or (B) The owner or operator has sought an alternate concentration limit under §264.94(b) for every hazardous constituent identified under paragraph (g)(2) of this section. (6) If the owner or operator determines, pursuant to paragraph (f) of this section, that there is a statistically significant difference for chemical parameters or hazardous constituents specified pursuant to paragraph (a) of this section at any Monitoring well at the compliance point, he or she may demonstrate that a source other than a regulated unit caused the contamination or that the detection is an artifact caused by an error in sampling, analysis, or statistical evaluation or natural variation in the ground water. The owner operator may make a demonstration under this paragraph in addition to. or in lieu of. submitting a permit modification application under paragraph (g)(4) of this section; however, the owner or operator is not relieved of the requirement to submit a permit modification application within the time specified in paragraph (g)(4) of this section unless the demonstration made under this paragraph successfully shows that a source other than a regulated unit caused the increase, or that the increase resulted from error in sampling, analysis, or evaluation. In making a demonstration under this paragraph, the owner or operator must: (i) Notify the Regional Administrator in writing within seven days of determining statistically significant evidence of contamination at the compliance point that he intends to make a demonstration under this paragraph: (ii) Within 90 days, submit a report to the Regional Administrator which demonstrates that a source other than a regulated unit caused the contamination or that the contamination resulted from error in sampling, analysis. or evaluation: (iii) Within 90 days, submit to the Regional Administrator an application for a permit modification to make any D-6 appropriate changes to the detection monitoring program facility; and (iv) Continue to monitor in accordance with the detection monitoring program established under this section. (h) If the owner or operator determines that the detection monitoring program no longer satisfies the requirements of this section. he or she must, within 90 days, submit an application for a permit modification to make any appropriate changes to the program. 6. In §264.99 by revising paragraph (c). revising paragraphs (d), (4, and (g). removing paragraph (h). redesignating paragraph (i) as (h), (j) as (i) and (k) as (j), revising the redesignated paragraphs (h) introductory text and (i) introductory text, and removing paragraph (1) to read as follows: §264.99 Compliance monitoring program. ????????? (c) The Regional Administrator will specify the sampling procedures and statistical methods appropriate for the constituents and the facility, consistent with §264.97 (g) and (h). (1) The owner or operator must conduct a sampling program for each chemical parameter or hazardous constituent in accordance with §264.97(g). (2) The owner or operator must record ground-water analytical data as measured and in form necessary for the determination of statistical significance under §264.97(h) for the compliance period of the facility. (d) The owner or operator must determine whether there is statistically significant evidence of increased contamination for any chemical parameter or hazardous constituent specified in the permit. pursuant to paragraph (a) of this section. at a frequency specified under paragraph (f) under this section. (I) In determining whether statistically significant evidence of increased contamination exists. the owner or operator must use the method(s) specified in the permit under §284.97(h). The methods(s) must compare data collected at the compliance point(s) to a concentration limit developed in accordance with §264.94. (2) The owner or operator must determine whether there is statistically significant evidence of increased contamination at each monitoring well at the compliance point within a reasonable time period after completion of sampling. The Regional Administrator will specify that time period in the facility permit. after considering-the Federal Register / Vol. 53, No. 196 / Tuesday. October 11. 1966 / Rules and Regulations 39731 complexity of the statistical test and the availability of laboratory facilities to perform the analysis of ground-water samples.l l l l l [f] The Regional Administrator will specify the frequencies for collecting samples and conducting statistical tests to determine statistically significant evidence of increased contamination in accordance with §264.97(g). A sequence of at least four samples from each well (background and compliance wells] must be collected at least semi-annually during the compliance period of the facility. (g) The owner or operator must analyze samples from all monitoring wells at the compliance point for all constituents contained in Appendix IX of Part 264 at least annually to determine whether additional hazardous constituents are present in the uppermost aquifer and. if so. at what concentration. pursuant to procedures in §264.98(f). If the owner or operator finds Appendix IX constituents in the ground water that are not already identified in the permit as monitoring constituents. the owner or operator may resample within one month and repeat the Appendix IX analysis. If the second analysis confirms the presence of new constituents, the owner or operator must report the concentration of these additional constituents to the Regional Administrator within seven days after the completion of the second analysis and add them to the monitoring list. If the owner or operator chooses not to resample, then he or she must report the concentrations of these additional constituents to the Regional Administrator within seven days after completion of the initial analysis and add them to the monitoring list. (h) If the owner or operator determines pursuant to paragraph (d) of this section that any concentration limits under §261.94 are being exceeded at any monitoring well at the point of compliance he or she must:l l l l l (i) If the owner or operator determines. pursuant to paragraph (d) of this section. that the ground-water concentration limits under this section are being exceeded at any monitoring well at the point of compliance, he or she may demonstrate that a source other than a regulated unit caused the contamination or that the detection is an artifact caused by an error in sampling. analysis, or statistical evaluation or natural variation in the ground water. In making a demonstration under this paragraph, the owner or operator must:l l l l l [FR Doc. 88-22913 Filed 10-7-88; 8:45 am] BILLING CODE 6560-50-M D-7