Appendix D: Glossary
accuracy – the degree of agreement between the measurement of the measured value and its actual (true) value. Together, precision and bias determine accuracy.
action level – the generic term applied to any numerical concentration criteria that will be compared with environmental data to arrive at a decision or determination about a potential contaminant(s) of concern (from survey through remediation) or for a user-defined volume of media using environmental sample data.
aliquot – the term used to refer to a portion of a liquid solution or solid matrix that is taken out of the sample or subsample for analysis. Aliquot is somewhat synonymous with subsample and is often used incorrectly in the site cleanup industry to refer to an increment. This is inappropriate and confusing because these terms are not synonymous. Increment is the proper word for field samples that are added together (or pooled) to form a composite sample or an incremental sample.
analytical quality – the degree to which evidence demonstrates that all steps in an analytical process were performing with acceptable bias and precision. Quality is judged by the ability of the data to be used for their intended purpose. Since there are many types of decisions that vary in their need for data rigor, analytical quality that is acceptable for making one decision may not be acceptable for making a different decision. If documentation for the quality control in critical process steps is lacking, the data are said to be of “unknown” quality.
analytical sample – the portion of a soil (or other media) sample submitted to the laboratory that actually undergoes extraction or digestion to dissolve target analytes into a liquid that can be injected into an instrument for measurement. The term is interchangeable with the term analytical subsample.
analytical subsample – the portion of a soil (or other media) sample submitted to the laboratory that actually undergoes extraction or digestion to dissolve target analytes into a liquid that can be injected into an instrument for measurement. The term is interchangeable with the term analytical sample.
analytical variability – the imprecision in data results that are attributable to the analytical process of extraction/digestion of an analytical sample, cleanup of the ensuing extract (if performed), introduction of the extract into the analytical instrument, and the operation (calibration, signal stability, maintenance, etc.) of the analytical instrument itself. The degree and sources of analytical variability are measured by analytical quality control checks, typically measured in terms of standard deviation, relative standard deviation, or relative percent difference.
area of influence – the area of soil surrounding a sample that is considered to have the same concentration as the sample. This concept is equivalent to the sampling unit concept, which more explicitly considers soil volume, rather than just area.
arithmetic mean – the sum of x measurements divided by x (all measurements are equally weighted).
average – see arithmetic mean.
bias-corrected accelerated (BCa) bootstrap method – a modification of the percentile bootstrap 95% upper confidence limit, which attempts to address the issue of insufficient coverage.
bias – the tendency for a measurement to consistently over- or underestimate the actual (true) value. Precision and bias together determine accuracy.
bulk soil – generally, native soil that has not been sieved. However, it may contain components that are not considered to be “soil” in its strictest sense, such as twigs and other macro plant fragments, living creatures (insects, worms, etc.), stones larger than the 10-mesh sieve fraction (2 mm), man-made debris, and/or trash.
bulk soil sample – a soil sample expected to be representative of native soil. Soil is defined by the U.S. Department of Agriculture as mineral and organic material that is less than the 10-mesh sieve fraction (<2 mm in diameter). A bulk soil sample contains both the fine fractions, usually considered to be the less than 60-mesh fraction (<0.25 mm in diameter), and the coarse fractions, the material between 60- and 10-mesh (0.25 to 2 mm in diameter). The soil fraction that does not pass through a 10-mesh sieve is often referred to as oversized material and removed from the bulk soil sample during processing.
co-located sample – soil samples collected a few inches to a few feet apart as a quality control check and sometimes performed in traditional discrete or grab sampling. In discrete sampling plans, co-located samples provide valuable information about short-scale spatial heterogeneity and whether it is causing significant sampling error that could lead to decision errors. Because there are usually two samples involved, quantitation of the variation/precision between co-located samples usually uses the relative percent difference as the measure.
composite sample – a sample composed of two or more increments, which generally undergoes some preparation procedures designed to reduce the errors associated with obtaining a measurement from the combined sample. An ISM sample is a type of composite sample whose collection and preparation steps are designed using the general suggestions of Gy’s theory of sampling. Traditional composite samples generally do not consist of a large volume or a large number of increments and do not undergo the same preparation and subsampling steps suggested by Gy’stheory. The mean for the area covered by a composite sample is not expected to be as accurate as a mean produced by an incremental sample.
compositional heterogeneity (CH) – the heterogeneity arising from the composition of each particle within a decision unit. CH is inherently dependent on the composition, shape, size, density, etc., of the particles or fragments making up the lot. CH is synonymous with constitution heterogeneity.
conceptual site model (CSM) – a written or pictorial representation of an environmental system and the biological, physical, and chemical processes that determine the transport of contaminants from sources through environmental media to environmental receptors within the system. The CSM should include the contamination release mechanisms; fate and transport mechanisms and how long they have been acting on the contamination and creating short-scale heterogeneity; what receptors are currently present and potential future receptors; what exposure pathways may exist currently and potential future exposure pathways; what exposure routes of entry may exist currently and potential future exposure routes for each receptor; and what, if anything, will be done about the contamination and/or intact exposure pathways.
confidence limit (CL) – the numbers at the upper (95% upper confidence limit) and lower (lower confidence limit) end of a confidence interval. The 95% CL is the most frequently used, although other values can be used.
constitution heterogeneity – synonymous withcompositional heterogeneity (CH).
correct samples – the term used in the Gy-based theory of sampling to designate samples for which representativeness is known and documented. The analytical data from correct samples can be relied upon to support correct decisions.
coverage – for statisticians, the probability that a confidence interval encloses or captures the true population parameter. For example, a calculated 95% upper confidence limit is intended to have a 95% chance of being equal to or exceeding the true (population) arithmetic mean. For field investigators, coverage is the extent to which the density of sampling locations represents the sampling unit (i.e., spatial coverage).
data quality – the fitness of data for their intended use. Since soil data come from soil samples, data quality must include sampling quality as well as analytical quality. The analysis can be perfect, but if the sample was “wrong” (perhaps degraded or mislabeled or the wrong particle sizes analyzed), the data quality is “bad” in that the results can give misleading information.
data quality objective (DQO) – a qualitative and quantitative statement derived from the USEPA DQO process that clarifies a study’s technical and quality objectives, defines the appropriate type of data, and specifies tolerable levels of potential decision errors that will be used to establish the quality and quantity of data needed to support decisions.
data representativeness – a measure of the degree to which data accurately and precisely represent a characteristic of a population.
data uncertainty – a lack of confidence that data can be used for a particular application. Data are uncertain when there is insufficient documentation to explain how samples were collected, processed, or subsampled (i.e., sample representativeness is unknown); insufficient documentation of quality control (QC) is available to document sources of sampling and analytical variability so the bias and precision of the data are unknown; QC documentation shows that data are too biased; or data precision is too poor to support confident decisions. Data uncertainty is always present to some degree for soil data, however, data uncertainty becomes excessive when the risk of incorrect decisions exceeds tolerable levels.
data validation – an analyte- and sample-specific process that extends the evaluation of data beyond method, procedure, or contractual compliance (i.e., data verification) to determine the quality of a specific dataset relative to the end use. Data validation focuses on the project’s specifications or needs, is designed to meet the needs of decision-makers/data users, and should note that potentially unacceptable departures from the Quality Assurance Project Plan. The validation process may look at the laboratory quality control (QC) checks for sample extraction and digestion and for extract cleanup. Data validation always includes evaluation of the laboratory QC used to evaluate instrument calibration and performance and may also check mathematical calculations. In this way, data validation is important to establish the precision and bias of the pure analytical process, but it does little to establish data representativeness or estimate data uncertainty.
decision – a determination made about a potential contaminant(s) of concern or for a volume of media using environmental sample data.
decision error – making an incorrect decision, such as deciding that cleanup is needed when it is not or missing a cleanup that is warranted based on site-specific information, including measurement from samples collected within the decision unit.
decision mechanism – an algorithm or protocol that results in a decision about a potential contaminant of concern or for a volume of media. A variety of decision mechanisms are possible when using ISM sampling. Each decision mechanism has strengths, weaknesses, and assumptions. In some cases, agency requirements will dictate the decision mechanism to be used. In other cases, a consensus on the decision mechanism to be employed needs to be reached among members of the planning team prior to finalization of the sampling plan.
decision threshold – any type of numerical value used for an exceedance/non-exceedance decision, such as a screening level, an action level, a cleanup level, a criterion, etc.
decision unit (DU) – the smallest volume (i.e., plan area and depth) of soil (or other media) for which a decision will be made based upon ISM sampling. A DU may consist of one or more sampling units. It is an incorrect use of the term DU when used to represent all ISM sample results, regardless of decision type or intended use.
delimitation error – the error that results from incorrect shape or nonuniform volume of material extracted from the decision unit or sampling unit to form the sample. Often occurs when improperly shaped tools or incorrect equipment are used to collect increments.
disaggregation – the act of breaking soil peds (clods or clumps) into individual small particles but keeping the small pebbles and hard crystalline (mineral soil) particles intact. Disaggregation is often performed by crushing soil peds using fingers, a hand-operated mortar and pestle, a coffee grinder, a rubber mallet, etc. Disaggregation does NOT involve particle size reduction, which is the breaking apart or crushing of individual solid particles by milling. Mills are able of reducing solid rock particles to the consistency of flour.
discrete soil sample – a soil sample obtained from the parent matrix by scooping or coring from a single location at a single point in time. May also be termed a “grab sample,” especially if the sample has been collected without consideration of a statistically valid sampling design or a representative sample support. Grab samples are almost always incorrect samples.
distributional heterogeneity – the heterogeneity describing the nonuniform distribution at all scales of types of fragments or particles within a sample, across a sampling unit or decision unit, or across a site.
energetics – residues that are unreacted explosives and propellant compounds that remain after firing or detonating munitions as defined in USEPA SW-846 Method 8330B.
exposure point concentration (EPC) – an estimate of the concentration of a constituent in an environmental medium to which a receptor will be exposed. The EPC can be determined for an entire site or for an individual exposure unit. The EPC is based on a statistical derivation of either measured data or modeled data. In risk assessment, an EPC is typically based on a 95% upper confidence limit so that that risk-based decisions are protective of human health and the environment.
exposure unit (or exposure area decision unit) – a decision unit that is used to make decisions about risk or a volume of an environmental medium (for example, soil) over which a receptor is reasonably assumed to move randomly and is therefore equally likely to contact all locations.
extraction error – the error that results from incorrectly extracting the increment from the decision unit (DU) or sampling unit. An example is loose material in the bottom part of the corer falling out and back into the hole when sampling dry, sandy soils with an open-bottom corer. The sample would then over-represent the upper part of the DU volume because the portion of the core representing the lower part of the DU is lost.
field replicate samples – two or more incremental samples independently collected from the same decision unit or sampling unit using the same number of increments but from offset increment collection locations.
fundamental sampling error – the sample variability that results from the constitutional heterogeneity of soil. Fundamental sampling error is always present and can be estimated, but its magnitude depends in part on sample mass relative to particle size. It can be reduced by reducing particle size and/or increasing sample mass. For soil with particles up to 2-mm diameter and analyte concentrations of interest in the mg/kg range, a sample mass of 1 kg is typically required to control variability due to fundamental error with 15 to 20% relative standard deviation. Particle size must be reduced before sample mass is reduced (e.g., subsampling) to maintain control of fundamental error.
grand mean – the arithmetic mean of all ISM replicates from the same decision unit or sampling unit.
grinding – a generic term for soil disaggregation or milling. When using the term grinding, you must specify the equipment to be used to help ensure an accurate understanding of whether the intent is disaggregation or milling. Soil that is “ground” in a coffee grinder or in a hand-operated mortar and pestle will only be disaggregated because those grinders do not have the force needed to fracture non-friable mineral and rock particles (sands, pebbles, etc.). To provide clarity of intent, the term grinding should be used when the intent is disaggregation and the term milling should be used when the intent is particle size reduction.
grouping and segregation error – sample variability resulting from the short-range distributional heterogeneity within and around the immediate area from which a sample is collected (i.e., the sampling location) and developing within the sample container after sample collection. Particles tend to associate into groups of like particles due to gravitational separation, chemical partitioning, differing moisture content, magnetism, or electrostatic charge, which can lead to sampling bias).
Gy-compliant (procedure) – sample collection, sample processing, sample splitting, and subsampling procedures that comply with the theory of sampling by using activities that minimize sampling errors, in particular, fundamental sampling error, grouping and segregation error, and increment delimitation and extraction error. Using Gy-compliant procedures for all steps of the sample collection and analysis process produces correct samples for which representativeness is known.
heterogeneity – the condition of spatial nonuniformity in the distribution of soil constituents. All soil is heterogeneous. In a first analysis, there are two fundamentally different types of heterogeneity in soil: heterogeneity due to the dissimilar and diverse constituents of the individual particles, and heterogeneity due to the nonuniform spatial distribution of different types of particles within the soil. These are identified as CH and distributional heterogeneity. Compositional heterogeneity, also called micro-scale heterogeneity, is responsible for fundamental sampling error and is reduced by increasing sample mass. Distributional heterogeneity is present at all scales. Variability due to distributional heterogeneity is addressed by site stratification into logical sampling units or decision units and by adjusting the number of increments.
hot spot – generally described as an area of elevated contamination (ITRC 2008). A hot spot is not typically identified visually (i.e., stained soil, free product) but is primarily identified on the basis of chemical concentrations detected in soil sample results. For meaningful discussion, the specific size and magnitude of chemical concentrations that constitute a hot spot should be agreed on during systematic project planning.
incorrect samples – the term used in Gy-based theory of sampling to designate samples for which representativeness is not known and cannot be determined. No matter how good the analytical quality is, data from these samples are not reliable to support decision-making.
increment – a volume of soil collected from a single point within a decision unit (DU) or sampling unit (SU) that is collected with a single operation of a sampling device. Multiple increments (typically 30 or more) are collected from a DU or SU and combined to form an incremental sample. This term should be used instead of the termaliquot, which actually has the opposite meaning. An increment is something added in or added together, an aliquot is something taken out, like a portion of extract taken from a flask to inject into an analytical instrument.
incremental sample – a sample formed from multiple increments collected from a defined volume of soil, the decision unit (DU) or sampling unit (SU), which are combined, processed, and analyzed to estimate the mean concentration in that DU or SU.
independent sample – a stand-alone sample whose result is not dependent on any other samples. For example, each decision unit is often sampled by collecting three independent field replicate samples, not by splitting a single incremental sample or by collecting co-located increments.
laboratory replicate – two or more subsamples taken from a single field sample. Synonymous with subsample replicate. Not to be confused with a laboratory instrument replicate, which is repeated measurement of a sample to determine precision for the instrument.
laboratory control sample – a known matrix spiked with compound(s) representative of all target analytes.
large-scale distributional heterogeneity – nonuniform distribution of differences in analyte concentration from location to location across an area or differences in how contaminants are spatially distributed throughout the decision unit (DU) or sampling unit (SU). Variability in results due to the nonuniform distribution of analytes across the DU/SU is controlled by increasing the number of increments making up the sample. This is the spatial scale at which heterogeneity becomes important to decision-making. Also synonymous with long-scale heterogeneity.
long-scale heterogeneity – synonymous with large-scale heterogeneity.
matrix spike/matrix spike duplicate – environmental samples that are spiked in the laboratory or in the field with a known concentration of a target analyte(s) to verify percent recoveries. Matrix spike/matrix spike duplicate samples are primarily used to check sample matrix interferences. They can also be used to monitor laboratory performance. A duplicate spike is used to assess bias and precision.
micro-scale heterogeneity – differences in size and composition between individual soil particles. Often due to some soil particles being composed of minerals that more readily adsorb contaminants than other soil particles. See also compositional heterogeneity. Not to be confused with short-scale distributional heterogeneity.
milling – complete particle size reduction of all soil components including hard crystalline materials to a defined maximum particle size (<250 µm or <75 µm). The terms “pulverization” and “comminution” are synonymous with milling. The types of mills commonly used with soil samples include various types of laboratory-grade ball mills and ring and puck mills. This magnitude of particle reduction reduces subsampling variability.
nature and extent decision unit – a decision unit (DU) based on the reasonably well-known known location and dimensions of a source area. Synonymous with a source area DU.
nonparametric distribution – when the shape of the statistical data distribution curve cannot be plotted by a mathematical formula.
parametric distribution – when the shape of the statistical data distribution curve can be plotted by a mathematical formula. Examples of parametric data distributions commonly observed with environmental datasets are normal distributions (another name for bell-shaped curves), lognormal distributions, and gamma distributions.
percent relative standard deviation (%RSD) – a measure of imprecision when two or more replicate procedures are performed. The RSD is the arithmetic standard deviation divided by the arithmetic mean multiplied by 100.
population of soil – a volume of soil that shares a common characteristic. The project decision to be made defines the soil population of interest, such as the volume of soil exceeding the cleanup criteria or background concentration.
population of potential soil samples – all potential soil samples within a decision unit (DU), exposure unit, or other defining boundary. If a soil sample is considered to be 100 grams of soil in a jar, and the DU is a mass of soil weighing 2000 kg, then 20,000 potential soil samples make up the population defined by the DU.
precision – a measure of reproducibility. Together precision and bias determine accuracy.
quality – the standard of something as measured against other things of a similar kind; the degree of excellence of something.
quality assurance – a management or oversight function that deals with setting policy and running an administrative system of management controls that cover planning, implementation, and review of data collection activities and the use of data in decision-making.
quality control – a technical function that includes all the scientific precautions, such as calibrations and replication, needed to acquire data of known and adequate quality.
receptor – a human or ecological individual (e.g., recreational visitor or piping plover) or general ecological population (e.g., benthic invertebrates) that could be exposed to contaminants in environmental media.
relative difference (RD) – a measure of imprecision when only two replicate procedures (i.e., duplicates) were performed. The most common formula for RD is to subtract one replicate from the other and divide that difference by the average of the two replicate results. Unlike relative percent difference, the fractional result is not then multiplied by 100.
relative percent difference (RPD) – a measure of imprecision that can be used when only two replicate procedures (i.e., duplicates) were performed. The most common formula for RPD is to subtract one replicate from the other and divide that difference by the average of the two replicate results. The fractional result is then multiplied by 100. Note the RPD is not equal to the relative standard deviation of the same two sample results.
relative standard deviation (RSD) – a measure of imprecision when two or more replicate procedures were performed. The RSD is the arithmetic standard deviation divided by the arithmetic mean. Also called the coefficient of variation.
replicate samples – two or more independently collected field samples or laboratory subsamples obtained from the same lot of soil by the same sampling or subsampling procedure to measure the precision of the results. Replicate samples are not split but are independently collected incremental samples. See also sampling quality.
representative soil sample – correctly answers the desired question about a decision unit or sampling unit with an acceptable level of confidence. A sample that is representative to answer one question is not likely to be representative to answer a different question.
representative analytical sample – has the same property of interest as the field soil sample from which it is collected.
representativeness – a description of the degree to which an estimate or measurement agrees with the true value of the parameter of interest. The most representative estimate is the one that has the least total error (or greatest precision and accuracy). “A sample of a universe or whole which can be expected to exhibit the average properties of the universe or whole” 40 CFR § 260.10) (USEPA 2002a).
sample – a small part or quantity intended to show what the whole is like. An incremental sample is formed by the reunion of multiple increments obtained from a defined volume of soil (the sampling unit or decision unit).
sample support – the size (mass or volume), shape, and orientation of an increment or a sample.
sampling density – the number of discrete samples or increments per area or volume of soil.
sampling error – anything during sample collection and handling that causes the measured properties of a sample to deviate from the actual properties of the population. See also sampling variability.
sampling quality – the degree to which evidence demonstrates that all steps related to acquiring representative samples from the field and preserving that representativeness through laboratory subsampling were performed with acceptable bias and precision in the sample collection, processing, and subsampling.
sampling unit – the volume of soil from which increments are collected to determine an estimated mean concentration of analytes of interest for that volume of soil.
sampling (or subsampling) variability – imprecision in data results due to various factors in sampling design, field sample collection, and laboratory sample processing and subsampling procedures. Common causes include insufficient number of increments, incorrect sample support, and inadequate laboratory sample processing. The term sampling error is synonymous with sampling variability. The word “error” is commonly used in statistics to refer to variability or imprecision. The degree and sources of sampling variability are measured by replicate sampling at various steps in the sampling, processing, and subsampling processes. Variability is typically measured in terms of standard deviation, relative standard deviation, or relative percent difference between replicate samples.
short-scale distributional heterogeneity – nonuniformity in the distribution of analytes of interest at spatial scales too small to be relevant at the scale of decision-making. The scale is too small to allow separation of “clean” versus “dirty” soil during cleanup and too small to be meaningful to the receptors identified during risk assessment. (Note: the meaningful spatial scale can be vastly different depending on the receptor, such as an earthworm vs. a human resident versus a fox.)
slabcake – an entire incremental field sample spread out in a pan (such as a foil-lined cookie sheet) to about ¼- to ½-inch depth.
slabcake subsampling – an entire incremental field sample spread out in a pan (such as a foil-lined cookie sheet) to about ¼- to ½-inch depth. At least 30 small increments are taken from the full thickness of the slabcake and combined to form the analytical sample.
soil – fragmented particulate material consisting of discrete rock and mineral particles less than 2.0 millimeters in size. Small amounts of organic matter (humus). This document uses the term soil, understanding that other solid particulate media can also be assessed using this methodology.
specimen – a discrete individual example from a population. The term specimen is sometimes used in place of the term sample when the data user does not know how a sample was collected and handled to convey the concern that the representativeness of the sample or subsample relative to the population from which it was take is unknown.
standard deviation – a measure of the dispersion or imprecision of a sample or population distribution equal to the positive square root of the variance.
statistic – a calculated numerical value (such as the sample mean) that characterizes some aspect of a sample set of data and that is often meant to estimate the true value of a corresponding parameter (such as the population mean) in an underlying population.
statistical data distribution – assessed by a frequency plot (similar to a histogram) of the data. Frequency plots can take any number of shapes. See also parametric distribution and nonparametric distribution.
stratification – splitting a population into subgroups that are “internally consistent with respect to a target constituent or property of interest and different from adjacent portions of the population.
stratify – to form, deposit, or arrange into layers.
subsample – a small portion of material selected from a field sample by the laboratory for analysis. In ISM, a subsample is a representative composite of increments collected from an incremental field sample by the laboratory for analysis. See also aliquot.
subsampling – in ISM, collecting a small, representative portion of a field sample from a processed incremental field sample by spreading the processed field sample in a two-dimensional layer (slabcake) and combining multiple small increments taken from random locations through the entire thickness of the layer. Or better, by forming the processed field sample into an elongate pile (one-dimensional line) and collecting increments taken completely through the line from random locations in the subsample. See also slabcake subsampling.
subsample replicate – two or more subsamples taken from a single field sample.
Superfund – the Comprehensive Environmental Response, Compensation and Liability Act (CERCLA) established by Congress in 1980. CERCLA is informally called Superfund and allows EPA to clean up contaminated sites when responsible parties are either unable or unwilling. It also forces the parties responsible for the contamination to either perform cleanups or reimburse the government for EPA-led cleanup work.
target particle size – the particle size of soil that is relevant to the decision to be made. Selecting the correct target particle size is important because the concentration of soil contaminants generally increases as the particle size fraction analyzed decreases.
theory of sampling – developed by Pierre Gy, a comprehensive approach to representative sampling of bulk particulate materials (including soil) that includes a complete analysis of the sources of variability in sample results and the representativeness of the sampling methods, procedures, and equipment used. The theory covers at least seven distinct ways that heterogeneous particulate materials affect sampling integrity. The resulting variability, bias, and non-representativeness of data are collectively termed sampling errors. Gy’s sampling errors originate from three general sources: the material being sampled, the effectiveness of the sampling equipment, and whether the sampling procedures use that equipment correctly (Minkkinen and Esbensen 2018).
total sampling error – the cumulative error or variability from all stages of the sampling, processing, and analytical steps.
upper confidence limit (UCL) – a statistical way to derive an upper estimate of the mean. The UCL is calculated by adding a “safety factor” to the mean obtained from the sample set. The “safety factor” takes into account the number of samples used in the calculation, the variability in the sample results, and the desired level of confidence that the estimate of the mean does not underestimate the true mean. Different mathematical formulas for UCL calculation depend on the statistical distribution of the data and the level of confidence desired that the UCL is above the true mean.
upper confidence limit, 95% – the calculated statistical value that we are 95% confident above the true value of the mean.Visual Sample Plan – a free software program developed by Pacific Northwest National Laboratory that supports the development of defensible sampling plans (including multiple increment sampling approaches) based on statistical sampling theory and the statistical analysis of sample results to support confident decision-making.
Click Here to download the entire document.