155  NONSTANDARD TERMS  2004 Jul 13  MATH 155 : STATISTICS  Dr. Luft

Certain terms in Statistics by Power are nonstandard.
Most of the nonstandard terms are introduced because there are NOT standard terms to use for objects we need to discuss,  but some terms are taken or adapted from MINITAB.

The subject of statistics consists largely of procedures to describe a data set or to draw inference about a population from characteristics of a sample.  Some of these procedures are long and complicated, and some of the steps and objects have no standard name.  This book introduces nonstandard terms for many of these steps and objects, to clarify discussion of the procedures.  Many chapters will summarize which terms in the chapter are nonstandard.

Chapter Two

"Contingency table", "one-way", and "two-way" are all standard terms, but the adjectives "dependent" and "independent" are not usually applied to tables.  The non-standard term row fraction is a variation of MINITAB's non-standard term row percent.  The term difference table is descriptive but not standard.  The non-standard term difference table refers to  a table of cell-by-cell differences of two contingency tables having the same numbers of rows and columns: the first is a dependent table and the second is an associated independent table.  Simple skewness is not a standard term.

Chapter Three

The scores X1, X2, ..., Xn on n subjects chosen randomly from the same population all represent random variables.  If repetitions of the same subject are allowed (whether they occur or not), the above random variables are independent, and we describe this process with the term independent sampling, which is a non-standard term for sampling with replacement.  But if repetitions of the same subject are not allowed, the random variables are dependent, and we describe the process with the term dependent sampling, which is a nonstandard term for sampling without replacement.

Any variable computed from the scores X1, X2, ..., Xn on n subjects is called a sample statistic.  The sampling distribution of such a sample statistic is the set of its levels (possible values), together with the probability that each level will occur in a random sample.  (In a given application, one must stipulate sampling with or without replacement.)  Some writers assume the population is infinite, but throughout this book, we assume that populations are finite.  In this chapter, we discuss the sampling distribution of the count of subects in the sample which belong to the counted category.  This sample count may be considered the sum of 1's and 0's assigned to the subject scores according to whether the subjects do or do not belong to the counted category.  From this point of view, a binomial variable is both a sample count and a sample sum.  In Chapter Six we discuss the sampling distribution of the mean of (interval) scores.

"Population proportion" is an old-fashioned term which means the fraction of subjects in a population which fall into a certain category.  In traditional arithmetic, the words "proportion" and "fraction" are equivalent, but outside mathematics they can have different meanings.  Hence, we use the non-standard term population fraction to make clear we mean a number which is a fraction.

Chapter Four

In many statistical analyses, we distinguish subjects belonging or not belonging to a certain category; we give the population the non-standard name two-part population, and the category the non-standard name counted category.  Thus the non-standard term population fraction means the fraction of subjects in a two-part population which belong to the counted category.  Furthermore, the non-standard term sample fraction means the fraction of subjects in a sample which belong to the counted category.

In this book we do not use the classical critical value approach to hypothesis testing, but prefer the p-value approach, because it gives more information, and because the p-value is computed by modern computer programs for applied statistics.  However, the p-value approach requires many non-standard terms to describe.

Some of these non-standard terms refer to events.  A statistical event is a statement about a sample, usually involving a statistic.  A statistical event using an equality is called a sample event; but one using an inequality is called a supporting event, provided its inequality is in the same direction as the inequality in the alternative hypothesis.

Other non-standard terms refer to the probabilities of certain events under certain conditions.  The p-value is the probability of a certain supporting event, under the assumption that the population has a certain feature.  Some non-standard synonyms are observed level of significance and Bradley Efron's actual level of significance.  The question "at what level is the evidence?" is a nonstandard request for the p-value.

The hypothetical value of a parameter is the number which appears in the hypotheses concerning that parameter.  Thus we may speak of the hypothetical population fraction, median, or mean.

The contingency table statistic is the well-known quantity (computed from a contingency table) which is asymptotically chi-square distributed.  We do not call it chi-square, because its distribution often deviates significantly from that of chi-square, in a way that can mislead us to think the p-value is smaller than it is.

Chapter Six

In computing the Wilcoxon signed-rank sum for a matched pairs design, we perform two subtractions.  First we subtract corresponding scores within pairs, obtaining a preliminary set of differences on pairs.  Then from each of the preliminary differences we subtract the hypothetical difference of population means to obtain the final differences.
 


Chapter Two
§1.3  two-way table
§1.6  row fraction                   inspired by MINITAB's row percent
§1.7  dependent table
§1.7  associated independent table
§1.7  difference table (a difference of two contingency tables)
§3.3  simple skewness

Chapter Three
§3.3  independent sampling        sampling with replacement
§3.3  dependent sampling          sampling without replacement
§???  population fraction
§???  EVENT

Chapter Four
§1.1  sample fraction
§1.1  counted category
§1.3  two-part population
§2.1  at what level is the evidence
§2.1  observed level of significance §2.3?
      statistical event
      sample event
§2.2  supporting event
§4.1?/4.2?/5.1   contingency table statistic
§???  hypothetical population fraction

Chapter Five
§???  hypothetical mean

Chapter Six
§???  hypothetical mean
§1    final differences