Subject: Re: How Should We *Motivate* Students in Intro Stat?

To: EdStat-L Statistics Education Discussion List,
sci.stat.edu Usenet Newsgroup

From: Donald B. Macnaughton <donmac@matstat.com>
formerly donmac@hookup.net)

Date: Sunday April 6, 1997
slightly revised on June 1, 1997

cc: John R. Vokey <vokey@hg.uleth.ca>

SHOULD WE NARROW THE FOCUS OF THE INTRODUCTORY COURSE?
Referring to an earlier post of mine (1997) John Vokey (1997)
suggests that

>    ( snip )
> narrowing the focus [of the introductory statistics course] to
> application and use of statistics in a particular discipline
> (or sub-discipline) rather than the broad (but often thereby
> shallow) coverage more usually attempted would help with stu-
> dent motivation

John uses the concept of the *focus* of the introductory course.
It is useful to distinguish between two senses of the "focus"
- the focus of the *examples* we use in the course
- the focus we place on the *underlying statistical concepts*.

I believe we should (when possible) narrow the focus of the exam-
ples.  In particular, we can substantially heighten student in-
terest if we select the examples in an introductory course from
the students' chosen field of study.  In this sense of "focus"
(which I suspect is John's intended sense) I agree that we should
narrow the focus of the introductory course.

On the other hand, I doubt we can narrow the focus we place on
the underlying statistical concepts because, as I discuss in the
earlier post, the underlying concepts appear to be the same in
all the different fields.

SO WHAT IF STATISTICS IS THE SAME IN ALL FIELDS?
John asserts the sameness of the underlying statistical concepts
in different fields and then asks a question:

>    ( snip )
> Clearly, at some (possibly very) abstract level *most* (as I
> think Macnaughton implies with his "schema") of what is thought
> of as statistics is the same in different fields.  But what of
> it?

I believe we can use the sameness to *unify* the use of statis-
tics in empirical research, and thereby give students a substan-
tially better understanding of the role of statistics in empiri-
cal research.

I discuss an approach to using the sameness to help students un-
derstand the role of statistics in a paper (1996).

SO WHAT IF SCIENCE IS THE SAME IN ALL FIELDS?

> At some abstract enough level, all science is the same, too
> (although whether Popperian, Baconian, etc. is still a source
> of much discussion).  But, again, what of it?

I believe all science is simply study of
- entities
- relationships between entities
- properties of entities (variables) and
- relationships between properties.

(Most contemporary science seems to emphasize relationships be-
tween properties.)

As I discuss further in the paper, presenting this unifying view
of science gives students both a broad overview and a deep in-
sight into the role of statistics in all branches of empirical
(scientific) research.

WHAT DO WE TEACH?

> We don't teach "science", we teach physics, biology, chemistry,
> psychology, and so on, and not just physics, but astrophysics,
> or high-temperature physics, or quantum physics (with each of
> these yet often further sub-divided into specialties) that dif-
> fer not just in content, but in *methods*, in how one thinks

I agree that at a lower level these fields all differ in content,
methods, problem approaches, questions, and answers.  However, I
believe that at a higher level there is an easy-to-understand
*unifying* point of view.  Under the unifying point of view all
these fields (and all the other fields of science) study exactly
the same set of concepts.  That is, all the fields appear to
study entities, relationships between entities, properties of en-
tities, and relationships between properties.  This seems to be
all that the various fields of science and empirical research do,
both in theory and in practice.

(My claim in the preceding paragraph is a sweeping generaliza-
tion, which is indeed a risky approach.  To make things more dif-
ficult for me, I am unable [for logical reasons] to directly
*prove* the generalization.

(On the other hand, if the generalization is false, a person can
easily demonstrate its falsity by citing concrete examples of em-
pirical research that cannot easily be viewed as studying enti-
ties, properties [variables], or relationships.  I invite inter-
ested readers to submit examples that appear to falsify the gen-
eralization.)

If all the fields of empirical research can be reasonably unified
in terms of a few simple concepts, and if the use of statistics
in empirical research can also be unified in terms of these con-
cepts, we should teach these concepts to students at the begin-
ning of the introductory statistics course.

POSSIBLE FALSIFYING EXAMPLES
John refers to some examples that might falsify the generaliza-
tion:

> Some of these methods are statistical, and, yes, they do differ
> between areas in emphasis, preferred approaches, conventional
> usages, and so on.  So much so, that what passes for statisti-
> cal training in one area is often unrecognizable (without a lot
> of translation work and an understanding of "just how things
> are done" in that area) from another.

For John's putative falsifying examples to have credence, they
must be more specific.  Are there some statistical concepts or
methods that are glaringly different in different fields (areas)
of empirical research?  That is, are there some statistical con-
cepts or methods that we cannot easily view as studying proper-
ties or relationships between properties of entities?  (I am not
aware of any.)

(Some fields [e.g., statistical mechanics] have very complicated
mathematical model equations for the relationships between the
variables that they study.  However, the complexity of these
equations should not be allowed to hide the fact that *all* these
equations are merely statements of relationships between vari-
ables [relationships between properties of entities].  Further-
more, the model equations in science are diverse as one moves
from field to field.  However, since these complicated and di-
verse model equations are all merely statements of relationships
between variables, neither their complexity nor their diversity
can be cited as evidence that the use of statistics differs [at a
high level] in different fields.)

THE "CORRECT" INTERPRETATION OF EMPIRICAL RESEARCH

> Furthermore, just to take a specific issue, just because most
> statistics CAN be placed in a random-sampling, parameter esti-
> mation context, doesn't mean that in all, many or even more
> than some (or at most only some of the time) of the areas that
> parameter estimation is the goal of the statistical enterprise,
> or that random sampling is the statistical underpinning of the
> inferential process.

Although John expresses it quite differently, I think his sen-
tence above makes a slightly less general version of the follow-
ing point:

Just because we *can* interpret a particular empirical
research project in terms of certain statistical concepts
does not mean that the *correct* interpretation of the
research project is in terms of those concepts.

If this is John's point, I would like to suggest that in science
there is *no* correct interpretation or goal of some phenomenon
or research project, because that is absolutist, and science is
continually changing.  Instead of trying to find the "correct"
interpretation, most scientists simply choose the interpretation
(and goal) that currently
1. provides the most accurate predictions or controls and
2. is easiest to understand.

Nothing beyond these two factors seems to have much sway in
choosing the contemporary point of view of phenomena in any field
of empirical (scientific) research.

(Of course, cultural momentum and the politics of science also
play roles in the adoption of the contemporary point of view.
However, these are secondary factors and always give way in the
end to the factors of accuracy and ease of understanding.)

Using an entity-property-relationship approach to interpret em-
pirical research maximizes accuracy and ease of understanding be-
cause
1. an entity-property-relationship approach relies on the stan-
dard procedures of statistics, which can be shown to make the
most accurate predictions and controls possible under various
reasonable definitions of the word "accurate"
2. an entity-property-relationship approach is easy to understand
because it unifies empirical research and statistics in terms
of a few simple concepts that are mostly already intuitive in
students' thought.

DOES THE SCHEMA INCORPORATE THEORY-CORROBORATIVE RESEARCH?

>    ( snip )
> With specific reference to Macnaughton's interesting "schema",
> I fail to see how his claim for a universal system incorporates
> theory-corroborative research (which I take to be the bulk of
> science).  There the goal is the testing of theoretical impli-
> cations, *not* directly with statistical prediction and con-
> trol.

I agree with John's point that most scientific research can be
usefully interpreted as being "theory-corroborative".  I view
theory-corroborative research as consisting of the following four
steps:
1. A scientist postulates an extension to some existing theory.
2. An implication of the extension to the theory is inferred that
should be observable in the real (external) world.
3. A research project is performed to see if the implication can
actually be found in the real world.
4. If the implication is found, and if there is no reasonable al-
ternative explanation for the finding, the particular exten-
sion to the theory is said to be confirmed (corroborated) and
will (in time) become generally accepted.

Note that this process applies to research in all branches of
science from the most abstract (e.g., wave mechanics) to the most
practical (e.g., medical research).  For example, in medical re-
search a scientist might postulate (perhaps on the basis of labo-
ratory research) that a particular drug has an effect on AIDS.
The implication here of this hypothesis is that if you give AIDS
patients the drug, they will get better.  (That is, the implication
is that in AIDS patients there is a relationship between the
variables "amount of the drug" and "amount of AIDS".)  We can
test this implication with an appropriate formal experiment.  In
deciding whether to believe there is a relationship between the
two variables we can use the statistical procedure of analysis of
variance to help eliminate chance as an alternative explanation
of the results of the experiment.

As suggested earlier, I believe scientific theories are simply
statements about entities, properties, and relationships.  Simi-
larly, I believe the implications inferred from a theory in the
second step of theory-corroborative research are implications
about entities, properties, and relationships that should be ob-
servable in the real world.  Furthermore, and most importantly,
it appears that most theoretical implications that are tested in
theory-corroborative research can be easily and usefully viewed
in terms of prediction or control of the values of properties of
entities, usually on the basis of *relationships between* proper-
ties.

(As with the earlier generalization, I cannot prove the generali-
zations in the preceding sentence.  On the other hand, if either
generalization is false, a person can easily demonstrate the fal-
sity by citing a reasonably broad set of concrete counterexam-
ples.  That is, a reader can demonstrate falsity by citing either
1. concrete examples of theory-corroborative research projects
that cannot easily be viewed in terms of prediction or control
of values of properties of entities or
2. concrete examples of prediction or control of the values of
properties that are not based on the study of relationships
between properties.
[In case 2 I view studies of *univariate distributions* as a lim-
iting case of the study of relationships between properties--the
limit occurs when the number of predictor variables is reduced to
zero, leaving only the response variable to be studied (1996 ap-
pendix B2).  I discuss examples of some of the (few) areas of
scientific research that cannot easily be viewed as studying re-
lationships between properties below in appendix A.])

In an important monograph about statistical hypothesis testing,
Chow (1996) discusses four types of empirical research projects,
giving most of his attention to research projects he calls
"theory-corroboration experiments".  Chow gives a concrete exam-
ple of each of the four types of research projects.  The theory-
corroboration example (and the other three examples) can easily
be viewed as studying a relationship between properties of enti-
ties, as I discuss further in appendix B.

Since most theory-corroborative research projects can be easily
viewed as studying relationships between properties of entities
(i.e., relationships between variables), and since the schema in-
terprets empirical research projects in terms of the concept of a
relationship between variables, therefore the schema incorporates
most theory-corroborative research.

The ideas in this post are part of a broader discussion of an ap-
proach to the introductory statistics course available at

http://www.matstat.com/teach/

--------------------------------------------------------
Donald B. Macnaughton   MatStat Research Consulting Inc.
--------------------------------------------------------

APPENDIX A:  AREAS OF SCIENCE THAT DO NOT STUDY RELATIONSHIPS
BETWEEN PROPERTIES
I suggest above that *most* areas of scientific research can be
viewed as studying relationships between properties of entities.
One can find a few examples of research projects that do not
study relationships between properties in most areas of science.
In these examples the researchers are usually (always?) attempt-
ing to detect new entities, describe properties of entities, or
study relationships between entities, instead of studying rela-
tionships between properties.

One can find many examples of research projects that do not study
relationships between properties in the fields of
- molecular biology
- particle physics
- archeology
- paleontology.

Molecular biology and particle physics are young fields in the
sense that there are (apparently) still many new entities and
properties to be discovered.  Thus study of entities and proper-
ties seems to be more fruitful for researchers than study of re-
lationships between the properties.  However, as these young
fields mature, and as the rate of discovery of new entities and
properties (presumably) falls off, these fields will pay more at-
tention to relationships between properties, since such study en-
ables us to predict and control the values of properties, and
such ability is always of interest.  In the meantime, even in
these fields, measurement (of properties of entities) usually
lurks heavily in the background.  Thus one is never far from re-
lationships between properties.

Molecular biology studies three main groups of entities
- physical objects (i.e., the many types of physical entities
that make up the microscopic environment of the organism, such
as cells, nuclei, proteins, and atoms)
- energy
- chemical reactions, which are a subclass of the entities we
call processes.

Particle physics studies four main types of entities
- particles
- waves
- energy
- reactions among particles, waves, and energy.

Archeology and paleontology are historical sciences in which the
primary goal is to describe and understand entities and proper-
ties of entities in the remote past.  Because these sciences deal
with the remote past, they usually cannot easily study relation-
ships between properties of entities, because it is hard to get
enough data.

APPENDIX B:  INTERPRETATION OF CHOW'S EXAMPLES OF FOUR TYPES OF
RESEARCH PROJECTS
As noted in the body of this post, Chow (1996) discusses four
types of empirical research projects and gives a concrete example
of each type.  (Although all of Chow's examples are from his own
field [experimental psychology], this does not limit the general-
ity of his discussion, since analogous examples can easily be
found in most other fields.)  In each of the examples it is rea-
sonable to view the main entities under study as being the human
subjects in the research project.

The following table identifies the response variable and predic-
tor variable in each of Chow's examples, thereby illustrating how
each example can be viewed as a study of a relationship between
properties of entities:
_________________________________________________________________

Type of                Type of     Definition of the Variable
Research Project       Variable    in Chow's Example
_________________________________________________________________

theory-corroboration   response    number of additional words re-
called by subjects
predictor   number of grammatical trans-
formations required for sub-
jects to generate the surface
structure of the sentence (pp
45-56)

utilitarian            response    test performance (i.e., test
score) of subjects
predictor   teaching method (a binary
variable with values "method
E" and "method C", pp 57-59,
2-3)

clinical               response    score of subjects on the C2
predictor   membership of subjects in
category K (a binary variable
with values "yes" and "no", pp
59-61)

generality             response    subjects' performance score
predictor   subjects' amount of practice
(pp 61-63)
_________________________________________________________________

REFERENCES
Chow, S. L. (1996) _Statistical Significance:  Rationale,
Validity and Utility._  London:  Sage.

Macnaughton, D. B. (1996), "The Introductory Statistics Course:
A New Approach."  Available at http://www.matstat.com/teach/

Macnaughton, D. B. (1997) "Response to Comments by Samuel M.
Scheiner."  Posted to the sci.stat.edu Usenet newsgroup on
February 23, 1997 under the title "Re: How Should We
*Motivate* Students in Intro Stat?." Available at
http://www.matstat.com/teach/

Vokey, J. R. (1997) "Re: How Should We *Motivate* Students in
Intro Stat?." Posted to the sci.stat.edu Usenet newsgroup on
February 23, 1997.  Available at
gopher://jse.stat.ncsu.edu:70/7waissrc%3A/edstat/edstat
(search for "How Should We" without the quotes).  Also
available at http://www.dejanews.com  Search the appropriate
date range database using "~g sci.stat.edu and motivate intro
stat" without the quotes.