EPR Approach: Response to Vokey

Subject: Re: How Should We *Motivate* Students in Intro Stat?

     To: EdStat-L Statistics Education Discussion List,
         sci.stat.edu Usenet Newsgroup

   From: Donald B. Macnaughton <donmac@matstat.com>
                       formerly donmac@hookup.net)

   Date: Sunday April 6, 1997
         slightly revised on June 1, 1997

     cc: John R. Vokey <vokey@hg.uleth.ca>

SHOULD WE NARROW THE FOCUS OF THE INTRODUCTORY COURSE?
Referring to an earlier post of mine (1997) John Vokey (1997) 
suggests that

>    ( snip )
> narrowing the focus [of the introductory statistics course] to
> application and use of statistics in a particular discipline
> (or sub-discipline) rather than the broad (but often thereby
> shallow) coverage more usually attempted would help with stu-
> dent motivation

John uses the concept of the *focus* of the introductory course.  
It is useful to distinguish between two senses of the "focus" 
- the focus of the *examples* we use in the course
- the focus we place on the *underlying statistical concepts*.

I believe we should (when possible) narrow the focus of the exam-
ples.  In particular, we can substantially heighten student in-
terest if we select the examples in an introductory course from 
the students' chosen field of study.  In this sense of "focus" 
(which I suspect is John's intended sense) I agree that we should 
narrow the focus of the introductory course.

On the other hand, I doubt we can narrow the focus we place on 
the underlying statistical concepts because, as I discuss in the 
earlier post, the underlying concepts appear to be the same in 
all the different fields.


SO WHAT IF STATISTICS IS THE SAME IN ALL FIELDS?
John asserts the sameness of the underlying statistical concepts 
in different fields and then asks a question:

>    ( snip )
> Clearly, at some (possibly very) abstract level *most* (as I
> think Macnaughton implies with his "schema") of what is thought
> of as statistics is the same in different fields.  But what of
> it?  

I believe we can use the sameness to *unify* the use of statis-
tics in empirical research, and thereby give students a substan-
tially better understanding of the role of statistics in empiri-
cal research.  

I discuss an approach to using the sameness to help students un-
derstand the role of statistics in a paper (1996).


SO WHAT IF SCIENCE IS THE SAME IN ALL FIELDS?

> At some abstract enough level, all science is the same, too
> (although whether Popperian, Baconian, etc. is still a source
> of much discussion).  But, again, what of it?  

I believe all science is simply study of 
- entities
- relationships between entities
- properties of entities (variables) and 
- relationships between properties.  

(Most contemporary science seems to emphasize relationships be-
tween properties.)  

As I discuss further in the paper, presenting this unifying view 
of science gives students both a broad overview and a deep in-
sight into the role of statistics in all branches of empirical 
(scientific) research.


WHAT DO WE TEACH?

> We don't teach "science", we teach physics, biology, chemistry,
> psychology, and so on, and not just physics, but astrophysics,
> or high-temperature physics, or quantum physics (with each of
> these yet often further sub-divided into specialties) that dif-
> fer not just in content, but in *methods*, in how one thinks
> about problems in the area, asks questions, and formulates 
> answers.  

I agree that at a lower level these fields all differ in content, 
methods, problem approaches, questions, and answers.  However, I 
believe that at a higher level there is an easy-to-understand 
*unifying* point of view.  Under the unifying point of view all 
these fields (and all the other fields of science) study exactly 
the same set of concepts.  That is, all the fields appear to 
study entities, relationships between entities, properties of en-
tities, and relationships between properties.  This seems to be 
all that the various fields of science and empirical research do, 
both in theory and in practice. 

(My claim in the preceding paragraph is a sweeping generaliza-
tion, which is indeed a risky approach.  To make things more dif-
ficult for me, I am unable [for logical reasons] to directly 
*prove* the generalization.  

(On the other hand, if the generalization is false, a person can 
easily demonstrate its falsity by citing concrete examples of em-
pirical research that cannot easily be viewed as studying enti-
ties, properties [variables], or relationships.  I invite inter-
ested readers to submit examples that appear to falsify the gen-
eralization.)

If all the fields of empirical research can be reasonably unified 
in terms of a few simple concepts, and if the use of statistics 
in empirical research can also be unified in terms of these con-
cepts, we should teach these concepts to students at the begin-
ning of the introductory statistics course.


POSSIBLE FALSIFYING EXAMPLES
John refers to some examples that might falsify the generaliza-
tion:

> Some of these methods are statistical, and, yes, they do differ
> between areas in emphasis, preferred approaches, conventional
> usages, and so on.  So much so, that what passes for statisti-
> cal training in one area is often unrecognizable (without a lot
> of translation work and an understanding of "just how things
> are done" in that area) from another.  

For John's putative falsifying examples to have credence, they 
must be more specific.  Are there some statistical concepts or 
methods that are glaringly different in different fields (areas) 
of empirical research?  That is, are there some statistical con-
cepts or methods that we cannot easily view as studying proper-
ties or relationships between properties of entities?  (I am not 
aware of any.)    

(Some fields [e.g., statistical mechanics] have very complicated 
mathematical model equations for the relationships between the 
variables that they study.  However, the complexity of these 
equations should not be allowed to hide the fact that *all* these 
equations are merely statements of relationships between vari-
ables [relationships between properties of entities].  Further-
more, the model equations in science are diverse as one moves 
from field to field.  However, since these complicated and di-
verse model equations are all merely statements of relationships 
between variables, neither their complexity nor their diversity 
can be cited as evidence that the use of statistics differs [at a 
high level] in different fields.)


THE "CORRECT" INTERPRETATION OF EMPIRICAL RESEARCH

> Furthermore, just to take a specific issue, just because most
> statistics CAN be placed in a random-sampling, parameter esti-
> mation context, doesn't mean that in all, many or even more
> than some (or at most only some of the time) of the areas that
> parameter estimation is the goal of the statistical enterprise,
> or that random sampling is the statistical underpinning of the
> inferential process.

Although John expresses it quite differently, I think his sen-
tence above makes a slightly less general version of the follow-
ing point:

    Just because we *can* interpret a particular empirical 
    research project in terms of certain statistical concepts 
    does not mean that the *correct* interpretation of the 
    research project is in terms of those concepts.  

If this is John's point, I would like to suggest that in science 
there is *no* correct interpretation or goal of some phenomenon 
or research project, because that is absolutist, and science is 
continually changing.  Instead of trying to find the "correct" 
interpretation, most scientists simply choose the interpretation 
(and goal) that currently
1. provides the most accurate predictions or controls and 
2. is easiest to understand. 

Nothing beyond these two factors seems to have much sway in 
choosing the contemporary point of view of phenomena in any field 
of empirical (scientific) research.  

(Of course, cultural momentum and the politics of science also 
play roles in the adoption of the contemporary point of view.  
However, these are secondary factors and always give way in the 
end to the factors of accuracy and ease of understanding.)

Using an entity-property-relationship approach to interpret em-
pirical research maximizes accuracy and ease of understanding be-
cause
1. an entity-property-relationship approach relies on the stan-
   dard procedures of statistics, which can be shown to make the 
   most accurate predictions and controls possible under various 
   reasonable definitions of the word "accurate"
2. an entity-property-relationship approach is easy to understand 
   because it unifies empirical research and statistics in terms 
   of a few simple concepts that are mostly already intuitive in 
   students' thought.


DOES THE SCHEMA INCORPORATE THEORY-CORROBORATIVE RESEARCH?

>    ( snip )
> With specific reference to Macnaughton's interesting "schema",
> I fail to see how his claim for a universal system incorporates
> theory-corroborative research (which I take to be the bulk of
> science).  There the goal is the testing of theoretical impli-
> cations, *not* directly with statistical prediction and con-
> trol.

I agree with John's point that most scientific research can be 
usefully interpreted as being "theory-corroborative".  I view 
theory-corroborative research as consisting of the following four 
steps:
1. A scientist postulates an extension to some existing theory.  
2. An implication of the extension to the theory is inferred that 
   should be observable in the real (external) world.  
3. A research project is performed to see if the implication can 
   actually be found in the real world.
4. If the implication is found, and if there is no reasonable al-
   ternative explanation for the finding, the particular exten-
   sion to the theory is said to be confirmed (corroborated) and 
   will (in time) become generally accepted.

Note that this process applies to research in all branches of 
science from the most abstract (e.g., wave mechanics) to the most 
practical (e.g., medical research).  For example, in medical re-
search a scientist might postulate (perhaps on the basis of labo-
ratory research) that a particular drug has an effect on AIDS.  
The implication here of this hypothesis is that if you give AIDS 
patients the drug, they will get better.  (That is, the implication 
is that in AIDS patients there is a relationship between the 
variables "amount of the drug" and "amount of AIDS".)  We can 
test this implication with an appropriate formal experiment.  In 
deciding whether to believe there is a relationship between the 
two variables we can use the statistical procedure of analysis of 
variance to help eliminate chance as an alternative explanation 
of the results of the experiment.

As suggested earlier, I believe scientific theories are simply 
statements about entities, properties, and relationships.  Simi-
larly, I believe the implications inferred from a theory in the 
second step of theory-corroborative research are implications 
about entities, properties, and relationships that should be ob-
servable in the real world.  Furthermore, and most importantly, 
it appears that most theoretical implications that are tested in 
theory-corroborative research can be easily and usefully viewed 
in terms of prediction or control of the values of properties of 
entities, usually on the basis of *relationships between* proper-
ties.  

(As with the earlier generalization, I cannot prove the generali-
zations in the preceding sentence.  On the other hand, if either 
generalization is false, a person can easily demonstrate the fal-
sity by citing a reasonably broad set of concrete counterexam-
ples.  That is, a reader can demonstrate falsity by citing either
1. concrete examples of theory-corroborative research projects 
   that cannot easily be viewed in terms of prediction or control 
   of values of properties of entities or
2. concrete examples of prediction or control of the values of 
   properties that are not based on the study of relationships 
   between properties.  
[In case 2 I view studies of *univariate distributions* as a lim-
iting case of the study of relationships between properties--the 
limit occurs when the number of predictor variables is reduced to 
zero, leaving only the response variable to be studied (1996 ap-
pendix B2).  I discuss examples of some of the (few) areas of 
scientific research that cannot easily be viewed as studying re-
lationships between properties below in appendix A.])

In an important monograph about statistical hypothesis testing, 
Chow (1996) discusses four types of empirical research projects, 
giving most of his attention to research projects he calls 
"theory-corroboration experiments".  Chow gives a concrete exam-
ple of each of the four types of research projects.  The theory-
corroboration example (and the other three examples) can easily 
be viewed as studying a relationship between properties of enti-
ties, as I discuss further in appendix B.

Since most theory-corroborative research projects can be easily 
viewed as studying relationships between properties of entities 
(i.e., relationships between variables), and since the schema in-
terprets empirical research projects in terms of the concept of a 
relationship between variables, therefore the schema incorporates 
most theory-corroborative research.


LINK
The ideas in this post are part of a broader discussion of an ap-
proach to the introductory statistics course available at

               http://www.matstat.com/teach/

--------------------------------------------------------
Donald B. Macnaughton   MatStat Research Consulting Inc.
donmac@matstat.com      Toronto, Canada
--------------------------------------------------------


APPENDIX A:  AREAS OF SCIENCE THAT DO NOT STUDY RELATIONSHIPS 
BETWEEN PROPERTIES
I suggest above that *most* areas of scientific research can be 
viewed as studying relationships between properties of entities.  
One can find a few examples of research projects that do not 
study relationships between properties in most areas of science.  
In these examples the researchers are usually (always?) attempt-
ing to detect new entities, describe properties of entities, or 
study relationships between entities, instead of studying rela-
tionships between properties.

One can find many examples of research projects that do not study 
relationships between properties in the fields of
- molecular biology
- particle physics
- archeology
- paleontology.  

Molecular biology and particle physics are young fields in the 
sense that there are (apparently) still many new entities and 
properties to be discovered.  Thus study of entities and proper-
ties seems to be more fruitful for researchers than study of re-
lationships between the properties.  However, as these young 
fields mature, and as the rate of discovery of new entities and 
properties (presumably) falls off, these fields will pay more at-
tention to relationships between properties, since such study en-
ables us to predict and control the values of properties, and 
such ability is always of interest.  In the meantime, even in 
these fields, measurement (of properties of entities) usually 
lurks heavily in the background.  Thus one is never far from re-
lationships between properties.

Molecular biology studies three main groups of entities
- physical objects (i.e., the many types of physical entities 
  that make up the microscopic environment of the organism, such 
  as cells, nuclei, proteins, and atoms)
- energy
- chemical reactions, which are a subclass of the entities we 
  call processes.

Particle physics studies four main types of entities 
- particles
- waves
- energy
- reactions among particles, waves, and energy.  

Archeology and paleontology are historical sciences in which the 
primary goal is to describe and understand entities and proper-
ties of entities in the remote past.  Because these sciences deal 
with the remote past, they usually cannot easily study relation-
ships between properties of entities, because it is hard to get 
enough data.  


APPENDIX B:  INTERPRETATION OF CHOW'S EXAMPLES OF FOUR TYPES OF 
RESEARCH PROJECTS
As noted in the body of this post, Chow (1996) discusses four 
types of empirical research projects and gives a concrete example 
of each type.  (Although all of Chow's examples are from his own 
field [experimental psychology], this does not limit the general-
ity of his discussion, since analogous examples can easily be 
found in most other fields.)  In each of the examples it is rea-
sonable to view the main entities under study as being the human 
subjects in the research project.

The following table identifies the response variable and predic-
tor variable in each of Chow's examples, thereby illustrating how 
each example can be viewed as a study of a relationship between 
properties of entities:
_________________________________________________________________

Type of                Type of     Definition of the Variable 
Research Project       Variable    in Chow's Example
_________________________________________________________________

theory-corroboration   response    number of additional words re-
                                   called by subjects
                       predictor   number of grammatical trans-
                                   formations required for sub-
                                   jects to generate the surface 
                                   structure of the sentence (pp 
                                   45-56)

utilitarian            response    test performance (i.e., test 
                                   score) of subjects
                       predictor   teaching method (a binary 
                                   variable with values "method 
                                   E" and "method C", pp 57-59, 
                                   2-3)

clinical               response    score of subjects on the C2
(discriminant)                     task
                       predictor   membership of subjects in 
                                   category K (a binary variable 
                                   with values "yes" and "no", pp 
                                   59-61)

generality             response    subjects' performance score
                       predictor   subjects' amount of practice 
                                   (pp 61-63)
_________________________________________________________________


REFERENCES
Chow, S. L. (1996) _Statistical Significance:  Rationale, 
   Validity and Utility._  London:  Sage.

Macnaughton, D. B. (1996), "The Introductory Statistics Course:  
   A New Approach."  Available at http://www.matstat.com/teach/

Macnaughton, D. B. (1997) "Response to Comments by Samuel M. 
   Scheiner."  Posted to the sci.stat.edu Usenet newsgroup on 
   February 23, 1997 under the title "Re: How Should We 
   *Motivate* Students in Intro Stat?." Available at 
   http://www.matstat.com/teach/

Vokey, J. R. (1997) "Re: How Should We *Motivate* Students in 
   Intro Stat?." Posted to the sci.stat.edu Usenet newsgroup on 
   February 23, 1997.  Available at 
   gopher://jse.stat.ncsu.edu:70/7waissrc%3A/edstat/edstat 
   (search for "How Should We" without the quotes).  Also 
   available at http://www.dejanews.com  Search the appropriate 
   date range database using "~g sci.stat.edu and motivate intro 
   stat" without the quotes.

Home Page for the Entity-Property-Relationship Approach to Introductory Statistics