1. **STATISTICAL SYLLOGISM **applies a statistical generalization
about a "reference class", F, to a specific member of the class, m. For
example, a statistical syllogism may go from 'Most Fs are G' and 'm is
an F' to 'm is a G' (positive form), or from 'Few Fs are Gs' and 'm is
an F' to 'm is not a G' (negative form). Instead of 'most' or 'few' the
statistical premise may use a phrase like 'all or most' or 'few if any'.
Often the quantifier is a quantitative one, such as '85% of all Fs are
Gs', or 'at most 20% of Fs are Gs'. If the statistical premise is obtained
by generalizing from a sample, it may specify a margin of error, as in
'80% +/- 5% of all Fs are Gs'.

The most obviously correct examples of statistical syllogism involve items selected at random from the reference class. If all you know about a particular marble is that it was selected at random from a large collection of marbles of which at least 95% were red, then it seems very reasonable to think that the marble is red. If all you know about a particular playing card is that it was dealt at random from a deck in which fewer than 8% of the cards were Aces, it seems reasonable to suppose that the card is not an Ace.

In cases where nothing quite like random selection is involved, a statistical syllogism may still be in order if the individual in question can regarded as a "typical" member of the reference class. If most German shepherd dogs are aggressive, and there is a typical looking shepherd coming down the street toward you now, you should probably expect it to be aggressive even though it was not randomly selected from all the German shepherds in existence.

One of the main sources of fallacy in statistical syllogism is an inappropriate
choice of reference class. Every individual belongs to indefinitely many
different classes, and sometimes statistics on two of the classes to which
an individual belongs are on opposite sides of the 50% mark. Most physicians
do not smoke, but most people who grew up on tobacco farms do smoke. If
we know that Fritz is a physician who grew up on a tobacco farm, it would
be fallacious for us to conclude either that he probably doesn't smoke
(because he's a physician) or that he probably does (because he grew up
on a tobacco farm). Each reference class is inappropriate because Fritz
is also known to belong to the other. If we happen to have statistics on
the intersection of these two reference classes (that is, on physicians
who grew up on tobacco farms), we should use that as the reference class
and draw the conclusion (if any) supported by those statistics. If not,
we should avoid using statistical syllogism to arrive at a conclusion about
whether Fritz smokes. Otherwise we will be committing the fallacy of **ignoring
relevant evidence**.

Note that it doesn't take actual statistics on an alternative reference
class to make a given reference class inappropriate. As long as there is
reason to suspect that the statistics are quite different for another class to which the individual
belongs, it is unsafe to assume that the individual is a "typical" member
of the class for which we do have statistics. If a study showed that most
cocker spaniels are docile, it would not be safe to conclude that Laddie,
an abused cocker spaniel, is docile.
The information that Laddie has been abused gives us reason to doubt
that he is a behaviorally "typical" cocker spaniel, even if we don't have
actual statistics on
the docility of abused cocker spaniels. If we ignore this information
and make the inference anyway, we will be **ignoring relevant evidence**.

2. **GENERALIZATION** goes from statistics on specific members (the
"sample") of a class (the "population") to a statistical conclusion about
the population as a whole. The premises state the "composition" of the
"sample" with respect to a certain attribute, that is, the proportion of
sample members that have the attribute, and the conclusion states that
the population has the same or about the same composition as the sample.
For example, pollsters may go from the premise that 63% of a large, random
sample of voters prefer candidate A to the conclusion that between 60%
and 66% of all voters prefer candidate A. Quality control experts may go
from the premise that 95% of the whatsits they randomly selected and tested
were OK to the conclusion that about 95% of the whatsits in the lot from
which their sample was taken were OK.

Generalization is the subject of oddly conflicting attitudes. Many people
are quite willing to generalize on the basis of a few cases or even a single
case, thereby committing the fallacy of **small sample** or **hasty
generalization** . Others (or perhaps in some cases the same people)
are very suspicious of the conclusions statisticians draw from much larger
numbers of cases. How, they wonder, can anyone claim to divine the opinions
of 100 million people from interviews with a couple thousand? How can anyone
declare that candidate A will win when only 5% of the votes have been counted?
Although such questions are often intended to be rhetorical, the supposedly
obvious answer that "It can't be done!" is false. In cases where the "sample"
is randomly selected from the population, the generalization from sample
to population can be backed up by a complex argument in which all but one
of the steps are deductive, and the one nondeductive step is an intuitively
correct statistical syllogism.

The foundation of the reasoning lies in mathematics, which is deductive.
The relevant mathematical result, qualitatively stated, is that **most
of the large samples that could possibly be drawn from a population have
compositions close to that of the population**. More specific results
can be given for particular sample sizes. For instance, it can be shown
mathematically that at least 95% of all possible samples of size 1600--a
typical size for the professional polls-- have compositions within about
2-1/2% of the composition of the population. It doesn't matter whether
the population size is 100 thousand, 100 million, or 100 trillion: the
ratio of matching samples to possible samples is essentially constant for
populations much larger than the sample. Moreover, the margin of error
can be reduced by increasing the sample size. At least 95% of all possible
samples of size 6400 have compositions within about 1-1/4% of the composition
of the population. Each quadrupling of the sample size reduces the margin
of error by half. (This assumes a "confidence level" of 95%, the usual
figure for such studies: if the margin of error stated in the conclusion
is held constant, increasing the sample size increases the confidence level.)

The statistical syllogism underlying a generalization goes something like this:

- Most possible large samples of population P have compositions close to P's
- S is a randomly selected large sample of population P
- So S's composition is close to P's.

The claim that P's composition is close to S's follows deductively from the above conclusion.

We can determine by observation what S's composition is and reason deductively as follows:

- S's composition is C [established by observation]
- P's composition is close to S's [established by statistical syllogism and deduction]
- So P's composition is close to C.

Thus we can infer the approximate composition of an arbitrarily large population from the composition of an absolutely large random sample, even if the sample is small relative to the population size, by reasoning in which the only non-deductive step is a statistical syllogism.

In cases where a truly random sample cannot be obtained, we must assume
that our method of sampling is not biased. A sample is truly random only
when all members of the population have the same chance of being selected
for inclusion in the sample. Typically this is not the case. Some members
of the population will have a greater than average chance of being selected
and others a less than average chance for reasons related to our method
of sampling. The composition of the sample can be safely projected onto
the entire population only if it is safe to assume that the characteristics
that determine an individual's chance of being selected for the sample
are unrelated to the attribute we are studying. In a political preference
poll, selecting our sample from among homeowners will normally introduce
a bias that makes the sample worthless because homeowners are on average
wealthier than non-homeowners, and the more wealthy tend to be more conservative
than the less wealthy. The use of such a **biased sample** is a fallacy
regardless of how large the sample is, just as the use of a **small sample** is a fallacy even if the members were randomly chosen.

3. **ANALOGY** draws a conclusion about one thing from a premise
about some other thing together with premises asserting other similarities
between the two things. Given that individuals a and b both have characteristics
F1, F2, etc., and that b also has G, we might conclude that a also has
G. The two key factors in analogy are the **extent of the relevant similarity** between a and b and the **absence of relevant differences** between
them. There is no point in counting similarities and differences. One similarity
or difference may be extremely relevant and many others almost entirely
irrelevant. The relevance of a similarity or difference must be judged
on the basis of general background knowledge, including especially our
well-established theories about things of the kind(s) we are reasoning
about, about the kinds of causal processes there are in the world, and
so forth. Assessing the strength of a good analogy is usually more difficult
than assessing the strength of a generalization or a statistical syllogism,
but bad analogies are often conspicuously bad, highlighting obviously irrelevant
similarities and/or ignoring relevant differences.

4. **SIMPLE INDUCTION**. Goes from premises about some members of
a class (a "sample") to a conclusion about some other member of the same
class. Thus simple induction goes from premises like those of a generalization
argument to a conclusion like that of a statistical syllogism, without
involving any statistical statement about the class as a whole. As in the
case of analogy, the premises and conclusion of a simple induction are
at the same level of generality, in contrast to statistical syllogism,
which goes from general to specific, and generalization, which goes from
specific to general. The difference is roughly this. In analogy, the strength
of the inference to the conclusion that a has G depends on a's having many
things in common (being an F1, an F2, etc.) with one other individual that
is known to have G. In simple induction it depends on a's having one thing
in common (being an F) with many other individuals most of which have G.

**COMPARISON 1: Generalization versus Statistical Syllogism**. These
two types of argument go in opposite directions. Generalization has a conclusion
that is more general than any premise, while statistical syllogism has
a premise that is more general than the conclusion. Examples:

Generalization |
Statistical Syllogism |

My cat Sloth likes tuna | Most cats like tuna |

Al's cat Zubin likes tuna | Sloth is a cat |

etc. | -------------------- |

-------------------------- | Sloth likes tuna |

All or most cats like tuna |

Generalization |
Statistical Syllogism |

Most of the Iowans I know like corn | Few dogs lack fleas |

-------------------------------------- | Spot is a dog |

Most Iowans like corn | -------------------- |

Spot has fleas |

**COMPARISON 2: Generalization versus Simple Induction**. Both of
these types of induction involve inference from the makeup or composition
of a sample. That is, what is given in the premises is that certain specific
members of a class, say A, have been observed, and that all, or most, or
a certain percentage of these observed As are Bs. The conclusion drawn
in generalization is that some correspondingly high (or low) proportion
of all As are Bs. In other words, a conclusion is drawn, on the basis of
the makeup of the sample, about the makeup of the entire class from which
the sample was taken. By contrast, in simple induction, a conclusion is
drawn about a single case that is in the class but not in the sample, either
that it is a B, or that it isn't. Examples:

Generalization |
Simple Induction |

All of the cats I've known love fish | All of the cats I've known love fish |

--------------------------------------- | ------------------------------------ |

All or most cats love fish | This new cat will love fish. |

Generalization |
Simple Induction |

80% of the students polled dislike pop quizzes | 80% of the students polled dislike pop quizzes |

--------------------------------------- | Betsy is another student |

About 80% of all students dislike pop quizzes | --------------------------------------- |

Betsy dislikes pop quizzes |

**Comparison 3: Statistical Syllogism versus Simple Induction**.
Both of these types draw a conclusion about a single case from premises
about a class. The difference lies in the relationship of that individual
to the class of cases described in the premise(s). If that class is one
to which the individual belongs, the argument is a statistical syllogism.
Otherwise, the argument is a simple induction. Examples:

Statistical Syllogism |
Simple Induction |

Most apples are bland | Most of the apples I've tasted have been bland |

This is an apple | This is another apple |

--------------------------------------- | ------------------------------------ |

This is bland | This is bland |

Statistical Syllogism |
Simple Induction |

Most of Al's dates have been smart | Most of Al's dates have been smart |

Al has dated Judi | Al is going to date Tracey |

--------------------------------------- | --------------------------------------- |

Judi is smart | Tracey is smart |

**Comparison 4: Simple Induction versus Analogy**. In analogy, as
in simple induction, the conclusion is about one item (call it a). But
in analogy, the premises are also about a single item (b) that is said
to be similar in certain ways to a, and to be (let's say) a G. The conclusion
then says that a is also a G. In simple induction, the conclusion that
a is a G is based on the fact that numerous other items that, like a, are
Fs, are also Gs (or that most of them are). So the difference

between simple induction and analogy is that between basing the claim that a is G on many other similar cases (simple induction) and basing it on one very similar case (analogy). Examples:

Analogy |
Simple Induction |

Jim and Bill have similar interests, talents, | Most of the student athletes I know are liked |

aptitudes, habits, personalities, etc. | by their classmates |

Jim is liked by his classmates | Bill is another student athlete |

--------------------------------------- | ------------------------------------ |

Bill is liked by his classmates | Bill is liked by his classmates |

Analogy |
Simple Induction |

Chickadees are closely related to titmice and regularly eat the same kinds of seeds | All birds to whom I've offered peanut butter have eaten it |

Titmice will eat peanut butter | I will offer peanut butter to chickadees |

--------------------------------------- | --------------------------------------- |

Chickadees will eat peanut butter | The chickadees will eat the peanut butter |