Table Of Contents

This Page


STAT/MATH 394, Probability I, covers the basic elements of probability theory. The goal is to provide a solid grounding in understanding and working with practical problems that involve some element of randomness. The topics covered include sample spaces, probabilistic experiments, probability axioms, cumulative distribution functions, and some common distributions.

Official Syllabus (Essentially the first part of this page).


For better or worse, I, Hoyt Koepke, am your instructor. I’m a PhD student in statistics. I did my masters in Computer Science and my undergrad in physics. My email is; I’m not on email all the time, but I will try to get back to you ASAP.


The official prerequisite is the calculus sequence, MATH 124, 125, and 126 (or equivalent). What you’ll need for this course is probably a subset of that, but known well. In particular, make sure you’re fluent in basic differentiation and integration. You also need to be comfortable working with common series (e.g. the geometric series), limits, and sets. If people request it or appear to have trouble on these topics, I’ll post links to review material on the website.


The book for the course is A First Course in Probability, 7th edition, by Sheldon Ross. If you have another edition, it covers the same material and will likely work, but it’s your responsibility to get the homework problems from a copy of this edition.


The final exam will be held on the last day of class, and the midterm will be July 2.

The exams will all be closed book, closed notes. I will, however, give out a one-page reference sheet that you can use. It will have a number of the important formulas and identities on it, and I’ll make sure you have it ahead of time for studying.

Also, at least half of the exam problems will come directly from the recommended problems (see below) with little or no modification.

NOTE: You do not need to bring a calculator to the exams. This is a change of mind on my part; apologies for that.


Because this course moves very quickly – we have only about four weeks! – homework will be assigned in almost every class and then collected the next class. This translates into about 10 homework sets, and I’ll drop the lowest one in calculating the final grade. Having homeworks due this frequently makes sure you stay on top of the material. Generally, the assignments will be short, only 2-4 graded problems, to also make sure you can have a life.

The exercises in the book are divided into two types, regular problems and self-assessment problems. The answers to all of the self-assessment problems and some of the regular problems are in the back of the book.

In addition to the graded problems, I’ll also recommend a few ungraded exercises with each homework. These will usually have answers in the back of the book, and I’ll generally take half or more of the exam questions from these. Understanding and working these problems is a great way to prepare for the exams.

Discussion Board

The course discussion board is for discussion of homeworks, concepts, and pretty much anything related directly to the material of the course. It is online at and requires a UW net ID to log on. It is restricted to this course; if you have trouble, send me an email.


The grade distribution will be as follows:

  • 25% Homework.
  • 30% (or 20%) Midterm.
  • 40% (or 50%) Final Exam.
  • 5% Class Participation / Extra Credit (may give additional credit).

For the exams, if you do better on the final than the midterm, it will count as 50% of your final grade and the midterm will count as 20%.

The class participation / extra credit grade will be based on how much above the “minimum” you participate in the course and how much I see you try to engage probability theory.

  • Asking questions in class / on the course discussion board.
  • Answering and explaining questions other students ask on the discussion board.
  • Sending me a link to a news article that deals with probabilities, along with a 2-5 sentence discussion/assessment of how the article handles them. You can talk about how well it is explained, whether they interpreted it correctly, if there’s room for people to hide something within their assessment, etc.
  • Other things I haven’t thought of yet...

Office Hours

Office hours will be from 10:45 - 11:20 MWF (right after class), and 12:30 - 2:00 on Tue/Thu (note change), and by appointment. My office is Padelford B-312 (Note that Padelford has 3 sections, A, B, and C; I’m in B).

Schedule (In Progress)

Here is the (tentative) schedule for the course. Anything that hasn’t yet happened may change. If I need to change anything that may potentially affect your grade (e.g. dropping/changing/modifying a homework problem), I’ll either email it out or announce it in class. Otherwise, it is your responsibility to keep track of the content here.

06/21 - Monday:

General course information, overview of probability theory, experiments, sample spaces, Venn Diagrams, set theory, and basic axioms of probability theory. Covers 2.1-2.3.

Homework 1, due 06/23 Wednesday: Chapter 2, problems 5, 8;
theoretical exercise 6 parts b,d,g, and h. (Note: you don’t need to do i).
Recommended Problems: Chapter 2, problem 1. Note: self-test
problem 3 is postponed til Wednesday.

Notes: For those of you who don’t have the book, or have a different edition, I’ve asked a copy to be put on reserve. I don’t think the material will have changed, but I think the problems change between editions. Until then, here they are:

HW 1 + Solutions

Due 06/23 Wednesday.

Problem 2.5, page 55.

A system is composed of 5 components, each of which is either working or failed. Consider an experiment that consists of observing the status of each component, and let the outcome of the experiment be given by the vector (x_1,x_2,x_3,x_4,x_5,x_6), where x_i is equal to 1 if component i is working and equal to 0 if component i has failed:

  1. How many outcomes are in the sample space of this experiment?
  2. Suppose that the system will work if components 1 and 2 are both working, or if components 3 and 4 are both working, or if components 1,3, 5 are all working. Let W be the event that the system will work. Specify all the outcomes in W.
  3. Let A be the event that components 4 and 5 are both failed. How many outcomes are contained in the event A?
  4. Write out all the outcomes in the event AW.


  1. Each machine has 2 states, so there are 2^5 = 32 possible outcomes.

  2. You can see this as:

    W =
&\Set{(1,1,x_3,x_4,x_5)}{x_i \in \set{0,1}, i = 3,4.5} \\
\Set{(x_1,x_2,1,1,x_5)}{x_i \in \set{0,1}, i = 1,2.5} \\
\Set{(1,x_2,1,x_4,1)}{x_i \in \set{0,1}, i = 2,4}

    For notational convenience, let’s write (1,0,1,1,1) as 10111 and so on. Writing them all out then gives us

    W = &\set{11000, 11001, 11010, 11011, 11100, 11101, 11110, 11111} \\
&\cup \;\set{00110, 00111, 01110, 01111, 10110, 10111, 11110, 11111} \\
&\cup \;\set{10101, 10111,11101,11111}

    Which, when combined, gives us

    W = \{00110, 00111, 01110, 01111, 10101, 10110, 10111, 11000, \\
11001, 11010, 11011, 11100, 11101, 11110, 11111\}

  3. In this case, A is just:

    A =
&\Set{(x_1,x_2,x_3,0,0)}{x_i \in \set{0,1}, i = 1,2,3}

    Writing this out gives

    A = \set{00000, 00100, 01000, 01100, 10000, 10100, 11000, 11100}

  4. The interesection between the two groups is just:

    AW = A \cap W = \set{11000, 11100}

Problem 2.8, page 56.

Suppose that A and B are mutually exclusive events for which P(A) = .3 and P(B) = 0.5. What is the probability that

  1. either A or B occurs?
  2. A occurs but not B?
  3. both A and B occur?


  1. A and B are mutually exlusive events, so we can just use axiom 3 to add their probabilities. Thus:

    P(A\cup B) = P(A) + P(B) = 0.3 + 0.5 = 0.8

  2. If A occurs, B doesn’t. So this is just P(A) = 0.3.

  3. P(A\cap B) = P(\emptyset) = 0 by definition of “mutually exclusive”.

Theoretical Exercise 2.6 (selected parts), page 61.

Let E, F, and G be three events. Find expressions for the events so that of E, F, and G:

  1. Both E and G occur but not F.
  1. At least two events occur.
  1. At most one of them occurs.
  2. At most two of them occur.


  1. E \cap G \cap F^c = EGF^c.
  1. (EF) \cup (EG) \cup (FG). Note that the parenthesis are not needed.
  1. (EF \cup EG \cup FG)^c. This can be seen as the complement of part d.
  2. (EFG)^c. Think of this as “all three events don’t happen.”

06/23 - Wednesday

Continue with probability theory axioms, probabilities of events that are not mutually exclusive, basic combinatorics.

Course Reading: Section 2.4, the introduction to 2.5, and chapter 1.

Homework 1 is Due.


  • My apologies for being unclear on the bookshelf example and a few other aspects of choosing groups. I’ll spend the first part of class on Friday clarifying these things and giving a couple more examples. I did cover enough, though, that you should be able to do the homework (I’m working on writing them up for the webpage right now, and I’ll post some hints there).
  • With this homework, the recommended problems, including self-test problems 1.1 and 1.9, are easier than the homework and will help. The answers to some are in the back of the book.
  • For homework 1, I should be able to get this current set back to you on Monday, along with (hopefully) Friday’s. After that, the turnaround should be the next class.
  • The midterm is on Friday, July 2nd. I’ll get your cheat-sheet up on the website Monday, and also hand it out in class. Wednesday, we’ll have a review.
  • We didn’t get to the calculus self-test today; it will be posted below shortly.
  • Solutions to Homework 1 will be posted on Friday.

HW 2 + Solutions

Due 06/25 - Friday: Theoretical Exercise 2.12; Problems 2.12, 2.44, and 2.55.

Recommended Problems: Self-Test exercises 1.1, 1.9, and 2.3 (from Monday); Problems 2.10, 2.33.

Theoretical Exercise 2.12:

Show that the probability that exactly one of the events E or F occurs equals P(E) + P(F) - 2P(EF).


There are several ways to prove this, here’s one:

We’re interested in P(E^c F \cup F^cE). Now E^c F \cap F^cE = \emptyset, so the events are mutually exclusive. Thus

(1)P(E^c F \cup F^cE) = P(E^c F) + P(F^c E) \quad \text{ by axiom 3}


F = F \cap (E \cup E^c) = FE \cup FE^c

These are two mutually exclusive events, so

P(F) = P(FE) + P(FE^c)


P(E) = P(FE) + P(F^cE)

Combining these with (1) above, we have that

P(E^c F \cup F^cE) = (P(F) - P(FE)) + (P(E) - P(FE)) = P(E) + P(F) - 2P(FE)

Problem 2.12:

An elementary school is offering 3 language classes; one in Spanish, one in French, and one in German. These classes are open to any of the 100 students in the school. There are 28 students in the Spanish class, 26 in the French class, and 16 in the German class. There are 12 students that are in both Spanish and French, 4 in both Spanish and German, 6 in both French and German, and 2 taking all three.

  1. If a student is chosen randomly, what is the probability that he or she is not in any of these classes?
  2. If a student is chosen randomly, what is the probability that he or she is taking exactly one language course?
  3. If 2 students are chosen randomly, what is the probability that at least 1 is taking a language class?

Hints: Understand example 2.5L; also, on c, be careful about using the same set sizes to calculate the probability that the second student is taking a language class, since you’ve already chosen the first.


This problem is a straightforward application of the inclusion/exclusion formula for 3 events,

(2)P(A \cup B \cup C) = P(A) + P(B) + P(C) - P(AB) - P(AC) - P(BC) + P(ABC)

  1. Let

E_S &= "Student in Spanish" \\
E_F &= "Student in French" \\
E_G &= "Student in German"


P(\text{``student in no class''})
&= 1 - P(``student in at least one class'') \\
&= 1 - P(E_S \cup E_F \cup E_G) \\
&= 1 - \Tbr{P(E_S) + P(E_F) +P(E_G) - P(E_S E_F) - P(E_S E_G) - P(E_F E_G) + P(E_S E_F E_G)} \\
&= 1 - \inv{100}\Tbr{28 + 26 + 16 - 12 - 4 - 6 + 2} \\
&= 1 - \frac{50}{100} = \inv 2

  1. The Venn diagram for this event looks like this:


    Now consider P(E_S) + P(E_F) + P(E_G). However, this counts regions 1,2, and 3 twice and region 4 three times. Thus, we need to remove the measure of the double-counted regions completely, subtracting it out would give us 2(P(E_S
E_F) + P(E_S E_G) + P(E_F E_G)). However, this is again too much, as it subtracts the measure of region 4 – E_S E_F
E_G – 6 times, requiring us to add it back in 3 times for 0 total. Our final formula is thus:

    &P(E_S) + P(E_F) +P(E_G) - 2\Tbr{P(E_S E_F) + P(E_S E_G) + P(E_F E_G)} + 3P(E_S E_F E_G) \\
&\quad = \inv{100}\T{28+26+16-2(12+4+6) +3\cdot 2} \\
&= \frac{32}{100} = 0.32

  2. The easiest way to do this is as follows:

&P(\text{``At least one student taking the langauge class''}) \\
&\quad = 1 - P(\text{``Neither student taking the course.''}) \\
&\quad = 1 -
 \frac{(\text{``\# students not taking language''})
            \times (\text{``\# students not taking language minus 1st student''})}
  {(\text{``\# students''}) \times (\text{``\# students minus 1st student''})} \\
&\quad = 1 - \frac{50\times 49}{100\times 99} \\
&\quad = \frac{149}{198}

Problem 2.44:

Five people, designated as A,B,C,D, and E, are arranged in linear order. Assuming that each possible order is equally likely, what is the probability that

  1. There is exactly one person between A and B?
  2. There are exactly two people between A and B?
  3. There are exactly three people between A and B?

Hints: Do part c. first, and think about representing groups as bins like we did in class. These bins, though don’t need to be contiguous.


  1. The criteria defining the event E specifies that we have one of the pairs AxByz, xAyBz, xyAzB, or the same with A and B reversed. Thus there are 6 ways to place A and B, and 3! ways to fill x,y,and z. Thus

    \sizeof{E} &= 6\cdot 3! \\
\So \Pof{E} &= \frac{6\cdot 3!}{5!} = \frac{36}{120}

  2. This is the same as part a, but now there are 4 ways of finding the location of A and B, namely AxyBz, xAyzB, BxyAz, and xByzA. Thus

    \sizeof{E} &= 4\cdot 3! \\
\So \Pof{E} &= \frac{4\cdot 3!}{5!} = \frac{24}{120} = \frac{1}{5}

  3. There are only two ways now, AxyzB and BxyzA. So

    \sizeof{E} &= 2\cdot 3! \\
\So \Pof{E} &= \frac{2\cdot 3!}{5!} = \frac{12}{120} = \frac{1}{10}

Problem 2.55:

Compute the probability that a hand of 13 cards contains:

  1. The ace and king of at least one suit.
  2. All 4 of at least 1 of the 13 denominations.

Hints: Understand examples 2.5f. 2.5g, and 2.5h. Solving this problem will involve breaking the events up into smaller events that you can calculate the probabilities of by counting how many orderings match these events.


This problem was presented in class. The final solutions are:

&\text{ a. } \qquad
    - \nchoosem{4}{2}\nchoosem{48}{9}
    + \nchoosem{4}{3}\nchoosem{46}{7}
    - \nchoosem{4}{4}\nchoosem{44}{5}}{\nchoosem{52}{13}} = 0.2198 \\
&\text{ b. } \qquad
   \frac{ \nchoosem{13}{1}\nchoosem{48}{9}
    - \nchoosem{13}{2}\nchoosem{44}{5}
    + \nchoosem{13}{3}\nchoosem{40}{1}}{\nchoosem{52}{13}} = 0.0342

Calculus Self-Test

The following four problems are similar to things we’ll need to do regularly in the material after the midterm.

  1. Calculate \int_1^a \inv{x^{\beta + 1}}\ud x, \quad \beta > 0, a \geq 1

  2. Calculate \int_0^\infty e^{-\lambda x} \ud x

  3. Let

    g(x) = \fOOO
{2 - x} {x\in\Icc{1,2}}

What is \int_{-\infty}^{\infty} g(x) \ud x?
  1. Calculate \sum_{n = 0}^\infty 3^{-n}.

I’ll present the solutions to these briefly on Friday.

06/25 - Friday

Finish up introdution to combinatorics (chapter 1) with some more examples, clarify seeing experiments as groups or as orderings, present HW 2.55, identifiability issues, finish up combinatorics.


  • Office hours have changed!!! They are now at 12:30 - 2:00 on both Tuesday / Thursday, as well as for the 40 minutes after each class. Those of you who can’t make the switch can still come at the earlier time on Tuesday.

HW 3 + Solutions

Due 06/28 - Monday: Problems 2.28, 2.37, 2.46.

Recommended Problems: Problems 1.9, 2.29 a and b, 2.35, and self-test 2.17.

Problem 2.28:

An urn contains 5 red, 6 blue, and 8 green balls. If a set of 3 balls is randomly selected, what is the probability that each of the balls will be (a) of the same color; (b) of different colors? Repeat under the assumption that whenever a ball is selected, its color is noted and it is then replaced in the urn before the next selection. The first is known as sampling without replacement, and the latter is known as sampling with replacement.

Hint: Understand examples 2.5b-e and problem 2.29 from the recommended problems first.


First consider the case of sampling without replacement. For (a), we can divide the event into 3 mutually exclusive events:

E_R &= \set{\text{All 3 balls are red.}} \\
E_B &= \set{\text{All 3 balls are blue.}} \\
E_G &= \set{\text{All 3 balls are green.}}

Then P(E_R) is just the probability that all three of the balls drawn are red, or just

P(E_R) &= \frac{\text{\# of ways to pick out 3 red balls}}
               {\text{\# of ways to pick out 3 balls}}
       = \frac{\nchoosem{5}{3}}{\nchoosem{19}{3}}

P(E_B) and P(E_G) are similar. Since they are mutually exclusive, we can just add them, so

P(\text{all balls are same color})
&= P(E_R) + P(E_B) + P(E_G) \\
&= \frac{\nchoosem{5}{3} + \nchoosem{6}{3} + \nchoosem{8}{3}}{\nchoosem{19}{3}} \\
&\simeq 0.0887513

For b, it’s even easier:

P(\text{one ball of each color})
&= \frac{\text{\# of ways to pick out 1 ball from each bin.}}
        {\text{\# of ways to pick out 3 balls}} \\
&= \frac{\nchoosem{5}{1}\nchoosem{6}{1}\nchoosem{8}{1}}{\nchoosem{19}{3}} \\
&= 0.247678

When sampling with replacement, the problem doesn’t change between draws. Thus there are now, e.g. 5^3 ways of choosing 3 balls, and the size of the sample space is 19^3. This gives us, for part a.

P(\text{All 3 balls are red.}) &= \frac{5^3}{19^3} \\
\So P(\text{all balls are same color})
&= \frac{5^3 + 6^3 + 8^3}{19^3} \\
&= 0.124362

which is a little higher than for the case of sampling without replacement. This is what we’d expect, as not replacing a ball of a certain color makes the likelihood of drawing that color less likely on the next try.

For b., the easiest way of doing it, without using conditional probability, is to divide the problem up into 3! mutually exclusive events. Let

E_{R1} &= \set{\text{Ball 1 is red.}} \\
E_{R2} &= \set{\text{Ball 2 is red.}} \\
&\vdots \\
E_{G3} &= \set{\text{Ball 3 is green.}}


P(\text{all balls are same color})
&= \underbrace{P(E_{R1}E_{B2}E_{G3}) + P(E_{R1}E_{G2}E_{B3}) + \cdots + P(E_{G1}E_{B2}E_{R3})}
_{\text{3! mutually exclusive events, all with the same probability.}} \\
&= 3! \times \frac{5}{19}\times \frac{6}{19}\times\frac{8}{19}\\
&= 0.209943

which is a little lower than the case of sampling without replacement. Again, this is what we’d expect.

Problem 2.37:

An instructor gives her class a set of 10 problems with the information that the final exam will consist of a random selection of 5 of them. If a student has figured out how to 7 of the problems, what is the probability that he or she will answer correctly (a) all 5 problems or (b) at least 4 of the problems?


  1. This is just equal to the number of groups of five possible out of the 7 prepared problems divided by the number of groups of five possible. This is just:


  2. For this, to make sure we don’t count overlapping events, we need to break it up into the mutually exclusive events of getting exactly five correct and exactly 4 correct. Thus there are

    \nchoosem{7}{5} + \nchoosem{7}{4}\nchoosem{3}{1}

    different outcomes from the \nchoosem{10}{5} possible outcomes. Thus the probability of getting at least 4 correct is:

{\nchoosem{7}{5} + \nchoosem{7}{4}\nchoosem{3}{1}}

Problem 2.46:

How many people have to be in a room in order that the probability that at least two of them celebrate their birthday in the same month is at least \onehalf? Assome that all possible monthly outcomes are equally likely.

Hint: understand example 2.5i.


The probability that there are at least 2 people in a group of n who have birthdays in the same month is just 1 minus the probability that all n have birthdays in distinct months. n people can fit distinctly into 12 slots in

12\times 11 \times \cdots \times (12 - n + 1)

different ways, and the size of the sample space – the number of ways to put 12 people in 12 slots with overlap allowed – is just 12^n, yeilding the probability of a match being

P(\text{at least two birthdays in the same month})
&= 1 - \frac{12\times 11 \times \cdots \times (12 - n + 1)}{12^5}

Finding n by trying various n shows that when n=5, the probability of a match is 89/144, or about 0.618056. When n=4, this probability is only .427083, so n=5 is the answer.

06/28 - Monday

Conditional Probability, Independence, and the “baby” version of Bayes’ Formula. Basically sections 3.1-3.2, 3.4, and the basic idea of 3.3.


  • Midterm Exam on Friday! Be prepared.

HW 4 + Solutions

Due 06/30 - Wednesday: Problems 3.9, 3.10, 3.14, 3.62.

Recommended Problems: Problems 3.20, 3.37; Self-Test problems 3.9, 3.13, 3.18.

Problem 3.9:

Consider 3 urns. Urn A contains 2 white and 4 red balls; urn B contains 8 white and 4 red balls; and urn C contains 1 white and 3 red balls. If 1 ball is selected from each urn, what is the probability that the ball chosen from urn A was white, given that exactly 2 white balls were selected?


This problem requires us to use conditional probability. Let

W_A &= \set{\text{Ball 1 is white}}  \\
W_B &= \set{\text{Ball 2 is white}}  \\
W_C &= \set{\text{Ball 3 is white}}

and similarly for red. Then the event we’re conditioning on, F, is the event that there are exactly 2 white balls selected, i.e.

F = W_A W_B R_C \cup W_A R_B W_C \cup R_A W_B W_C.

We’re then interested in the probability

P(W_A \C F)
&= \frac{P(F \cap W_A) }{P(F)}


F\cap W_A &= (W_A W_B R_C \cup W_A R_B W_C \cup R_A W_B W_C) \cap W_A \\
     &= W_AW_BR_C \cup W_AR_BW_C


P(W_A \C  F)
&= \frac{P(W_A W_B R_C \cup W_A R_B W_C)}{P(W_AW_BR_C \cup W_AR_BW_C \cup R_AW_BW_C)}  \\
&= \frac{P(W_A W_B R_C) + P(W_A R_B W_C)}{P(W_AW_BR_C) + P(W_AR_BW_C) + P(R_AW_BW_C)}

where the last step comes from the events being mutually exclusive. Now the urns are independent, and these probabilities can be easily calculated. We thus have that:

P(W_A W_B R_C) &= \frac{2}{6}\times\frac{8}{12}\times\frac{3}{4} \\
P(W_A R_B W_C) &= \frac{2}{6}\times\frac{4}{12}\times\frac{3}{4} \\
P(R_A W_B W_C) &= \frac{4}{6}\times\frac{8}{12}\times\frac{1}{4} \\

Plugging in the numbers and simplifying gives the final solution, \sdfrac{7}{11}.

Problem 3.10:

Three cards are randomly selected, without replacement, from an ordinary deck of 52 playing cards. Compute the conditional probability that the first card selected is a spade given that the second and third cards are spades.


There are several ways of doing this problem. First, one can note that the reduced sample space – formed by “fixing” two cards in the draw of three cards to be spades – is just a regular deck of cards minus 2 spades. (Convince yourself that it’s fine to condition on events that occur after the first card is dealt out.) The answer would then be

P(S_1 \C S_2 S_3) = \frac{\nchoosem{11}{1}}{\nchoosem{50}{1}} \simeq 0.22

It’s also possible to do this using the definition of conditional probability. In this case, we would have

P(S_1 \C S_2 S_3) &= \frac{P(S_1 S_2 S_3)}{P(S_2S_3)} \\
&= \frac{\nchoosem{13}{3}}{\nchoosem{52}{3}}
\quad \left/\quad

which works out to be the same solution.

Problem 3.14:

An urn initially contains 5 white and 7 black balls. Each time a ball is selected, its color is noted and it is replaced in the run along with 2 other balls of the same color. Compute the probability that

  1. The first 2 balls selected are black and the next 2 white;
  2. Of the first 4 balls selected, exactly 2 are black.


a. We’ll do this using conditional probability, though we could do it similar to the way we did it in part (b).

We want to find P(B_1B_2W_3W_4). Using the multiplication rule which we discussed earlier, we can expand this as:

P(B_1B_2W_3W_4) &= P(B_1)P(B_2\C B_1) P(W_3 \C B_2 B_1) P(W_4 \C W_3 B_2 B_1)

Each of these probabilities is easily calculatable:

P(B_1) &= \frac{7}{12} \text{ Now there's 5 white and 7 black.} \\
P(B_2 \C B_1) &= \frac{9}{14} \text{ Now there's 5 white and 9 black.} \\
P(W_3 \C B_2 B_1) &= \frac{5}{16} \text{ Now there's 5 white and 11 black.} \\
P(W_4 \C W_3 B_2 B_1) &= \frac{7}{18} \text{ Now there's 7 white and 11 black.}

Multiplying these together gives us \frac{35}{768}, or 0.0455729.

b. This problem can be done two ways. The “sure” way is to do the same as above for the \nchoosem{4}{2} = 6 different cases. This is easier than it looks, as one can notice that all that matters are what balls were drawn prior to the current one. Thus

P(W_4 \C W_3 B_2 B_1) = P(W_4 \C B_3 W_2 B_1) = P(W_4 \C B_3 B_2 W_1)

and so on. Furthermore, when one does this for all 6 terms we find that they are all equal; i.e. it doesn’t matter on the order.

One could guess, however, that this would be the case initially. If we look at the probability in (a):

\frac{7 \times 9 \times 5 \times 7}{12\times 14\times 16\times 18}

The number of balls is always increasing by 2, so the denominator is the same in all cases. Furthermore, there will always be 7 black balls the first time a black ball is drawn and 9 black balls the second time a black ball is drawn. Same with white; there will always be 5 white balls when a white ball is first drawn and 7 the second time it is drawn. In general, this is an example of Exchangability; I don’t expect you to know it, but it’s a weaker condition than independence and essentially says that all orders are equally likely.

So, summarizing,

P(\text{2 white and 2 black})
&= \nchoosem{4}{2}\frac{7 \times 9 \times 5 \times 7}{12\times 14\times 16\times 18} \\
&= \frac{210}{768}

Problem 3.62:

Barbara and Dianne go target shooting. Suppose that each of Barbara’s shots hits the wooden duck target with probability p_1 and each of Dianne’s shots its the target with probability p_2. Suppose they both shoot simultaneously at the same target. If the wooden duck is knocked over, indicating it was hit, what is the probability that

  1. Both shots hit the duck.
  2. Barbara’s shot hit the duck?

What independence assumptions have you made?


a. Let B be the event that Barbara hit the duck, and let D be the event that Dianne hit the duck, and then B
\cup D is the event that the duck was hit. We’re interested in

P(BD\C B\cup D)
&= \frac{P(B \cup D \C BD) P(BD)}{P(B \cup D)} \\
&= \frac{1 \cdot P(BD)}{P(B) + P(D) - P(BD)}

Now, if we assume that B and D are independent (which is what I asked for),

P(BD) = P(B)P(D) = p_1p_2


P(BD\C B\cup D) = \frac{p_1p_2}{p_1 + p_2 - p_1p_2}

  1. Similarly, we’re interested in the conditional probability:

P(B \C B\cup D)
&= \frac{P(B \cup D \C B)P(B)}{P(B \cup D)} \\
&= \frac{1 \cdot p_1}{p_1 + p_2 - p_1p_2}

If we suspect that the shots hitting the target are not independent – e.g. Dianne’s bullet knocks it over before Barbara’s bullet gets there – then P(BD) \neq P(B)P(D). However, we don’t have enough information to consider this case mathematically.

06/30 - Wednesday

No homework, more on conditional probability, exam review.

07/02 - Friday

Midterm Exam

The midterm exam will be closed book, closed notes. 50% - 75% of the material will be directly from the recommended problems, so make sure you understand them.

The exam will be one hour (the first hour), and then I’ll jump into the next topic – more advanced use of Bayes’ theorem.

You will not be able to use a calculator on the exam.

Here is a reference sheet of the basic equations, propositions, and axioms that you can refer to on the exam. The exam is intended to test your ability to reason about probability problems similar to what we’ve encountered thus far both formally and intuitively.

The test material will cover the following sections:

Chapter 1:

Sections 1-6, except for the binomial and multinomial theorems for polynomial equations and related examples.

Chapter 2:

Sections 1-5. You won’t be responsible for the general inclusion/exclusion formula (proposition 4.4) except what’s on the reference sheet, the proofs of propositions 4.1-4.3, or examples 5j,5k,5m,5n, and 5o.

Chapter 3:

Sections 1,2, 4, and the basic idea of 5; that P(E\C F) is a probability measure. You won’t be responsible for example 4j and the subsequent material in section 4.

In general, I won’t require you to know concepts for the exam that you haven’t seen in the homework, recommended problems, or in examples discussed in class. If you’ve kept up and understand all the problems, you should do fine and there should be no surprises.

HW 5 + Solutions

Due Wednesday, 07/05: Problems 3.31, 3.70, 3.76, 3.83, Short Essay (see below).

Recomended Problems: Problems: 3.44, 3.48.

Problems 3.31:

Ms. Aquina has just had a biopsy on a possibly cancerous tumor. Not wanting to spoil a weekend family event, she does not want to hear any bad news in the next few days. But if she tells the doctor to only call if the news is good, then if the doctor does not call, Ms. Aquina can conclude that the news is bad. So, being a student of probability, she instructs the doctor to flip a coin. If it comes up heads, the doctor is to call if the news is good and not call if the news is bad. If the coin comes up tails, the doctor is not to call. In this way, even if the doctor doesn’t call, the news is not necessarily bad.

Let \alpha be the probability that the tumer is cancerous; let \beta be the conditional probability that the tumor is cancerous given that the doctor does not call.

  1. Which should be larger, \alpha or \beta?
  2. Find \beta in terms of \alpha, and prove your answer in part (a).


  1. One would think that \beta would be larger, since the conditional sample space is the unconditional sample space with the event that the doctor calls excluded, and this event only occurs if the tumor is not cancerous.

  2. This is a straightforward application of Bayes’ rule. Let A be the event that the tumor is cancerous, and let H be the event that the doctor’s coin flip comes up heads. Then the event that doctor does not call is just AH \cup H^c. Continuing,

&= \Pof{A \C AH \cup H^c} \\
&= \frac{\Pof{A\cap (AH \cup H^c)}}{\Pof{AH \cup H^c}} \\
&= \frac{\Pof{AH \cup AH^c}}{\Pof{AH} + \Pof{H^c}} \\
&= \frac{\Pof{AH} + \Pof{AH^c}}{\Pof{AH} + \Pof{H^c}}

    Now \Pof{AH} = \Pof{A}\Pof{H} = \alpha/2 and \Pof{AH^c} = \Pof{A}\Pof{H^c} = \alpha/2 as the two events are independent. Thus we have that

    \beta = \frac{\alpha}{\alpha / 2 + 1/2} = \frac{2\alpha}{\alpha + 1}

Problems 3.70:

There is a 50-50 chance that the queen carries the gene for hemophilia. If she is a carrier, then each prince has a 50-50 chance of having hemophilia. (a) If the queen has had three princes without the disease, what is the probability that the queen is a carrier? (b) If there is a fourth prince, what is the probability that he will have hemophilia?


  1. Let H_1, H_2, etc. be the events that the first, second, etc. princes have hemophilia, and let C be the event that the queen is a carrier. Again, using Bayes’ rule,

    \Pof{C\C H_1^c H_2^c H_3^c}
&= \frac{\Pof{H_1^c H_2^c H_3^c \C C} \Pof{C}}{\Pof{H_1^c H_2^c H_3^c}} \\
&= \frac
{\Pof{H_1^c H_2^c H_3^c \C C} \Pof{C}}
{\Pof{H_1^c H_2^c H_3^c \C C} \Pof{C} + \Pof{H_1^c H_2^c H_3^c \C C^c} \Pof{C^c}}

    Now the events that the princes have hemophilia are independent only when you condition on the state of the queen. Thus

    \Pof{H_1^c H_2^c H_3^c \C C}
&= \Pof{H_1^c \C C}\Pof{H_2^c \C C}\Pof{H_3^c \C C}  \\
&= \inv{8}


    \Pof{H_1^c H_2^c H_3^c \C C^c} = 1


    \Pof{C\C H_1^c H_2^c H_3^c}
{\inv{8}\cdot\inv{2} + 1\cdot\inv{2}} \\

  2. For this problem, we’re looking at \Pof{H_4\C H_1^cH_2^cH_3^c}. The concern with this problem is just that we need to be careful about the independence assumptions. In particular, the state of the fourth prince depends on the state of the previous three princes only through the event that the queen is a carrier. Thus we know that we’ll have to include the state of the queen somewhere in our analysis. Here’s one way to do it:

    \Pof{H_4\C H_1^cH_2^cH_3^c}
{\Pof{H_4 H_1^cH_2^cH_3^c}}
{\Pof{H_1^cH_2^cH_3^c}} \\
{\Pof{H_4 H_1^cH_2^cH_3^c\C C}\Pof{C} + \Pof{H_4 H_1^cH_2^cH_3^c\C C^c}\Pof{C^c}}
{\Pof{H_1^cH_2^cH_3^c \C C}\Pof{C} + \Pof{H_1^cH_2^cH_3^c \C C^c}\Pof{C^c}}

    Looking at the individual terms, we have

    \Pof{H_4 H_1^cH_2^cH_3^c\C C}
&= \Pof{H_4 \C C}\Pof{H_1^c \C C}\Pof{H_2^c \C C}\Pof{H_3^c\C C} = \frac{1}{16} \\
\Pof{H_4 H_1^cH_2^cH_3^c\C C^c} &= 0 \\
\Pof{H_1^cH_2^cH_3^c \C C} &= \frac{1}{8} \\
\Pof{H_1^cH_2^cH_3^c \C C^c} &= 1

    Thus we can plug these values in:

    \Pof{H_4\C H_1^cH_2^cH_3^c}
&= \frac{ \inv{16} \cdot \inv{2} }
{\inv{8}\cdot \inv{2} + \inv{2} } \\
&= \inv{18}

Problem 3.76:

Suppose that E and F are mutually exclusive events of an experiment. Show that if independent trials of this experiment are performed, then E will occur before F with probability P(E) / [P(E) + P(F)].


There are two ways to do this problem, one a bit more formally correct than the other; the slightly less formal one is all I expect you to know, so I’ll just present that solution here. Informally, consider repeating an experiment and then stopping when either event E or event F occurs. Then the event that you stop the expirement is the same as the event E \cup F. So

P(E \C E \cup F)
&= \frac
{P( E \cap (E \cup F) )}
{P(E \cup F)} \\
&= \frac{P(E)}{P(E) + P(F)}

where the last step follows as E and F are mutually exclusive.

Problem 3.83:

Die A has 4 red and 2 white faces, whereas die B has 2 red and 4 white faces. A fair coin is flipped once. If it lands on heads, the game continues with die A; if it lands on tails, then die B is to be used.

  1. Show that the probability of red at any throw is \onehalf.
  2. If the first two throws result in red, what is the probability of red at the third throw?
  3. If red turns up at the first two throws, what is the probability that it is die A that is being used?


  1. Let R_i be the event that throw i is red, and let A and B be the events that the die chosen is die A and die B, respectively. Then

&= \Pof{R_i \C A}\Pof{A} + \Pof{R_i \C B}\Pof{B} \\
&= \frac{2}{6}\times\frac{1}{2} + \frac{4}{6}\times\frac{1}{2} \\
&= \frac{2}{12} + \frac{4}{12} = \frac{1}{2}

  1. I think it’s easier to do (c) first, so here we go. The probability we’re interested in is

    \Pof{A \C R_1R_2}
&= \frac
{\Pof{R_1R_2 \C A}\Pof{A}}
{\Pof{R_1R_2\C A}\Pof{A} + \Pof{R_1R_2\C B}\Pof{B}}

    Now once we know the die being used, the events of getting reds are independent. Thus

    \Pof{R_1R_2 \C A} &= \Pof{R_1 \C A}\Pof{R_2 \C A} = \frac{2}{3}\times\frac{2}{3} = \frac{4}{9}\\
\Pof{R_1R_2 \C B} &= \Pof{R_1 \C B}\Pof{R_2 \C B} = \frac{1}{3}\times\frac{1}{3} = \frac{1}{9}


    \Pof{A \C R_1R_2} &= \frac
{\frac{4}{9}\times\frac{1}{2} + \frac{1}{9}\times\frac{1}{2}} \\
&= \frac{4}{5}

  1. This time we’re interested in \Pof{R_3 \C R_2 R_1}:

    \Pof{R_3 \C R_2 R_1 }
&= \frac{\Pof{R_3 R_2 R_1 }}{\Pof{R_2 R_1 }} \\
&= \frac
{\Pof{R_3 R_2 R_1 \C A}\Pof{A} + \Pof{R_3 R_2 R_1 \C B}\Pof{B} }
{\Pof{R_2 R_1 \C A}\Pof{A} + \Pof{R_2 R_1 \C B}\Pof{B} } \\
{\T{\frac{2}{3}}^3\inv{2} + \T{\frac{1}{3}}^3\inv{2}}
{\T{\frac{2}{3}}^2\inv{2} + \T{\frac{1}{3}}^2\inv{2}} \\
&= \frac{3}{5}

Short Essay

The fifth problem is to look at one of four classic probabalistic paradoxes and write a short essay on them. The four paradoxes to choose from are Berkson’s paradox, Bertrand’s Box paradox, Sleeping Beauty Problem, or the Three Prisoner’s problem. For whichever one you choose, I want you to cover 3 things:

  1. Why is it counterintuitive? In other words, why do people often think of it wrongly? What is it easy to forget?
  2. What is a correct way of thinking about it? Is there a correct but intuitive way of reasoning about it?
  3. Mathematically, what is going on? What are the conditional probabilities? Do they work out correctly? What is it easy to miss?

A suggested format is to write a paragraph / equations-with-explanation for each of the three points above. In class, we already talked about similar paradoxes, namely the Monty Hall Problem and the Boy-Girl problem. I tried to discuss the above three points on each, and I don’t expect more than that. However, going above and beyond the above guidelines may definitely earn you some extra credit points. Ways to do this include discussing several articles, looking at generalizations of the same phenomena, discussing a real-life-scenario where people might make this mistake, etc.

07/05 - Monday (Holiday)

Holiday, no class.

07/07 - Wednesday

Introduce Random Variables as extensions of previous events. Read sections 4.1, 4.2, 4.6, and 4.8. You may skip discussions about the expectation and variance of these random variables. Pay attention to the examples.

HW 6 + Solutions

Due 07/09 - Friday: Problems 4.41, 4.48, 4.71.

Recommended Problems: Self Test 4.3 and 4.8; problems 4.7 and 4.18

Extra Credit: 4.11.

Problem 4.41:

A man claims to have extrasensory perception. As a test, a fair coin is flipped 10 times, and the man is asked to predict the outcome in advance. He gets 7 out of 10 correct. What is the probability that he would have done at least this well if he had no ESP?


Let X be a random variable representing the number of correctly guessed flips. Now X has a Binomial distribution with parameters n=10 and p = \onehalf. Thus the probability mass function of X is

&= \binom{10}{x}\T{\frac{1}{2}}^{x}\T{\frac{1}{2}}^{10-x} \\
&=\binom{10}{x} \frac{1}{2^{10}}


\Pof{\text{at least 7}}
&= P(X=7)+ P(X=8)+ P(X=9)+ P(X=10) \\
&=\frac{176}{1024} \simeq 0.172

Problem 4.48:

It is known that diskettes produced by a certain company will be defective with probability 0.01, independently of each other. The company sells the diskettes in packages of size 10 and offers a money-back guarantee that at most 1 of the 10 diskettes in the package will be defective. If someone buys 3 packages, what is the probability that he or she will return exactly 1 of them?


Let X be a random variable representing the number of defective disks in a package of 10. Then, X\sim\Binomial{x;\,n=10,p=0.01}. Thus the p.m.f. of X is

p(x) &= \binom{10}{x}(.01)^{x}(.99)^{10-x}

We are interested in the probability that 2 or more disks are defective, the condition for returning a package, which is just

P(\text{2 or more defective})
&= 1 - P(X = 0) - P(X = 1) \\
&= 1 - p(0) - p(1) \\
&\simeq 0.0042662

This is the probability that an individual package is returned. Now, let Y be the number of returned packages in a shipment of 3. Then Y is also going to be a random variable with a Binomial distribution, this time with parameters n=3 and p=0.0042662. Thus the probability that exactly one is returned is just

P(Y = 1)
&= \Binomial{x;\,n=3,p=0.0042662} \simeq 0.0126896

Problem 4.71:

Consider a roulette wheel consisting of 38 numbers – 1 through 36, 0, and double 0. If Smith always bets that the outcome will be one of the numbers 1 through 12, what is the probability that

  1. Smith will lose his first 5 bets;
  2. his first win will occur on his fourth bet?


  1. This can be calculated many ways; the easiest are as \Binomial{0;n=5,p=12/38} or as the probability that 5 independent trials in a row are all failures.

    \Pof{\text{5 failures}} &= 0.149951

  2. This is the probability that a geometric random variable having a probability of success 12/38 takes on the value 4. Thus,

    \Geometric{4; p = 12/38} = \T{\frac{26}{38}}^3\frac{12}{38} = 0.10115

07/09 - Friday

Expectation, variance, more on probability mass functions, and the cumulative distribution function. Sections 4.3-4.5.

HW 7 + Solutions

Due 07/12 - Monday: Problems 4.33, 4.43, 4.75; Theory Exercise 4.32.

Recommended Problems: Problems 4.18, 4.21; Self-Test Problems 4.21, 4.9, 4.22.

Problem 4.33:

A newsboy purchases papers at 10 cents and sells them at 15 cents. However, he is not allowed to return unsold papers. If his daily demand is a binomial random variable with n=10 and p=\sdfrac{1}{3}, approximately how many papers should he purchase so as to maximize his expected profit?

Hint: Understand / use example 4.4b.


For this problem, define the following variables:

s&=\text{\# items stocked}\\
b&=\text{net profit for each item sold}=15-10=5\\
\ell&=\text{net loss for each item not sold}=10\\
X&=\text{\# items sold} \sim \Binomial{x;10,\frac{1}{3}}

The profit, as a function of items sold, is then

P(X) = \fOO
{bX - (s-X)\ell}{X \leq s}
{bs}{X > s}

The expected profit is just the expected value of this function, i.e.

&= \sum_{x = 0}^10 P(x) \times \Binomial{x;10,\frac{1}{3}}\\
&= \sum_{x = 0}^s (bX - (s - X)\ell) \times \Binomial{x;10,\frac{1}{3}}
   + \sum_{x = s+1}^10 bs \times \Binomial{x;10,\frac{1}{3}}

There are several ways to maximize this function; the first way is just to plug and play with different values of s. The example in the book, 4.4b, gives a comprehensive treatment of various ways to make this calculation easier. From there, we have the following useful equation; stocking s+1 units is better than stocking s units whenever

\sum_{x=0}^s p(x) < \frac{b}{b+\ell}

where p(x) = \Binomial{x;n=10,p=\frac{1}{3}}. Using any of these ways should show that the optimal number of papers to stock up on is 3.

Problem 4.43:

A communications channel transmits the digits 0 and 1. However, due to static, the digit transmitted is incorrectly received with probability 0.2. Suppose that we want to transmit an important message consisting of one binary digit. To reduct the chance of error, we transmit 00000 instead of 0 and 11111 instead of 1. If the reciever of the message uses “majority” decoding, what is the probability that the message will be wrong when decoded? What independence assumptions are you making?


Assume independence between flips on the bits. This is then just the probability that at least 3 bits are flipped. Let X represent the number of bits flipped.

P(\text{wrong message})
&=P(\text{more than 2 digits in error})\\
  + \binom{5}{4}\left(\frac{1}{5}\right)^{4}\left(\frac{4}{5}\right)^{1}
  + \binom{5}{5}\left(\frac{1}{5}\right)^{5}\left(\frac{4}{5}\right)^{0}\\
&=\frac{181}{3125}\approx .0579

Problem 4.75:

A fair coin is continually flipped until heads appears for the tenth time. Let X denote the number of tails that occur. Compute the probability mass function of X.


Let Y represent the number of trials before 10 heads occur. Then Y has a negative binomial distribution with parameters r = 10 and p = 1/2. Now let X denote the number of tails. Because every coin flip that doesn’t come up heads comes up tails, we know that X = Y -10. We’re interested in the probability mass function of X, so we can procede as follows:

&=\mathpzc{NegativeBinomial}\T{x+10; r=10,p} \\
&=\dbinom{x+9}{9}p^{10}(1-p)^{x} \\

Theoretical Exercise 4.32:

A jar contains n chips. Suppose that a boy successively draws a chip from the jar, each time replacing the one drawn before drawing another. This continues until the boy draws a chip that he has previously drawn before. Let X denote the number of draws, and find the probability mass function of X.


E_i&=\text{event of not drawing a previous chip on the $i^{th}$ draw}\\
p_i&=\text{probability of stopping on the $i^{th}$ draw}\\
n&=\text{\# of chips}\\
X&=\text{\# draws before ending}


  &=P(\text{probablity of not stopping on any previous draws and stopping on the $x^{th}$ draw})\\
  &=P(E_1E_2\cdots E_{x-1}(E_x)^{c})\\
  &=P(E_x^c \C E_{x-1}E_{x-2}\cdots E_1) P(E_{x-1} \C E_{x-2}E_{x-3}\cdots E_1)
      \cdots P(E_2 \C E_1) P(E_1)

Now the probability of not drawing a previously seen chip given that there have been no repeat draws is just the number of unseen chips divided by the total number of chips. On draw i, this is just (n-i+1) / n. Similarly, the probability of drawing a previously seen chip on draw x, given that there have been no repeats thus far, is just (x-1)/n. Thus the above becomes:

  &=\frac{x-1}{n} \times \prod_{i = 1}^{x-1} \frac{n-i+1}{n}

Finally, since it is impossible to stop on the first draw, or draw a new one on the (n+1)\text{st} draw, we can give the following formula that includes the domain information:

P(X=x) = \fOO
{\frac{x-1}{n} \times \prod_{i = 1}^{x-1} \frac{n-i+1}{n}}
{x \in \set{2,3,...,n+1}}

07/12 - Monday

Poisson random variables, begin discussion of continuous random variables.

HW 8 + Solutions

Due 07/14 - Wednesday: Problems 4.60, 4.65, 5.2, 5.4

Recommended Problems: Problems 4.29, 4.30, 4.55, Self Test 4.22, 5.2.

Problem 4.60:

The number of times that a person contracts a cold in a given year is a Poisson random variable with parameter \lambda = 5. Suppose that a new wonder drug (based on large quantities of vitimin C) has just been marketed that reduces the Posson parameter to \lambda = 3 for 75 percent of the population. For the other 25 percent of the population, the drug has no appreciatble effect on colds. If an individual tries the drug for a year and has 2 colds in that time, how likely is it that the drug is beneficial for him or her? (I.e. they are in that group).


This is just a standard Bayes probability problem in which we calculate the final probabilities using the given distribution functions. Let

E&= \text{event of getting 2 colds in a year}\\
D&= \text{event that the new drug is effective}

Then we are interested in P(D\C E)

\Pof{D\C E}
&= \frac{\Pof{E \C D} \Pof{D}}{\Pof{E}}\\
&= \frac{\Pof{E \C D} \Pof{D}}{\Pof{E\C D}\Pof{D} + \Pof{E\C D^c}\Pof{D^c}}


\Pof{E\C D} &= \Poisson{2; \lambda = 3} \\
\Pof{E\C D^c} &= \Poisson{2; \lambda = 5} \\
\Pof{D} &= 0.75 \\
\Pof{D^c} &= 0.25


\Pof{D\C E}
  + \left(\frac{e^{-5}\cdot5^2}{2!}\right)(.25)}\\
&\approx .8886

Problem 4.65:

Each of 500 soldiers in an army company independently has a certain disease with probability 1/10^3. This disease will show up in a blood test, and to facilitate matters, blood sample from all 500 are pooled and tested.

  1. What is the (approximate) probability that the blood test will be positive (and so at least one person has the disease)?

Suppose now that the blood test yields a positive result.

  1. What is the probability, under this circumstance, that more than one person has the disease?

One of the 500 people is Jones, who knows that he has the disease.

  1. What does Jones think is the probability that more than one person has the disease?

As the pooled test was positive, the authorities have decided to test each individual separately. The first i-1 of these tests were negative, and the i\text{th} one – which was on Jones – was positive.

  1. Given the preceding, as a function of i, what is the probability that any of the remaining people have the disease?


  1. Let X be the number of sick soldiers. In this case, it’s justified to use the Poisson approximation since np =
500/1000 =1/2 is small relative to n = 1000.

    Using the Poisson approximation,

    \Pof{X \geq 1}
&= 1 - P(X = 0) \\
&\simeq 1 - \Poisson{0; \lambda = np} \\
&= 1 - \frac{e^{-\onehalf} \T{\frac{1}{2}}^0}{0!} \\
&= 1 - e^{-\onehalf} \\
&\simeq 0.3935

  2. For this part of the problem we need to first define some more events. Let

    T&=\text{Event that the test of the pooled samples is positive.}\\
M&=\text{Event that more than one solider is infected.}

    Then the probability that more than one soldier has the desease, given that the test is positive, is as follows. Let p(x) represent the probability mass the Poisson at x. Then

    P(M|T) &\approx \frac{P(T|M)P(M)}{P(T)} \\
&= \frac
{1 \times \T{1 - \Pof{X=0} - \Pof{X = 1}}}
{1 - \Pof{X = 0}} \\
&= \frac{
1  - \frac{e^{-\frac{1}{2}}\T{\frac{1}{2}}^0}{0!}
   - \frac{e^{-\frac{1}{2}}\T{\frac{1}{2}}^{1}}{1!}}
{1 - e^{-\onehalf}} \\
&\approx .2293

  3. Since Jones knows he has the disease, and that entirely explains the result of the test, the probability that more than one person has the disease (from his point of view) is really the probability that someone other than he has it. Or in other words, the probability that at least one of the remaining 499 people has the disease. Thus the problem is the same as (a), but with 499 people instead of 500. Thus, let Y be the number of people in this group that has the disease. In this case, again using the Poisson approximation,

    P(Y \geq 1)
&= 1-P(Y = 0) \\
&= 1-\frac{e^{-.499}(0.499)^0}{0!} \\
&\approx .3929

    (Note that the answer in the back of the book is wrong.

  4. Even though we are given that the test was positive, this does not provide any more information about the probability of the remaining (500-i) soldiers having the disease since, with Jones being the i\text{th} person, the positive test result is completely explained. Thus it is simply the probability that one person in the remaining group of 500-i people has it. Let this event be Y_i. Then, using the Poisson approximation – which is still valid, since p is really small relative to np – gives

    P(Y_i \geq 1)
&\sim 1-P(Y_i = 0) \\
&= 1-\frac{e^{-\frac{500-i}{1000}}\T{\frac{500 - i}{1000}}^0}{0!} \\
&= 1 - e^{-\frac{500-i}{1000}}

    Now, let’s see how close this really is the true probability of the Binomial. If we plot the probabilities given by the two different methods as a function of i, we get the following.

Plotting their values is so close that they are indistinguishable on the graph. Looking at their difference, the bottom plot, indeed shows that they are very similar; the difference is several orders of magnitude less than the values themselves. Thus the Poisson approximation is quite good.

Problem 5.2:

A system consisting of one original unit plus a spare can function for a random amount of time X. If the density of X is given (in units of months) by

f(x) = \fOO
{Cxe^{-x/2}} {x > 0}
{0}          {x \leq 0}

what is the probability that the system functions for at least 5 months?

Hint: Use the assumption that \int_{-\infty}^\infty p(x) \ud x = 1 to find C.


Since f(x) is a density function, the first step is to find the constant C such that the total area under f(x) is 1.

\int_{-\infty}^\infty Cxe^{-\frac{x}{2}} \ud x
&= C \Tcbr{\Tevalat{-2xe^{-\frac{x}{2}}}_0^\infty
   + 2 \int_0^\infty e^{-\frac{x}{2}} \ud x} \\
&= C \cdot \T{0 + 2 \cdot \Tevalat{2e^{-\frac{x}{2}}}_\infty^0} \\
&= C \cdot 2\cdot 2 \\
\So C&=\frac{1}{4}

Where we know that \int_0^\infty e^{\lambda x}\ud x=1/\lambda from the formula for the exponential distribution (\int_0^\infty \lambda e^{\lambda x}\ud x=1).

Now f(x) can be evaluated to find the probability that the system functions for at least 5 months. Specifically, we have that

&= \int_5^\infty \frac{x}{4}e^{-\frac{x}{2}} \\
&= \inv{4} \Tcbr{\Tevalat{-2xe^{-\frac{x}{2}}}_5^\infty
     + 2 \int_5^\infty e^{-\frac{x}{2}} \ud x} \\
&= \inv{4} \Tcbr{2\cdot 5 e^{-\frac{5}{2}}
     + 2 \Tevalat{2e^{-\frac{x}{2}}}_\infty^5} \\
&= \inv{4} \Tcbr{2\cdot 5 e^{-\frac{5}{2}}
     + 2 \cdot 2e^{-\frac{5}{2}}} \\
&= \frac{7}{2} e^{-\frac{5}{2}}

Problem 5.5:

A filling station is supplied with gasoline once a week. If its weekly volume of sales in thousands of gallons is a random variable with probability density function

f(x) = \fOO
{5(1-x)^4}{0 < x < 1}

what need the capacity of the tank be so that the probability of the supply’s being exhausted in a given week is 0.01?

Hint: Draw a graph of the pdf; the x axis is the number of gallons, which can’t be above 1 (units are in thousands of gallons). Then set up the correct integral for the problem.


Plotting this probability function helps visualize what’s going on and how to set up the integral. We are looking for a constant a, representing the size of the tank, such that the amount of probability mass larger than a is \leq 0.01. Solving for this boundary gives us:

&=\int_a^1 f(x)dx\\
&=\int_a^1 5(1-x)^4dx\\
&=\left . -(1-x)^5 \right| _a^1\\
\So a&=1-\sqrt[5]{.01}\\
&\approx .6020

07/14 - Wednesday

More continuous random variables, exponential distribution, uniform distribution.

HW 9 + Solution

Due Friday, 07/16: Problems 5.32, 5.40; Theory Exercises 4.27, 5.26. On 5.40, also plot the p.d.f. of the new random variable. On T.E. 5.26, Find the a new R.V. in terms of X.

Recommended Problems: 5.4, 5.17; Self-Test 5.1, 5.7; Theory Exercise 5.29.

Problem 5.32:

The time (in hours) needed to repair a machine is an exponentially distributed random variable with parameter \lambda=\onehalf. What is

  1. The probability that a repair time exceeds 2 hours;
  2. The conditional probabilty that a repair takes at least 10 hours, given that its duration exceeds 9 hours?


  1. Let T be the number of hours required to repaire the washing machine, with T \sim \Exponential{t; \lambda = \onehalf}. Then

    \Pof{T > 2}
&= \int_2^\infty \lambda e^{-\lambda x} \ud x \\
&= \Tevalat{e^{-\lambda x}}^2_\infty \\
&= e^{-\lambda 2} \\
&= e^{-1}

b. Now the probability we are interested in is \Pof{T > 10 \C
T > 9}. However, by the memoryless property of the exponential, this is just \Pof{T > 1}. Following the same steps as above, we get \Pof{T > 1} = e^{-\onehalf}.

Problem 5.40:

If X is uniformly distributed over \Ioo{0,1}, find the density function f_Y(y) of Y = e^X. Plot this new density function. Hint: be careful about the limits, and verify your final p.d.f. integrates to 1.


We’re interested in the density f_Y(y), but we need to get there through the density function of X, f_X(x).

&= P(Y \leq y)\\
&= P(e^X \leq y) \\
&= P(X \leq \log(y) ) = F_X(\log(y)) \\
\So f_Y(y) &= \der{y}F_Y(y) = \der{y} F_X(\log(y)) \\
&= f_X(\log(y)) \inv{y} \\
&=\fOO{\inv{y}}{0 \leq \log{y} \leq 1}{0}{\text{otherwise}} \\
&= \fOO{\inv{y}}{1 \leq y \leq e}{0}{\text{otherwise}}

where the second to last step follows since f_X(x) equals 1 if x \in \Icc{0,1} and 0 elsewhere. As a quick check, let’s make sure this indeed integrates to 1:

\int_{1}^e \inv{y}\ud y = \Tevalat{\log(y)}^e_1 = \log(e) - \log(1) = 1 - 0 = 1

So we’re good to go. Plotting this gives:

Theory Exercise 4.27:

If X is a geometric random variable, show analytically that

P(X = n + k \C X > n) = P(X = k)

Give a verbal argument using the interpretation of a geometric random variable as to why the precedeing equation is true.


First, we prove this formally. The geometric distribution has a p.m.f of p(x)=(1-p)^{x-1}p. Thus we proceed directly:

\Pof{X=n+k \C X>n}
&=\frac{\Pof{X>n\C X=n+k}\Pof{X=n+k}}{\Pof{X>n}}\\
&=\frac{1 \cdot \Pof{X=n+k}}{\Pof{X>n}}\\
&=\frac{p(n+k)}{\sum_{k=1}^\infty p(n+k)} \\
&=\frac{(1-p)^{n+k-1}p}{\sum_{k=1}^\infty (1-p)^{n + k -1}p} \\
&=\frac{(1-p)^n (1-p)^{k-1}p}{(1-p)^{n}\sum_{k=1}^\infty (1-p)^{k -1}p} \\
&=\frac{(1-p)^{k-1}p}{1} \\

where the sum in the denominator is just one since that is a sum over all the points of a geometric p.m.f., which must sum to 1.

Intuitively, we would expect a geometric random variable to possess a “memoryless” property, as it measures the number of tries of independent trials before one trial succeeds. Since these trials are independent, we would expect that fact that you’ve reached a certain point without success has no connection with future trials; thus it’d be the experiment same as when you started.

Theory Exercise 5.26.

If X is uniformly distributed over \Ioo{a,b}, what random variable, expressed as a linear function of X, is uniformly distributed over (0,1)?


Consider, first, that the probability density function of X is constant on the interval \Icc{a,b} and 0 elsewhere. We want a new random variable, Y that is constant and uniform on the interval \Icc{0,1} and 0 elsewhere. Thus we must effectively “slide” and “squish/stretch” the first interval, on which X takes values uniformly, to fit the form of the second.

The operation of shifting it is done by subtracting a, thus X - a is uniformly distributed on the interval \Icc{0, b-a}.

Squishing or stretching it is done by dividing by the original length so that the final end point is 1 instead of b-a. This means that

Y = \frac{X - a}{b - a}

has non-zero density exactly on the interval \Icc{0,1}. A quick check (not required for the hw) ensures that it is also uniform on this interval:

\Pof{Y \leq y}
&= \Pof{\frac{X - a}{b - a} \leq y} \\
&= \Pof{X \leq y(b - a) + a} \\
&= F_X(y(b - a) + a) \\
&= \fOOO
{\inv{b-a}\Tbr{\T{y(b - a) + a} - a}}{ \T{y(b - a) + a} \in \Icc{a,b}}
{0}{ \T{y(b - a) + a} < a}
{1}{ \T{y(b - a) + a} > 1} \\
&= \fOOO
{y}{y\in \Icc{0,1}}
{0}{y < 0}
{1}{y > 1}

which is exactly the cumulative distribution function of a R.V. uniform on \Icc{0,1}.

One could also note that linear mappings should preserve the uniform distribution property, as we are simply shifting and scaling the input to the uniform p.d.f.; i.e. if f_X(x) is constant on an interval, then f_X(cx + d) should also be constant but on a (possibly) different interval.

07/16 - Friday

The normal distribution.

HW 10 + Solutions

Due Monday 07/19: Problems 5.26, 5.29.

Recommended Problems: Self Test Problems 5.18, 5.19; Theory 5.29.

Bonus problem: Theory Exercise 5.31.

Problem 5.26:

Two types of coins are produced at a factory: a fair coin and a biased one that comes up heads 55 percent of the time. We have one of these coins but do not know whether it is a fair coin or a biased one. IOn order to ascertain which type of coin we have, we shall perform the following statistical test: We shall toss the coin 1000 times. If the coin lands on heads 525 or more times, then we shall conclude that it is a biased coin, whereas, if it lands heads less than 525 times, then we shall conclude that it is the fair coin,. If the coin is actually fair, what is the probabilityh that we shall reach a false conclusion? What would it be if the coin were biased?



F &= \text{event that the test concludes the coin is fair}\\
B &= \text{event that the test concludes the coin is biased}\\
X_F, X_B &= \text{number of heads in the test, dependent on which coin it is}\\
X_F &\sim \Binomial{x; p_F,n=1000} \\
X_B &\sim \Binomial{x; p_B,n=1000}


&=P\left(\frac{X_F-np_F}{\sqrt{np_F\cdot (1-p_F)}}\geq
                \frac{524.5-np_F}{\sqrt{np_F\cdot \left(1-p_F\right)}}\right)\\[8pt]
&=P\left(\frac{X_F-500}{\sqrt{500\cdot (\frac{1}{2})}}\geq
                \frac{524.5-500}{\sqrt{500\cdot (\frac{1}{2})}}\right)\\[8pt]
&\simeq P(Z\geq 1.550) \quad\text{by DeMoivre-Laplace approximation}\\

where \Phi(z) is the cdf of a standard normal random variable.

Going the other way, the probability of incorrectly deciding it’s a fair coin if it’s actually biased is:

=&P\left(\frac{X_B-550}{\sqrt{550\cdot (.45)}}\leq
                \frac{525.5-550}{\sqrt{550\cdot (.45)}}\right)\\[8pt]
=&P(Z\leq -1.6210)\\

Problem 5.29:

A model for the movement of a stock supposes that if the present price of the stock is s, then after one time period it will be either us with probability p or ds with probability 1-p. Assuming that successive movements are independent, approximate the probability that the stock’s price will be up at least 30 percent after the next 1000 time periods if u=1.012, d=0.990, and p=0.52. (Note: You may assume that d = 1/ u).


The tricky part of this problem is figuring out how many steps up or down need to be taken. Note that u \neq 1/d; surprisingly, the small difference between u and 1/d makes a big difference numerically.


x=&\text{number of times the stock goes up}; \quad x\geq 0\\
y=&\text{number of times the stock goes down};\quad y\geq 0\\

Note that x + y = 1000. Then the price after 1000 movements is just su^xd^y; each movement is multiplicative, so the operations commute. Thus it doesn’t matter which order we record them in, as all that matters is the total number up and the total number down.

We now need to find the minimum number of movements up required to put the final stock price above 1.3s, a 30% increase:

Since this problem follows a p(x) = \Binomial{x; 1000, .52} distribution, we can reasonably approximate \sum_{x=470}^{1000} p(x) with a normal distribution using the DeMoivre-Laplace approximation. Let X be the number of movements up. Then:

P(X \geq 470) =&P(X \geq 469.5) \\
  \geq \frac{469.5-np}{\sqrt{np\cdot(1-p)}}\right)\\[8pt]
=&P\left(\frac{X-520}{\sqrt{520\cdot(.48)}} \geq \frac{469.5-520}{\sqrt{520\cdot(.48)}}\right)\\[8pt]
\simeq&P(Z \geq -3.1965)\\
=&1 - \Phi(-3.1965) \\

07/19 - Monday

Review for final exam.

07/21 - Wednesday (Final)

Final Exam

The final will be the entire class period (a little over 2 hours long) and will cover the entire course with an emphasis on the material after the midterm. There will be 8-10 problems of a similar format to the midterm. Again, 50-75 percent will come directly from the recommended problems, and pretty much everything will directly relate to the homeworks or examples presented in class. Thus a good way of preparing is making sure you understand the recommended problems and homework.

Sections Covered.

While the material on the exam is pretty much bounded by what you’ve seen in the homework, recommended problems, or in class, here’s a reading guide to the sections of the book that are covered:

Chapter 1 (from midterm) :

Sections 1-6, except for the binomial and multinomial theorems for polynomial equations and related examples.

Chapter 2 (from midterm):

Sections 1-5. You won’t be responsible for the general inclusion/exclusion formula (proposition 4.4) except what’s on the reference sheet, the proofs of propositions 4.1-4.3, or examples 5j,5k,5m,5n, and 5o.

Chapter 3 (mostly from midterm):

Sections 1,2, 3, 4, and the basic idea of 5; that P(E\C F) is a probability measure. You won’t be responsible for examples 3b, 3g, 3i, 3m, 3o, 4j, and the subsequent material in section 4.

Chapter 4 (new, random variables):

All material except for sections 4.6.2, 4.7.1, 4.8.4 and examples 1e, 4b, 4c, 6f, 6h, 6i, 7d, 7f, 8c, 8e, 8f, 8j. I’ll give you the formulas for the various distributions along with their expectations and variances. You should know when to use what distribution.

Chapter 5 (new, continuous random variables):

All material except for sections 5.5.1 (hazard rates) and section 5.6. In section 5.7, you will not need theorem 5.7.1 for variable transformations, but you will need to do one problem similar two what I’ve presented in class and in the homework. Thus the examples motivating this theorem are quite useful.

Reference Sheet:

Download here. This reference sheet will be given out with the exam.

(If you’re interested, here is the LaTeX source; you’ll need latexopts.sty to compile it.)