数学｜Probability IV, Michaelmas 2022

Probability IV, Michaelmas 2022
About these notes
Lecture notes for Michaelmas 2022. Lectures will follow this content quite closely, but there may be things
said in lecturers that are not in these notes and vice versa. The exercises are an integral part of the notes
and cover examinable material. It is highly recommended that you form your own notes as part of actively
engaging with the material (mathematics, as every course introduction reminds you, is not a spectator
sport).
The sections titled ‘Going further’ concern more advanced material that will only appear on exams as
‘unseen’ material.
These lecture notes are based on previous versions by Ostap Hryniv and Andrew Wade.
Notation
This section briefly reviews some notation that is frequently used in the notes (and course).
Sets
A set S is a collection of elements. If x is an element of S, we write x ∈ S. The set of no elements is the
empty set ?. A set S is a subset of a set T, written S ? T (or T ? S), if x ∈ S implies x ∈ T.
Given two sets S and T, we can construct new sets via:
? Intersection: S ∩ T, the set of elements in both S and T.
? Union: S ∪ T, the set of elements in S or in T (or in both).
? Difference: S T, the set of elements x where x ∈ S but x /∈ T.
? Complement of S in T: S
c ≡ T S, the set of elements x where x ∈ T but x /∈ S.
? Symmetric difference: S △ T = (S T) ∪ (T S), the set of elements in exactly one of S and T.
A set is countable if it is finite or countably infinite, the latter meaning that it is in bijection to the set
N := {1, 2, 3, . . .} of natural numbers.
Given a collection of sets (Aα)α∈I , compact notation for their union or intersection is ∪α∈IAα or
∩α∈IAα. Particular examples are ∪
n
i=1Ai
, ∩
n
i=1Ai (finitely many sets) and ∪∞
i=1Ai
, ∩∞
i=1Ai (a countable
infinity of sets). If the meaning is clear from the context, we may just write ∪nAn, ∩nAn for countable
unions or intersections.
Limits
Recall that if x1, x2, x3, . . . is a sequence of real numbers, then
lim inf
n→∞
xn = limn→∞
inf
m≥n
xm = sup
n
inf
m≥n
xm ,
lim sup
n→∞
xn = limn→∞
sup
m≥n
xm = inf
n
sup
m≥n
xm ;
here ?∞ ≤ lim infn→∞ xn ≤ lim supn→∞ xn ≤ +∞.
If (and only if) lim infn→∞ xn = lim supn→∞ xn then their common value in [?∞, +∞] is limn→∞ xn.
1
If (xn) is a non-decreasing sequence of real numbers, i.e., xn+1 ≥ xn for all n, and x = limn→∞ xn
exists, we write xn ↑ x to indicate that xn converges to x monotonically (from below). Similarly, if xn
is non-increasing with limit x, we write xn ↓ x to indicate that xn converges to x monotonically (from
above). As an example of combining these notational concepts, infm≥n xm is non-decreasing in n, and so
infm≥n xm ↑ lim infn→∞ xn.
Miscellaneous
As usual, 1A denotes the indicator random variable of the event A, i.e., the function 1A(ω) taking value
1 if ω ∈ A and value 0 otherwise.
We will occasionally use the notation a ∧ b = min{a, b} and a ∨ b = max{a, b} for real numbers a and
b. Finally, we use ?x? for the floor function of a real number x, i.e., the largest integer that is at most x.
1 Introductory examples: from finite to infinite spaces
1.1 Coin tossing I
A coin is tossed n times. Let ξk be equal to 1 if the kth toss comes down ‘heads’ and equal to ?1 if the
kth toss comes down ‘tails’. We assume that the coin is fair, so P(ξk = 1) = P(ξk = ?1) = 1
2
, and that
coin tosses are independent.
The sample space of this experiment can be described by the sequence of ±1s: each sequence
(x1, . . . , xn), where xk ∈ {?1, +1}, is equally likely, because there are 2
n
such sequences and each
has probability 2
?n
. This is a finite sample space and we know all about these from earlier probability
courses.
An alternative description of the experiment is given by the sequence not of tosses themselves but of
partial sums ξ1 + · · · + ξk. That is, define S0 := 0 and, for k ∈ {1, . . . , n}, Sk =
Pk
?=1 ξ?
. Then each
realization (x1, . . . , xn) of (ξ1, . . . , ξn) corresponds to a realization (s0, s1, . . . , sn) of (S0, S1, . . . , Sn). The
random process S0, . . . , Sn is called an n-step simple random walk, and any sequence (s0, s1, . . . , sn) of its
possible values is called an n-step trajectory; it has the properties s0 = 0 and |sk+1 ?sk| = 1, there are 2
n
such trajectories. So an equivalent sample space for the experiment is the space of all n-step trajectories;
again every trajectory has probability 2
?n
.
For trajectories of n-step random walks we will ask questions such as
? What is the distribution of the fraction of the time that the walk is positive?
? What is the distribution of the position of the maximum of the walk?
? What is the probability that the walk never revisits to the origin?
These questions have interesting and sometimes surprising answers.
1.2 Coin tossing II
Suppose now that the coin is tossed an infinite number of times, giving a sequence ξ1, ξ2, . . . indexed by
the natural numbers N. This corresponds to an infinite random walk trajectory S0, S1, . . . .
What is the sample space for this process? It must have the property that if we stop at any finite n,
we recover the properties already discussed. The idea is to represent the infinite sequence of coin tosses
(x1, x2, . . .) as a point in [0, 1] via dyadic expansion:
u =
X∞
k=1
2
?k
1 + xk
2
.
For example, the sequence (1, 1, ?1, ?1, . . .) corresponds to 3
4
. Now take U uniformly distributed on [0, 1],
and define (ξ1, ξ2, . . .) as the dyadic expansion of U. Then the sequence ξk has all the right properties: for
2
example,
P(ξ1 = 1) = P(U ∈ [
1
2
, 1]) = 1
2
;
P(ξ2 = 1) = P(U ∈ [
1
4
,
1
2
] ∪ [
3
4
, 1]) = 1
2
;
and
P(ξ1 = 1, ξ2 = 1) = P(U ∈ [
3
4
, 1]) = 1
4
.
So our sample space for the infinite coin-tossing experiment can be [0, 1] with the uniform distribution.
To formalize this we need the notion of measure theory, but you should be happy that this is possible
because you have all seen the uniform distribution before! Note there’s a minor complication here in that
the dyadic expansion may not be unique; (1, ?1, ?1, . . .) and (?1, 1, 1, . . .) are both equal to 1
2
. But there
are relatively few of these (only countably many) and they do not affect the probabilities.
We are not often concerned directly with infinite random walks in this course, but we will ask questions
such as
? What is the probability that the walk eventually returns to the origin?
? What is the expected time taken for the walk to return to the origin?
1.3 Some warm-up exercises
Exercise 1.1. The following task gives you an opportunity to test your intuition. Use your imagination to
“generate” a sequence of 200 random bits (i.e., 0 or 1). Split your sequence into individual runs (a run
is a maximal subsequence of consecutive identical bits) and construct the histogram of the runs lengths.
Compare your results to those generated by computer, using, e.g., the R script available from DUO.
Exercise 1.2. Let Πn denote the set of all n-step paths started at the origin, i.e., the set of s =
(s0, s1, . . . , sn) with s0 = 0 and |sk ? sk?1| = 1 for all 1 ≤ k ≤ n. For s = (s0, s1, . . . , sn) ∈ Πn,
denote
u(s) = Xn
k=1
1{sk?sk?1=+1} and d(s) = Xn
k=1
1{sk?sk?1=?1}
,
the number of ‘up’ and ‘down’ steps in s, respectively.
(i) Let p ∈ [0, 1]. Assume that S = (S0, S1, . . . , Sn) is a random element of Πn with distribution given
for s = (s0, s1, . . . , sn) ∈ Πn by
P(S = s) = P(S0 = s0, S1 = s1, . . . , Sn = sn) = p
u(s)
(1 ? p)
d(s)
.
Define for k ∈ {1, . . . , n}, ξk := Sk ? Sk?1. Show that ξ1, . . . , ξn are independent random variables
with
P(ξi = +1) = p and P(ξi = ?1) = 1 ? p .
(ii) Let p ∈ [0, 1]. Let ξ1, . . . , ξn be independent random variables with
P(ξi = +1) = p and P(ξi = ?1) = 1 ? p .
Define S0 := 0, and, for 1 ≤ k ≤ n,
Sk = ξ1 + · · · + ξk .
Show that for any s = (s0, s1, . . . , sn) ∈ Πn we have
P(S0 = s0, S1 = s1, . . . , Sn = sn) = p
u(s)
(1 ? p)
d(s)
.
Exercise 1.3. Using Definition 2.3, show that the simple random walk is temporally and spatially homogeneous, i.e., show that for any n, x and y,
P(Sn = x | S0 = 0) = P(Sn = x + y | S0 = y)
and for any n, m, x and y,
P(Sn = y | S0 = x) = P(Sn+m = y | Sm = x).
3
2 Trajectories of random walks
Goals of this section:
1. Use path-counting and associated arguments such as the reflection principle and time-reversal to
investigate properties of finite duration simple random walks.
2. Understand the origin of the renewal relation involving visits and first returns to the origin, and its
implications for the first return time distribution.
3. Investigate phenomena associated with long leads, first maxima, and first-passage probabilities.
Alternative presentations of the material in this section can be found in [GS01, Sections 3.9–3.10]
or [Fel68, Section III].
2.1 The Ballot problem
We start by considering the following ballot problem:
Example 2.1. In a ballot consisting of two candidates, votes are counted one by one. It is found that
candidate A scored a votes and candidate B scored b votes, a > b. Suppose that every possible order in
which the a + b votes are counted are equally likely. What is the probability that throughout the counting
process there are more votes for A than for B?
It will be shown below (see Theorem 2.8) that the answer to the ballot problem equals (a ? b)/(a + b).
We will work towards a proof of this result, which is essentially combinatorial—it boils down to counting.
Let n = a + b denote the total number of votes in the ballot. The counting process can be described
by a sequence (ξ0, ξ1, . . . , ξn) where
ξk =
(
+1 if the kth vote is for A ,
?1 if the kth vote is for B .
Then if we define S0 := 0 and, for 1 ≤ k ≤ n,
Sk := X
k
i=1
ξi
,
the sequence (S0, S1, . . . , Sn) describes the evolution of the counting process, with
Sk = the lead of A over B after k ballots have been counted .
It is very instructive to plot Sk against k. Notice that every possible trajectory arising from the counting
process connects the origin (0, S0) = (0, 0) and the final point (n, Sn) = (a + b, a ? b). By assumption,
every such trajectory is equally likely.
Definition 2.2. A sequence of integers (s0, s1, s2, . . . , sn) is an n-step path from x to y if s0 = x, sn = y,
and |sk ? sk?1| = 1 for all 1 ≤ k ≤ n.
In the ballot problem, the counting sequence (S0, S1, . . . , Sn) is an n-step path (n = a + b) from 0 to
a ? b > 0, chosen uniformly at random from all such paths. The event that we are interested in for the
ballot problem is that Sk > 0 for all 1 ≤ k ≤ n.
Let Nn(x, y) denote the total number of n-step paths from x to y. Call sk ? sk?1 the kth step or
jump. Since s0 = x and sn = y, and clearly
sn ? s0 =
Xn
k=1
(sk ? sk?1) = “number of +1 steps” ? “number of ?1 steps”
4
we have that every trajectory that contributes to Nn(x, y) must be such that of its n jumps, there are
exactly n+y?x
2
jumps ‘up’ and n+x?y
2
jumps ‘down’. Each trajectory is specified by choosing which n steps
are ‘up’ and which are ‘down’. Consequently,
Nn(x, y) =
n
n+y?x
2