STAT 134 – Instructor: Adam Lucas
Midterm 2
Friday, April 22, 2022
Print your name:
SID Number:
Exam Information and Instructions:
You will have 48 hours to take this exam. Open book/notes but no internet resources
outside of the Stat134.org website. You are allowed to use a calculator.
We will be using Gradescope to grade this exam. Write any work you want graded on the
front of each page, in the space below each question. Additionally, write your SID number
in the top right corner on every page.
If you are writing your answer on a separate sheet or on iPad, please use different
page (or side) for each question (answer for several parts of a question can be
on the same page). Then sort them according to the order of questions.
Provide calculations and reasoning in every answer.
Unless stated otherwise, you may leave answers as unsimplified numerical and algebraic
expressions, and in terms of the Normal c.d.f. Φ. Finite sums are fine, but simplify any
infinite sums.
I certify that all materials in the enclosed exam are my own original work and I have not violated
the UC Berkeley honor code.
Sign your name:
GOOD LUCK!
1
Honor Code (1 pt) After reading the exam information and instructions on the first page
please circle all choices that violate the rules of this test or the UC Berkeley Honor Code
(a) post or read on Chegg or an online forum
(b) use a calculator
(c) leaving your answer unsimplified
(d) use your notes, or textbook
(e) communicate with non-staff about the test before we communicate with you that everyone
has taken the test.
2
Problem 1 (10 points)
Hoeffding’s inequality provides an upper bound on the probability that the sum of bounded
independent random variables deviates from its expected value by more than a certain amount.
The inequality has further been generalized for unbounded random variables and finds many
applications in modern machine learning theory. In this exercise, we will prove a one-sided
Hoeffding’s inequality. The two-sided version can be argued by symmetry, which we will omit
for this problem.
Suppose a random variable X has a Moment Generating FunctionMX(t) = EetX . For simplicity
assume that EX = 0. Assume that there exists σ > 0, such that MX(t) ≤ exp
{
σ2t2
2
}
, for all
t ∈ R.
(a) (2 points) Show that for any t > 0,
P(X > δ) = P(etX > eδt) ≤ exp(σ2t2/2 δt). (1)
Hint: Use Markov’s inequality: for any non-negative random variable Y , P(Y > a) ≤ EY
a
.
(b) (2 points) Note that since inequality (1) holds for all t > 0,
P(X > δ) ≤ min
t>0
exp(σ2t2/2 δt).
Using this, show that,
P(X > δ) ≤ exp
(
δ
2
2σ2
)
.
(Hint: Find the t > 0 that minimizes exp(σ2t2/2 δt))
(c) (2 points) Now suppose that X1, . . . , Xn are independent and satisfy,
MXi(t) ≤ exp
{
σ2i t
2
2
}
, for all i ∈ {1, . . . , n}, for all t ∈ R.
Let Xˉn =
1
n
∑n
i=1Xi. Show that MXˉn(t) ≤ exp
(
t2
∑n
i=1 σ
2
i
2n2
)
= exp
(
t2( 1
n2
∑n
i=1 σ
2
i )
2
)
.
(d) (2 points) Use parts (b) and (c) to show that,
P(Xˉn > δ) ≤ exp
{
n
2δ2
2
∑n
i=1 σ
2
i
}
(2)
(e) (2 point) When σi = σ, for all i = 1, . . . , n, from part (d), conclude that,
P(Xˉn > δ) ≤ exp
{
nδ
2
2σ2
}
.
3
Problem 2 (10 pts)
Variance σ2 = E[(X μ)2] (where μ = EX) tells you how far the distribution is spread out of
from its mean. Similarly, E [(X μ)4/σ4] which is called Kurtosis, tells you how heavy the tail
of your distribution is. We say a distribution has a heavy tail if its density is relatively large on
the points that is far away from the mean.
For example, kurtosis of exponential random variable and normal random variable are 6 and 3,
respectively. In fact, exponential distribution has heavier tail than normal random variables as
you can see in the following figure.
In this question, you will show that kurtosis of normal random variables is 3, using some facts
you learned in this class. You may find page 358 – 360 of the textbook helpful. Suppose
X, Y
iid~ N(μ, σ2).
You may use the result from previous parts even if you couldn’t solved them.
(a1) (1pt) What is the density of R =
√(
X μ
σ
)2
+
(
Y μ
σ
)2
and what is the name of this distri-
bution Proof is not required.
(a2) (1pt) What is the density of S = R2 Provide the name and the parameter of this
distribution. Proof is not required.
(b) (4pts) Show that ES2 = 8.
(c) (4pts) Show that E
(
X μ
σ
)4
= 3.
(Hint : Expand S2 and use the fact that X, Y are independent)
4
Problem 3 (10 pts)
Recall that the gamma function is defined as
Γ(r) =
∫ ∞
0
xr 1e xdx
for r > 0. You can use without proof that Γ(r + 1) = rΓ(r) for r > 0 in this problem.
Let ν be a positive real number and X be a random variable with probability density
fX(x) =
{
2
ν
2
Γ( ν
2
)
x
ν
2
1e
1
2x if x > 0
0 otherwise.
(a) (5pts) Let Y = 1/X. Specify the distribution of Y including parameters.
(b) (5pts) Find E[Y ].
5
Problem 4 (9 pts)
A chord of a circle is a line segment connecting two points on the circle. To pick a point uniformly
in the unit circle means that you are randomly throwing a dart at the unit circle. Pick a point
Q uniformly at random in the unit circle, radius 0 ≤ R ≤ 1 from the origin, and pick any chord
of the circle with that point as the midpoint. Let L be the length of this chord.
In the following steps, you will find E[L].
1. (3 pts) Notice that if we rotate the circle, we can assume that the chord is always vertical
with non-negative horizontal coordinate (i.e. assume Q=(R,0) for R ≥ 0).
Show that the CDF of R is FR(r) = r
2. (Hint: What is the probability that the radius R
of the point Q is less than r )
2. (3 pts) Find the density of R.
3. (3 pts) Find E[L].
6
Problem 5 (10 pts)
Let X = U(4) out of 8 standard ordered uniforms and Y = U(5) out of 8 standard ordered
uniforms.
1. (5 pts) Derive the joint density fX,Y (x, y).
2. (5 pts) Evaluate P (Y < 1
3
+X). Hint: it might be easier to find the distribution of Y X.
7
Scratch Paper
8
9
10