Assignment 2 DSC 212: Probability and Statistics for Data Science Due date: 5:00 pm PST, 13th March, 2023 1. Consider Bernoulli(p) observations 0 0 1 0 1 0 1 1. Plot the posterior distribution for p for the following prior distributions Beta(1,1), Beta(10,10), Beta(1, 10). Mark the inflection points in the plots. Note: Hand-drawn plots are sufficient. 2. Let X1, X2, . . . , Xn i.i.d.~ Uniform(0,θ). Calculate the posterior distribution corresponding to the prior density f(θ) ∝ 1θ . 3. In this problem, we derive a general approach—called iterative reweighed least-squares—for obtaining the empirical risk minimization solution with a smooth loss, and unconstrained linear function class. Consider minimizing a function L : Rd → R, and assume that L is continuously differentiable up to second order. The Newton iteration for obtaining the minimizer of L is θt+1 = θt [ 2L(θt)] 1 L(θt) (1) where 2L and L are the Hessian and gradient of L respectively. (a) Show that iteration (1) is equivalent to minimizing the second-order Taylor expansion of L around θt. (b) Let L(θ) = ∑n i=1 (x i θ, yi) where : R× Y → R is smooth in its first argument. Argue that L(θ) = n∑ i=1 αi(θ)xi 2L(θ) = n∑ i=1 wi(θ)xix i (2) for weights αi(θ) and wi(θ) that you specify. (c) Let X ∈ Rn×d be the matrix whose ith row is x i . Let α(θ) = (α1(θ), . . . , αn(θ)) and Wθ = diag(w1(θ), . . . , wn(θ)). Show that iteration (1) is equivalent to solving θt+1 = argmin θ∈Rd ∥∥∥W 1/2(θt)(Xθ z(θt)∥∥∥2 (3) for z(θt) ∈ Rn which you specify. Thus Newton iteration in this context is equivalent to solving a reweighed least-squares problem at each iteration. Hint: L(θ) = X α(θ) and 2L(θ) = X W (θ)X. (d) Consider the case where (t, y) = yt + log(1 + et) is the logistic loss. Show that in this case, wi(θ) = σ(θ xi)(1 σ(θ xi)) and find an expression for zi(θ). 4. (Bayesian linear model) For the fixed design linear model with X = [x1, x2, . . . , xn] with observations yi = x i β + σεi having i.i.d. standard normal noise εi. Consider the likelihood and prior given below. fY |β(y|w) ∝ exp( 1 2σ2 ∥y Xw∥2) (4) fβ(w) ∝ exp( ∥w∥2Γ) (5) Here Γ is a positive definite precision matrix and ∥w∥2Γ := w Γw. Find the posterior distribution Pβ|Y . 5. If X ~ N (μ,Σ) where μ ∈ Rd and Σd×d is a positive definite matrix. Find the distribution of the vector AX ∈ Rp for a fixed matrix A ∈ Rp×d. 1