程序案例-F71MA

F79MA/F71MA 2021-22 Assessed Project 1 In this project we will look at maximum likelihood estimation for a gamma model and show that computer simulation can be a very useful tool to characterise estimator performance. Project description You are a statistical trainee actuary working for an insurance company. You are part of a statistical modelling team that is developing models based on independent non identically distributed random variables. You consider the following model Xi ~ Gamma(αωi, β) , i = {1, . . . , N} where α, β, and ω1, . . . , ωN are positive scalar parameters (the gamma distribution has several common parametrisations, here we define the p.d.f. of a random variable X ~ Gamma(α, β) as f(x) = β α Γ(α) xα 1 exp{ βx} for x > 0, where Γ(·) denotes the gamma function, see https://mathworld.wolfram.com/GammaFunction.html). You have been asked to explore the statistical properties of the maximum likelihood estimator of β assuming that α and ω1, . . . , ωN are known. Assume that α > 2 and ∑N i=1 ωi ≥ 1. Let X = X1, . . . , XN . You decide to perform the following analyses: 1. State the likelihood function L(β; X) = f(X; β). [2 marks] 2. Derive the maximum likelihood estimator for β, denoted by β (X). [2 marks] 3. Show that the bias of β (X) is given by B(β) = β/(α ∑N i=1 ωi 1). [3 marks] 4. Derive the expression for the Cramer Rao Lower Bound for biased estimator of β with bias B(β) = β/(α ∑N i=1 ωi 1). [3 marks] 5. Let α = 4 and ωi = i for i = {1, . . . , N}. Use simulation in R to calculate the variance of β (X) for β = {1, 2, 4, 8, 16} and for different sample sizes N ∈ [1, 20]. Briefly discuss your results and present appropriate graphical summaries. How close are the variances of β (X) to the theoretical lower bound found in Part 4 [6 marks] 1 Your findings should be presented in the form of a report, which should: have a clear and logical structure; include an introduction and clearly stated conclusions that can be understood by any numerate scientist; include detail of your mathematical calculations so that your results could be reproduced by another statistician; include clearly labelled and correctly referenced tables and diagrams, as appropriate; include the R code you used in an appendix (you do not need to explain individual R commands but some comments should be included to indicate the purpose of each section of code); include citation and referencing for any material (books, papers, websites etc) used. maximum page limit of two (2) pages (11-point font, A4 size). The R code in the appendix does not count towards this page limit. A total of 4 Marks is available for these aspects of your report. This will be marked according to the rubric given in the Appendix. [Total: 20 Marks] Notes This assignment counts for 20% of the course assessment. You may have face-to-face discussions with me or your colleagues, but your report must be your own work. Plagiarism is a serious academic offence and carries a range of penalties, some very serious. Copying a friend’s report or code, or copying text into your report from another source (such as a book or website) without citing and referencing that source, is plagiarism. Collusion is also a serious academic offence. You must not share a copy of your report (as a hard copy or in electronic form) or your computer code with anyone else. Penalties for plagiarism or collusion can include voiding of your mark for the course. See https://www.hw.ac.uk/students/studies/ examinations/plagiarism.htm for more details. Your report should be submitted through Canvas on Friday 15 October 2021. Stu- dents based in the Edinburgh campus should submit by 18h00 UK time. Students based in the Malaysia campus should submit by 18h00 Malaysia time. A link to the submission page is available through the ‘Assignment’ tab of the course Can- vas page. For late submissions, 30% will be deducted for work submitted at most 5 days late. Submissions that are more than 5 days late will receive 0 marks. You will receive feedback on your submission within 15 teaching days. 2 Appendix: Rubric for marking of the report The five marks available for the exposition of your report will be awarded according to the scale below: 0 Marks Lack of clear and logical structure will be Conclusions missing or not suitable for a non-statistician awarded Statistical calculations and methodology not clearly set out for the reader for Tables and figures unclear, badly labelled or not correctly referred to R code not included, or no comments included in it Sources used not clearly referenced 2 Marks Clear and logical structure will be Conclusions generally suitable for a non-statistician awarded Statistical calculations and methodology generally set out clearly for the reader for Tables and figures often clear and correctly referred to R code included with some comments Sources used clearly referenced 4 Marks Clear and logical structure will be Conclusions suitable for a non-statistician awarded Statistical calculations and methodology set out clearly for the reader for Tables and figures clear, correctly referred to and easy to interpret R code included with comments Sources used clearly and correctly referenced 3