r studio-R 1

Hypothesis testing & Confidence intervals Data Analysis for Psychology in R 1 Semester 2, Week 4 Dr Umberto Noè Department of Psychology The University of Edinburgh Learning objectives 1. Interpret a confidence interval as the plausible values of a parameter that would not be rejected in a two-sided hypothesis test. 2. Determine the decision for a two-sided hypothesis test from an appropriately constructed confidence interval. 3. Be able to explain the dierence between a significant result and an important result. 2 / 29 3 / 29 Part A Introduction 4 / 29 Population vs Sample We performed a study by selecting a sample from the population. The goal is to use the sample data to answer questions about the population data. For example: What’s the population mean Goal: estimating a parameter. Could the population mean be zero Goal: testing hypotheses about a parameter. 5 / 29 We said that the SE of the mean measures how far is likely to be from the population mean . Figure 1 is a drawing of an invisible lady walking her dog. The dog, which is visible, is on a leash. The leash is such that the dog is within of the lady 95% of the time. In one picture we can see the dog, but we would like to know where the lady is. Since the lady and the dog are usually within of each other, we can take the following interval as a range that would typically include the lady: or, in short: We could say that we are 95% confident that the lady is within this interval. Confidence interval Figure 1. Invisible lady walking her dog. xˉ μ t SE t SE [dog t SE, dog + t SE] dog ± t SE 6 / 29 Confidence interval We have one picture, which corresponds to one sample. We would like to know the value of the population mean , which corresponds to the invisible lady, but we cannot directly observe it. We can however see the sample mean , corresponding to the dog. We use what we can see, , along with its standard error, as a way of constructing an interval that we hope will include what we cannot see, i.e. the population mean . The interval is a 95% confidence interval for the position of the lady. The interval is a 95% confidence interval for the population mean . Recall that the standard error of the mean is computed as . μ xˉ xˉ μ dog ± t SE xˉ± t SE μ SE = s/ √ n 7 / 29 Consider this example: Sample of , with mean and Want to test whether The sample mean (47) is about 3 standard errors smaller than the claimed mean (50) Hypothesis testing n = 30 xˉ = 47 s = 5.5 H 0 : μ = 50 H 1 : μ ≠ 50 SE = s/ √ n = 5.5/ √ 30 = 1.004 t = = 2.99 47 50 1.004 8 / 29 Critical values for a distribution with : tstar <- qt(c(0.025, 0.975), df = 29) tstar ## [1] -2.045 2.045 As , we reject . Hypothesis testing t(29) α = 0.05 t ≤ t H 0 9 / 29 10 / 29 Part B Hypothesis testing & Confidence intervals 11 / 29 P-value method Compute the p-value, i.e. probability of obtaining a test statistic at least as extreme as the observed one, when is true. 2 * pt(abs(tobs), df = n-1, lower.tail = FALSE) Make a decision: Reject if Do not reject if Critical value method Compute the lower and upper critical values and : qt(c(0.025, 0.975), df = n-1) Make a decision: Reject if or Do not reject if Hypothesis testing Consider again the following example, where we wish to test the following claim at the significance level: We start by computing the observed value of the t-statistic for our sample: α = 0.05 H 0 : μ = μ 0 vs H 1 : μ ≠ μ 0 t = ˉ x μ 0 s/ √ n H 0 H 0 p ≤ α H 0 p > α t t H 0 t ≤ t t ≥ +t H 0 t < t < +t 12 / 29 Hypothesis testing Do not reject if:H 0 : μ = μ 0 t < t < +t t < < +t ˉ x μ 0 s √n t < xˉ μ 0 < +t s √ n s √ n xˉ t < μ 0 < xˉ+ t s √ n s √ n xˉ+ t > μ 0 > xˉ t s √ n s √ n ˉ x t < μ 0 < ˉ x+ t s √ n s √ n 13 / 29 Do not reject when: Reject when either: Hypothesis testing That is, we do not reject when lies within the CI: H 0 : μ = μ 0 xˉ t  lower limit of CI < μ 0 < xˉ+ t  upper limit of CI s √ n s √ n H 0 : μ = μ 0 μ 0 ≥ xˉ+ t  upper limit of CI s √ n or μ 0 ≤ xˉ t  lower limit of CI s √ n H 0 μ 0 Do not reject H 0 if [xˉ t , xˉ+ t ] includes μ 0 s √ n s √ n 14 / 29 15 / 29 Part C Significance vs Importance 16 / 29 head(tempsample) ## # A tibble: 6 × 2 ## BodyTemp Pulse ## ## 1 36.4 69 ## 2 37.4 77 ## 3 37.2 75 ## 4 37.1 84 ## 5 36.7 71 ## 6 37.2 76 dim(tempsample) ## [1] 50 2 nrow(tempsample) ## [1] 50 Body temperature example Recall the body temperature study, where the goal was to assess whether the sample data provide evidence, at the 5% significance level, that the population mean body temperature for healthy humans changed from the long thought 37 °C. Data: library(tidyverse) tempsample <- read_csv('https://uoepsy.github.io/data/BodyTemp.csv') 17 / 29 Null and alternative hypotheses: Chosen significance level: Test statistic: n <- nrow(tempsample) xbar <- mean(tempsample$BodyTemp) s <- sd(tempsample$BodyTemp) se <- s / sqrt(n) mu0 <- 37 tobs <- (xbar - mu0) / se tobs ## [1] -3.141 Body temperature example H 0 : μ = 37 H 1 : μ ≠ 37 α = 0.05 t = xˉ 37 s/ √ n 18 / 29 Body temperature example If you prefer to use tidyverse, you could equivalently do: stats <- tempsample %>% summarise( n = n(), xbar = mean(BodyTemp), s = sd(BodyTemp), se = s / sqrt(n), mu0 = 37, tobs = (xbar – mu0) / se ) stats ## # A tibble: 1 × 6 ## n xbar s se mu0 tobs ## ## 1 50 36.81 0.4252 0.06013 37 -3.141 19 / 29 Body temperature example Recall the value of the t-statistic for the observed sample: tobs ## [1] -3.141 To make a decision, we compare the observed t-statistic, , to the critical values from a distribution. tstar <- qt(c(0.025, 0.975), df = n - 1) tstar ## [1] -2.01 2.01 Two-sided alternative, so we ask: is the observed as or more extreme than the two critical values Is or Yes, so reject . t = 3.141 ±t t(49) t t ≤ t t ≥ t H 0 20 / 29 Body temperature example 21 / 29 Body temperature example If you were to compute the p-value, you would reach the same conclusion. pvalue <- pt(-abs(tobs), df = n - 1) + pt( abs(tobs), df = n - 1, lower.tail = FALSE) pvalue ## [1] 0.002851 We could say: At the 5% significance level, the sample data provide very strong evidence to reject the null hypothesis in favour of the alternative one that the mean body temperature of healthy humans is dierent from the long thought 37 °C; , two-sided.t(49) = 3.14, p = 0.003 22 / 29 tstar <- qt(c(0.025, 0.975), df = n - 1) tstar ## [1] -2.01 2.01 xbar + tstar * se ## [1] 36.69 36.93 Body temperature example At the significance level, the sample data provide very strong evidence to reject in favour of . However, this doesn't give us any idea of what the actual population mean may be. It just tells us there is very strong evidence that it's not 37 °C. Good practice When you find a statistically significant result, it is good practice to follow-up the hypothesis test with a confidence interval, in order to provide the reader with an idea of what a plausible value of the population parameter may be. We are 95% confident that the population mean body temperature for healthy humans is between 36.69 °C and 36.93 °C. α = 0.05 H 0 : μ = 37 H 1 : μ ≠ 37 23 / 29 Hypothesis testing and Confidence intervals Notice that the 95% confidence interval [36.69, 36.93] does not include the hypothesised value in the null hypothesis, . As 37 is not within [36.69, 36.93], we reject the null hypothesis at the 5% significance level. The 95% CI gives us a range of plausible values for the population mean. As the claimed value 37 is not within the range, that is not a plausible value for . If you test whether the population mean could be equal to 37, you end up rejecting that value. Generic fact: The 95% CI gives you a range of values for the parameter that would not be rejected at the 5% significance level in a two-sided hypothesis test. μ 0 = 37 μ 24 / 29 Body temperature example The p-value is 0.003, indicating that the sample data provide very strong evidence that the population mean is significantly dierent from 37 °C. This is about statistical significance That is, the result is very unlikely to come from a population where is true and the discrepancy between the sample and population mean is only due to random sampling. We are 95% confident that the population mean body temperature for healthy humans is between 36.69 °C and 36.93 °C. The confidence interval tells us that perhaps the true population mean body temperature is somewhere in between 36.69 and 36.93, rather than 37. Is this departure from 37 big enough to be of any practical impact for decision makers Pretty much no, it's negligible. The result, while being significant, is not an important result. Hence, statistical significance is dierent from practical significance (which we call importance), and a statistically significant result may not be of practical significance at all. H 0 25 / 29 Significance vs Importance In scientific research, and in this course too, the word significance has a special meaning which is dierent from its everyday language meaning. It is a keyword reserved to mean statistical significance, i.e. the sample results being unlikely to be observed just because of random variation due to sampling when in fact is true in the population. This is not the same meaning as the sample results being important or practically significant. H 0 26 / 29 Significance vs Importance You can have a significant result, but this doesn't necessarily mean that it's important. It's good practice to always follow-up a significant result (i.e. when you reject the null hypothesis) with a confidence interval, to provide the reader with a measure of the "magnitude" of the departure from the hypothesised value . In our case, the 95% confidence interval [36.69, 36.93] °C indicates that the departure from the currently accepted standard of 37 °C is not that big to be of much impact for decision makers. This is an example of a statistically significant result that is not of practical importance. H 0 : μ = μ 0 27 / 29 Why both If a confidence interval can also be used to perform a hypothesis test, why do we need both The answer is that they both provide useful and complementary information. A p-value gives you the strength of evidence that the sample data bring against the null hypothesis. This is not conveyed by a confidence interval. The confidence interval gives you an idea of the "magnitude" of the population parameter, which the hypothesis test doesn't convey. In summary, report both if you have a significant result, i.e. if you reject the null hypothesis! Always follow up a significant hypothesis test with a confidence interval. Side note. If the hypothesis test is not significant, and you do not reject say, then you do have an idea about the magnitude of the population mean. In fact, by not rejecting the claim that you are told that 0 is a plausible value for the population mean. H 0 : μ = 0 μ = 0 28 / 29 29 / 29