程序案例-ALY6010-21661

Module 4 Project Two-sample Confidence Intervals & Hypothesis Testing ALY6010-21661 (Feb 2022) Overview and Rationale This assignment is designed to provide you with hands-on experiences in estimating and hypothesizing with two samples of interest. The data set is provided in an Excel workbook and contains a wide range to data types that you will need to work with. Remember that for your references you should use books, scientific journals, strictly academic sources. Course Outcomes This assignment is directly linked to the following key learning outcomes from the course syllabus: CO1: Explore the use of statistical software in data analysis through hands-on applications, CO4: Perform estimations of population parameters using confidence intervals based on one sample and perform estimations of the difference between two population parameters of the same kind based on two samples. CO6: Perform various hypothesis tests, including those for a population parameter (single sample), and the difference between two population parameters of the same kind (two samples), and perform analysis of variance (ANOVA). CO7: Interpret meaningful relationships and patterns in the data in relation to a given business question Prepare the following assignment using R Markdown. Submit (1) your HTML Report and (2) your original Rmd file. Remember that your report will be reviewed using Turnitin. Review your Turnitin score, and if it higher than 20%, fix your report, submit again, and repeat until your score is lower than 20%. Do not use t.test() codes on your report, always use the formulas we talked in class, they are also presented in the book. Same for other tests. Title. Present a title. Introduction: Prepare a well-informed introduction, supported by academic references. Demonstrate your understanding of the following topics: 1. Hypothesis testing and its application in an industry of your interest. 2. The different applications of z test, t test and F test for two sample comparisons. 3. Importance of proper referencing in Academic writing. 4. Briefly describe all data sets used in this report and their purpose. Use at least 2 academic references per topic, besides our course book. Analysis section: Include in this section all the tasks described below. For every task, present a title and explanation, remember that this is a professional report, and your readers need to know about the tasks (a short title and a short explanation will do it). In addition, adding your own title to each task and your own explanations will decrease Turnitin scores. Conclusions: Make an overall observation of the whole project, the meaning of the results you obtained regarding the direction of the data or project and explain any new analytical and R programing skills you gained. Also, imagine you are preparing this report for a company or research institution, therefore, you must make meaningful contributions, think about what recommendations you can provide. Bibliography: Use APA format. References must be used on the main body of your report: Technically speaking, if you do not mention any references in the main text of your report, then it is like you did not use any, even if you add a list at the end. Present references in the main body of your reports, in the place where you use them as an information source; use either only the first author’s last name and year, e.g., (Bluman, 2017) and then list them in the bibliography section in alphabetical order or use a number in order of appearance/use, then list them in the bibliography section in that numerical order. Appendix: Mention the attached Rmd file. Note: Before you begin this assignment, you must install and load the library “MASS” into your R Studio so that you can use sample data available within this package. https://cran.r-project.org/web/packages/MASS/MASS.pdf Use the following code: install.packages(“MASS”) then library(MASS) Do not present the install.package() code on your report, run it directly on your R Studio’s console tab. Remember: Do not present long raw data sets on your report, ONLY the results of your data analyses. Task 1 (for each task, add your own title on your report). Present some descriptive statistics of the public data set cats. Check the data set cats using cats code on your Console. Be organized and make sure to: 1. Select the appropriate descriptive statistics to present. 2. Select the appropriate visualizations to present the data (tables and /or graphs) This is basically an open question, in which you must show your analytical, organization, and data presentation skills. Task 2 Assuming that the samples are independent of each other, and the variance of both population is unknown, answer the following research question: Do male and female cats have the same body weight (in Kilograms) (Bluman, chapter 9-2). Ho: μ1 = μ2 Ha: μ1 ≠ μ2 Hint: one way to get separate R vectors for male and female cat body weight values is to use the subset function as follows: male = subset(cats, subset=(cats$Sex==”M”)), similar for females. Present your hypothesis. Use α = 0.01 for your hypothesis testing procedure. * Present the critical value and compare it to your test value. * Explain the result of your test. * All your codes for this task should be contained in one single R chunk. Prepare a table or use inline R codes to present your answers. Task 3 In task 2 you tested the hypothesis for the differences in the means between female and male cats body weight. In task 3, let’s test for the difference in their variances (Bluman, chapter 9-5). Use alpha α = 0.01 * Present the critical value and compare it to your test value. * Explain the result of your test. * All your codes for this task should be contained in one single R chunk. Prepare a table or use inline R codes to present your answers. Task 4 Repeat task 2, this time test the hypothesis that the heart weight mean (in grams) are not equal between female and male cats. Use α = 0.001 for your hypothesis testing procedure. * Present the critical value and compare it to your test value. * Explain the result of your test. * All your codes for this task should be contained in one single R chunk. Prepare a table or use inline R codes to present your answers. Task 5 Repeat task 3, this time test the hypothesis that the heart weight variances are not equal between female and male cats. Use α = 0.001 for your hypothesis testing procedure. * Present the critical value and compare it to your test value. * Explain the result of your test. * All your codes for this task should be contained in one single R chunk. Prepare a table or use inline R codes to present your answers. Task 6 Hypothesis testing for two dependent paired samples (Bluman chapter 9-3). To evaluate whether meditation has an effect on sleep quality, 13 students were recruited for a meditation workshop. They agreed to wear sleeping evaluators to measure their sleeping quality. The sleeping quality is on the scale 0-10 (the higher the better). The average sleeping quality scores in the week before the workshop were: 5.7, 7.8, 5.9, 5.6, 5.9, 6.8, 5.7, 3.9, 4.6, 4.5, 7.7, 6.3 The average sleeping quality scores in the week following the workshop were: 6.8, 8.7, 7.6, 6.2, 6.1, 7.7, 5.9, 4.5, 5.5, 6.1, 6.9, 5.2 The researchers claimed that meditation improves sleeping quality. Is it true Using the following order of event, present your report in an organized manner. List all steps of your testing procedure clearly (using α = 0.01). Explain why this is a test of two dependent paired samples. State your two hypotheses. Prepare and present all your codes using only one R Chunk. Present the table where you calculated D=(X1-X2) and D2=(X1-X2)2. For example (check book): Present the test critical values and compare it to your test value. Explain your findings along with interpretation. Task 7 When you prepared tasks 2 and 4, you used the critical values to test your hypothesis. In task 6, use the critical values to obtain their corresponding P values, show the code and explain it. Use the p values to test your hypotheses again using the same alpha values from tasks 2 and 4. Remember to explain your findings.