Stuart Urban Computational Finance 2021 Fall II Computational Finance – HW 3 1. Generate 5,000 random rolls of a fair six-sided die i.e. one in which each number 1 through 6 is equally likely. Fix your random number generator seed at 200 for reproducibility. Write your code in a flexible way that allows you to change how many sides the die has, or to make it loaded (i.e. some sides have higher probabilities than others), with no more than one additional or different line of code. a. Plot and label a histogram showing how many rolls produced each number. b. Plot and label a histogram showing how many of the first 50 rolls produced each number. c. If you did not know whether the die was fair or not, what would you conclude d. I offer you the opportunity to bet 20 cents on any number to win $1 if you roll that number, and $0 otherwise. You are neutral to risk. If you did not know that the die is fair, just based on what you observed from the first 50 rolls of the die, do you take the bet If yes, which number do you bet on e. If you decided to bet on a particular number in part (d), how does your bet do over the next 50 rolls of the die i.e. how much do you make or lose f. What does this exercise tell you about learning from small samples Connect your answer to the Law of Large Numbers. 2. One common distribution we often encounter in real-world data is the Pareto distribution. For instance, it approximately describes the upper end of distributions of income and wealth in a population, the size of cities in a country, or the size of companies. Pareto-distributed data has a “fat tail” – extremely high positive values are much more likely than what would have been predicted by some of the other distributions we know (e.g. normal). The CDF of the Pareto distribution is () = 1 , ≥ 0, < where is the smallest value the Pareto-distributed random variable can take and is the rate at which the probability decreases for large values. Stuart Urban Computational Finance 2021 Fall II In this question, we will approximate the distribution of market capitalizations of companies in the S&P 500 index as of 12/31/2020. a. Import market cap data from market_cap.xlsx and use it to estimate . b. Sample 500 x 100,000 uniformly distributed random numbers. c. Use the inverse transform method to transform them into a sample of Pareto- distributed random numbers, assuming = 1.5. d. Sort each column of the vector using the sort() function and then order elements largest to smallest, and average across rows to get a 500x1 vector. What does each element in this vector represent e. Use a scatterplot of actual vs. simulated market caps to evaluate your simulations. To make it easier to gauge results from the graph, I recommend (i) plotting both market caps in logs, and (ii) also drawing a 45 degree (i.e. y=x) line. f. Repeat steps (c) through (e) for = 2.0 and 1.0. Which simulation seems to describe the data best Why