stata-MG4F7-Assignment 20

Project Assignment 2021 MG4F7 Due date: December 6, 2021 The
Empirical Analysis of the Wealth of Nations Why are some countries rich
and others poor We are going to study several possible drivers of
economic development: countries’ human capital; countries’ efforts to
develop new technologies; countries’ business environments; and,
countries’ political institutions. I have provided you with data from
the World Bank’s World Development Indicators database (for more
information, see
http://databank.worldbank.org/data/reports.aspx source=World-Development-Indicators).
For all 217 countries, and for the year 2010, I have extracted
available data on the following variables: GDP per capita, PPP
(constant 2011 international $); this is a measure of national income
produced in a country, per person, in a given year Life expectancy at birth, total (years); this is a measure of human capital as health and well-being Research and development expenditure (% of GDP); this is a measure of engagement in technological progress;
Cost of business start-up procedures (% of GNI per capita); this is a
measure of the business-friendliness of a country (something like a
measure of “red tape”) CPIA transparency, accountability, and
corruption in the public sector rating (1=low to 6=high); this is a
measure of the quality of government institutions 1. Examine the
outcome variable (GDP per capita). a. What is the median b. What is
the mean c. Does the difference between median and mean suggest the
presence of a skew in the distribution – if so, in which direction Make
a histogram plotting national income (save the graph and include it in
your responses – please do likewise any time you are asked for a figure
or table). Does it look as you’d expect 2. Examine the
explanatory variables. a. Make a correlation table that includes the
outcome variable (y) and all of the variables (x1, x2, x3 ,…) The Stata
command is “corr y x1 x2 x3 …”. What is the correlation between R&D
expenditure and income Does it surprise you b. Now examine the
univariate correlation between income and R&D expenditure (corr y
x1). What is the correlation you see now Any idea what might be
happening c. Create a variable that indicates that all variables are
non-missing. Examine the univariate correlation between income and
R&D expenditure (corr y x1) if the non-missing indicator is equal to
1. What is the correlation you see Does this clarify the findings in
3(a) and 3(b) What does it suggest about how you should run your
multivariate regressions d. Examine and report each pair-wise
correlation between explanatory variables (corr x1 x2, corr x1 x3, corr
x1 x4, corr x2 x3, corr x2 x4, corr x3 x4). Where do you see the
greatest potential collinearity problem 3. We next examine the
simple relationships between each explanatory variable and the outcome
variable (GDP per capita). a. Please produce scatter plots in which
income per capita is plotted against each of the explanatory variables
of interest. What key limitation can you see in the graph in which the
quality of government institutions is the explanatory variable What key
concern do you see in the graph in which the cost of business start-up
is the explanatory variable How can you address this concern b.
Estimate the simple regressions predicting income per capita with each
of the explanatory variables of interest. Please report the slope and
intercept for each regression (or just incorporate the regression output
into your project submission). c. What is the y-intercept in the
simple regression of income per capita on the cost of business start up
procedures What does it mean in practice Is it a “realistic”
y-intercept in the sense of describing a potential reality d. What is
the slope in the simple regression of income per capita on the cost of
business start up procedures How is income predicted to change if a
country were to see a decline in the cost of opening a business from
60% of national income per capita to 10% e. Based on the simple
regression of income per capita on the cost of business start up
procedures, what is the predicted level of income per capita in a
country with start up costs equal to 100% of national income per capita
What is the approximate 95% prediction interval for income in a country
with start up costs equal to 100% of national income per capita Is
this level of start up costs an outlier in the data f. What is the
y-intercept in the simple regression of income per capita on R&D
expenditures What does it mean in practice Is it a “realistic”
y-intercept in the sense of describing a potential reality g. What is
the slope in the simple regression of income per capita on R&D
expenditures Suppose a government minister proposes an ambitious policy
increasing R&D expenditures by 0.5% of national income (GDP). The
minister argues that this will increase income per capita by $10,000. Do
you think this is likely Explain. h. What is the y-intercept in the
simple regression of income per capita on life expectancy What does it
mean in practice Is it a “realistic” y-intercept in the sense of
describing a potential reality i. What is the R-squared in the simple
regression of income per capita on life expectancy How does it compare
to the R-squared in the other simple regressions j. Based on part
4(i), do you think that life expectancy has an important role in causing
higher incomes Propose a mechanism that would produce such a causal
relationship. k. Suppose you were skeptical that the observed
relationship between income per capita and life expectancy is causal.
Propose one reverse causality mechanism and one omitted variables
mechanism that would produce the positive relationship observed. 4.
Let’s see what we would observe if we happened to draw particular
subsamples for our estimates. Make sure your data are sorted by country
name. Generate a country code that is increasing as you go down the
dataset (so Afghanistan is 1, Albania 2, etc., down to Zimbabwe at 217).
a. Estimate the regression line predicting income using R&D
expenditure for country code 1-50; 51-100; 101-150; and, 151-217. Report
the estimated slopes. Why do the slopes differ from one regression to
the next b. If you were trying to infer the relationship between
R&D spending and income per capita for the entire set of 217
countries from just one of these subsamples, how would you do it (hint:
you can think of this as having a few different “sample” signals, and
you are trying to estimate where a “population” parameter is likely to
be) Would each of the subsamples allow you to produce a reasonable
inference about the relationship present in all 217 countries 5.
We’ll next look at multivariate regressions. Estimate a model predicting
income per capita using life expectancy, business start-up costs and
R&D expenditure. (Hint: make sure you have addressed the issue with
the start-up cost data.) a. Present evidence of a collinearity problem
arising from the inclusion of both life expectancy and business
start-up costs. b. Can you make an argument for including life
expectancy and dropping business start-up costs from the empirical
model c. Can you make an argument for including business start-up
costs and dropping life expectancy from the empirical model d.
Produce a path diagram indicating the relationships among business
start-up costs, life expectancy, and income per capita. If you do not
have a clear idea about the direction of causality, just draw in a line
with arrows in both directions, but make clear the signs (positive or
negative) of the relationships.