Prepare for your exams
Get points
Guidelines and tips

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search Store documents

The best documents sold by students who completed their studies

Search through all study resources

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

University Rankings

Discover the best universities in your country according to Docsity users

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

From our blog

Exams and Study

Go to the blog

Econometrics Midterm Exam Solutions: Regression Analysis and OLS Estimators, Exams of Economics

University of Waterloo Economics

Answers to an econometrics midterm exam, covering topics such as random samples, random variables, the impact of control variables, and the properties of the ols estimator. It includes detailed explanations and derivations related to linear regression models, hypothesis testing, and potential issues like endogeneity. The document also addresses the interpretation of regression coefficients and the validity of statistical tests in the context of wage analysis. This material is useful for students studying econometrics, providing insights into the application of econometric techniques and the interpretation of results. 489 characters long.

Typology: Exams

2024/2025

Available from 05/29/2025

elam-dennis 🇨🇦

12 documents

1 / 17

This page cannot be seen from the preview

Don't miss anything!

1

ECO5185

Midterm Answers;

Fall 2025.

Question 1

a.

Briefly explain (in words) the idea/concept of each of the following, and briefly explain their

implication for econometric analyses

(a)

random sample

Answer:

A random sample is a sample whose draws are independent and identically

distributed. It means that for the simple linear regression model A3, i.e.

E(ϵ

i

|

X

) =

0

(1)

simplifies to

E(ϵ

i

|

x

i2

)

=

0.

(2)

As such, for the OLS estimator to be unbiased we only need to worry about the explana-

tory variable not being correlated with the error term within each draw. It will also have

implications for the structure of the variance of our estimator.

(b)

random variable

Answer:

A random variable is a variable whose outcome is uncertain/unknown.

Given that

y

i

and

x

ik

(i =

1, . . . , n and k

=

1, . . . ,

K)

are random variables, the OLS

estimator, β

k

(k

=

1, . . . ,

K)

is also a random variable. It has a mean and a variance,

and as such, has statistical properties and we can carry-out hypothesis testing.

b.

Discuss how the addition of a control variable has implications for

(a)

the variation used to estimate a parameter of interest

Answer:

By the FWL theorem, the addition of a control variable means that we are not

using some of the variation in x2 when estimating its parameter (i.e. β2). More precisely,

Partial preview of the text

Download Econometrics Midterm Exam Solutions: Regression Analysis and OLS Estimators and more Exams Economics in PDF only on Docsity!

ECO5185 Midterm Answers; Fall 2025.

Question 1

a. Briefly explain (in words) the idea/concept of each of the following, and briefly explain their

implication for econometric analyses

(a) random sample

Answer: A random sample is a sample whose draws are independent and identically

distributed. It means that for the simple linear regression model A3, i.e.

E ( ϵi | X ) = 0 (1)

simplifies to

E ( ϵi | xi 2 ) = 0_._ (2)

As such, for the OLS estimator to be unbiased we only need to worry about the explana-

tory variable not being correlated with the error term within each draw. It will also have

implications for the structure of the variance of our estimator.

(b) random variable

Answer: A random variable is a variable whose outcome is uncertain/unknown.

Given that yi and xik ( i = 1 ,... , n and k = 1 ,... , K ) are random variables, the OLS

estimator, βk ( k = 1 ,... , K ) is also a random variable. It has a mean and a variance,

and as such, has statistical properties and we can carry-out hypothesis testing.

b. Discuss how the addition of a control variable has implications for

(a) the variation used to estimate a parameter of interest

Answer: By the FWL theorem, the addition of a control variable means that we are not

using some of the variation in x 2 when estimating its parameter (i.e. β 2 ). More precisely,

we are not using the variation in x 2 that is correlated with the new control.

(b) the validity of a test

Answer: Adding a control may result in assumption A3 holding, if the additional variable

belongs in the model and is correlated with the explanatory variable. Recall that if A

does not hold the t-test and F-test are not valid.

c. Briefly evaluate the following statement

“ The R^2 measures the proportion of the variation in the dependent variable that is explained by

the regression line, and as such, a high R^2 implies the estimate of the parameter of interest is

probably close to the true parameter.”

Answer: The first part of the statement is true. The R^2 measures how much of the variation in

the dependent variable is explained by the OLS regression line. Having said this, the second

part of the statement is not true. Having a high R^2 has no bearing on whether assumption

A3 holds, and as such no bearing on whether the estimator is unbiased or consistent. As such,

it has no bearing on whether the estimate is close (or probably close) to the true.

d. You are interested in exploring whether men, on average, have a higher hourly wage than

women in the Canadian labour market. Your co-author suggests the following estimator

i ∈ male (^) wage i — nm

i ∈ female wagei nf

where wagei is the hourly wage of individual i , and nm and nf are the number of males and

females in the sample, respectively.

Would this estimator, when applied to data, generate a guess that close to the true parameter

of interest? Justify your answer.

Answer: This is a method of moment estimator (i.e. the sample analogs of the population

moments). It is made up of sample means, and we know that sample means converge in

probability to population means if we have a random sample. Now, if the estimator is con-

sistent (i.e. we have a random sample) and we have a large sample, one can conclude that the

estimator is probably close to the true population parameter. We cannot, however, say it

with certainty.

Question 2

Assume the true population model takes the form

yi = β 1 + β 2 xi 2 + ϵi

Σ x Σ Σ Σ x

Σ Σ Σ x Σ Σ x Σ x

i = n i =

2 i 2 i =

i =1 i 2 n i =

i =1 i 2 i =

n i =

2 i 2

i =

i = 2 i 2 i =

n i = i = n i = i =

Answer: Σ n yixi 2

Σ n ( β 2 xi 2 + ϵi ) xi 2 = (^) n i = n (^) [ β 2 x 2

2 i 2

ϵixi 2 ]

β 2

Σ n x^2

Σ n ϵixi 2

c. Will this slope parameter estimator be unbiased? Show your work.

Answer: Σ n ϵixi 2

Σ n ϵixi 2

Σ n xi 2 E [ ϵi | X ]

Σ n xi 2 · 0 = β 2 +

= β 2

n (^) 2 i =1 i 2

It will be unbiased if A3 holds.

Question 3

a. Show the following

(a)

PM = 0

= E [ β 2 | X ] + E

x

b 2 =

2 i 2 =

(^) n i =

2 i 2

= β 2 +

E [ b 2 | X ] = E β 2 +^ n i =

| X

2 i 2

| X

= β 2 + (^2) i 2

n i =1 x

2 i 2

Answer:

PM = X ( X ′ X )−^1 X ′[ I − X ( X ′ X )−^1 X ′]

= [ X ( X ′ X )−^1 X ′ I ] − [ X ( X ′ X )−^1 X ′ X ( X ′ X )−^1 X ′]

= [ X ( X ′ X )−^1 X ′ I ] − [ X ( X ′ X )−^1 IX ′]

= [ X ( X ′ X )−^1 X ′] − [ X ( X ′ X )−^1 X ′]

(b)

M is idempotent

Answer:

MM = [ I − X ( X ′ X )−^1 X ′][ I − X ( X ′ X )−^1 X ′]

= [ II ] − [ IX ( X ′ X )−^1 X ′] − [ X ( X ′ X )−^1 X ′ I ] + [ X ( X ′ X )−^1 X ′ X ( X ′ X )−^1 X ′]

= I − X ( X ′ X )−^1 X ′^ − X ( X ′ X )−^1 X ′^ + X ( X ′ X )−^1 IX ′

= I − X ( X ′ X )−^1 X ′^ − X ( X ′ X )−^1 X ′^ + X ( X ′ X )−^1 X ′

= I − X ( X ′ X )−^1 X ′

= M

(c)

[( M 1 X 2 )′( M 1 X 2 )]−^1 ( M 1 X 2 )′( M 1 y ) = [ X ′^ X 2 − X ′^ X 1 ( X ′^ X 1 )−^1 X ′^ X 2 ]−^1 [ X ′^ y − X ′^ X 1 ( X ′^ X 1 )−^1 X ′^ y ] 2 2 1 1 2 2 1 1

where the underlying population model is

y = X 1 β 1 + X 2 β 2 + ε

Answer:

[( M 1 X 2 )′( M 1 X 2 )]−^1 ( M 1 X 2 )′( M 1 y ) = [ X ′^ M 1 X 2 ]−^1 [ X ′^ M 1 y ] 2 2 = [ X ′^ ( I − X 1 ( X ′^ X 1 )−^1 X ′^ ) X 2 ]−^1 [ X ′^ ( I − X 1 ( X ′^ X 1 )−^1 X ′^ ) y ] 2 1 1 2 1 1 = [( X ′^ − X ′^ X 1 ( X ′^ X 1 )−^1 X ′^ ) X 2 ]−^1 [( X ′^ − X ′^ X 1 ( X ′^ X 1 )−^1 X ′^ ) y ] 2 2 1 1 2 2 1 1 = [ X ′^ X 2 − X ′^ X 1 ( X ′^ X 1 )−^1 X ′^ X 2 ]−^1 [ X ′^ y − X ′^ X 1 ( X ′^ X 1 )−^1 X ′^ y ] 2 2 1 1 2 2 1 1

(c) Another co-author suggests the following estimator

b = ( X ′ PZX )−^1 X ′ PZy

where P Z = Z ( Z ′ Z )−^1 Z ′, with Z being the X matrix where the information for the xK -

th variable is replaced with information on z 1 and z 2. Therefore Z is n × K + 1. P Z is

both symmetric and idempotent. Show that, under certain identifying assumptions, this estimator will be consistent. (5 marks)

Answer:

b = ( X ′ PZX )−^1 X ′ PZy

= ( X ′ PZX )−^1 X ′ PZ ( X β + ε )

= β + ( X ′ PZX )−^1 X ′ PZ ε (prop. of matrices)

= β + ( X ′ Z ( Z ′ Z )−^1 Z ′ X )−^1 X ′ Z ( Z ′ Z )−^1 Z ′ ε

X ′ Z Z ′ Z

− 1 Z ′ X

X ′ Z Z ′ Z

− 1 Z ′ ε

X ′ Z Z ′ Z

− 1 Z ′ X

X ′ Z Z ′ Z

− 1 Z ′ ε

plim ( b ) = plim β + n n n n^ n^ n

= plim β + plim

X ′ Z

n

plim

Z ′ Z

− 1

n

plim

Z ′ X

− 1

n

plim

X ′ Z

n

plim

Z ′ Z

− 1

n

plim

Z ′ ε

n

= β + ( Q XZ Q −^1 Q ZX )−^1 Q XZ Q −^1 0 ZZ ZZ = β

Question 5

The STATA output needed to answer this question can be found in the following pages.

Assume the following population model

wagei = β 1 + β 2 unioni + β 3 indigenousi + β 4 westi + β 5 easti

β 6 indigenousi · westi + β 7 indigenousi · easti + εi

where wagei is the hourly wage (in dollars) of individual i. union is a binary variable equal to

one if the individual is unionized or covered by a union (and zero otherwise), and indigenous is

a binary variable that equals one if the person identifies themselves as being indigenous (and zero

n n n n n n = β +

2

otherwise). Finally, west is a binary variable that equals one if the person lives in Western Canada

(and zero otherwise) and east is a binary variable that equals one if the person lives in Eastern Canada

(and zero otherwise).^1

a. Interpret the coefficient estimate of β 2.

Answer: Holding all other factors constant, a worker that is unionized (or covered by a union)

makes $ 5.04 more per hour than a worker that is not unionized (nor covered by a union).

b. Test whether being unionized or covered by a union has an impact on the wage. Show your

work. (5 marks)

Answer:

Step 1:

H 0 : β 2 = 0

Ha : β 2 ̸= 0

Step 2: Under A 1, A 2 a , A 3, A 4, A 5, and A 6

tstat^ =

bols^ − β 2 se ( bols )

~ tn − K ( t 53779 )

Step 3:

tstat^ =

Furthermore, the critical values are is 1_._ 962 and - 1_._ 962

Step 4 (conclusion): Since

tstat^ = 35_._ 652 > 1_._ 962

one rejects H 0 in favour of Ha at the 5% level of significance. Said less formally, being unionized

(or covered by a union) impacts the wage.

c. Do you believe that the test carried above is valid? Justify your answer. (5 marks)

Answer: Gender probably belongs in the wage equation and women tend to be more unionized

than men. As such, assumption A3 probably does not hold which means that the test is not

valid. (^1) Central Canada is the reference group.

i

sw 2 = i = n − 1

sw,z = i = n − 1

Equations

Properties of sums

n n n Σ ( xi 2 + yi ) =

xi 2 +

yi i =1 (^) n i =1 n i =

ayi = a

yi i = n

i =

a = n · a i =

other equations

n n n Σ ( yi − y ¯) = 0

( xi 2 − x ¯ 2 )( yi − y ¯) =

( xi 2 − x ¯ 2 ) yi

Expectation properties (with a being a constant)

E ( wi + zi ) = E ( wi ) + E ( zi )

E ( awi ) = aE ( wi )

E ( a ) = a

E ( wi ) = Ez { E [ wi | zi ] }

Variance and covariance

V ar ( wi ) = E ( wi − E ( wi ))^2

= E ( w^2 ) − E ( wi ) E ( wi )

Cov ( wi, zi ) = E ( wi − E ( wi ))( zi − E ( zi ))

= E ( wizi ) − E ( wi ) E ( zi )

Sample variance and sample covariance

Σ n ( wi − w ¯ )^2

Σ n ( wi − w ¯)( zi − z ¯)

i =1 i =1 i =

i =

s^2 =

1 Σ^

e^2

Simple linear regression model

yi = β 1 + β 2 xi 2 + εi

Σ n ( xi 2 − x ¯ 2 )( yi − y ¯)

var ( bols | X ) = (^) n i =

σ^2 ( xi 2 — x ¯ )^2

est. var ( bols | X ) = (^) n i =

s^2 ( xi 2 — x ¯ )^2

where

se ( bols ) =

n − 2

s

Σ n

i =

s^2 ( x

i

— x ¯)^2

i =1 i

yi = y ˆ i + ei where y ˆ i = b 1 + b 2 xi 2

Simple linear regression model assumptions

A 1: Linearity (in the parameters) of the regression model

A 2 a : Variation in x 2 (in the sample)

A 3 (Regarding the error term ( ε )): Exogeneity of X , E ( εi | X ) = 0

A 4: Homoskedasticity and absence of autocorrelation var ( εi ) = σ^2 and cov ( εi, εj ) = 0 for i ̸= j

A 5: Data Generation (stochastic process for the explanatory variable(s))

n i =1 ( x^ i^2 —^ x ¯^2 ) 2

ols ols ols b 1 = y ¯ − b 2 x ¯^2 b 2 =

n

e

X

R-squared

R^2 =

SSR

SST

where SST (total sum of squares) is

n ( yi − y ¯)^2 i =

SSR (regression sum of squares) is

n ( y ˆ i − y ¯)^2 i =

and SSE (sum of squared error) is

n 2 i i =

Properties of Matrices

a. Given two matrices ( A and B ) of equal dimensions

A + B = B + A

b. Given three matrices ( A , B and C ) of equal dimensions

( A + B ) + C = A + ( B + C )

c. Given two matrices ( A and B ) of equal dimensions

( A + B )′^ = A ′^ + B ′

d.

( A ′)′^ = A

e. In general,

AB ̸= BA

f.

( AB ) C = A ( BC )

g.

A ( B + C ) = AB + AC and ( A + B ) C = AC + BC

h.

( AB )′^ = B ′ A ′

i.

AI = IA = A

where I is the identity matrix

j. If A is square and of full rank, A −^1 exists, and

A −^1 A = AA −^1 = I

k.

( A −^1 )−^1 = A

l.

( A −^1 )′^ = ( A ′)−^1

m.

( AB )−^1 = B −^1 A −^1

Multiple linear regression model assumptions

A 1: Linearity (in the parameters) of the regression model

A 2: X is full rank (i.e. rank ( X ) = K ).

A 3 (Regarding the Error Term ( ε )): Exogeneity of X , E ( ε | X ) = 0

A 4: Homoskedasticity and absence of autocorrelation

E ( εε ′| X ) = σ^2 I n

A 5: Data Generation (stochastic process for the explanatory variable(s))

A 6: Normality of the error terms - The disturbance terms are normally distributed, conditional on

x

Fstat^

( R^2 − R ∗^2 ) /J

(1 − R^2 ) /n − K

Econometrics Midterm Exam Solutions: Regression Analysis and OLS Estimators, Exams of Economics

Related documents

Partial preview of the text

Download Econometrics Midterm Exam Solutions: Regression Analysis and OLS Estimators and more Exams Economics in PDF only on Docsity!

ECO5185 Midterm Answers; Fall 2025.

Question 1

Question 2

Question 3

PM = 0

| X

| X

PM = X ( X ′ X )−^1 X ′[ I − X ( X ′ X )−^1 X ′]

= [ X ( X ′ X )−^1 X ′ I ] − [ X ( X ′ X )−^1 X ′ X ( X ′ X )−^1 X ′]

= [ X ( X ′ X )−^1 X ′ I ] − [ X ( X ′ X )−^1 IX ′]

= [ X ( X ′ X )−^1 X ′] − [ X ( X ′ X )−^1 X ′]

MM = [ I − X ( X ′ X )−^1 X ′][ I − X ( X ′ X )−^1 X ′]

= [ II ] − [ IX ( X ′ X )−^1 X ′] − [ X ( X ′ X )−^1 X ′ I ] + [ X ( X ′ X )−^1 X ′ X ( X ′ X )−^1 X ′]

= I − X ( X ′ X )−^1 X ′^ − X ( X ′ X )−^1 X ′^ + X ( X ′ X )−^1 IX ′

= I − X ( X ′ X )−^1 X ′^ − X ( X ′ X )−^1 X ′^ + X ( X ′ X )−^1 X ′

= I − X ( X ′ X )−^1 X ′

= M

X ′ Z Z ′ Z

X ′ Z Z ′ Z

X ′ Z Z ′ Z

X ′ Z Z ′ Z

X ′ Z

Z ′ Z

Z ′ X

X ′ Z

Z ′ Z

Question 5

Equations

1 Σ^

X

R^2 =

SSR

SST

A + B = B + A

( A + B ) + C = A + ( B + C )

( A + B )′^ = A ′^ + B ′

( A ′)′^ = A

AB ̸= BA

( AB ) C = A ( BC )

( AB )′^ = B ′ A ′

AI = IA = A

A −^1 A = AA −^1 = I

( A −^1 )−^1 = A

( A −^1 )′^ = ( A ′)−^1

( AB )−^1 = B −^1 A −^1

( R^2 − R ∗^2 ) /J