









Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
Reviewing Assumptions, Dummy variables, Structural Change, Unrestricted model, Specification Analysis, Model Building, Non Nested or Competing Models, Least Squares describe this handout.
Typology: Study notes
1 / 16
This page cannot be seen from the preview
Don't miss anything!
Assumption 1: “Correctly specified, linear model”
Dummy variables
female male
male
o Case 2: Several Categories
summer fall i
i i spring
D D
c y D
2 3
1 where:
fall w er
sum w er
spring w er
w er
c c
c c
c c
c
int
int
int
int
3
2
1
1
Alternately: fall w er i
i i spring summer
D D
c y D D
3 3 int
1 2 where:
4
3
2
1
4
3
2
1
c
c
c
c
o Case 3: (many categories, many values, just examined) o Case 4: Threshold Effects If wanted to measure effect of increasing levels of, say, education, one couldn’t set up a variable where Ei = 1 if high school, 2 if bachelors, 3 if masters, 4 if PhD, etc. since this assumes that each ‘jump’ (1 to 2, 2 to 3) has equal ‘value.’
3
1
i
income age i Ei
(notice that dummy variable trap is avoided by summing through 3, not 4 (which is the total number of categories) o Case 5: Interaction Terms
0 E income age Di
Allow for interaction:
Structural Change
o We generally assume that β is the same for all Y. However, it may look as follows:
o H 0 : Rβ=q (wil tell us if β applies to all y)
Unrestricted model : (^)
×
×
×
×
×
×
×
×
21
11
21
11
21
11
21
11
2
1
2
1
2
1
2
1 0
n
n
n
n
n
n
n
n X
y
y
2 2 2 2
1 b 2 (^) = ( X 2 ′ X 2 ) X ′ Y → e ′ e − and the total residual residual
sum of squares: e ′ e = e 1 ′ e 1 + e 2 ′ e 2
Restricted Model
1 2
2
1 1
1
1 2 k k
o [ ]
R k Ik Ik k
o
1 2
2
1 1
1
k k
R q
×
×
×
×
×
×
21
11
21
11
21
11 2
1
2
1
2
1
n
n
n
n
n
n X
y
y
( )
[ e e ( n n k )]
ee ee k F (^) k n n k 1 2 2
** ( , 1 22 ) ′ + −
x
y
o ~ ( 0 , 1 ) 2 ˆ
N
w w
T
rk
r t ∑ =+
o (^) ∑ =+
T
rk
wr w T k 1
2 ( ) ( 1 )
σˆ and (^) ∑ − =+
T
rk
wr T k
w ( ) 2
Specification Analysis
o Omitting relevant variables Biased parameters Variance is smaller than true model, therefore get higher t-ratios s 2 is a biased estimator of σ 2
o Including an irrelevant variable Unbiased parameters Variance is greater than true variance s 2 is an unbiased estimator of σ 2
Model Building
o R2 or adjusted R2 (slowly add variables to increase it)
o Akaike Information Criterion
kn AIC k sy R e
2 2 2 / ( )= ( 1 − ) (choose the lower
AIC)
o Bayesian Information Criterion:
kn BIC k sy R e
2 2 / ( )= ( 1 − )
Non-Nested or Competing Models
o Macroeconomics makes use of this
o Encompassing Test
(Look for variables in common) W = X ∩ Z , X and Z are those remaining in each model
o J-Test y = ( 1 −α ) X β+α Z γ+ ε
Construct Z γˆ
Run y = ( 1 −α ) X β+α( Z γˆ)+ε, ( )
o Cox Test This was not discussed in class. (There’s a complex discussion in the book.)
w
t
time
breaks Stable model
unstable model
2
Assumption 2: Matrix X, has rank K
Multicollinearity o Two cases: Perfect multcollinearity, solution: drop variable (if possible) Near or high multicollinearity o Detect with:
Variance Inflation Factor: ( ) 2 1
− R k
(15-20 is a large number)
Characteristic roots: SCR
(if > 20, then problems) where LCR
and SCR are the largest and smallest characteristic roots. o Fix: Remove observation (however, not always possible due to theory, etc.) Missing observations o Ignorable – data are unavailable for unknown reasons Case A: YA, XA nA observations on X and Y are available Case B: — , XB nB observations missing on Y Case C: YC, — nC observations missing on X
∑
∑
=
=
A
A
n
i
i
n
i
i i
x x
y y x x
b
1
2
1
so when we add an observation ( X ) it makes no addition since: ( x − x ) = 0 ).
o R
2 will be lower:
(recall that: YMY
ee R 0
2 1 ′
= − , additional yi adds other
value)
2 will change o Systematic – when there is a reason (sample selection bias issue) Outliers or Influential Observations
Let Xn and Yn be matrices whose elements are r.v. and plim Xn = A, plim Yn = B:
n = A -
lim (^) n → ∞ F ( xn )− F ( x )= 0 at all continuity points of F(), notation:
x x
d n →
Rules (let x x
d n →^ and^ p^ lim^ yn = c )
d n n →
d n + n → +
d n /^ n →^ / if c ≠ 0
d n →
distribution as yn
Least Squares:
o bLS is a consistent estimator of β, plim b = β
Use assumption that 1
1 lim −
−
=
n
p
−
n
n
b
1 therefore
−
−
n
Q p n
n
p b p
1
1
lim (^) = 0
n
p
somehow… using convergence in mean square:
n
, lim (^) n →∞ 0 = 0
n
n n
Var
2
2
, and
lim 0 ( ) 0
2 = =
n
n
n
σ
o B has a limiting distribution that is normal o Stabilizing transformation:
This, n ( b −β), converges in distribution to a normal distrbituion
Lindberg-Levy Univariate Central Limit Theorem
n
i
i x n
x
=
∑ , then ( ) ( 0 , ) 2
d − → (this is the
distribution of the statistic)
−
−
1
1
here we need to show that the latter term is distributed normally since Q is a constant:
o (^) = 0
n
, by assumptions 3 and 5
o ( ) n
n n
Var
therefore
n
p 2 2
o Therefore, ( 0 , ) 2 N Q n
X d
o And,
( )
( )
( ( ) ) 2 1
2 1
2 1
1 1 2 1
−
−
−
− − −
b N X X
n
n
b N
nb N Q
nb N Q Q QQ
d
d
d
d
o Implications for t and f-tests
T-test:
[ ]
[ ]
2 2 2
1 2
1
2 1
lim lim
∑
−
−
−
i
i n
p s p
n
n
n
n k n
n s
n k
n k
n k n k
ee s
( )
2 1 1 ˆ (^) ( ) ( ) − − ′ −
n k
ee
( ) 2 1 1 ˆ (^) ˆ ( ) ( ) − − ′
n
ee
2
o Heteroskedasticity:
2
2 11 11 2 2
nn nn
Var X nn
Often found in cross-section data, also in high frequency o Autocorrelation (memory, persistence)
−
−
×
1
2 1
1 1
1 2 1
2 2
n
n
Var X nn
o Least Squares in General Context:
( ) ( )
2 1 1 ( | )
− −
Not efficient estimator
Assume
lim Q n
p (^) =
consistent Also asymptotically normal:
− 1 − 1
2 , lim Q n
Q p n
b N
d LS
However, asymptotically inefficient, does not achieve CRLB o Omega knowledge: Ω is known
regression, forecast and yield (^) Ωˆ^ which can be used:
2
. If this regression has
spherical disturbances (no autocorr. or heterosked.) then ˆ^2
Ω is completely unknown
n ( n + 1 )
parameters which is greater than n (i.e. it cannot be estimated)
k k <
parameters, therefore
X ′Ω^ X can be approximated
o Define (^) ∑∑ = =
n
i
n
j
ij xixj n
1 1
o White’s (Heteroskedasticity) Consistent Covariance matrix
×
n
i
1
2 0
p lim s 0 = Q
( ) ( )
1
1
2
2 1 1 1 1 ( )
−
=
−
= ′ ∑ XX n
e xx n
n n
Varb
n
i
LS i i i
using White’s heteroskedasticity consistent covariance matrix… o Newey-West Autocorrelation Consistent Covariance Matrix
( )( )
= + ∑ ∑ ′ + ′ = =+
− − −
L
l
T
tl
wl etetl xtxtl xtlxt n
Q s 1 1
0
where 1
l wl , where L is the lag length
o Generalized Least Squares:
Transform data in such a way that assumption 4, Var(ε|X) = σ
2 I, is satisfied Model: Y=Xβ + ε, Var(ε|X) = σ
2 Ω, transform data into:
Spectral Decomposition:
2 * *^1211
lim
lim
− − −
−
n
X X p n
p
d
Good to use in smaller-sample case, MLE for larger samples
o Weighted Least Squares
Size of industry case, divide all observations by size:
nn s^ n
s
2
1
2 11
n
nk n
n n
n
k
s
x s
s s
x
s
x s
s s
x
1
1
1 1
1 1
11
o What is the structure of Ω? (Case when Ω is unknown but its structure is
known) White’s Test
2 3
2 2
2 1 2 3 1
2
2 = σ
2 , H 1 : not H 0.
2 ~ χ
2 (P), where P is the number of regressors in f( ) above.
2 = σ 2
2 , H 1 : not H 0.
1 1
n
sort
n x
x
x
x
, cut in half, sample 1: e 1 ′ e 1 , sample 2: e 2 ′ e 2
11 1
22 2 ( 2 , 1 ) ee n k
ee n k F (^) n kn k ′ −
− − = where^ e^2 ′^ e^2 is on
numerator because expect σ 1
2 < σ 2
2
2 2
2 = σ
2 , H 1 : not H 0.
(^) FGLS ( X X ) X Y
( ) ( )
2 1 1 ˆ (^) ˆ − −
Run regression:
( 50 ) ln ln ln( 50 )
2 2 2 2
λ = ⇔ = + , yields
( )
( )
2
2 1
lnˆ
lnˆ
e n
e
.
Conduct forecast, and if in log, exponentiate and put on diagonals of
Ωˆ o Maximum Likelihood Estimator Likelihood function of all parameters (including Ω) Likelihood ratio test:
2 − 2 ln LR −ln L 0 ~ χ( P ) where P = the number of α’s
Time-Series Model
o yt = d 0 + d 1 xt + ε t , note that (^) { } t Y
t
t
= ∞
= −∞
and that t = 1,…,T which is a “time-
window” o With time-series data we run into two problems Independence Randomness o Properties: Stationarity – expect mean, var, cov to be finite – we know they are well-behaved Ergodicity – (has to do with independence) Martingale Sequences – (fixes randomness problem) o With these three ideas we define Central Limit Theorem based on Martingale Sequences, we continue to maintain E ( ε (^) t | X ) = 0 and
Cov ( ε (^) t , ε t − s ) ≠ 0 where t ≠ s
o Ways to determine autocorrelation / capture covariance: Autoregressive Process:
P
j
Q T rj 1
2 and
( )
∑
∑
=
=+
− = (^) T
t
t
T
jt
t t j
j e
ee
1
2
1
essentially a correlation coefficient between residuals and lag residuals.
2
o Durbin-Watson Test
( )
∑
∑
=
=
t
t
T
t
t t
e
e e D
1
2
2
2 1 (notice only one lag here)
T is the number of observations and K is the number of parameters Lower limit: d (^) L ( T , k )
Upper limit: dU ( T , k )
Hypothesis testing:
( )
t t t t t
t t t t t P y x y u
y x y P u
⇒ = − − − −
− −
− −
1 1 1 2 1
11 2 1 1
now ε
is dependent on x, therefore ε is not consistent. Can use the Durbin H test:
h r C
( )
∑
∑
=
=
− = (^) T
t
t
T
t
t t
e
ee
r
1
2
2
2 1
Moving Average: