Ordered & Multinomial Models

MacroEconometricModels.jl provides ordered logit/probit and multinomial logit regression for categorical dependent variables with three or more outcomes, estimated via Newton-Raphson maximum likelihood. All models produce Stata-style output and integrate with the package's StatsAPI interface, marginal effects infrastructure, and specification testing.

Ordered logit (cumulative logistic link) for ordinal outcomes (McCullagh 1980)
Ordered probit (cumulative normal link) for ordinal outcomes
Multinomial logit (softmax) for unordered categorical outcomes (McFadden 1974)
Average marginal effects (AME) for all three models with proper K × J output (Cameron & Trivedi 2005)
Brant test of the proportional odds assumption for ordered logit (Brant 1990)
Hausman-McFadden IIA test for multinomial logit (Hausman & McFadden 1984)
Robust inference: MLE, HC0, HC1, and cluster-robust standard errors
StatsAPI interface: coef, vcov, predict, confint, stderror, nobs, loglikelihood

using MacroEconometricModels, Random, Distributions
Random.seed!(42)
# Helper: generate ordinal outcome from cumulative logistic model
function _gen_ordered_logit(rng, n, X, beta, cuts)
    xb = X * beta
    y = Vector{Int}(undef, n)
    for i in 1:n
        u = rand(rng)
        y[i] = length(cuts) + 1  # default = highest category
        for (j, c) in enumerate(cuts)
            if u < 1.0 / (1.0 + exp(-(c - xb[i])))
                y[i] = j; break
            end
        end
    end
    y
end
# Helper: generate ordinal outcome from cumulative probit model
function _gen_ordered_probit(rng, n, X, beta, cuts)
    xb = X * beta
    d = Normal()
    y = Vector{Int}(undef, n)
    for i in 1:n
        u = rand(rng)
        y[i] = length(cuts) + 1
        for (j, c) in enumerate(cuts)
            if u < cdf(d, c - xb[i])
                y[i] = j; break
            end
        end
    end
    y
end

<< @setup-block not executed in draft mode >>

Quick Start

Recipe 1: Ordered logit

n = 1000
X = randn(n, 2)
y = _gen_ordered_logit(Random.default_rng(), n, X, [1.0, -0.5], [0.0, 1.5])
m = estimate_ologit(y, X; varnames=["income", "education"])
report(m)

<< @example-block not executed in draft mode >>

Recipe 2: Ordered probit

n = 1000
X = randn(n, 2)
y = _gen_ordered_probit(Random.default_rng(), n, X, [0.8, -0.5], [0.0, 1.0])
m = estimate_oprobit(y, X; varnames=["income", "education"])
report(m)

<< @example-block not executed in draft mode >>

Recipe 3: Multinomial logit

n = 1000
X = hcat(ones(n), randn(n, 2))
# True coefficients: K=3 covariates × (J-1)=2 alternatives
beta_true = [0.5 -0.3; 1.0 -0.5; -0.5 0.8]
V = X * beta_true
eV = exp.(V)
P = hcat(ones(n), eV) ./ (1.0 .+ sum(eV, dims=2))
# Draw from categorical distribution
y = [findfirst(cumsum(P[i, :]) .>= rand()) for i in 1:n]
m = estimate_mlogit(y, X; varnames=["const", "x1", "x2"])
report(m)

<< @example-block not executed in draft mode >>

Recipe 4: Marginal effects for ordered logit

n = 1000
X = randn(n, 2)
y = _gen_ordered_logit(Random.default_rng(), n, X, [1.0, -0.5], [0.0, 1.5])
m = estimate_ologit(y, X; varnames=["income", "education"])
me = marginal_effects(m)
# K × J matrix: each row sums to zero across categories
round.(me.effects, digits=4)

<< @example-block not executed in draft mode >>

Recipe 5: Brant test (proportional odds)

n = 1000
X = randn(n, 2)
y = _gen_ordered_logit(Random.default_rng(), n, X, [1.0, -0.5], [0.0, 1.5])
m = estimate_ologit(y, X; varnames=["income", "education"])
bt = brant_test(m)
round(bt.pvalue, digits=4)

<< @example-block not executed in draft mode >>

Recipe 6: Hausman IIA test

# Need J >= 4 so that omitting one category leaves >= 3
n = 1500
X = hcat(ones(n), randn(n, 2))
beta_true = [0.3 -0.2 0.1; 0.8 -0.4 0.5; -0.3 0.6 -0.2]  # K=3 × (J-1)=3 → J=4
V = X * beta_true
eV = exp.(V)
P = hcat(ones(n), eV) ./ (1.0 .+ sum(eV, dims=2))
y = [findfirst(cumsum(P[i, :]) .>= rand()) for i in 1:n]
m = estimate_mlogit(y, X; varnames=["const", "x1", "x2"])
# Test IIA by omitting category 4
iia = hausman_iia(m; omit_category=4)
round(iia.pvalue, digits=4)

<< @example-block not executed in draft mode >>

Ordered Logit

The ordered logit (proportional odds) model relates an ordinal outcome $y_i \in \{1, 2, \ldots, J\}$ to regressors $x_i$ through the cumulative logistic distribution (McCullagh 1980):

\[P(y_i \leq j \mid x_i) = \Lambda(\alpha_j - x_i' \beta), \quad j = 1, \ldots, J-1\]

where:

$y_i$ is the ordinal dependent variable with $J$ categories
$x_i$ is the $K \times 1$ regressor vector (no intercept — absorbed by cutpoints)
$\beta$ is the $K \times 1$ slope coefficient vector
$\alpha_1 < \alpha_2 < \cdots < \alpha_{J-1}$ are the cutpoints (thresholds)
$\Lambda(\cdot)$ is the logistic CDF

The category probabilities follow from differencing:

\[P(y_i = j \mid x_i) = F(\alpha_j - x_i' \beta) - F(\alpha_{j-1} - x_i' \beta)\]

with the convention $F(\alpha_0) = 0$ and $F(\alpha_J) = 1$.

No intercept in X

The intercept is absorbed into the cutpoints. Including a column of ones in X causes identification failure.

Estimation

The model is estimated by Newton-Raphson MLE using the BHHH (outer product of gradients) approximation to the Hessian. The optimizer enforces cutpoint ordering $\alpha_1 < \alpha_2 < \cdots < \alpha_{J-1}$ at each iteration.

n = 1000
X = randn(n, 3)
y = _gen_ordered_logit(Random.default_rng(), n, X, [1.0, -0.5, 0.3], [0.0, 1.5])
m = estimate_ologit(y, X; varnames=["income", "education", "age"], cov_type=:hc1)
report(m)

<< @example-block not executed in draft mode >>

The output reports slope coefficients and cutpoints separately. A positive $\beta_k$ shifts probability mass toward higher categories. McFadden's pseudo $R^2 = 1 - \ell(\hat{\beta}) / \ell_0$ measures improvement over the null (cutpoints-only) model.

Keywords

Keyword	Type	Default	Description
`cov_type`	`Symbol`	`:ols`	Covariance estimator: `:ols` (MLE), `:hc0`, `:hc1`, `:cluster`
`varnames`	`Vector{String}`	auto	Coefficient names
`clusters`	`AbstractVector`	`nothing`	Cluster assignments (required for `:cluster`)
`maxiter`	`Int`	`200`	Maximum Newton-Raphson iterations
`tol`	`Real`	$10^{-8}$	Convergence tolerance

Return Values

Field	Type	Description
`beta`	`Vector{T}`	Slope coefficients (K)
`cutpoints`	`Vector{T}`	Estimated cutpoints (J-1)
`vcov_mat`	`Matrix{T}`	Joint vcov of $[\beta; \alpha]$
`fitted`	`Matrix{T}`	Predicted probabilities (n × J)
`loglik`	`T`	Maximized log-likelihood
`pseudo_r2`	`T`	McFadden's pseudo R-squared
`categories`	`Vector`	Original category values
`converged`	`Bool`	Convergence flag

Ordered Probit

The ordered probit model uses the standard normal CDF $\Phi(\cdot)$ as the link function:

\[P(y_i \leq j \mid x_i) = \Phi(\alpha_j - x_i' \beta)\]

The API is identical to ordered logit. The choice between logit and probit is largely a matter of convention — both produce similar marginal effects in practice.

n = 1000
X = randn(n, 2)
y = _gen_ordered_probit(Random.default_rng(), n, X, [0.8, -0.5], [0.0, 1.0])
m = estimate_oprobit(y, X; varnames=["income", "education"])
report(m)

<< @example-block not executed in draft mode >>

Multinomial Logit

The multinomial logit model relates an unordered categorical outcome $y_i \in \{1, 2, \ldots, J\}$ to regressors through the softmax function (McFadden 1974):

\[P(y_i = j \mid x_i) = \frac{\exp(x_i' \beta_j)}{\sum_{k=1}^{J} \exp(x_i' \beta_k)}\]

where:

$\beta_1 = 0$ (base category normalization)
$\beta_j$ is the $K \times 1$ coefficient vector for alternative $j = 2, \ldots, J$
The model estimates $K \times (J-1)$ free parameters

Technical Note

The implementation uses the log-sum-exp trick for numerical stability and computes the analytical Hessian (not BHHH) for fast convergence. The coef(m) function returns vec(beta) of length $K(J-1)$.

Estimation

n = 1000
X = hcat(ones(n), randn(n, 2))
beta_true = [0.5 -0.3; 1.0 -0.5; -0.5 0.8]
V = X * beta_true
eV = exp.(V)
P = hcat(ones(n), eV) ./ (1.0 .+ sum(eV, dims=2))
y = [findfirst(cumsum(P[i, :]) .>= rand()) for i in 1:n]
m = estimate_mlogit(y, X; varnames=["const", "x1", "x2"])
report(m)

<< @example-block not executed in draft mode >>

The output displays one coefficient table per alternative (relative to the base category). Positive $\beta_{j,k}$ means higher $x_k$ increases the probability of choosing alternative $j$ relative to the base.

Keywords

Keyword	Type	Default	Description
`cov_type`	`Symbol`	`:ols`	Covariance estimator: `:ols` (MLE), `:hc0`, `:hc1`, `:cluster`
`varnames`	`Vector{String}`	auto	Coefficient names
`clusters`	`AbstractVector`	`nothing`	Cluster assignments (required for `:cluster`)
`maxiter`	`Int`	`200`	Maximum Newton-Raphson iterations
`tol`	`Real`	$10^{-8}$	Convergence tolerance

Return Values

Field	Type	Description
`beta`	`Matrix{T}`	Coefficient matrix (K × (J-1)), base = category 1
`vcov_mat`	`Matrix{T}`	Vcov of `vec(beta)`, K(J-1) × K(J-1)
`fitted`	`Matrix{T}`	Predicted probabilities (n × J)
`loglik`	`T`	Maximized log-likelihood
`pseudo_r2`	`T`	McFadden's pseudo R-squared
`categories`	`Vector`	Original category values

Marginal Effects

Coefficients in ordered and multinomial models do not have direct marginal effect interpretations because the probability functions are nonlinear. Average marginal effects (AME) compute the mean derivative across all observations (Cameron & Trivedi 2005).

Ordered Models

For the ordered logit, the AME of variable $k$ on outcome $j$ is:

\[\text{AME}_{k,j} = \frac{1}{n} \sum_{i=1}^{n} \left[ f(\alpha_{j-1} - x_i' \beta) - f(\alpha_j - x_i' \beta) \right] \beta_k\]

where $f(\cdot)$ is the logistic PDF. The AMEs sum to zero across categories for each variable — increasing probability in one category must decrease it elsewhere.

n = 1000
X = randn(n, 2)
y = _gen_ordered_logit(Random.default_rng(), n, X, [1.0, -0.5], [0.0, 1.5])
m = estimate_ologit(y, X; varnames=["income", "education"])
me = marginal_effects(m)
# Row k = variable, column j = category
round.(me.effects, digits=4)

<< @example-block not executed in draft mode >>

Multinomial Logit

For the multinomial logit, the AME formula accounts for the softmax structure:

\[\text{AME}_{k,j} = \frac{1}{n} \sum_{i=1}^{n} p_{ij} \left( \beta_{j,k} - \sum_{m=1}^{J} p_{im} \beta_{m,k} \right)\]

n = 1000
X = hcat(ones(n), randn(n, 2))
beta_true = [0.5 -0.3; 1.0 -0.5; -0.5 0.8]
V = X * beta_true
eV = exp.(V)
P = hcat(ones(n), eV) ./ (1.0 .+ sum(eV, dims=2))
y = [findfirst(cumsum(P[i, :]) .>= rand()) for i in 1:n]
m = estimate_mlogit(y, X; varnames=["const", "x1", "x2"])
me = marginal_effects(m)
round.(me.effects, digits=4)

<< @example-block not executed in draft mode >>

Specification Tests

Brant Test (Proportional Odds)

The Brant test evaluates the proportional odds (parallel regression) assumption underlying ordered logit (Brant 1990). Under the null, the slope coefficients are equal across all $J-1$ binary logits that split the outcome at each cutpoint.

The test fits $J-1$ separate binary logits ($y \leq j$ vs $y > j$) and constructs a Wald statistic comparing the binary logit coefficients to each other. Rejection suggests the ordered logit model is misspecified — different cutpoints produce different slope estimates.

n = 1000
X = randn(n, 2)
y = _gen_ordered_logit(Random.default_rng(), n, X, [1.0, -0.5], [0.0, 1.5])
m = estimate_ologit(y, X; varnames=["income", "education"])
bt = brant_test(m)
# Overall test
println("Chi-squared: ", round(bt.statistic, digits=3), ", p-value: ", round(bt.pvalue, digits=4))
# Per-variable p-values
round.(bt.per_variable, digits=4)

<< @example-block not executed in draft mode >>

Field	Type	Description
`statistic`	`T`	Overall Wald statistic
`pvalue`	`T`	P-value (chi-squared with $K(J-2)$ df)
`df`	`Int`	Degrees of freedom
`per_variable`	`Vector{T}`	Per-variable p-values ($J-2$ df each)
`binary_coefs`	`Matrix{T}`	K × (J-1) matrix of binary logit coefficients

Hausman-McFadden IIA Test

The independence of irrelevant alternatives (IIA) assumption states that the odds ratio between any two alternatives is independent of other alternatives. The Hausman-McFadden test (1984) re-estimates the multinomial logit excluding one category and compares the coefficients to the full model.

# Need J >= 4 so that omitting one leaves >= 3
n = 1500
X = hcat(ones(n), randn(n, 2))
beta_true = [0.3 -0.2 0.1; 0.8 -0.4 0.5; -0.3 0.6 -0.2]
V = X * beta_true
eV = exp.(V)
P = hcat(ones(n), eV) ./ (1.0 .+ sum(eV, dims=2))
y = [findfirst(cumsum(P[i, :]) .>= rand()) for i in 1:n]
m = estimate_mlogit(y, X; varnames=["const", "x1", "x2"])
iia = hausman_iia(m; omit_category=4)
println("Hausman stat: ", round(iia.statistic, digits=3), ", p-value: ", round(iia.pvalue, digits=4))
nothing # hide

<< @example-block not executed in draft mode >>

Failure to reject supports the IIA assumption. If IIA is rejected, consider nested logit or mixed logit alternatives.

Field	Type	Description
`statistic`	`T`	Hausman test statistic
`pvalue`	`T`	P-value (chi-squared)
`df`	`Int`	Degrees of freedom
`omitted_category`	any	Label of the omitted category

Out-of-Sample Prediction

All three models support predict(m, X_new) for computing predicted probabilities on new data:

n = 1000
X = randn(n, 2)
y = _gen_ordered_logit(Random.default_rng(), n, X, [1.0, -0.5], [0.0, 1.5])
m = estimate_ologit(y, X; varnames=["income", "education"])
# Predict on new observations
X_new = randn(5, 2)
probs = predict(m, X_new)
round.(probs, digits=3)

<< @example-block not executed in draft mode >>

Each row sums to 1. For multinomial logit, the same interface applies via predict(m, X_new).

Complete Example

A full workflow estimating ordered and multinomial models on the same simulated survey data:

# Simulate ordinal satisfaction data (5 categories)
n = 2000
X = randn(n, 3)
xb = X * [0.8, -0.4, 0.3]
cuts = [-1.5, -0.5, 0.5, 1.5]
d = Normal()
F = hcat([cdf.(d, c .- xb) for c in cuts]...)
u = rand(n)
y = ones(Int, n) .* 5
for j in 4:-1:1
    y[u .< F[:, j]] .= j
end

# Ordered logit
m_ologit = estimate_ologit(y, X; varnames=["income", "education", "age"])
report(m_ologit)

<< @example-block not executed in draft mode >>

# Marginal effects
me = marginal_effects(m_ologit)
round.(me.effects, digits=4)

<< @example-block not executed in draft mode >>

# Brant test
bt = brant_test(m_ologit)
println("Brant test p-value: ", round(bt.pvalue, digits=4))
nothing # hide

<< @example-block not executed in draft mode >>

# Ordered probit on same data
m_oprobit = estimate_oprobit(y, X; varnames=["income", "education", "age"])
report(m_oprobit)

<< @example-block not executed in draft mode >>

# Multinomial logit on unordered version
m_mlogit = estimate_mlogit(y, hcat(ones(n), X); varnames=["const", "income", "education", "age"])
report(m_mlogit)

<< @example-block not executed in draft mode >>

Common Pitfalls

Including an intercept in ordered models. The cutpoints absorb the intercept. Adding a column of ones to X causes multicollinearity and estimation failure.
Fewer than 3 categories. Both ordered and multinomial models require $J \geq 3$. For binary outcomes, use estimate_logit or estimate_probit instead.
Ignoring the proportional odds assumption. Always run brant_test after ordered logit. If rejected, the coefficients vary across cutpoints and a generalized ordered logit or multinomial logit is more appropriate.
Interpreting multinomial logit coefficients as marginal effects. The $\beta_{j,k}$ coefficients measure log-odds ratios relative to the base category, not marginal changes in probability. Always compute marginal_effects(m) for substantive interpretation.
IIA violations in multinomial logit. If hausman_iia rejects for any omitted category, the multinomial logit model is misspecified. The relative odds between remaining alternatives should not change when an alternative is removed.

References

Agresti, A. (2010). Analysis of Ordinal Categorical Data. 2nd ed. Wiley. ISBN 978-0-470-08289-8.
Brant, R. (1990). Assessing Proportionality in the Proportional Odds Model for Ordinal Logistic Regression. Biometrics 46(4), 1171-1178. DOI
Cameron, A. C. & Trivedi, P. K. (2005). Microeconometrics: Methods and Applications. Cambridge University Press. ISBN 978-0-521-84805-3.
Greene, W. H. (2012). Econometric Analysis. 7th ed. Prentice Hall. ISBN 978-0-131-39538-1.
Hausman, J. A. & McFadden, D. (1984). Specification Tests for the Multinomial Logit Model. Econometrica 52(5), 1219-1240. DOI
McCullagh, P. (1980). Regression Models for Ordinal Data. Journal of the Royal Statistical Society: Series B 42(2), 109-142. DOI
McFadden, D. (1974). Conditional Logit Analysis of Qualitative Choice Behavior. In P. Zarembka (Ed.), Frontiers in Econometrics (pp. 105-142). Academic Press.
Train, K. E. (2009). Discrete Choice Methods with Simulation. 2nd ed. Cambridge University Press. ISBN 978-0-521-74738-7.
Wooldridge, J. M. (2010). Econometric Analysis of Cross Section and Panel Data. 2nd ed. MIT Press. ISBN 978-0-262-23258-6.