The problem with odds ratios

Many researchers use logit models to estimate the effect of specific variables on a binary (i.e., 0 or 1) outcome.  How are these models derived?  How are odds ratios calculated?  What are the problems with odds ratios?  I answer all these questions in this post, following a lovely summary by Norton and Dowd (2018).

Deriving Logit models

These models are typically of the form:

• y*i=βXii

where Β is the coefficient vector and X is the matrix of explanatory variables. The dependent variable y*i is the unobserved continuous latent variable. The observed binary variable is typically assumed to equal 1 when y*i >0 and 0 otherwise. To measure the probability that an event occurs (e.g., the binary variable equals 1), we can calculate that as:

• Pr(yi = 1| Xi) = Pr(y*i > 0| Xi) = Pr(εi ≤ βXi| Xi)

However, as the probability distribution of εis typically not known, we typically scale this figure by the standard deviation of the residual ε.  Thus, we get

• Pr(yi = 1| Xi)=Pr(εi / σ ≤ (β /σ) Xi| Xi)

In the probit model, we assume that (εi/σ)~N(0,1). so that Pr(yi = 1| normal, Xi) = Φ [(β /σ) Xi]

In the logit model, we assume a logistic distribution where (εi/σ)~logistic(0, π/√3), so that Pr(yi = 1| logistic, Xi) = 1/[1+exp{-(β /σ) Xi}].

How do you calculate odds ratios?

Odds ratios are simply the ratio of the odds something occurs divided by the odds that it does not.  We already derived the odds that something occurs in the logit case; which is Pr(yi = 1| logistic, Xi) = 1/[1+exp{-(β /σ) Xi}].  The ratio of the odds the event occurs and the odds it doesn’t occur is {1/[1+exp{-(β /σ) Xi}]} / exp{-(β /σ) Xi}/[1+exp{-(β /σ) Xi}], which simplifies to exp{-(β /σ) Xi}.

For binary independent variables, one can measure the log odds as simply β/σ for the specific coefficient of interest.

What is the problem with odds ratios?

One issue is that the specification used (e.g., logit vs. probit) can affect your estimate odds ratio estimates.  Specifically,

…the logit and probit models postulate error distributions with different values of r (the standard normal distribution has a variance of 1, the standard logistic distribution has a variance of π2/3). This explains why the estimated logit and probit coefficients are different. The normalizations are different. A rule of thumb is that logit coefficients are larger by a factor of about 1.6.

More importantly, the number of explanatory variables included in the model will affect the odds ratio.  If you include a large number of explanatory variables in the model, it will account for variation in the dependent variable, and thus it will reduce the variance of the residual, ε.  Thus, adding more explanatory variables will increase the magnitude of the odds ratio, whereas removing explanatory variables will decrease the size of the odds ratios.

The authors claim that there are a few implications to this finding.

1. There is no single odds ratio.  The odds ratio will depend on the explanatory variables included in the model (and of course the underlying data used as well).
2. Comparing odds ratios across studies is challenging.   Differences in odds ratios across studies may be due to differences in estimated effect, β, or could be due to differences in< the variability of the residual, σ.
3. Comparing odds ratios across model specifications is challenging.  Whereas with linear models, observing similar results across model specifications is prima facie evidence of finding a strong result, with odds ratios this is not the case as we would expect the odds ratios to change across model specifications due to differences in the variance of the residual.

The authors recommend reporting the marginal or incremental effect (a.k.a partial effects), as is more commonly done in economics, as this estimate does not directly depend on σ.  These marginal effects are estimated for a specific patient population, often patients with average characteristics.  However, one can show how marginal effects vary in a population by plotting the marginal effects conditional on a given patient characteristics on the y-axis and the domain of the patient characteristic on the x-axis.

Source: