Female vs Male Hurricanes — Commentary

John Maindonald

Dec 26 2014.

The notes that follow comment on analyses that are reported in Jung et al (2014), and in particular on the analysis that leads to their Figure 1. Diagnostic plots reveal serious problems with their analysis. Use of the Box-Cox methodology to identify a suitable power transformation leads to a model that fits the data well. It has NDAM$^{0.13}$ as the only explanatary variable, with addition of terms involving the pressure at landfall and the masculinity-feminity index giving no improvement to the fit. Diagnostics are very satisfactory, with no outliers or highly influential points identified. For the modeling reported here, highly crucial roles attach both to preliminary graphical exploration and to careful examination of model diagnostics.

A further analysis works with the updated ICAT damage estimates that are in 2014 US dollars, and investigates inclusion of Audrey and Katrina that were omitted from the Jung et al analysis. Also examined is the model fit to data from 1979 onwards, when male and female names were alternated.

A further comment is that the regression coefficients are not capable of sustaining the interpretation that Jung et al wish to place on them.

Aside from the main theme are comments on the variety of functions, in different R packages, that are available for the fitting of regression models for data with a negative binomial error.

The data for Audrey and Katrina is corrected relative to that used in the June 20 2014 version of this document.

Preparatory Steps

Download and input data

fnam <- paste0("http://www.pnas.org/lookup/suppl/doi:10.1073/",
              "pnas.1402786111/-/DCSupplemental/pnas.1402786111.sd01.xlsx")
if(!file.exists('hurricane-names.xlsx')){
download.file(fnam, destfile='hurricane-names.xlsx', mode="wb")
}
library(xlsx)

Loading required package: rJava
Loading required package: xlsxjars

hurricJ <- read.xlsx('~/r/_rn/Data/hurricane-names.xlsx', 
                         sheetIndex=1, rowIndex=1:93)
rnam <- with(hurricJ, paste0(as.character(Name), Year))
row.names(hurricJ) <- rnam
hurricJ[,"Gender_MF"] <- factor(hurricJ[,"Gender_MF"], labels=c("m","f"))

Check variables

keyVar <- hurricJ[, c(“MasFem”,“alldeaths”,“MinPressure_before”,“Category”,“ZNDAM”,“Year”)] sapply(keyVar,mean)

Examine plots

Use a log(alldeaths+0.5) scale on the y-axis in order to separate points. Smooth curves have been added, separately for hurricanes with male and female names.

library(lattice)
ypos <- seq(from=0, to=200, by=50)
sc <- list(x=list(relation="free"), y=list(at=sqrt(ypos), labels=paste(ypos)))
xyplot(log(alldeaths+0.5) ~ NDAM+log(NDAM), groups=Gender_MF, xlab="",
       par.settings=simpleTheme(pch=c(15,16),cex=1.1), data=hurricJ, 
       scale=sc, auto.key=list(columns=2), type=c('p',"smooth"))

Figure 1: Plot of log(alldeaths+0.5) against log(NDAM)

Use of a logarithmic x-axis scale ameliorates severe leverage issues. It also gives variables that are closer to linearly related. Curvsture in the smooths, at least for female names, suggests that a logarithmic transformation may be too extreme.

Now check the range of values of MinPressure_before.

with(hurricJ, range(MinPressure_before))

[1]  909 1002

The ratio is 1.1 to 1, too small for a logarithmic or power transformation to make much difference.

Now (Figure 2) plot against MinPressure_before

xyplot(log(alldeaths+0.5) ~ MinPressure_before, 
       groups=Gender_MF, data=hurricJ, 
       par.settings=simpleTheme(pch=c(15,16)),
       scale=sc, auto.key=list(columns=2), type=c('p',"smooth"))

Figure 2: Plot of log(alldeaths+0.5) against MinPressure_before

Negative Binomial versus `lm` style Linear Models

Once the counts are large enough, there is little to distinguish a model that uses log(alldeaths+0.5) as the outcpme variabke from a Negative Binomial model that has var[deaths] proportional to E[deaths]$^2$. For these data, with 41 of the counts 4 or less, the approximation is rough. It may nevertheless be useful for exploratory purposes.

The following fits the preferred Jung at al model.

library(MASS)
hurricJ0.nb <- glm.nb(alldeaths ~ 1, data=hurricJ)
hurricJ3.nb <- glm.nb(alldeaths ~ 
                    MasFem*(MinPressure_before+NDAM), 
                    data=hurricJ)
## MasFem:NDAM indicates that the product of MasFem and NDAM 
## is to be included as a variable in the model.

Now compare the two models.

anova(hurricJ0.nb, hurricJ3.nb)

Likelihood ratio tests of Negative Binomial Models

Response: alldeaths
                                 Model theta Resid. df    2 x log-lik.
1                                    1 0.446        91            -705
2 MasFem * (MinPressure_before + NDAM) 0.811        86            -644
    Test    df LR stat.  Pr(Chi)
1                               
2 1 vs 2     5     60.6 9.29e-12

## Pearson chi-squared for Jung et al preferred model
sum(resid(hurricJ3.nb, type='pearson')^2)/hurricJ3.nb[["df.residual"]]

[1] 1.09

## 
## Check model coefficients
coef(summary(hurricJ3.nb))

                           Estimate Std. Error z value Pr(>|z|)
(Intercept)                7.13e+01   1.87e+01    3.81 1.37e-04
MasFem                    -6.16e+00   2.36e+00   -2.61 9.10e-03
MinPressure_before        -7.13e-02   1.92e-02   -3.71 2.09e-04
NDAM                      -4.78e-05   2.83e-05   -1.69 9.12e-02
MasFem:MinPressure_before  6.32e-03   2.43e-03    2.60 9.45e-03
MasFem:NDAM                1.69e-05   3.59e-06    4.70 2.62e-06

Figure 3 shows the diagnostic plots.

opar <- par(mar=c(3.5,3.5,2.1,0.6), mgp=c(2.25,0.5,0), mfrow=c(2,2))
plot(hurricJ3.nb)

Figure 3: Diagnostic plots for the preferred Jung et al model

par(opar)

A small number of data points have very high leverage, relative to other points. While Diane is the only hurricane that triggers the standard Cook’s distance criterion, there may well be combined effects that are distorting the model.

Notice that the coefficient of NDAM is negative. The interaction with MasFem is accounting for a major part of the effect of NDAM.

Figures 1 and 3 are together good reason not to go with the Jung et al model. Reasonable alternatives are then to work with log(NDAM), or with a power transformation of NDAM. The right panel in Figure 1 suggests a power transformation.

Use of `NDAM`$^\lambda$ as explanatory variable

For checking for the most appropriate transformation, Cook (1998) argues for examining the regressions of powers of the explanatory variable on the outcome variable. This reverses the roles of NDAMandlog(deaths+0.5)`, relative to their roles in the model fit. See further Weisberg (2014), Fox and Weisberg (2014), and Maindonald and Braun (2012),

The following uses the function powerTransform() in the {} package to find a suitable transformation. This assumes, of course, that log(deaths+0.5) is a reasonable choice for dependent variable. As already noted, this is roughly equivalent to use of a log link function in a negative binomial model.

library(car)
powerTransform(NDAM ~ log(alldeaths+0.5), data=hurricJ)

Estimated transformation parameters 
   Y1 
0.134

The function powerTransform() chooses a transformation so that residuals are as close as possible to normal. The same calculation with Minpressure_before in place of log(deaths+0.5) suggests a power of 0.129, i.e., for all practical purposes the same.

Working then with NDAM$^{0.13}$, we have:

## Negative binomial
hurric.nb <- glm.nb(alldeaths ~ I(NDAM^0.13),
                     data=hurricJ)
coef(summary(hurric.nb))

             Estimate Std. Error z value Pr(>|z|)
(Intercept)     -2.43      0.441    -5.5 3.75e-08
I(NDAM^0.13)     1.77      0.154    11.5 1.33e-30

## Calculate Pearson chi-squared
sum(resid(hurric.nb, type='pearson')^2)/hurric.nb[["df.residual"]]

[1] 1.11

## Roughly equivalent lm model, for comparison
hurric.lm <- lm(log(alldeaths+0.5) ~ I(NDAM^0.13),
                    data=hurricJ)
coef(summary(hurric.lm))

             Estimate Std. Error t value Pr(>|t|)
(Intercept)     -2.17      0.400   -5.41 5.18e-07
I(NDAM^0.13)     1.53      0.145   10.52 2.43e-17

The estimates are about as similar as is reasonable to expect. The diagnostic plots for the negative binomial model are:

opar <- par(mar=c(3.5,3.5,2.1,0.6), mgp=c(2.25,0.5,0), mfrow=c(2,2))
plot(hurric.nb)

$Figure 4: Diagnostic plots for the model that has NDAM$^{0.13}$ as the only explanatory variable$

par(opar)

Is there a case for adding further terms?

library(MASS)
hurric2.nb <- glm.nb(alldeaths ~ (I(NDAM^0.13)+MasFem)^2+MinPressure_before+
                 MasFem:MinPressure_before, maxit=100, 
                 data=hurricJ)
anova(hurric.nb, hurric2.nb)

Likelihood ratio tests of Negative Binomial Models

Response: alldeaths
                                                                       Model
1                                                               I(NDAM^0.13)
2 (I(NDAM^0.13) + MasFem)^2 + MinPressure_before + MasFem:MinPressure_before
  theta Resid. df    2 x log-lik.   Test    df LR stat. Pr(Chi)
1  1.08        90            -619                              
2  1.13        86            -616 1 vs 2     4     3.02   0.554

Adding further terms to the model that has NDAM$^{0.13}$ as explanatory variable does not improve the explanatory power.

The plot method gives the following:

opar <- par(mar=c(3.5,3.5,2.1,0.6), mgp=c(2.25,0.5,0), mfrow=c(2,2))
plot(hurric.nb)

par(opar)
par(mfrow=c(1,1))

There are no outliers, and no highly influential points.

Re-inclusion of Audrey (1957) and Katrina (2005)

Note that the following corrects mistakes in the figures for Audrey and Katrina that were used in the June 20 2014 version of this document.

Wih the more acceptable model, can we now re-include the omitted hurricanes? Unfortunately Jung et al left Audrey and Katrina out of their dataset. (It is good practice to report data for outliers. They may turn out not to be outliers when a different model is used. Or they may hold important information.)

Data that included Audrey and Katrina, with damage estimates in 2014 dollars, were therefore obtained from http://www.icatdamageestimator.com/commonsearch . NOAA Monthly Weather Reports (MWRs) supplied the numbers of deaths for most of the hurricanes in Jung et al’s data. Wikipedia wiki pages for individual hurricanes provide what will often be better attested data, certainly for more damaging hurricanes where time is required to pull together the evidence. For present purposes, the number of deaths data from Jung et al’s data set was supplemented with the numbers given on the wiki pages for Hurricane_Katrina and Hurricane_Audrey.

if(!file.exists('hurric2014.csv')){
url <- "http://maths-people.anu.edu.au/~johnm/stats-issues/data/hurric2014.csv"
download.file(url, destfile="hurric2014.csv")
}
hurric2014 <- read.csv('hurric2014.csv', row.names=1)
hurric2014 <- within(hurric2014, Gender_MF <- factor(Gender_MF, levels=c("m","f")))

Figure 5 shows a scatterplot of log(alldeaths+0.5) against log(NDAM2014)

library(latticeExtra)

Loading required package: RColorBrewer

xpos <- c(1,200, 4*10^4, 8*10^6)
ypos <- c(0, 2, 6, 24, 100, 400, 1600)
sc <- list(y=list(at=log(ypos+0.5), labels=ypos),
           x=list(at=log(xpos), labels=xpos))
gph <- xyplot(log(alldeaths+0.5) ~ log(NDAM2014), groups=Gender_MF,
              xlab="Damage to property ($000,000)",
              ylab="Total deaths",
              par.settings=simpleTheme(pch=c(15,16),cex=1.35),
              data=hurric2014, scales=sc, auto.key=list(columns=2))
hnames <- c("Audrey1957", "Katrina2005")
nlabs <- match(hnames, row.names(hurric2014))
gph+layer(panel.text(x[nlabs], y[nlabs], labels=row.names(hurric2014)[nlabs], pos=2, ...))

Figure 5: Plot of log(alldeaths+0.5) against log(NDAM2014)

Check the appropriate power transformation

The following again uses the function powerTransform() in the {} package to check the transformation that might be used:

powerTransform(NDAM2014 ~ log(alldeaths+0.5), data=hurric2014)

Estimated transformation parameters 
   Y1 
0.146

As the change from 0.13 to 0.15 is of no consequence, we will then work with NDAM2014$^{0.13}$, as before.

## Negative binomial
hurric2014.nb <- glm.nb(alldeaths ~ I(NDAM2014^0.13),
                     data=hurric2014)
coef(summary(hurric2014.nb))

                 Estimate Std. Error z value Pr(>|z|)
(Intercept)         -2.94      0.507   -5.81 6.40e-09
I(NDAM2014^0.13)     2.04      0.175   11.66 2.02e-31

## lm model as alternative
hurric2014.lm <- lm(log(alldeaths+0.5) ~ I(NDAM2014^0.13),
                    data=hurric2014)
coef(summary(hurric2014.lm))

                 Estimate Std. Error t value Pr(>|t|)
(Intercept)         -2.47      0.430   -5.74 1.23e-07
I(NDAM2014^0.13)     1.65      0.153   10.77 5.62e-18

The estimates are about as similar as is reasonable to expect. The diagnostic plots for the negative binomial model are:

opar <- par(mar=c(3.5,3.5,2.1,0.6), mgp=c(2.25,0.5,0), mfrow=c(2,2))
plot(hurric2014.nb)

Figure 6: Diagnostic plots, model with Audrey and Katrina included.

par(opar)

Audrey stands out very clearly as an outlier and as a high influence point. Omitting Audrey gives:

## Negative binomial
hurric2014a.nb <- glm.nb(alldeaths ~ I(NDAM2014^0.13),
                     data=hurric2014[-13,])
coef(summary(hurric2014a.nb))

                 Estimate Std. Error z value Pr(>|z|)
(Intercept)         -3.05      0.460   -6.64 3.13e-11
I(NDAM2014^0.13)     1.99      0.157   12.72 4.67e-37

opar <- par(mar=c(3.5,3.5,2.1,0.6), mgp=c(2.25,0.5,0), mfrow=c(2,2))
plot(hurric2014a.nb)

Figure 7: Diagnostic plots, with Audrey excluded.

par(opar)

Katrina now appears as a high influence point.

Restriction to hurricanes from 1979 or later

Until 1979, all names used were female. Since then, male and female names have alternated. The following limits the analysis to data from 1979 onwards:

hurric1979on.nb <- glm.nb(alldeaths ~ I(NDAM2014^0.13),
                     data=subset(hurric2014, Year>=1979))

Warning: glm.fit: algorithm did not converge

coef(summary(hurric1979on.nb))

                 Estimate Std. Error z value Pr(>|z|)
(Intercept)         -2.74      0.558   -4.91 9.21e-07
I(NDAM2014^0.13)     1.90      0.190   10.00 1.51e-23

hurric1979onALL.nb <- glm.nb(alldeaths ~ I(NDAM2014^0.13)+
                      Minpressure_Updated.2014+
                    Gender_MF+ Gender_MF:log(NDAM2014)+ 
                      Gender_MF:Minpressure_Updated.2014,
                    data=subset(hurric2014, Year>=1979), maxit=100)
anova(hurric1979on.nb, hurric1979onALL.nb)

Likelihood ratio tests of Negative Binomial Models

Response: alldeaths
                                                                                                                   Model
1                                                                                                       I(NDAM2014^0.13)
2 I(NDAM2014^0.13) + Minpressure_Updated.2014 + Gender_MF + Gender_MF:log(NDAM2014) + Gender_MF:Minpressure_Updated.2014
  theta Resid. df    2 x log-lik.   Test    df LR stat. Pr(Chi)
1  1.06        53            -384                              
2  1.27        48            -374 1 vs 2     5     9.42  0.0933

Now examine the diagnostics:

opar <- par(mar=c(3.5,3.5,2.1,0.6), mgp=c(2.25,0.5,0), mfrow=c(2,2))
plot(hurric1979on.nb)

Figure 8: Diagnostic plots, data from 1979 only

par(opar)

Katrina has a large influence on the fit. Hence, omit Katrina.

hurric1979onOmitK.nb <- glm.nb(alldeaths ~ I(NDAM2014^0.13),
                     data=subset(hurric2014[-84,], Year>=1979))
coef(summary(hurric1979onOmitK.nb))

                 Estimate Std. Error z value Pr(>|z|)
(Intercept)         -1.72      0.500   -3.44 5.75e-04
I(NDAM2014^0.13)     1.48      0.173    8.52 1.55e-17

## Now examine the diagnostics:
opar <- par(mar=c(3.5,3.5,2.1,0.6), mgp=c(2.25,0.5,0), mfrow=c(2,2))
plot(hurric1979onOmitK.nb)

Figure 9: Diagnostic plots from 1979 on, model that excludes Katrina

par(opar)

The only point of note is a clear suggestion of increasing variance, beyond what the model accommodates, with increasing fitted values. A model that adequately accommodates data from 1979 on perhaps still has to be found! With a more appropriate model, perhaps Katrina would not be a point of high influence!

Other Aspects of the Methodology

Jung et al’s argument, as I understand it, is that the variables NDAM (property damage in 2013 monetary values) and MinPressure_before together account for the potential of hurricanes to cause human deaths, and that MasFem, perhaps in interaction with NDAM and/or MinPressure_before, then accounts for the difference between potential deaths and observed deaths. The implied assumption that NDAM and MinPressure_before are together an effective proxy for the potential to cause human deaths may then be regarded with scepticism.

Some damage to property, in some cases a large part. will have occurred away from urban centres. In effect, then, the analysis is missing a variable that measures the extent of mismatch between the combined effect of NDAM and MinPressure_before and the potential for human damage. Yule-Simpson type effects, involving continuous as well as categorical variables, can in these circumstances bias or induce effects in variables that are included in the model. With observational data, omission of a variable that has a substantial effect will commonly induce such bias. Look for example under the index entries for Lord’s paradox in Tu and Gilthorpe (2012).

An analysis of this data that passes standard checks has not shown an effect from MasFem or from interactions with MasFem. The issue of what interpretations can be placed on the regression coefficents does not then arise, but remains a damaging criticism of the methodology. See also the points raised in comments at the urls:

http://scatter.wordpress.com/2014/06/03/my-thoughts-on-that-hurricane-study/.

http://andrewgelman.com/2014/06/06/hurricanes-vs-himmicanes/

Note: This document was created from an R Markdown markup file. The file is available as: http://maths-people.anu.edu.au/~johnm/stats-issues/hurricnames.Rmd . Under RStudio, click on “File”, then on “Open File”, and select the file. Then click on “Knit HTML” to recreate this document.

Additional Notes

A dispersed Poisson model

This fits without problem, but fails the diagnostic tests! This model assumes a variance that is proportional to E[deaths] rather than to E[deaths]$^2$.

library(MASS)
hurric.glm <- glm(alldeaths ~ log(NDAM)+MinPressure_before+MasFem+
                 I(MasFem*log(NDAM))+I(MasFem*MinPressure_before),
                 data=hurricJ, family=quasipoisson)
summary(hurric.glm)


Call:
glm(formula = alldeaths ~ log(NDAM) + MinPressure_before + MasFem + 
    I(MasFem * log(NDAM)) + I(MasFem * MinPressure_before), family = quasipoisson, 
    data = hurricJ)

Deviance Residuals: 
   Min      1Q  Median      3Q     Max  
-10.34   -2.78   -1.17    1.15   16.92  

Coefficients:
                               Estimate Std. Error t value Pr(>|t|)
(Intercept)                    48.20894   28.04869    1.72    0.089
log(NDAM)                      -0.03181    0.27639   -0.12    0.909
MinPressure_before             -0.04751    0.02749   -1.73    0.088
MasFem                         -5.66132    3.21028   -1.76    0.081
I(MasFem * log(NDAM))           0.08356    0.03368    2.48    0.015
I(MasFem * MinPressure_before)  0.00525    0.00314    1.67    0.099

(Dispersion parameter for quasipoisson family taken to be 21.3)

    Null deviance: 4031.9  on 91  degrees of freedom
Residual deviance: 1579.6  on 86  degrees of freedom
AIC: NA

Number of Fisher Scoring iterations: 6

opar <- par(mar=c(3.5,3.5,2.1,0.6), mgp=c(2.25,0.5,0), mfrow=c(2,2))
plot(hurric.glm)

par(opar)

The linear downward trend in the red line in the “Residuals vs Fitted” plot presumably arises because it is a “smooth” that has been fitted using a smoothing approach that is appropriate for a normal errors model. The scale-location plot is the big concern. The deviance residuals adjust for a variance proportional to mean increase in the variance. Here, mean/sqrt(variance) increases very markedly with the mean. The SEs from the negative binomial model are accordingly suspect.

Functions in R Packages that fit negative binomial models

The above has benefitted from discussion on the ANZSTAT email list. Negative binomial regression provides a somewhat extreme example of overlap between R packages, with notation that is not consistent between these different implementations.

In addition to MASS::glm.nb(), note msme::nbinomial(), aod::negbin() and gamlss::gamlss(). Mark Donaghue noted that gamlss() in the gamlss package fits two different types of NB model, either family = NBI (quadratic; var = mu(1 + sigma * mu)) as I think for all the functions above, or family=NBII (linear; var = mu(1 + sigma)).

Note that the notation is not consistent across the different packages and functions that allow a negative binomial error structure.

References

Cook, R (1998). Regression Graphics. Wiley.

Fox and Weisberg (2nd edn, 2014). An R Companion to Applied Regression. Sage.

Jung, Shavitta, Viswanathana, and Hilbe (2013). “Female hurricanes are deadlier than male hurricanes”. http://www.pnas.org/cgi/doi/10.1073/pnas.1402786111.

Maindonald and Braun (3rd edn, 2010). Data Analysis and Graphics Using R. CUP.

Tu and Gilthorpe (2012). “Statistical Thinking in Epidemiology”. CRC Press.

Weisberg, S (4th edn, 2014). “Applied Linear Regression”. Wiley.