\chapter{Bayesian Statistics} \label{chap:bayesianstatistics} \SweaveOpts{keep.source=TRUE, pdf=FALSE, prefix.string=Chap04, grdevice=tikz.Swd} <>= options(digits=3, show.signif.stars=FALSE, width=53) rm(list=ls()) require(tikzDevice) source("../SweaveTikZ.R") @ %data files: read.table("ATL.txt", T); read.table("H.txt", T); read.csv("hurart.txt", T); read.bugs(modelc, quiet=TRUE); load("modelupdates.RData") %packages: rjags, R2WinBUGS, lattice %source code: source("gsp.R") %third party: JAGS (C++); WinBUGS, OpenBUGS \begin{quote} ``Probability does not exist.'' \end{quote} \indent---Bruno de Finetti\\ Classical inference involves ways to test hypotheses and estimate confidence intervals. Bayesian inference involves methods to calculate a probability that your hypothesis is correct. The result is a posterior distribution that combines information from your data with your prior beliefs. The term `Bayesian' comes from Bayes theorem---a single formula for obtaining the posterior distribution. It is the cornerstone of modern statistical methods. Here we introduce Bayesian statistics. We begin by considering the problem of learning about the population proportion (percentage) of all hurricanes that make landfall. We then consider the problem of estimating how many days it takes your manuscript to get accepted for publication. \section{Learning About the Proportion of Landfalls} \label{sec:learningaboutproportions} Statistical models that have skill at forecasting the frequency of hurricanes can be made more relevant to society if they also include an estimate of the proportion of those that make landfall. The approach is as follows \citep{Albert2009}. Before examining the data, you hold a belief about the value of this proportion. You model your belief in terms of a prior distribution. Then after examining some data, you update your belief about the proportion by computing the posterior distribution. This setting also allows you to predict the likely outcomes of a new sample taken from the population. It allows you to predict, for example, the proportion of landfalls for next year. The use of the pronoun `you' focuses attention on the Bayesian viewpoint that probabilities are simply how much {\it you} personally believe that something is true. All probabilities are subjective and are based on all the relevant information available to you. Here you think of a population consisting of all past and future hurricanes in the North Atlantic. Then let $\pi$ represent the proportion of this population that hit the United States at hurricane intensity. The value of $\pi$ is unknown. You are interested in learning about what the value $\pi$ could be. Bayesian statistics requires you to represent your belief about the uncertainty in this percentage with a probability distribution. The distribution reflects your subjective prior opinion about plausible values of $\pi$. Before you examine a sample of hurricanes, you think about what the value of $\pi$ might be. If all hurricanes make landfall $\pi$ is one if none make landfall $\pi$ is zero. This is the nature of a proportions, bounded below by zero and above by one. Also, while the number of hurricanes is an integer count, the percentage that make landfall is a real value. As a climatologist you are also aware that not all hurricanes make it to land. The nature of percentages together with your background provide you with information about $\pi$ that is called `your prior.' From this information suppose you believe that the percentage of hurricanes making landfall in the United States is equally likely to be smaller or larger than 0.3. Moreover, suppose you are 90\% confident that $\pi$ is less than 0.5. A convenient family of densities for a proportion is the beta. The beta density, here representing your prior belief about the population percentage of all hurricanes making landfall $\pi$, is proportional to \begin{equation} \label{eq:beta} g(\pi) \propto \pi^{a-1} (1-\pi)^{b-1} \end{equation} where the parameters $a$ and $b$ are chosen to reflect your prior beliefs about $\pi$. The mean of a beta density is $m = a/(a+b)$ and the variance is $v = m(1-m)/(a+b+1)$. Unfortunately it is difficult to assess values of $m$ and $v$ for distributions like the beta that are not symmetric. It is easier to obtain $a$ and $b$ indirectly through statements about the percentiles of the distribution. Here you have a belief about the median (.3) and the 90th percentile (.5). The \verb@beta.select@ function in the {\bf LearnBayes} package \citep{Albert2011} is useful for finding the two parameters (shape and scale) of the beta density that match this prior knowledge. The inputs are two lists, \verb@q1@ and \verb@q2@, that define these two percentiles, and the function returns the values of the corresponding beta parameters, $a$ and $b$. <>= require(LearnBayes) q1 = list(p=.5, x=.3) q2 = list(p=.9, x=.5) beta.select(q1, q2) @ Note the argument \verb@p@ is the distribution percentile and the argument \verb@x@ is a value for $\pi$. Now you have your prior specified as a continuous distribution. You should plot your prior using the \verb@curve@ function to see if it looks reasonable relative to your beliefs. <>= a = beta.select(q1, q2)[1] b = beta.select(q1, q2)[2] curve(dbeta(x, a, b), from=0, to=1, xlab="Proportion of Landfalls", ylab="Density", lwd=4, las=1, col="green") abline(v=.5, lty=2) abline(v=.3, lty=2) @ \begin{figure} \centering <>= par(las=1, mgp=c(2, .5, 0), tcl=-.3) a = beta.select(q1, q2)[1] b = beta.select(q1, q2)[2] curve(dbeta(x, a, b), from=0, to=1, xlab="Proportion of landfalls", ylab="Density", lwd=4, las=1, col="green") abline(v=.5, lty=2) abline(v=.3, lty=2) curve(dbeta(x, a, b), lwd=4, col="green", add=TRUE) @ \vspace{-.75cm} \caption{Beta density describing hurricane landfall proportion.} \label{fig:beta:prior} \end{figure} As seen in Fig.~\ref{fig:beta:prior}, the distribution appears to adequately reflect your prior knowledge of land fall percentages. For reference, the vertical dashed lines are the values you specified for the median and the 90th percentile. The \verb@abline@ function is used after the curve is plotted. The \verb@v@ argument in the function specifies where along the horizontal axis to draw the vertical line on the graph. Recall, to learn more about the function use \verb@?abline@. This is a start, but you need to take your prior distribution and combine it with a likelihood function. The likelihood function comes from your data. It is a function that describes how likely it is, given your model, that you observed the data you actually have. It might sound a bit confusing, but consider that if you think you have a good model for your data, then the probability that your model will replicate your data should be relatively high. For example, consider the number of hurricanes over the most recent set of years. Read the data files containing the annual basin-wide (\verb@A@) and landfall (\verb@US@) counts and compute the sum.\footnote{The basin counts were obtained from \url{http://www.aoml.noaa.gov/hrd/tcfaq/E11.html} on January 17, 2012.} <>= A = read.table("ATL.txt", header=TRUE) US = read.table("H.txt", header=TRUE) Yr = 2006 sum(A$H[A$Year >= Yr]) sum(US$All[US$Year >= Yr]) @ You find \Sexpr{sum(US$All[US$Year>=Yr])} of the \Sexpr{sum(A$H[A$Year>=Yr])} hurricanes made land fall in the United States. You could simply report that $\pi$ = \Sexpr{round(sum(US$All[US$Year>=Yr])/sum(A$H[A$Year>=Yr]),2)} or \Sexpr{round(sum(US$All[US$Year>=Yr])/sum(A$H[A$Year>=Yr])*100,0)}\% of hurricanes make land fall. However this is the single sample estimate and it does not consider your prior belief about $p$. In addition you might be interested in predicting the number of land falls in a new sample of next year's hurricanes. If you regard a `success' as a hurricane making land fall and you take a random sample of $n$ hurricanes with $s$ successes and $f=n-s$ failures, then the {\it likelihood} function (or simply, the likelihood) is given by \begin{equation} \label{eq:binomlike} L(s, f | \pi) \propto \pi^s(1-\pi)^f,\ \hspace{1cm} 0 < \pi < 1 \end{equation} The likelihood depends on the model parameters. Here the only parameter is $\pi$. The likelihood function is defined by the observed set of outcomes. Specifically, the likelihood of $\pi$ given that you observed \Sexpr{sum(US$All[US$Year>=Yr])} land falls in a set of \Sexpr{sum(A$H[A$Year>=Yr])} hurricanes is equal to the probability of that particular observed set of outcomes. Here the model is the beta density with parameters $a=s+1$ and $b=f+1$. You can convince yourself that the likelihood is maximized when $\pi$ = \Sexpr{round(sum(US$All[US$Year>=Yr])/sum(A$H[A$Year>=Yr]),2)} for $s$ = \Sexpr{sum(US$All[US$Year>=Yr])} and $f$ = \Sexpr{sum(A$H[A$Year>=Yr])-sum(US$All[US$Year>=Yr])} by plotting $L(\pi)$ for $\pi$ from 0 to 1. Then with a likelihood function describing your data along with your prior beliefs, your posterior density for $\pi$ is obtained up to a proportionality constant, by multiplying the prior density ($g(\pi)$) by the likelihood ($L(\hbox{data}|\pi)$), which is Bayes rule given as \begin{equation} \label{eq:Bayesrule} g(\pi | \hbox{data}) \propto g(\pi) L(\hbox{data} | \pi) \end{equation} Your prior represents your thinking about the parameter (in terms of probability) of interest before examining data. Afterwards you have a likelihood function representing the probability of your data, given values for the parameter. Finally the posterior probability distribution is computed using Bayes rule. It can be shown that if the prior is a beta density with parameters $a$ and $b$ and the likelihood is a beta density with parameters $s$ and $f$, then the posterior density is also a beta density with parameters $a+s$ and $b+f$. This is an example of a Bayesian conjugate model, where the prior and posterior densities have the same functional form. Since the prior, likelihood, and posterior are all in the same beta family of densities, you can use \verb@dbeta@ to compute the values. To compute and plot them together, type <>= s = sum(US$All[US$Year >= Yr]) f = sum(A$H[A$Year >= Yr]) - sum(US$All[US$Year >= Yr]) curve(dbeta(x, a + s, b + f), from=0, to=1, xlab="Proportion of Landfalls", ylab="Density", col=1, lwd=4, las=1) curve(dbeta(x, s + 1, f + 1), add=TRUE, col=2, lwd=4) curve(dbeta(x, a, b), add=TRUE, col=3, lwd=4) legend("topright", c("Prior", "Likelihood", "Posterior"), col=c(3, 2, 1), lwd=c(3, 3, 3)) @ \begin{figure} \centering <>= par(las=1, mgp=c(2, .5, 0), tcl=-.3) s = sum(US$All[US$Year >= Yr]) f = sum(A$H[A$Year >= Yr]) - sum(US$All[US$Year >= Yr]) curve(dbeta(x, a + s, b + f), from=0, to=1, xlab="Proportion of landfalls", ylab="Density", col=1, lwd=4, las=1) grid() curve(dbeta(x, a + s, b + f), add=TRUE, col=1, lwd=4) curve(dbeta(x, s + 1, f + 1), add=TRUE, col=2, lwd=4) curve(dbeta(x, a, b), add=TRUE, col=3, lwd=4) legend("topright",c("Prior", "Likelihood", "Posterior"), col=c(3, 2, 1), lwd=c(3, 3, 3), bg="white") @ \vspace{-.75cm} \caption{Densities describing hurricane landfall proportion.} \label{fig:prior:like:post} \end{figure} The densities are shown in Fig.~\ref{fig:prior:like:post}. Note that the posterior density resembles the likelihood but it is shifted in the direction of the prior. This is always the case. The posterior is a weighted average of the prior and the likelihood where the weights are proportional to the precision. The greater the precision, the more weight it carries in determining the posterior. For your prior, the precision is related to how committed you are to particular values. Believing there is a 90\% chance that the proportion is less than 0.5 provides a quantitative level of commitment. The less committed you are, the flatter the prior distribution, and the less weight it plays in the posterior. For your data, the precision is inversely related to the standard error, so directly related to the sample size. The more data you have, the more weight the likelihood plays in determining the posterior. \section{Inference} \label{sec:inference} A benefit of this Bayesian approach to statistics is that the posterior distribution contains all the information you need to make inferences about your statistic of interest. In this example since the posterior is a beta distribution, the \verb@pbeta@ (cumulative distribution) and \verb@qbeta@ (quantile) functions can be used. For example, given your prior beliefs and your sample of data, how likely is it that the population percentage of land falls is less than or equal to 25\%? This is answered by computing the posterior probability $P(\pi \leq 0.25 | \hbox{data, prior})$, which is given by <>= pbeta(q=.25, a + s, b + f) @ You interpret this probability in a natural way. You can state that given the evidence in hand (data and your prior belief), there is a \Sexpr{round(pbeta(.25,a+s,b+f)*100,1)}\% chance that less than or equal to a quarter of all hurricanes hit the United States (at hurricane intensity). Or you can state that it is quite unlikely (\Sexpr{round(100-pbeta(.3,a+s,b+f)*100,1)}\%) that more than 3 in 10 hurricanes make U.S. landfall (\verb@1-pbeta(.3,a+s,b+f)@). You can't do this with classical statistics. Instead, the $p$-value resulting from a significance test is the evidence in support of a null hypothesis. Suppose the null hypothesis is that the population proportion exceeds 3 in 10. You state $H_o: \pi > 0.3$ against the alternative $H_a: \pi \leq 0.3$ and decide between the two on the basis of a $p$-value obtained by typing <>= prop.test(s, s + f, p=.3, alt="less")$p.value @ where the argument \verb@p@ specifies the value of the null hypothesis and the argument \verb@alt@ specifies the direction of the alternative hypothesis. The $p$-value of \Sexpr{round(prop.test(s,s+f,p=.3,alt="less")$p.value,3)} indicates that if the null is true (the proportion is greater than 0.3), the evidence seems unusual. You conclude there is moderate evidence against the null. However, it is incorrect to conclude there is only a \Sexpr{round(prop.test(s,s+f,p=.3,alt="less")$p.value*100,1)}\% chance that the proportion of landfalls is greater than 0.3. \section{Credible Interval} The natural interpretation of an inference extends to the interpretation of a posterior summary. For instance, the 95\% interval estimate for the percentage of landfalls is obtained from the 2.5th and 97.5th percentiles of the posterior density. This is done with the \verb@qbeta@ function by typing <>= qbeta(c(.025, .975), a + s, b + f) @ You state the 95\% {\it credible interval} for the proportion is [\Sexpr{round(qbeta(.025,a+s,b+ f),3)}, \Sexpr{round(qbeta(.975,a+s,b+f),3)}] and conclude you are 95\% confident that the true proportion lies inside this interval. Note the difference in interpretation. With a credible interval you say that given the data and your prior beliefs there is a 95\% chance that population proportion lies within this particular interval. With a confidence interval you say you are 95\% confident that the method produces an interval that contains the true proportion. In the former the population parameter is the random variable, in the latter the interval is the random variable. The 95\% confidence interval estimate for the proportion $\pi$ using a large enough sample of data, where $\hat p$ is the sample proportion and $n$ is the sample size, is given by \begin{equation} \label{eq:prop:95ci} \hat p \pm \sqrt{\frac{\hat p(1-\hat p)}{n}} \times \Phi^{-1}(0.975) \end{equation} where $\Phi^{-1}(0.975)$ (inverse of the cumulative distribution function) is the 97.5th percentile value from a standard normal distribution. The confidence interval is obtained by typing <>= prop.test(s, s + f)$conf.int @ The results allow you to state the 95\% confidence interval for the proportion is [\Sexpr{round(prop.test(s,s+f)$conf.int[1],digits=3)}, \Sexpr{round(prop.test(s,s+f)$conf.int[2],digits=3)}] and conclude that if you had access to additional samples and computed the interval in this way for each sample, 95\% of the intervals will cover the true proportion. In some cases the intervals produced (confidence and credible) from the same data set are identical. They are almost certainly different if an informative prior is included and they may be different even if the prior is relatively uninformative. However, the interpretations are alway different. One attractive feature of Bayesian statistics is that inferential statements are easy to communicate. It's natural to talk about the probability that $\pi$ falls within an interval or the probability that a hypothesis is true. Also, Bayes rule (Eq.~\ref{eq:Bayesrule}) is the only thing you need to remember. It's used for small and large samples. It is the only consistent way to update your probabilities. The cost is that you need to specify your prior beliefs. \section{Predictive Density} So far you focused on learning about the population proportion of hurricanes that make landfall. Suppose your interest is predicting the number of U.S. landfalls ($\tilde l$) in a future sample of seven hurricanes. If your current understanding of $\pi$ is contained in the density $g(\pi)$ (e.g., the posterior density), then the predictive density for $\tilde l$ is given by \begin{equation} \label{eq:preddensity} f(\tilde l) = \int f(\tilde l |\pi) g(\pi) d\pi \end{equation} With a beta distribution you can integrate to get an expression for the predictive density. The beta predictive probabilities are computed using the function \verb@pbetap@ from the {\bf LearnBayes} package. The inputs are a vector \verb@ab@ containing the beta parameters, the size of the future sample $m$, and a vector of the number of future landfalls \verb@lf@. <>= ab = c(a + s, b + f) m = 7; lf = 0:m plf = pbetap(ab, m, lf) round(cbind(lf, plf), digits=3) @ Simulation provides a convenient way to compute a predictive density. For example, to obtain $\tilde l$, first simulate $p$ from $g(p)$, and then simulate $\tilde l$ from the binomial distribution, $f_B(l | n, p)$. You can use this approach on your beta posterior. First simulate 1000 draws from the posterior and store them in \verb@p@. <>= p = rbeta(n=1000, a + s, b + f) @ Then simulate values of $\tilde l$ for these random values using the \verb@rbinom@ function and tabulate them. <>= lc = rbinom(n=1000, size=7, prob=p) table(lc) @ The table indicates that, of the 1000 simulations, \Sexpr{table(lc)[1]} of them resulted in no landfalls, \Sexpr{table(lc)[1]} of them resulted in one landfall, etc. You save the frequencies in a vector and then convert them to probabilities by dividing by the total number of simulations. Finally plot the probabilities using the histogram argument (\verb@type="h"@). <>= freq = table(lc) pp = freq/sum(freq) plot(pp, type="h", xlab="Number of U.S. Hurricanes", las=1, lwd=3, ylab="Predictive Probability") @ \begin{figure} \centering <>= par(las=1, mgp=c(2, .5, 0), tcl=-.3) freq = table(lc) pp = freq/sum(freq) plot(pp, type="h", xlab="Number of hurricanes", las=1, lwd=8, ylim=c(0, .4), ylab="Predictive probability") @ \vspace{-.75cm} \caption{Predictive probabilities for the number of land falling hurricanes.} \label{fig:predictedlandfallprobs} \end{figure} The plot is shown in Fig.~\ref{fig:predictedlandfallprobs}. It is most likely that one of the seven will hit the United States. The probability of four or more doing so is \Sexpr{(1-sum(pp[1:4]))*100}\%. The cumulative sum of the probabilities is found by typing <>= cumsum(pp) @ The probability of three or fewer landfalls is \Sexpr{cumsum(pp)[4]}. Suppose you wish to summarize this discrete predictive distribution by an interval that covers a certain amount of the probability. This can be done using the function \verb@discint@ from the {\bf LearnBayes} package. You combine the vector of probabilities with a vector of counts and specify a coverage probability as <>= mc = length(pp) - 1 int = discint(cbind(0:mc, pp), .95) int @ The output has two lists, the minimum probability (\verb@$prob@) greater than the specified coverage probability (here 95\%) and the list of counts (\verb@$set@) over which the individual probabilities are summed. You can see the probability is \Sexpr{discint(cbind(0:mc,pp),.95)$prob*100}\% that the number of U.S. hurricanes is either one of these counts. The interval covers a range of \Sexpr{length(int$set)} counts, which is necessarily wider than the range of counts computed by using the bounds on the \Sexpr{int$prob*100}\% credible interval around the population proportion. You check this by typing <>= lp = (1 - as.numeric(int$prob))/2 up = 1 - lp (qbeta(up, a + s, b + f) - qbeta(lp, a + s, b + f)) * 7 @ This is because in predicting a specific value, as you saw in Chapter~\ref{chap:classicalstatistics} with the linear regression model, there are two sources of uncertainty. Here the uncertainty about the population proportion $\pi$ and the uncertainty about the count given an estimate of $\pi$. \section{Is Bayes Rule Needed?} Since all probabilities are ultimately subjective, why is Bayes rule needed? Why don't you simply examine your data and subjectively determine your posterior probability distribution, updating what you already know? Wouldn't this save you trouble? Strange as it sounds, it would actually cause you more trouble. The catch is that it is hard to assess probabilities rationally. For example, probabilities must be non-negative and sum to one. If these are not met, then your probabilities are said to be inconsistent. In fact they should obey certain axioms of coherent assessment \citep{Winkler2003}. For example, if you consider event A to be more likely than event B, and event B to more likely than event C, then you should consider event A to be more likely than event C. If your probabilities fail to obey transitivity, then your assessment will be inconsistent in a decision-making sense. This is not a serious impediment since you can easily remove your inconsistency once you're made aware of it. If someone notes an arithmetic error you would certainly correct it. Similarly, if someone notes that your prior probabilities are not transitive, you would change them accordingly. In this regard, it is sometimes useful to attempt to assess the same set of probabilities in more than one way in order to `check' your assessments. You might assess probabilities using the mean and variance and then compare these probabilities to an assessment based on quantiles. Importantly, if you obey the axioms of coherence, then to revise your probabilities you must use Bayes rule. To do otherwise would be inconsistent. Numerous psychological studies have indicated that people do not always intuitively revise probabilities on the basis of this rule. People tend to give too little `weight' to the data. In general, this means that they do not change their probabilities as much as Bayes theorem tells them that they should. By using the rule you safeguard against inconsistent assessment of the available information. \section{Bayesian Computation} The models and examples presented in this chapter are reasonably simple: e.g., inference for an unknown success probability, a proportion, the mean of a normal, etc. For these problems, the likelihood functions are standard, and, if you use a conjugate model where the prior and posterior densities have the same functional form, deriving the posterior density poses no great computational burden. Indeed this is why conjugate models like the beta are widely used in Bayesian analysis \citep{Jackman2009}. You showed above the probability of a random hurricane hitting the United States can be modeled using a beta prior. In this case, computation is nothing but addition. But Bayesian computation quickly becomes more challenging when working with more complicated models, or when you use non-conjugate models. Characterizing the posterior density in these cases becomes a non-trivial exercise. Many statistical models in geography and the social sciences have this feature. Options still exist. One is brute force. When the posterior distribution is not a familiar function, you can compute values over a grid and then approximate the continuous posterior by a discrete one \cite{Albert2009}. This is possible for models with one or two parameters. In situations where you can directly sample from the posterior, a Monte Carlo (MC) algorithm gives an estimate for the posterior mean for any function of your parameters. In the situation where can't directly sample you can use rejection sampling provided you have a suitable proposal density or, in the most general case, you can adopt a Markov chain Monte Carlo (MCMC) approach. \subsection{Time-to-Acceptance} \label{sec:tta} Publication is central to science. Authors, wanting to be first in an area, are keen about the speed of the manuscript review process. Editors, focused on ensuring a thorough review, are well aware that timeliness is important to a journal's reputation. It can also be a practical matter. For example, there is usually a cutoff date after which a manuscript will not be considered by an assessment report. This presentation follows closely the work of \cite{HodgesEtAl2012}. Here you assume information about manuscript review time might be useful to authors and editors, especially if it can say something about future submissions. This motivates you to collect data on publication times from recent journals and to model them using a Bayesian approach. A Bayesian model has the advantage that output is in the form of a predictive distribution. Here you can't exploit conjugacy so you approximate the posterior with a discrete distribution. You sample directly from the distribution and use a MC algorithm for summarizing your samples. We use the American Meteorological Society (AMS) journals and the keyword `hurricane' appearing in published titles over the years 2008--2010. The search is done from the website \url{http://journals.ametsoc.org/action/doSearch}. Selecting `Full Text' on a particular article brings up the abstract, keywords, and two dates: received and accepted---in month, day, year format. The data are entered manually but are available to you by typing <>= art = read.csv("hurart.txt", header=TRUE) @ For each article both dates are provided along with the lead author's last name and the name of the journal. There are \Sexpr{dim(art)[1]} articles with the word `hurricane' in the title over the three-year period. Of these \Sexpr{sum(art$FinYr==2008)}, \Sexpr{sum(art$FinYr==2009)}, and \Sexpr{sum(art$FinYr==2010)} had `accepted' years of 2008, 2009, and 2010, respectively. Articles appearing in 2008 with accepted years before 2008 are not included. Journals publishing these articles are shown in Fig.~\ref{fig:barplotjournals}. Ten different journals are represented. Thirty-seven of these articles are published in {\it Monthly Weather Review}, 22 in {\it Journal of the Atmospheric Sciences}, 13 in the {\it Journal of Climate}, and 12 in {\it Weather and Forecasting}. \begin{figure} \centering <>= nm = c("Monthly Weather Review", "Journal of the Atmospheric Sciences", "Journal of Climate", "Weather and Forecasting", "Journal of Applied Meteorology and Climatology", "Journal of Physical Oceanography", "Journal of Atmospheric and Ocean Technology", "Earth Interations", "Journal of Hydrology", "Weather, Climate, and Society") par(mar = c(5, 15, 3, 2) + .1, mgp=c(2.2, .5, 0), tcl=-.3) barplot(sort(table(art$Jo)), names=rev(nm), las=1, horiz=TRUE, cex.axis=.9, cex.names=.7, xlab="Number of articles") @ \vspace{-.75cm} \caption{AMS journals publishing articles with `hurricane' in the title.} \label{fig:barplotjournals} \end{figure} First you compute the time period between received and accepted dates. This is done by converting the numeric values of year, month, and day to an actual date using the \verb@ISOdate@ function and then using the \verb@difftime@ to compute the time difference in days. <>= rec = ISOdate(art$RecYr, art$RecMo, art$RecDay) acc = ISOdate(art$FinYr, art$FinMo, art$FinDay) tau = difftime(acc, rec, units="days") @ You call the temporal difference the time-to-acceptance ($\tau$). This is the statistic of your interest. The mean $\tau$ is \Sexpr{round(as.integer(mean(tau[art$FinYr==2008])),0)} in 2008 (type \verb@mean(tau[art$FinYr==2008])@), \Sexpr{round(as.integer(mean(tau[art$FinYr==2009])),0)} in 2009, and \Sexpr{round(as.integer(mean(tau[art$FinYr==2010])),0)} in 2010. The mean $\tau$ for each of the four journals with the most articles is \Sexpr{round(as.integer(mean(tau[art$Jo=="MWR"])),0)} ({\it Monthly Weather Review}), \Sexpr{round(as.integer(mean(tau[art$Jo=="JAS"])),0)} ({\it Journal of the Atmospheric Sciences}), \Sexpr{round(as.integer(mean(tau[art$Jo=="JC"])),0)} ({\it Journal of Climate}), and \Sexpr{round(as.integer(mean(tau[art$Jo=="WF"])),0)} ({\it Weather and Forecasting}). On average the {\it Journal of Climate} is slowest and {\it Monthly Weather Review} is fastest, but the difference is less than 3.5 weeks. Undoubtedly there are many factors that influence the value of $\tau$. On the editor's side there is pre-review, review requests, dispensation decision among others. On the reviewer's side there is work load as well as breaks for travel and vacation, and on the author's side there is the effort needed to revise and resubmit. The goal is a predictive distribution for $\tau$. This will allow you to make inferences about the time-to-acceptance for future manuscript submissions. You assume $\tau$ is a random variable having a gamma density given by \begin{equation} f(\tau|\alpha , \beta) = \frac{\tau^{\alpha -1}{\exp(\frac{-\tau}{\beta})}}{{\beta^\alpha}\Gamma(\alpha)} \end{equation} where $\alpha$ and $\beta$ are the shape and scale parameters, respectively and $\Gamma(\alpha) = (\alpha-1)!$. The gamma density is commonly used to model time periods (wait times, phone call lengths, etc). If you place a uniform prior distribution on the parameter vector $(\alpha, \beta)$ the posterior density is given by \begin{equation} g(\alpha, \beta | \tau) \propto g(\alpha, \beta) f(\tau|\alpha, \beta) \label{eq:gammaposterior} \end{equation} The uniform prior is consistent with a judgement that the parameters are the same regardless of author or journal. Random draws from this joint posterior density are summarized and also used to draw predictive samples for $\tau$ from a gamma density. Following \cite{Albert2009}, to make the computation easier the posterior density is reformulated in terms of $\log \alpha$ and $\log \mu$, where $\mu = \alpha \times \beta$ is the posterior mean of $\tau$. The posterior density is available in the \verb@gsp.R@ file (from Jim Albert). Here you source the function by typing <>= source("gsp.R") @ Remember to include quotes around the file name. Given a pair of parameter values ($\log \alpha, \log \mu$) the function computes the posterior probability given the data and the posterior density using \begin{equation} \log g(\alpha, \frac{\mu}{\alpha} | \tau) = \log \mu + \sum_{i=1}^n \log f(\tau_i|\alpha, \frac{\mu}{\alpha}) \label{eq:gammasampling} \end{equation} It performs this computation using pairs of parameters defined as a two-dimensional grid spanning the domain of the posterior. Alternatively, you could use a MCMC approach to compute the posterior density. This would allow you to use non-uniform priors for the parameters. The \verb@gsp@ function is used to determine the posterior density as defined in Eq.~\ref{eq:gammasampling} over a two-dimensional grid of parameter values (on log scales) using the \verb@mycontour@ function from the {\bf LearnBayes} package. It is also used to sample from the posterior using the \verb@simcontour@ function from the same package. Using these functions you contour the joint posterior of the two parameters and then add 1000 random draws from the distribution by typing <>= lim = c(.8, 2.1, 5, 5.45) mycontour(gsp, limits=lim, data=tau, xlab="log alpha", ylab="log mu") s = simcontour(gsp, lim, data=tau, m=1000) points(s$x, s$y, pch=19, col="gray", cex=.5) @ <>= dev.off() @ These computations may take a few seconds to complete. The result is shown in Fig.~\ref{fig:jointposterior}. The contours from inside-out are the 10\%, 1\%, and 0.1\% probability intervals on a log scale. \begin{figure} \centering <>= par(mar=c(5, 4, 4, 2) + .1, mgp=c(2, .5, 0), tcl=-.3) plot(lim[1:2], lim[3:4], type="n", las=1, xlab="log($\\alpha$)", ylab="log($\\mu$)") grid() points(s$x, s$y, pch=19, col="gray", cex=.5) mycontour(gsp, limits=lim, data=tau, add=TRUE) @ \vspace{-.75cm} \caption{Posterior density of the gamma parameters for a model describing time-to-acceptance.} \label{fig:jointposterior} \end{figure} The graph shows the mean time to acceptance is about 181 days [on the ordinate $\exp(5.2)$]. This might be useful to an editor. But the posterior gives additional information. For example, the editor can use your model to estimate the probability that the average time to acceptance will exceed 200 days for the set of manuscripts arriving next month. Since the model provides draws, the question is answered by finding the percentage of draws exceeding the logarithm of 200. Assuming the set of manuscripts is a random sample the model predicts \Sexpr{round(sum(s$y>log(200))/1000*100,1)}\%. Averages are not particularly useful to an author. An author would like to know if a recently submitted manuscript will be accepted in less than 120 days. To answer this question, the author takes random draws of $\tau$'s from a gamma density using the random draws from the posterior parameter distribution. This is done with the \verb@rgamma@ function and then finding the percentage of these draws less than this many days. <>= alpha = exp(s$x) mu = exp(s$y) beta = mu/alpha taum = rgamma(n=1000, shape=alpha, scale=beta) @ Here the model predicts a probability of \Sexpr{round(sum(taum<120)/1000*100,1)}\%. Note that this probability is lower than the average percentage less than 120 days as it includes additional uncertainty associated with modeling an individual estimate rather than a parameter estimate. Changes to review rules and manuscript timetables and tracking will influence time to acceptance. To the extent that these changes occur during the period of data collection or subsequently, they will influence your model's ability to accurately anticipate time to acceptance. Model fit is checked by examining quantile statistics from the data against the same statistics from the posterior draws. For instance, the percentage of articles in the data with $\tau$ less than 90 days is \Sexpr{round(sum(tau<90)/length(tau)*100,1)}, which compares with a percentage of \Sexpr{round(sum(taum<90)/length(taum)*100,1)} from the posterior draws. Continuing, the percentage of articles with $\tau$ longer than 360 days from the data is \Sexpr{round(sum(tau>360)/length(tau)*100,1)}, which compares with a percentage of \Sexpr{round(sum(taum>360)/length(taum)*100,1)} from the posterior draws. The model has practical applications for authors wishing to meet deadlines. As an example, in order for research to be considered by the 5th Assessment Report of the Intergovernmental Panel on Climate Change (IPCC) authors must have had their relevant manuscript accepted for publication by March 15, 2013. \begin{figure} \centering <>= s = simcontour(gsp, c(.8, 2.1, 5, 5.45), tau, 1000) pu = numeric() pl = numeric() pm = numeric() for(t in 1:361){ p = numeric() for(j in 1:1000){ alpha = exp(s$x) mu = exp(s$y) beta = mu/alpha taum = rgamma(n=1000, shape=alpha, scale=beta) p[j] = sum(taum < t)/1000 } pu[t] = quantile(p, prob=.975) pl[t] = quantile(p, prob=.025) pm[t] = quantile(p, prob=.5) } x = as.Date("March 16, 2013", format='%B %d, %Y') t = seq(361, 1, -1) days = x - t pur = rev(pu); plr = rev(pl); pmr = rev(pm) par(mar=c(6, 6, 3, 2) + .1, las=2, mgp=c(2.4, 1, 0), tcl=-.3) plot(days, pmr, pch=19, type="l", lwd=2, cex=.6, ylim=c(0, 1), ylab="Probability of acceptance\n before March 15, 2013", xlab="", xaxt="n") xx = as.numeric(days[seq(1, 361, 30)]) yy = par("usr")[3] - .05 axis(1, at=xx, labels=rep("", length(xx))) text(xx, yy, labels=as.character(days[seq(1, 361, 30)]), srt=45, adj=c(1, 1), xpd=TRUE, cex=.8) mtext("Submit date [year-month-day]", 1, line=4, las=1) abline(h=seq(0, 1, .2), col="lightgray", lty="dotted") abline(v=xx, col="lightgray", lty="dotted") xxx = c(days, rev(days)) yyy = c(plr, rev(pur)) polygon(xxx, yyy, border=NA, col="gray") lines(days, pmr, lwd=2) @ \vspace{-.5cm} \caption{Probability of paper acceptance as a function of submit date.} \label{fig:probvsdate} \end{figure} The graph in Fig.~\ref{fig:probvsdate} shows your model's predictive probability of meeting this deadline versus date. Results are based on 1000 random draws from a gamma distribution where the scale and shape parameter are derived from a 1000 draws from the joint posterior given the data. The probability is the percentage of posterior $\tau$ draws less than $\tau_o$ days from March 15, 2013. The 95\% credible interval shown by the gray band is obtained by repeating the 1000 draws from the gamma distribution 1000 times and taking the 0.025 and 0.975 quantile values of the probabilities. The probability is high in the early months of 2012 when the deadline is still a year in the future. However, by mid September of 2012 the probability drops below 50\% and by mid January 2013 the probability is less than 10\%. The predictive probabilities from your model reflect what can be expected if an arbitrary author submits to an arbitrary AMS journal with `hurricane' in the title under the assumption that the paper will be accepted. Certain journals and authors could have a faster or slower turn-around time. For instance, we note that the two papers in the data with Kossin as lead author have $\tau$'s of 84 and 56 days, values which are substantially smaller than the mean. In this case you need to use a hierarchical model (see Chapter~\ref{chap:bayesianmodels}) to accommodate author and journal differences. With a hierachical model you use a Markov chain Monte Carlo approach to obtain the posterior probabilities. \subsection{Markov chain Monte Carlo approach} \label{sec:mcmc} Markov chain Monte Carlo (MCMC) is a class of algorithms for sampling from a probability distribution using a Markov chain that, over many samples, has the desired distribution. It is a way to obtain samples from your posterior distribution. It is flexible, easy to implement, and requires little input from you. It is one reason behind the recent popularity of Bayesian approaches. Gibbs sampling is an example of an MCMC approach. Suppose your parameter vector of interest is $\theta= (\theta_1, \theta_2, \ldots, \theta_p)$. The joint posterior distribution of theta, which is denoted [$\theta$| data] may be of high dimension and difficult to summarize. Instead suppose you define the set of conditional distributions as \begin{equation} \left[\theta_1| \theta_2, \ \ldots, \theta_p, \ \hbox{data}\right], \left[\theta_2| \theta_1, \theta_3 \ \ldots, \theta_p, \ \hbox{data}\right], \cdots \left[\theta_p| \theta_1, \ \ldots, \theta_{p-1}, \ \hbox{data}\right], \end{equation} where [$X$|$Y,Z$] represents the distribution of $X$ conditional on values of random variables $Y$ and $Z$. The idea is that you can set up a Markov chain from the joint posterior distribution by simulating individual parameters from the set of $p$ conditional distributions. Drawing one value for each individual parameter from these distributions in turn is called one update (iteration) of the Gibbs sampling. Under general conditions, draws will converge to the target distribution (joint posterior of $\theta$). Unfortunately this theoretical result provides no practical guidance on how to decide if the simulated sample provides a reasonable approximation to the posterior density \citep{Jackman2009}. This must be done on a model-by-model case. \subsection{JAGS} \label{sec:jags} A popular general purpose MCMC software that implements Gibbs sampling is WinBUGS (Windows version of Bayesian inference Using Gibbs Sampling) \citep{LunnEtAl2000}. It is stand-alone with a Graphical User Interface (GUI). JAGS (Just Another Gibbs Sampler) is an open-source project written in C++. It runs on any computing platform and can be accessed from R with functions from the {\bf rjags} package \cite{Plummer2011}. Here you use JAGS on the earlier problem of making inferences about the proportion of U.S. landfalls. The example is simple and an MCMC algorithm is not necessary, but it's instructive for you to see the work flow. To begin, download and install the latest version of JAGS. This is C++ code that gets installed outside of R. Next open a text file and write the model code in a language that JAGS understands. You can copy and paste from here. Call the file {\bf JAGSmodel.txt}. \begin{verbatim} ___JAGS code___ model { h ~ dbin(pi, n) #data pi ~ dbeta(a, b) #prior a <- 3.26 b <- 7.19 } _______________ \end{verbatim} Note the similarity of the syntax to R. Every JAGS (and BUGS) program begins with a model statement followed by an opening curly brace and ends with an enclosing curly brace. The second line specifies the landfall count {\tt h} using the `twiddle' or `tilde' character `~' which denotes a stochastic relation. The stochastic relation is of the form \verb@node~ddens(arguments)@ where {\tt node} is your parameter and {\tt ddens} is the name of a function calling one of the distributions supported in JAGS. These functions all start with the letter {\tt d}, as here in {\tt dbin} for the binomial mass function and {\tt dbeta} for the beta density used in third line for your prior. Here the values of the parameters {\tt a} and {\tt b} are the same as you used earlier in the chapter and are set deterministically using the \verb@<-@ assignment operator. JAGS and BUGS are declarative languages. R is a procedural language. The JAGS syntax specifies the model, but it does not define the computational steps. When the JAGS model is compiled, the model syntax is turned into a set of sequential instructions, but this is hidden from you. Because of this, the statement order is not important. What matters is that the compiler can resolve the names and links in the model. Think of the JAGS model as a directed acyclic graph (DAG) as shown in Fig.~\ref{fig:DAG}. Nodes are the parameters and data and arrows indicate the conditional dependency, either stochastic or deterministic. Hurricane landfall $h$ depends on the number of hurricanes $n$ and the proportion $\pi$, where the proportion $\pi$ depends on your prior assumptions encoded in the beta parameters $a$ and $b$. \begin{figure} \centering <>= require(grid) hc = grid.circle(r=.05, x=.6, y=.2) grid.text("h", x=.6, y=.2) pic = grid.circle(r=.05, x=.4, y=.5) grid.text("$\\pi$", x=.4, y=.5) nc = grid.circle(r=.05, x=.8, y=.5) grid.text("n", x=.8, y=.5) grid.curve(.43, .45, .55, .25, arrow=arrow(type="closed", length=unit(2, "mm")), angle=10, curvature=0, gp=gpar(fill="black")) grid.curve(.77, .45, .65, .25, arrow=arrow(type="closed", length=unit(2, "mm")), angle=10, curvature=0, gp=gpar(fill="black")) ac = grid.circle(r=.05, x=.2, y=.8) grid.text("a", x=.2, y=.8) bc = grid.circle(r=.05, x=.6, y=.8) grid.text("b", x=.6, y=.8) grid.curve(.23, .75, .35, .55, arrow=arrow(type="closed", length=unit(2, "mm")), angle=10, curvature=0, gp=gpar(fill="black")) grid.curve(.57, .75, .45, .55, arrow=arrow(type="closed", length=unit(2, "mm")), angle=10, curvature=0, gp=gpar(fill="black")) @ \vspace{-.75cm} \caption{Directed acyclic graph of the landfall proportions model.} \label{fig:DAG} \end{figure} Landfall proportion $\pi$ is conditionally dependent on the scale (a) and shape (b) values through the beta density. The number of landfalls $h$ is dependent on the basin-wide count $n$ and the proportion of landfalls $\pi$ through the binomial distribution. The conditional independence structure of your model is easier to see when represented as a DAG. Note that BUGS allows you to generate model code from a DAG. Next open R and type <>= require(rjags) @ You call JAGS from R with the \verb@jags.model@ function by typing, <>= model = jags.model('JAGSmodel.txt', data = list('h'=4, 'n'=34), inits = list('pi'=.5, .RNG.seed=3042, .RNG.name="base::Super-Duper")) @ The function sends the model in file {\bf JAGSmodel.txt} to JAGS for parsing and compiling. Make sure the file is in your working directory. The {\tt data} argument specifies the number of landfalling hurricanes $h$ and number of hurricanes $n$ as a list. The {\tt inits} argument directs JAGS to initialize the MCMC although that is not necessary with this simple model. The last two elements in the {\tt inits} list specify the random number generator (RNG) and an initial seed value for the generator. This allows the results to be replicated as the sequence of random numbers will be identical. The object {\tt model} is of class {\tt jags}. By default the function generates 1000 samples (updates) of the MCMC algorithm. This is enough for successive values of $\pi$ to depart from the initial value and reach the posterior density. In fact, this `burn-in' is instantaneous with this simple model. If more iterations are needed you specify the number in the function with the argument {\tt n.adapt} or you can use the \verb@update@ method on the model object by typing, <>= update(model, 1000) @ An additional 1000 iterations from the MCMC are generated but not saved. Finally to generate posterior samples of $\pi$ and save them, you use the \verb@coda.samples@ function. It continues updating the chain for the number of iterations specified by \verb@n.iter@, but this time it saves them if the parameter is listed in the \verb@variable.names@ argument. <>= out = coda.samples(model, variable.names='pi', n.iter=1000) @ CODA stands for COnvergence Diagnostic and Analysis. It describes a suite of functions for analyzing outputs generated from BUGS software. The object returned is of class {\tt mcmc.list}. You use the \verb@summary@ method on this object to obtain a summary of the posterior distribution for $\pi$. <>= summary(out) @ The output begins with the characteristics of your MCMC and then provides statistics on the samples from the posterior distribution. The posterior mean indicating the proportion of all North Atlantic hurricanes that make landfall in the United States is \Sexpr{round(summary(out)$statistics[1],2)}. Quantiles of $\pi$ from the posterior samples indicate the 95\% credible interval for the proportion is (\Sexpr{round(summary(out)$quantiles[1],2)}, \Sexpr{round(summary(out)$quantiles[5],2)}). You obtain other statistics from the posterior samples by extracting the array corresponding to $\pi$ from the MCMC list and performing the appropriate computations. For instance, given your prior and your data, how likely is it that the population percentage of land falls is less than or equal to 0.25? This is answered by typing <>= pi = as.array(out) sum(pi <= .25)/length(pi) @ Here since there is only one chain and one variable you extract the vector from the array using \verb@as.array@. The answer compares well with what you obtained in \S\ref{sec:inference} where you used the conjugate model. Additional analysis can be done with the posterior samples and the plot method will produces a trace and density plot of your $\pi$ values by typing, <>= plot(out) @ The trace plot shows your MCMC samples versus the simulation index. It is useful in assessing whether your chain has converged. Although the samples vary the mean and variance are relatively constant indicating convergence. This is expected with a simple model. The density plot indicates the distribution of all the samples. The MCMC approach is flexible and it allows you to easily answer related inferential questions. For instance, suppose you are interested in the probability that any two consecutive hurricanes that form in the North Atlantic will hit the United States. You can add a node to your model of the form \verb@pi2 <- pow(pi,2)@. This is changed in your file {\bf JAGSmodel.txt}. You then monitor this node and draw inferences from the posterior samples. <>= model=jags.model('JAGSmodel.txt', data=list('h'=4, 'n'=34), inits=list('pi'=.5)) update(model, 1000) out = coda.samples(model, variable.names='pi2', n.iter=1000) summary(out) @ The median posterior probabilty indicates only a \Sexpr{round(summary(out)$quantiles[3],3)*100}\% chance of consecutive hurricanes hitting the United States. The result assumes consecutive hurricanes are independent. \subsection{WinBUGS} \label{sec:winbugs} WinBUGS and OpenBUGS have functions for modeling spatial data that are not yet available in JAGS. In Chapter~\ref{chap:bayesianmodels} you will build a Bayesian spatial model for your hurricane data. Here we show you the work flow for implementing a Bayesian model in WinBUGS through R. If you are using Windows there is no problem. Unfortunately if you are using a Mac your options are limited. One is to run WinBUGS with Wine. Another is to use OpenBUGS without using R. Here we assume you are using a Windows machine or running Windows on a partition of your Mac (e.g., bootcamp). First download and install WinBUGS.\footnote{\url{http://www.mrc-bsu.cam.ac.uk/bugs/winbugs/contents.shtml}} Make sure you have permission to write files in the directory where the executable file is stored. Next, write a WinBUGS model describing the time to acceptance for an article on hurricanes submitted to the American Meteorological Society (see \S\ref{sec:tta}). The code is given below. \begin{verbatim} ___WinBUGS code___ model { for (i in 1:N) { TTA[i] ~ dgamma(shape, rate) } ttastar ~ dgamma(shape, rate) ht <- step(120-ttastar) shape ~ dgamma(.1, .1) rate ~ dgamma(.1, .1) } _______________ \end{verbatim} The syntax is nearly identical to JAGS and in this case you can run this as a JAGS model using the work flow described above. Save the code in {\bf WinBUGSmodel.txt} outside of R. The \verb@step@ function is 1 if its argument is positive and 0 otherwise. The parameter \verb@ht@ thus indicates whether the sampled time to acceptance is less than 120 days. Next prepare the data inputs required by the \verb@bugs@ function from the {\bf R2WinBUGS} package. This is a list containing the name of each data vector. <>= require(R2WinBUGS) N = length(tau) TTA = as.numeric(tau) data = list("TTA","N") @ Using these data together with your model, you run an MCMC simulation to get estimates for \verb@ttastar@. Beforehand you decide how many chains to run for how many iterations. If the length of the burn-in is not specified, then the burn-in is taken as half the number of iterations. You also specify the starting values for the chains. Here you do this by writing the following function <>= inits=function(){ list(shape=runif(1, .3, .5), rate=runif(1, .3, .5), ttastar = runif(1, 100, 120)) } @ You then start the MCMC by typing <>= model = bugs(data, inits, model.file="WinBUGSmodel.txt", parameters=c("ttastar", "ht", "shape", "rate"), n.chains=3, n.iter=5000, n.thin=1, bugs.directory="C:/Program Files/WinBUGS14") @ The argument \verb@bugs.directory@ must point to the directory containing the WinBUGS executable and you must have write permission in this directory. The results are saved in the bugs object \verb@model@. The \verb@bugs@ function uses the parameter names you gave in the WinBUGS text file to structure the output into scalar, vector, and arrays containing the parameter samples. The object \verb@model$sims.array@, as as example, contains the posterior samples as an array. In addition the samples are stored in a long vector. You can use the \verb@print@ method to get a summary of your results. Additionally you use the \verb@plot@ method to create graphs summarizing the posterior statistics of your parameters and displaying diagnostics relating to convergence to the chains. The chains are convergent when your samples are taken from the posterior. By default \verb@bugs@ removes the first half of the iterations as `burn-in', where the samples are moving away from the initial set of values towards the posterior distribution. Samples can be thinned if successive values are highly correlated (see Chapter~\ref{chap:bayesianmodels}). The burn-in and thinning determine the final number of samples (saved in \verb@model$n.keep@) available for inference. Here you are interested in the posterior samples of the time to acceptance \verb@ttastar@. To plot the values as a sequence (trace plot) and as a distribution, type <>= plot(model$sims.array[, 1, 1], type="l") plot(density(model$sims.array[, 1, 1])) @ \begin{figure} \centering <>= load("modelupdates.RData") sim = model$sims.array[, 1, 1] par(las=1, mex=.8, mgp=c(2.2, .5, 0), tcl=-.3) layout(matrix(c(1, 2), 1, 2, byrow=TRUE), widths=c(.53, .47)) plot(sim, type="l", xlab="Simulation number", ylab="Acceptance time [days]") grid() lines(sim) mtext("a", side=3, line=1, adj=0, cex=1.1) hist(sim, xlab="Acceptance time [days]", main="") rug(sim) mtext("b", side=3, line=1, adj=0, cex=1.1) @ \vspace{-.75cm} \caption{Posterior samples of time to acceptance. (a) Trace plot and (b) histogram.} \label{fig:UScounts} \end{figure} The trace plot shows the sequence of sample values by the simulation number. The mean and variance of the values does not appear to change across the sequence indicating the samples are coming from the posterior distribution. A view of the marginal distribution of time-to-acceptance is shown with a histogram. Since your model contains a node indicating whether the sample time to acceptance was greater than 120 days you can easily determine the posterior probability of this occurrence by typing <>= sum(model$sims.array[, 1, 2])/model$n.keep @ The answer of \Sexpr{round(sum(model$sims.array[,1,2])/model$n.keep*100,1)}\% is close to what you obtained in \S\ref{sec:tta}. Greater flexibility in summarizing and plotting your results are available with functions in the {\bf coda} package \citep{PlummerEtAl2010}. You turn on the \verb@codaPkg@ switch and rerun the simulations. <>= modelc = bugs(data, inits, model.file="WinBUGSmodel.txt", parameters=c("ttastar", "ht", "shape", "rate"), n.chains=3, n.iter=5000, n.thin=1, codaPkg=TRUE, bugs.directory="C:/Program Files/WinBUGS14") @ The saved object \verb@modelc@ is a character vector of files names with each file containing coda output for one of the three chains. You create a MCMC list object (see \S\ref{sec:jags}) with the \verb@read.bugs@ function. <>= out = read.bugs(modelc, quiet=TRUE) @ Additional plotting options are available using the {\bf lattice} package. To plot the density of all the model parameters type <>= require(lattice) densityplot(out) @ The plots contain distributions of the MCMC samples from all the model nodes and for the three chains. Density overlap across the chains is another indication that the samples converge to the posterior. This chapter introduced Bayesian statistics for hurricane climate. The focus was on the proportion of land falling hurricanes. We examined a conjugate model for this proportion. We also looked at the time to acceptance for a manuscript. The conjugate model revealed the mechanics of the Bayesian approach. We also introduced MCMC samplers. This innovation allows you greater flexibility in creating realistic models for your data. The next chapter shows you how to graph and plot your data.