Independence

Author

Russell Almond

Published

October 5, 2020

Imagine a population which is split into two groups: \(A\) and \(B\). We select 100 people at random and ask them a question, which has two answers yes and no. Define the following quantities:

Define the following values (row and column totals):

Dividing any of those numbers by \(N_{xx}\) produces a corresponding proportion \(P_{xx}\) (which can be interpreted as a probability or proportion.

Suppose group membership and the answer to the question are statistically indepedent. In the diagram below, adjust \(P_{A+}\) and \(P_{+y}\) to make a two-by-two table:

#| standalone: true
#| viewerHeight: 300
library(shiny)
N <- 100
ui <- fluidPage(
inputPanel(
  sliderInput("pa", label = "P(Member of Group A)",
              min = 0, max = 1, value = .5, step = 0.05),
  sliderInput("py", label = "P(Answered `yes`)",
              min = 0, max = 1, value = .5, step = 0.05)
),
mainPanel(
  tableOutput("table"),
  plotOutput("mosaic")))

server <- function (input,output) {

tabi <- reactive(
  matrix(N*c(input$pa*input$py,
           input$pa*(1-input$py),
           (1-input$pa)*input$py,
           (1-input$pa)*(1-input$py)),2,2,
         byrow=TRUE,
         dimnames=list(Group=c("A","B"),Answer=c("y","n")))
)
output$table <- renderTable(tabi(),rownames=TRUE, digits=1)
output$mosaic <- renderPlot(mosaicplot(tabi()),color=TRUE,main="Independent data")
}
shinyApp(ui=ui,server=server)

There are two things you should notice about the independent data.

We could say that the row and column proportions are always the same.

Another way to think about this is to say:

Dependent

To make the table dependence, we need to add another parameter to the model to specify the degree of dependence.

For a two-by-two table, the odds ratio is as fairly easy to understand choice: \[ OR = \frac{P_{Ay}/P_{An}}{P_{By}/P_{Bn}}\] When group and answer are indpendent the cross product ratio should be 1.

If Group \(A\) is more likely to answer yes, then the ratio should be bigger than 1.

If Group \(B\) is more likely to answer yes, then the ratio should be less than one.

#| standalone: true
#| viewerHeight: 300
library(shiny)
N <- 100
ui1 <- fluidPage(
inputPanel(
  sliderInput("pad", label = "P(Member of Group A)",
              min = 0, max = 1, value = .5, step = 0.05),
  sliderInput("pyd", label = "P(Answered `yes`)",
              min = 0, max = 1, value = .5, step = 0.05),
  selectInput("OR",label="Odds Ratio",
              choices=c("1/4","1/3","1/2","2/3","1","3/2","2","3","4"), selected ="1")
),
mainPanel(
  tableOutput("table1"),
  plotOutput("mosaic1")))

server1 <- function (input,output) {

tabd <- reactive({
  pa <-input$pad
  py <-input$pyd
  OR <- eval(str2lang(input$OR))
  cat(pa,py,OR,"\n")
  if (OR == 1) {
    pay <- pa*py
  } else {
    S <- sqrt((1+(pa+py)*(OR-1))^2 + 4*OR*(1-OR)*pa*py)
    cat(S,"\n")
    pay <- (1+(pa+py)*(OR-1)-S)/2/(OR-1)
    cat(pay,"\n")
  }
  matrix(N*c(pay,(pa-pay),(py-pay),(1-py-pa+pay)),
        2,2, byrow=TRUE,
        dimnames=list(Group=c("A","B"),Answer=c("y","n")))
})
output$table1 <- renderTable(tabd(),rownames=TRUE, digits=1)
output$mosaic1 <- renderPlot(mosaicplot(tabd()),color=TRUE,main="Dependent data")
}
shinyApp(ui=ui1,server=server1)