Helping ordinary people create extraordinary websites!
HOME TUTORIALS SCRIPTS WEB HOSTING BLOG FORUM
Get Our Newsletter
Email:

Conduct Web experiments using PHP, Part 2

By Paul Meagher
2005-03-18


Independence model

Another model that you might want to test is one that assumes that cell probabilities are the simple product of the marginal probabilities.

pij = pi+p+j
with row marginals computed using this formula

pi+ = ni+ / N
and column marginals using this formula

p+j = n+j / N

Table 4 illustrates how to use these formulas to convert a table of frequency counts (see Table 2) to a table of response probability estimates.
Table 4. Converting observed frequencies to probability estimates
TEXT
shortlongsum
IMAGEpersonp11 = (10/18) * (8/18) = 0.2469 p12 = (10/18) * (10/18) = 0.3086p1+ = 10/18
productp21 = (8/18) * (8/18) = 0.1975p23 = (8/18) * (10/18) = 0.2469p2+ = 8/18
sump+1 = 8/18p+2 = 10/1818


You can use these probability estimates to derive the expected cell count where Eij is equal to Npi+p+j.
Table 5. Converting probability estimates to expected counts
TEXT
shortlongsum
IMAGEpersonE11 = 18 * 0.2469 = 4.4442E12 = 18 * 0.3086 = 5.554810
productE21 = 18 * 0.1975 = 3.555E22 = 18 * 0.2469 = 4.44428
sum81018


The product rule pij = pi+p+j expresses the idea of factor independence, the idea that Factor A exerts a constant factor-level effect regardless of the level of Factor B (and vice versa).

Test this "independence model" (and the expected cell counts derived from it) using the chi-square goodness-of-fit procedure. A large summed-differences score returned by the two-dimensional chi-square test procedure tells you that your factors are not independent. Your theoretical goal might then be viewed as trying to find the simplest model to explain your results.

The most complex model, called the saturated model, requires at least one parameter to represent each cell in the table. When modeling your data, your aim might be to reduce that number (use the same parameter estimate for more than one cell) while accurately accounting for the data patterns.

If your observed chi-square score is not significant (as in a null interaction), then examine each factor separately to determine whether there were any main effects and if so, what their size is. You can use the one-dimensional chi-square procedure to assess main effects (such as factor-level differences for one factor) once you recompute your cell totals by collapsing over (or ignoring) the levels of the other factor. You can think of one-dimension chi-square analysis as doing main effects analyses on the row or column marginals. The Chi1D.php and Chi2D.php classes also have a showResidualErrors() method that reports the residual error between your expected and observed counts. Examination of residuals is a critical part of the chi-square model-fitting procedure.

I use the independence model as the default model in Chi2D.php to compute the expected frequencies for use in the two-dimensional chi-square analysis. This is because the two-dimensional chi-square procedure is most commonly used in experimental contexts to test for possible interactions between your categorical variables where the null model is the factor independence model.

Tutorial Pages:
» Categorical data analysis
» 2x2 contingency tables
» Sampling model
» Discrete probability distributions
» Binomial sampling model
» Poisson sampling model
» Envisioning your results
» Eliciting your prior distribution
» Model fitting with chi-square
» Null effects model
» Independence model
» Prior model
» DOE explorer
» Explorer output
» Conclusions
» Resources


First published by IBM developerWorks


 | Bookmark
Related Tutorials:
» Zend Framework Tutorial
» Port Scanning and Service Status Checking in PHP
» Web Database Access from Desktop Applications
» CubeCart 3.0 Installation and Configuration
» PHP Site Search Made Easy
» Installing and Configuring Drupal 6.1

Ask A Question
characters left.