Take Web data analysis to the next level with PHP
By Paul Meagher2004-06-15
Chi Square instance variables
The php-based Chi Square software package I developed consists of classes for analyzing frequency data that is classified along one or two dimensions (ChiSquare1D.php and ChiSquare2D.php). I'll limit my discussion to explain how the ChiSquare1D.php class works and how it can be applied to one-dimensional Web poll data.
Before moving, I should note that classifying data along two dimensions (for instance, beer preference by gender) allows you to begin to explain your outcomes by looking for systematic relationships, or conditional probabilities, among your "contingency" table cells. While much of the discussion that follows will help you to understand how the ChiSquare2D.php software works, additional experimental, analysis, and visualization issues that are not discussed in this article are necessary to address before using this class.
Listing 3 looks at a fragment of the ChiSquare1D.php class which consists of:
1. A file that is included
2. The class instance variables
Listing 3. Fragment of Chi Square class with included file and instance variables
<?php
// ChiSquare1D.php
// Copyright 2003, Paul Meagher
// Distributed under LGPL
require_once PHP_MATH . "dist/Distribution.php";
class ChiSquare1D {
var $Total;
var $ObsFreq = array(); // Observed frequencies
var $ExpFreq = array(); // Expected frequencies
var $ExpProb = array(); // Expected probabilities
var $NumCells;
var $ChiSqObt;
var $DF;
var $Alpha;
var $ChiSqProb;
var $ChiSqCrit;
}
?>
A file called Distribution.php is included at the top of this script in Listing 3. The included path incorporates a PHP_MATH constant set in an init.php file that is assumed to have been included in a calling script.
The included file, Distribution.php, contains methods that generate sampling-distribution statistics for several commonly used sampling distributions (Student T, Fisher F, Chi Square). The ChiSquare1D.php class needs access to the Chi Square methods in Distribution.php to compute the tail probability of an obtained Chi Square value.
The list of instance variables in this class is worth noting because they define the result object that is generated by the analysis procedure. This result object contains all the important details about the test, including three critical Chi Square statistics -- ChiSqObt, ChiSqProb, and ChiSqCrit. For details on how each instance variable is computed, you can look at the constructor method for the class where all these values are derived.
Tutorial Pages:
» Take Web data analysis to the next level with PHP
» Relate Web data to experimental design
» Examples of measurement scales
» Start with the sampling
» Test the hypothesis
» Model the null hypothesis: The Chi Square statistic
» Look at the Chi Square sampling distribution
» Chi Square instance variables
» The Constructor: Backbone of the Chi Square test
» Handle output issues
» Repoll
» Apply the knowledge
» Resources
First published by IBM developerWorks
