Helping ordinary people create extraordinary websites!
HOME TUTORIALS SCRIPTS WEB HOSTING BLOG FORUM
Get Our Newsletter
Email:

Take Web data analysis to the next level with PHP

By Paul Meagher
2004-06-15


Model the null hypothesis: The Chi Square statistic

So far you have summarized the results of your Web poll using a table that reports frequency counts (and percentages) for each response option. To test the null hypothesis (that no difference exists between table cell frequencies), it is easier to compute an overall measure of how much each table cell deviates from the value you would expect under the null hypothesis.

In the case of this beer poll, the expected frequency under the null hypothesis is the following:

Expected Frequency = Number of Observations / Number of Response Options
Expected Frequency = 1000 / 4
Expected Frequency = 250

To compute an overall measure of how much the responses deviate from the expected frequency per cell, you can sum up all the differences into an overall measure of how much the observed frequencies differ from the expected frequencies: (285 - 250) + (250 - 250) + (215 - 250) + (250 - 250).

If you do this, you find the the expected frequency is 0 because deviations from a mean always sum to 0. To get around this problem, square all the difference scores (hence the square in Chi Square). Finally, to make the score comparable across samples with different numbers of observations (in other words, to standardize it), divide by the expected frequency. So, the formula for the Chi Square statistic looks like this ("O" means "observed frequency" and "E" equals "expected frequency"):

Figure 1. The formula for the Chi Square statistic

If you calculate the Chi Square statistic for the beer poll data, you obtain a value of 9.80. To test your null hypothesis, you want to know the probability of obtaining a value this extreme under the assumption that it is due to random sampling variability. To find this probability, you need to understand what the sampling distribution for Chi Square looks like.

Tutorial Pages:
» Take Web data analysis to the next level with PHP
» Relate Web data to experimental design
» Examples of measurement scales
» Start with the sampling
» Test the hypothesis
» Model the null hypothesis: The Chi Square statistic
» Look at the Chi Square sampling distribution
» Chi Square instance variables
» The Constructor: Backbone of the Chi Square test
» Handle output issues
» Repoll
» Apply the knowledge
» Resources


First published by IBM developerWorks


 | Bookmark
Related Tutorials:
» Zend Framework Tutorial
» Port Scanning and Service Status Checking in PHP
» Web Database Access from Desktop Applications
» CubeCart 3.0 Installation and Configuration
» PHP Site Search Made Easy
» Installing and Configuring Drupal 6.1

Ask A Question
characters left.