Session 5: Probability - Module D: Plotting Sampling Outcomes

1. Overview

Learn about how evenly distributed samples behave.
Use Python to plot sampled data.

2. Setting Up

Download sampleData.py:

  1. Download the python program sampleData.py into your programs directory.
  2. Download the dataset heightInches.csv into your programs directory.

Start IPython:

  1. Open a terminal window.
  2. Change the directory to your programs directory with the cd command.
  3. Start IPython with the ipython command.
  4. From IPython, you can run your program by typing run sampleData.py.

3. Plotting Sampling Outcomes

Start IPython and run the sampleData program; a histogram will be generated. Looking at the code in gedit, can you determine what the figure represents? How did you know?

Looking at the figure, what distribution does the sampled data appear to have?

There are a few configuration variables that control how the data is collected. These variables are barWidth, numBars, numRepetitions, numSamples. Try experimenting with the values! How does each change influence the histogram?

Remember to save one of your histograms, and answer these questions on your webpage!