Pages

Monday, February 24, 2014

Introducing Histogram


A histogram is a graphical representation of statistical frequency distribution. It is usually used for continuous variable. The data is represented as rectangles of varying heights and constant width. The width is actually the class interval of the data set. The height of each rectangle is proportional to the frequency of the class that it represents.

The following histogram example will help you understand how to construct a histogram.

Sample problem:
The following data table gives the frequency distribution of miles per gallon of fuel of 17 persons using a particular car model. 


Miles per gallon    Frequency

0 – 5 miles                 0
5 – 10 miles               1
10 – 15 miles             2
15 – 20 miles             4
20 – 25 miles             4
25 – 30 miles             2
30 – 35 miles             2
35 – 40 miles             1
40 – 45 miles             1
45 – 50 miles             0

Make a histogram representing the above data.

Solution: Since in this case we are directly given the frequency distribution, the construction of the graphical representation is relatively easy. However if we are only given the data set, then we first need to make the frequency distribution table as given above for it before we can construct the histogram. The graph for this data would look as follows:




Note that the width of each of the rectangles is same. That is because the class intervals given to us in the data have equal widths. The heights of the rectangles are proportional to the corresponding frequencies.

Histograms are extremely useful tools for graphical representation of data, specially when our target viewers are laymans and not statisticians. It has a better impact that the tabular form of frequency distribution table. It is a predominantly convenient method of representing a frequency distribution. It gives the viewer a gist of the underlying frequency curve of the variable under study. There are also some statistical measures (parameters) that can be found (or calculated) using a histogram. It simplifies comparison between frequencies of different classes. It is easier to compare as it is in the form of a diagram.

Histogram analysis can be done by visual inspection. Let us take a look at the following histogram as an example.


 

The above picture represents the heights of 30 people. We can see from the picture that most of these 30 persons fall between the height of 149.5 to 159.5 cm. Thus we can say that the mode of the given data is 9. (Recollect that earlier in this article I had said that we could find some statistical parameter from such a graphical representation of data. One of them is mode that we just found). We can also see that the minimum frequency is 1. That means there is only one person whose height is between 189.5 to 199.5 cm. We know that the total number of people is 30. Therefore the 15th person would have the median height. In this case, the first bar has 6 persons, the second bar as 9 persons. 6 + 9 = 15. Therefore the median height would be the end of the second bar, which is 159.5 cm.

This and a lot more information can be obtained from a histogram. This is the most commonly used tool in corporate reports and government censuses.

No comments:

Post a Comment