Health Statistics & Information Systems
An in-depth, comprehensive masterclass covering sources of health data, presentation methods, statistical analysis, sampling, and significance testing.
Sources of Health Information & Health Information System
Health statistics form the empirical foundation of public health. To measure the health status of a community, plan healthcare delivery, and evaluate interventions, reliable data is paramount. The Health Information System (HIS) is a mechanism designed for the collection, processing, analysis, and transmission of information required for organizing and operating health services.
Primary Sources of Health Information
The primary sources of data include the Census (a decennial total population count), Registration of Vital Events (births, deaths, marriages), and the Sample Registration System (SRS) which provides reliable estimates of birth and death rates at national and state levels. Other critical sources include epidemiological surveillance, hospital records, and disease registries.
Flowchart: Components of a Health Information System
50 Indepth Bullet Points on Sources & HIS
Tabulation, Graphic and Diagrammatic Presentation of Data
Once data is collected, it must be organized to be comprehensible. Tabulation is the systematic arrangement of data in rows and columns. Good tabulation is concise, self-explanatory, and follows strict rules regarding titles, stubs (row headings), and captions (column headings).
Graphic and Diagrammatic Methods
Depending on whether the data is quantitative (continuous/discrete) or qualitative, different visual tools are used. For continuous quantitative data, Histograms, Frequency Polygons, and Ogive curves are standard. For qualitative or discrete data, Bar Charts, Pie Charts, and Pictograms are preferred.
| Age Group (Years) | Frequency (Cases) | Cumulative Freq. |
|---|---|---|
| 0 – 10 | 15 | 15 |
| 10 – 20 | 25 | 40 |
| 20 – 30 | 40 | 80 |
| 30 – 40 | 20 | 100 |
Simple Bar Chart vs Pie Chart
50 Indepth Bullet Points on Tabulation & Presentation
Statistical Methods: Central Tendency & Variability
To summarize large datasets, we use single numerical values that represent the dataset. These are the Measures of Central Tendency: the Mean (arithmetic average), the Median (middle value), and the Mode (most frequent value).
However, central tendency alone doesn’t show how scattered the data is. For this, we use Measures of Variability (Dispersion). The most common are Range, Mean Deviation, Variance, and Standard Deviation (SD). The coefficient of variation (CV) is used to compare the relative variability of two different distributions.
Key Formulas
Mean (\(\bar{x}\)): $$ \bar{x} = \frac{\sum x}{n} $$
Standard Deviation (\(s\)): $$ s = \sqrt{\frac{\sum (x – \bar{x})^2}{n-1}} $$
Coefficient of Variation (CV): $$ CV = \left(\frac{s}{\bar{x}}\right) \times 100 $$
Normal Distribution Curve
50 Indepth Bullet Points on Statistical Methods
Sampling Size, Survey, Significance, Correlation & Regression
Studying an entire population is often impossible, hence we take a Sample. Sample size determination depends on the desired confidence level, margin of error, and estimated prevalence. Sampling techniques are divided into Probability (Random, Stratified, Cluster) and Non-Probability (Convenience, Purposive).
Tests of Significance (like t-test, Chi-square, Z-test) help determine if observed differences are true or due to chance. Correlation measures the strength of the relationship between two variables, while Regression predicts the value of one variable based on another.
Sampling Methods Tree
- Simple Random
- Systematic Random
- Stratified Random
- Cluster Sampling
- Convenience
- Purposive / Judgment
- Quota
- Snowball
50 Indepth Bullet Points on Sampling, Testing & Regression
Comprehensive 100 MCQs Assessment
Test your knowledge across all four modules. Select your answers and click “Submit Quiz” at the bottom to check your score.