Statistical analysis

Many find the statistical analysis the most difficult part of research. It is also the most commonly criticised part of papers written by other clinicians.

There are many useful books about statistics which can be consulted; if in any doubt a statistician will be pleased to give assistance in analysing research. Statisticians like to be con­sulted before research has been conducted rather than being

presented with the results at the end of the trial; they often give helpful advice over study design.

The following terms are frequently used when summarising statistical data:

Mean: the result of dividing the total by the number of observations (the average).

Median: the middle value with equal number of observations above and below — used for numerical or ranked data.

Mode: the value with the highest frequency observed — used for nominal data collection.

Range: the largest to the smallest value.

The most important decision for analysis is whether the distribution of results is normal (parametric). Normally distributed results have a symmetrical, bell-shaped curve, and the mean, median and mode all lie at the same value. Some types of data, such as blood group, are not normally distributed and require other methods of analysis (nonpara­metric).

Parametric tests

When the results are normally distributed a t-test can be used to compare the outcome between intervention and control groups. Confidence intervals are the best guide to the possible range in which the true differences are likely to lie. A confidence interval that includes zero usually implies a lack of statistical significance.

Nonparametric tests

Statistical tests such as the chi-squared test, Wilcoxon Signed Rank test (single sample) and the Mann—Whitney U-test (compares two samples) can be used because they make no assumptions about the underlying population distribution.

Scientists usually employ P values to describe statistical chance. A P value <0.05 is taken to imply a true difference. It is important not to forget that P = 0.05 simply means there is only a one in 20 chance that the differences between the variables happened by chance. If enough variables are exam­ined in any study significant differences will occur simply due to chance.

Statistics simply deal with the chance that observations between populations are different. Clinical results should show clear differences. If statistics are required to demonstrate differences between results, they are unlikely to have major clinical significance.

Computer software packages available

Statistical computer packages offer a quick way to analyse descriptive statistics such as mean, median and the range, as well as the most commonly used statistical tests such as the chi-squared test. Various packages are available commercially and are useful tools in data analysis.