Clear-Sighted Statistics: An OER Textbook
Appendix 3: Common Statistical Symbols and Formulas
I. Introduction
This appendix lists common statistical symbols and formulas. The terms and formulas presented here are explained in the appropriate modules of Clear-Sighted Statistics.
II. Common Statistical Symbols and Formula
A. Module 4: Picturing Data with Tables and Charts
Symbol/Formula | Description |
2 to the k formula | 2k > n. Formula used to determine the number of categories, classes, buckets, or bins in a Frequency Distribution |
Class Interval or Width, i | |
Class Midpoint | |
f | Frequency or the number of observations |
H | The highest value in a distribution |
k | Number of categories, classes, buckets, or bins in a Frequency Distribution |
L | The smallest value in a distribution |
N | Number of observations, or items, in a population |
n | Number of observations, or items, in a sample |
% or RF | Relative frequency or the proportion of the total number of observations |
Table 1: Module 4 Symbols and Formulas
B. Module 5: Statistical Measures
Symbol/Formula | Description |
Coefficient of Variation as a percentage | |
Coefficient of Variation as an index | |
D | Stands for Decile. Deciles divide a distribution into ten groups of equal frequency |
Interquartile Range (IQR) | |
Location of a Decile | |
Lower Outlier (Extreme Lower Outlier) | |
Location of a Percentile | |
Location of a Quartile | |
X̅ (The Sample Mean, X-Bar) |
|
μ (The Population Mean, mu) |
|
X̅W (Weighted Mean) |
|
Sample Mean, Grouped Data | |
Mean Deviation (MD) or Mean Absolute Deviation (MAD) |
|
Median | M or Med or |
Mode | Mo |
P | Stands for Percentile. P75 or P75 means the 75th percentile. Percentiles divide a distribution into a hundred groups of equal frequency. |
Range | Range = H (Highest Value) – L (lowest Value) |
Q | Stands for Quartile: Q1 (1st Quartile), Q2 (2nd Quartile), Q3 (3rd Quartile) and Q4 (4th Quartile). Quartiles divide a distribution into four groups of equal frequency. |
Pearson’s Coefficient of Skewness | |
Σ | Σ (capital Greek letter Sigma). It means the operation of summation or addition |
σ (Population Standard Deviation, sigma) | |
s (Sample Standard Deviation, s) | |
σ2 (Population Variance, sigma-squared) | |
s2 (Sample Variance, s-squared) | |
Sample Standard Deviation, Grouped Data | |
Skewness or SK | |
Trimean | |
Upper Outlier (Extreme Upper Outlier) | |
X | X stands for the random variable |
Table 2: Module 5 Descriptive Statistics Measures
C. Module 6: Index Numbers
Symbol/Formula | Description |
Fisher’s Ideal Index | |
Laspeyres Index | |
Paasche Index | |
Simple Aggregate Price Index | |
Simple Index Number | |
Simple Price Index | |
Value Index |
Table 3: Module 6: Index Numbers
D. Module 7: Basic Concepts of Probability
Symbol/Formula | Description |
Bayes Theorem | |
Combinations | |
nCr | nCr is pronounced “the combination of r things selected from n things.” Note: With combinations, the order of selection matters. |
Complement Rule (Subtraction Rule) | |
Factorial Number | n! (The factorial of a non-negative integer n, denoted by n!, is the product of all positive integers less than or equal to n: |
General Rule of Addition (for non-mutually exclusive events | or Note: Ç is pronounced as “intersection.” It is the equivalent to the word “and.” |
General Rule of Multiplication (for dependent events | or |
Multiplication Formula |
|
Permutations | |
nPr | nPr is pronounced “the permutation of r things selected from n things.” Note: With permutations, the order of selection matters. |
P(A) | The probability of event “A” |
P(~A) | The probability of the event not A. This is called the complement of event A. It is sometimes written as P(AC) or P(not A). |
P(A|B) | The probability of event A given than event B has happened. This is called conditional probability. |
Special Rule of Addition (for mutually exclusive events) | or Note: È is pronounced “union” and is the equivalent to the word “or” |
Special Rule of Multiplication (for independent events | or |
Table 4: Module 7: Basic Concepts of Probability
E. Module 8: Discrete Probability Distributions
1) Mean of a Probability Distribution, μ
μ = Σ[xP(x)], found by multiplying each value by its probability and then adding the product of each value times its probability.
2) Variance of a Probability Distribution, σ2
σ2 = Σ[(X – μ)2P(x)], found by, 1) Subtract the mean from each random value, x, 2) Square
(x – μ), 3) Multiply each square difference by its probability, and 4) Sum the resulting values to arrive at σ2.
3) Standard Deviation of a Probability Distribution, σ
σ = Öσ2, the standard deviation is the positive square root of variance.
4) Binomial Probability Formula
P(x) = nCxπx(1 – π)n – x, where C denotes combinations, n is the number of trials, x is the random number of successful trials, π is the probability of a success for each trial. Note: π, or pi, is not the mathematical constant of 3.14159 that you used in your geometry class to find the circumference of a circle.
5) Mean of a Binomial Distribution
μ = nπ
6) Variance of a Binomial Distribution
μ = nπ(1 - π)
7) Hypergeometric Distribution
Where N is the size of the population; S is the number of successes in the population; x is the number of successes (It could be 0, 1, 2, 3, 4, …); n is the size of the sample (number of trials); and C is the combinations.
8) Poisson Distribution
Where μ is the mean number of successes in a particular interval; e is the constant or base of the Naperian logarithmic system, 2.71828’ x is the number of successes; and P(x) is the probability of a specified value of x.
9) Mean of a Poisson Distribution
μ = nπ
F. Module 9: Continuous Probability Distributions
Symbol/Formula | Description |
Solving for X | X = μ + zσ Note: z can be either a positive or negative number. |
Standard Error for the Mean, sigma sub x-bar or SEM | |
Standard Normal Value | |
z-value, μ and σ known |
Table 5: Module 9: Continuous Probability Distribution
G. Module 10: Sampling and Sampling Errors
Symbol/Formula | Description |
Mean of the Sample Means (mu sub x-bar) | |
Sampling Error | X̅ - μ = 0 or X̅ ≠ μ |
Standard Error of the Mean, SEM, or | |
z-value for sample |
Table 6: Module 10: Sampling and Sampling Errors
H. Module 11: Confidence Intervals
Symbol/Formula | Description |
c | The selected confidence level; usually 95%, but in some cases 99% or 90%. |
Confidence Interval for Means using t | |
Confidence Interval for Means using z | |
Confidence Interval for Proportions | |
Critical Value | The value a test statistic must exceed to be out of the confidence interval or the value a test statistics must exceed to reject the Null Hypothesis. A test statistic is a value derived from a sample for the purposes of hypothesis testing and confidence intervals. Do not report the Critical Value as CV. CV is the Coefficient of Variance. |
d.f., df, or ν (the lower-case or small Greek letter nu) | Note: The formula for degrees of freedom depends on the type of distribution used. |
Margin of Error for the Mean using t | |
Margin of Error for the Mean using z | |
Population Proportion | Population Proportion = π. Some use a capital P to symbolize the Population Proportions. In Clear-Sighted Statistics population parameters are always symbolized with Greek letters. |
Sample Proportion, p | Sample Proportion = p (a lower-case p). A commonly used symbol for the sample proportion is p-hat, |
Sample Proportion formula | |
Standard Error of the Mean ( | |
Standard Error for the Proportion (σp or SEP) |
Table 7: Module 11: Confidence Intervals
I. Module 12: Estimating Sample Size
Symbol/Formula | Description |
Estimating Sample Size for the Mean | |
Estimating Sample size for the Proportion |
Table 8: Module 12: Estimating Sample Size
J. Module 13: Introduction to Null Hypothesis Significance Testing
Symbol/Formula | Description |
α (alpha) | The level of significance. The level of significance is selected by the researcher or analyst. Alpha is also the likelihood of a Type I Error. |
P(α) | The probability of a Type I Error, or rejecting a Null Hypothesis when we should fail to reject it. |
β (beta) | A Type II Error or failing to reject a Null Hypothesis that should be rejected. |
H0 | The Null Hypothesis. H0 is pronounced “H sub-zero” or “H sub naught.” H0 is a hypothesis about a population parameter. The Null Hypothesis states that there is no effect. Any difference between the parameter and the statistic is due to sampling error. |
H1 or HA | The Alternate Hypothesis, sometimes called the Research Hypothesis. The Alternate Hypothesis is pronounced “H sub-one” when the H1 symbol is used or “H sub-A” when the HA symbol is used. Like the Null Hypothesis, the Alternate Hypothesis is a statement about a population parameter. The Alternate hypothesis states that there is an effect, which means the difference between the parameter and statistic is too big to have occurred by chance. |
p-value | The p-value represents the likelihood of obtaining a test statistic as extreme or more extreme than the one obtained. If the p-value is greater than the level of significance, fail to reject the Null Hypothesis. When the p-value is equal to or less than the level of significance, reject the Null Hypothesis. |
Statistical Power |
Table 9: Module 13: Introduction to Null Hypothesis Significance Testing
K. Module 14: One-Sample Tests of Hypothesis (Normal and Student t Distributions)
Symbol/Formula | Description |
Cohen’s d Effect Size | |
delta, δ, for the mean | |
delta, δ, for the proportion | |
One-Sample test for the Mean when σ is known | |
One-Sample test for the Mean when σ is unknown | |
One-Sample test for the Proportion | |
Probability of a Type II Error P(β) | |
Statistical Power |
Table 10: Module 14: One-Sample Tests of Hypothesis
L. Module 15: Two-Sample Tests of Hypothesis (Normal and Student t Distributions)
Symbol/Formula | Description |
Cohen’s d | |
Cohen’s h (ES for population) | |
d | The difference between paired or dependent samples. |
| The mean of the difference between paired or dependent samples. |
df for Unequal Variance t-test | |
F-Test for comparing two sample variances | |
Paired t-test for dependent samples | |
Pooled Proportion | |
Pooled Variance | |
Pooled Variance t-test for Means (equal Variance) | |
Pooled Standard Deviations | |
Two-sample z-test of Means | |
Two-sample z-test of Proportions | |
Two-Sample t-test for Means (Unequal Variance) | |
Variance of the Distribution of differences in Means |
Table 11: Module 15: Two-Sample Tests of Hypothesis
M. Module 16: ANOVA
Symbol/Formula | Description |
Confidence Interval for difference in Treatment Means | |
Eta-squared, η2, Effect Size | |
Sum of Square, error | |
Sum of Square, total | |
Sum of Square, treatment |
Table 12: Module 16 ANOVA
N. Module 17: Chi-Square Tests
Symbol/Formula | Description |
Chi-Square (χ2) Goodness of Fit Test | Where fo stands for the Observed Frequencies for each category and fe stands for the Expected Frequencies for each category. |
Chi-Square Expected Frequency for a Contingency Table | |
Cohen’s w Effect Size | |
Degrees of Freedom for a Goodness of Fit | df = k – 1 |
Degrees of Freedom for a contingency table | |
Degrees of Freedom for a Goodness of Fit test for Normality | df = k – 3 (The two extra degrees of freedom are needed because we use the sample mean and sample standard deviation.) |
Table 13: Module 17 Chi-Square Tests
O. Module 18: Linear Correlation and Regression
Symbol/Formula | Description |
Coefficient of Correlation or r | |
Coefficient of Determination or r2 | or |
Confidence Interval | |
Intercept of the Regression Line | |
Linear Regression Equation (y-hat) | |
Prediction Interval | |
ρ, or the lower-case Greek letter rho | |
Slope of the Regression Line | |
Standard Error of the Estimate | |
Test for Zero Slope | |
Test statistics for the significance of r |
Table 14: Module 18 Linear Correlation and Regression
P. Microsoft Excel Statistical Functions
Analysis ToolPak
Anova: Single Factor
Correlation
Descriptive Statistics
F-Test Two-Sample for Variance
Histogram
Moving Average
Rank and Percentile
Regression
Sampling
t-Test: Paired Two Samples for Means
t-Test: Two-Sample Assuming Equal Variances
t-Test: Two-Sample Assuming Unequal Variances
z-Test: Two Sample for Means
Math Functions
ABS, POWER, ROUND, ROUNDDOWN, ROUNDUP, SQRT, SUM, SUMIF, SUMIFS, SUMPRODUCT, SUMSQ
Frequency Distribution Functions:
FREQUENCY
Descriptive Statistics Functions:
AVEDEV, AVERAGE, AVERAGEA, AVERAGEIF, AVERAGEIFS, COUNT, COUNTA, COUNTBLANK, CCOUNTIF, COUNTIFS, FREQUENCY, GEOMEAN, HARMEAN, KURT, LARGE, MAX, MAXA, MAXIFS, MEDIAN, MIN, MINA, MINIFSM MODE.MULT, MODE.SNGL, PERCENTILE.EXC, PERCENTILE.INC, PRECENTRANK.EXC, PERCENTRANK.INC, QUARTILE.EXC, QUARTILE.INC, RANK.AVG, RANK.EQ, SKEW, SKEW.P, SMALL, STDEV.P, STEVA, STDEVPA STDEV.S, VAR.P, VAR.S
Probability Functions:
COMBIN, FACT, PERMUT, PROB
Binomial Distribution Functions:
BINOM.DIST, BINOMI.INV
Exponential Distribution Functions:
EXPON.DIST
Hypergeometric Distribution Functions:
HYPOGEOM.DIST
Poisson Distribution Functions:
POISSON.DIST
Normal Distribution Functions:
NORMAL.DIST, NORM.INV, NORM.S.INV, NORM.SINV, STANDARDIZE
t Distribution Functions:
T.DIST, T.DIST.2T, T.DIST.RT, T.INV, T.INV.2T, and T.TEST
Confidence Interval Functions:
CONFIDENCE.NORM and CONFIDENCE.T
F Distribution Functions:
FDIST, FDISTRT, FINV, FINVRT
Chi-Square Functions:
CHISQ.DIST, CHISQ.DISTRT, CHISQ-INV, and CHISQ-INVRT
Exponential Distribution Functions:
EXPON.DIST
Correlation and Regression Functions:
CORREL, DEVEQ, INTERCEPT, LINEST, PEARSON, RSQ, SLOP, STEYX, TREND
Q. Greek Letters Commonly Used in Statistics
Greek | Upper | Lower Case |
|
Alpha | Α | α | α = level of significance, Type I Error |
Beta | Β | β | β = Type II Error; ١ – β = Power of the test |
Gamma | Γ | γ | |
Delta | Δ | δ | |
Epsilon | Ε | ε | |
Zeta | Ζ | ζ | |
Eta | Η | η | Eta-Squared Effect Size, |
Theta | Θ | θ | |
Iota | Ι | ι | |
Kappa | Κ | κ | |
Lambda | Λ | λ | |
Mu | Μ | μ | μ = population mean |
Nu | Ν | ν | ν = degrees of freedom (df) |
Xi | Ξ | ξ | |
Omicron | Ο | ο | |
Pi | Π | π | π = population proportion |
Rho | Ρ | ρ | ρ = linear correlation of a population |
Sigma | Σ | σ | Σ = “Sum of” or summation; σ2 = population variance; σ = population standard deviation |
Tau | Τ | τ | |
Upsilon | Υ | υ | |
Phi | Φ | φ | |
Chi | Χ | χ | Chi Square statistics (χ2) |
Psi | Ψ | ψ | |
Omega | Ω | ω |
Except where otherwise noted, Clear-Sighted Statistics is licensed under a
Creative Commons License. You are free to share derivatives of this work for
non-commercial purposes only. Please attribute this work to Edward Volchok.