Notation and symbology

Navigation:  Introduction >

Notation and symbology

Previous pageReturn to chapter overviewNext page

In order to clarify the expressions used here and elsewhere in the text, we use the notation shown in the table below. Italics are used within the text and formulas to denote variables and parameters. Typically in statistical literature, the Roman alphabet is used to denote sample variables and sample statistics, whilst Greek letters are used to denote population measures and parameters. An excellent and more broad-ranging set of mathematical and statistical notation is provided on the Wikipedia site.

Notation used in this Handbook

Item

Description

[a,b]

A closed interval of the Real line, for example [0,1] means the infinite set of all values between 0 and 1, including 0 and 1

(a,b)

An open interval of the Real line, for example (0,1) means the infinite set of all values between 0 and 1, NOT including 0 and 1. This should not be confused with the notation for coordinate pairs, (x,y), or its use within bivariate functions such as f(x,y) - the meaning should be clear from the context

{xi}

A set of n values x1, x2, x3, … xn, typically continuous interval- or ratio-scaled variables in the range (‑∞,∞) or [0,∞). The values may represent measurements or attributes of distinct objects, or values that represent a collection of objects (for example the population of a census tract)

{Xi}

An ordered set of n values X1, X2, X3, … Xn, such that XiXi+1 for all i

X,x

The use of bold symbols in expressions indicates matrices (upper case) and vectors (lower case)

{fi}

A set of k frequencies (kn), derived from a dataset {xi}. If {xi} contains discrete values, some of which occur multiple times, then {fi} represents the number of occurrences or the count of each distinct value. {fi} may also represent the number of occurrences of values that lie in a range or set of ranges, {ri}. If a dataset contains n values, then the sum ∑fi=n. The set {fi} can also be written f(xi). If {fi} is regarded as a set of weights (for example attribute values) associated with the {xi}, it may be written as the set {wi} or w(xi). If a set of frequencies, {fi}, have been standardized by dividing each value fi by their sum, ∑fi then {fi} may be regarded as a set of estimated probabilities and ∑fi=1

Summation symbol, e.g. x1+x2+x3+… +xn. If no limits are shown the sum is assumed to apply to all subsequent elements, otherwise upper and/or lower limits for summation are provided

Set intersection. The notation P(AB) is used to indicate the probability of A and B

Set union. The notation P(AB) is used to indicate the probability of A or B

Δ

Set symmetric difference. The set of objects in A that are not in B plus the set of objects in B that are not in A

Integration symbol. If no limits are shown the sum is assumed to apply to all elements, otherwise upper and/or lower limits for integration are provided

Product symbol, e.g. x1x2x3∙ … xn. If no limits are shown the product is assumed to apply to all subsequent elements, otherwise upper and/or lower limits for multiplication are provided

^

Hat or carat symbol: used in conjunction with Greek symbols (directly above) to indicate a value is an estimate of a parameter or the true population value

Tends to, typically applied to indicate the limit as a variable tends to 0 or ∞

Solidus or overbar symbol: used directly above a variable to indicate a value is the mean of a set of sample values

~

Two meanings apply, depending on the context: (i) "is distributed as", for example y~N(0,1) means the variable y has a distribution that is Normal with a mean of 0 and standard deviation of 1; (ii) negation, as in ~A means NOT A, or sometimes referred to as the complement of A. Note that the R language uses this symbol when defining regression models

!

Factorial symbol. z=n! means z=n(n‑1)(n‑2)…1. n>=0. Note that 0! is defined as 1. Usually applied to integer values of n. May be defined for fractional values of n using the Gamma function. Note that for large n Stirling's approximation may be used. R: factorial(n) – computes n!; if a range is specified, for example 1:5 then all the factorials from 1 to 5 are computed

Binomial expansion coefficients, also written as nCr, or similar, and shorthand for

n!/[(n-r)!r!].

‘Equivalent to’ symbol

‘Approximately equal to’ symbol

Proportional to

‘Belongs to’ symbol, e.g. x[0,2] means that x belongs to/is drawn from the set of all values in the closed interval [0,2]; x{0,1} means that x can take the values 0 and 1

Less than or equal to, represented in the text where necessary by <= (provided in this form to support display by some web browsers)

Greater than or equal to, represented in the text where necessary by >= (provided in this form to support display by some web browsers)

Floor function. Interpreted as the largest integer value not greater than x. Sometimes, but not always, implemented in software as int(x), where int() is the integer part of a real valued variable

Ceiling function. Interpreted as the smallest integer value not less than x. Sometimes, but not always, implemented in software as int(x+1), where int() is the integer part of a real valued variable

A|B

"given", as in P(A|B) is the probability of A given B or A conditional upon B

 

References

Wikipedia: Table of mathematical symbols: http://en.wikipedia.org/wiki/Table_of_mathematical_symbols