In mathematics, a moment is, loosely speaking, a quantitative measure of the shape of a set of points. The "second moment", or more specifically the "second central moment", for example, is widely used and measures the "width" (in a particular sense) of a set of points in one dimension, or in higher dimensions measures the shape of a cloud of points as it could be fit by an ellipsoid. Other moments describe other aspects of a distribution such as how the distribution is skewed from its mean. The mathematical concept is closely related to the concept of moment in physics, although moment in physics is often represented somewhat differently. Any distribution can be characterized by a number of features (such as the mean, the variance, the skewness, etc.), and the moments of a random variable's probability distribution are related to these features. The probability distribution itself can be expressed as a probability density function, probability mass function, cumulative distribution function, characteristic function, or momentgenerating function.
The first raw moment, or first moment about zero or simply the first moment, is referred to as the distribution's mean. The mean of the distribution of the random variable X, if the mean exists, is referred to with the expectation operator.
In higher orders, the central moments (moments about the mean) are more interesting than the moments about zero, because they provide clearer information about the distribution's shape.
Other moments may also be defined. For example, the nth inverse moment about zero is $E(X^\{n\})$ and the n th logarithmic moment about zero is $E(\backslash ln^n(X))$.
Significance of the moments
The nth moment of a realvalued continuous function f(x) of a real variable about a value c is
 $\backslash mu\text{'}\_n=\backslash int\_\{\backslash infty\}^\backslash infty\; (x\; \; c)^n\backslash ,f(x)\backslash ,dx.\backslash ,\backslash !$
It is possible to define moments for random variables in a more general fashion than moments for real values—see moments in metric spaces. The moment of a function, without further explanation, usually refers to the above expression with c = 0.
Usually, except in the special context of the problem of moments, the function f(x) will be a probability density function. The nth moment about zero of a probability density function f(x) is the expected value of X^{n} and is called a raw moment or crude moment.^{[1]} The moments about its mean μ are called central moments; these describe the shape of the function, independently of translation.
If f is a probability density function, then the value of the integral above is called the nth moment of the probability distribution. More generally, if F is a cumulative probability distribution function of any probability distribution, which may not have a density function, then the nth moment of the probability distribution is given by the Riemann–Stieltjes integral
 $\backslash mu\text{'}\_n\; =\; \backslash operatorname\{E\}(X^n)=\backslash int\_\{\backslash infty\}^\backslash infty\; x^n\backslash ,dF(x)\backslash ,$
where X is a random variable that has this cumulative distribution F, and E is the expectation operator or mean.
When
 $\backslash operatorname\{E\}(X^n)\; =\; \backslash int\_\{\backslash infty\}^\backslash infty\; x^n\backslash ,dF(x)\; =\; \backslash infty,\backslash ,$
then the moment is said not to exist. If the nth moment about any point exists, so does (n − 1)th moment (and thus, all lowerorder moments) about every point.
The zeroth moment of any probability density function is 1, since the area under any probability density function must be equal to one.
Significance of moments (raw, central, standardized) and cumulants (raw, standardized), in connection with named properties of distributions
Moment number 
Raw moment 
Central moment 
Standardized moment 
Raw cumulant 
Standardized cumulant

1 
mean 
0 
0 
mean 
N/A

2 
– 
variance 
1 
variance 
1

3 
– 
– 
skewness 
– 
skewness

4 
– 
– 
historical kurtosis (or flatness) 
– 
modern kurtosis (i.e. excess kurtosis)

5 
– 
– 
hyperskewness 
– 
–

6 
– 
– 
hyperflatness 
– 
–

7+ 
– 
– 
 
– 
–

Mean
The first raw moment is the mean.
Variance
The second central moment is the variance. Its positive square root is the standard deviation σ.
Normalized moments
The normalized nth central moment or standardized moment is the nth central moment divided by σ^{n}; the normalized nth central moment of x = E((x − μ)^{n})/σ^{n}. These normalized central moments are dimensionless quantities, which represent the distribution independently of any linear change of scale.
Skewness
The third central moment is a measure of the lopsidedness of the distribution; any symmetric distribution will have a third central moment, if defined, of zero. The normalized third central moment is called the skewness, often γ. A distribution that is skewed to the left (the tail of the distribution is heavier on the left) will have a negative skewness. A distribution that is skewed to the right (the tail of the distribution is heavier on the right), will have a positive skewness.
For distributions that are not too different from the normal distribution, the median will be somewhere near μ − γσ/6; the mode about μ − γσ/2.
Kurtosis
The fourth central moment is a measure of whether the distribution is tall and skinny or short and squat, compared to the normal distribution of the same variance. Since it is the expectation of a fourth power, the fourth central moment, where defined, is always nonnegative; and except for a point distribution, it is always strictly positive. The fourth central moment of a normal distribution is 3σ^{4}.
The kurtosis κ is defined to be the normalized fourth central moment minus 3 (Equivalently, as in the next section, it is the fourth cumulant divided by the square of the variance). Some authorities^{[2]}^{[3]} do not subtract three, but it is usually more convenient to have the normal distribution at the origin of coordinates. If a distribution has a peak at the mean and long tails, the fourth moment will be high and the kurtosis positive (leptokurtic); conversely, bounded distributions tend to have low kurtosis (platykurtic).
The kurtosis can be positive without limit, but κ must be greater than or equal to γ^{2} − 2; equality only holds for binary distributions. For unbounded skew distributions not too far from normal, κ tends to be somewhere in the area of γ^{2} and 2γ^{2}.
The inequality can be proven by considering
 $\backslash operatorname\{E\}\; ((T^2\; \; aT\; \; 1)^2)\backslash ,$
where T = (X − μ)/σ. This is the expectation of a square, so it is nonnegative for all a; however it is also a quadratic polynomial in a. Its discriminant must be nonpositive, which gives the required relationship.
Mixed moments
Mixed moments are moments involving multiple variables.
Some examples are covariance, coskewness and cokurtosis. While there is a unique covariance, there are multiple coskewnesses and cokurtoses.
Higher moments
Highorder moments are moments beyond 4thorder moments. As with variance, skewness, and kurtosis, these are higherorder statistics, involving nonlinear combinations of the data, and can be used for description or estimation of further shape parameters. The higher the moment, the harder it is to estimate, in the sense that larger samples are required in order to obtain estimates of similar quality. This is due to the excess degrees of freedom consumed by the higher orders. Further, they can be subtle to interpret, often being most easily understood in terms of lower order moments – compare the higher derivatives of jerk and jounce in physics. For example, just as the 4thorder moment (kurtosis) can be interpreted as "relative importance of tails versus shoulders in causing dispersion" (for a given dispersion, high kurtosis corresponds to heavy tails, while low kurtosis corresponds to heavy shoulders), the 5thorder moment can be interpreted as measuring "relative importance of tails versus center (mode, shoulders) in causing skew" (for a given skew, high 5th moment corresponds to heavy tail and little movement of mode, while low 5th moment corresponds to more change in shoulders).
Cumulants
The first moment and the second and third unnormalized central moments are additive in the sense that if X and Y are independent random variables then
 $\backslash mu\_1(X+Y)=\backslash mu\_1(X)+\backslash mu\_1(Y)\backslash ,$
and
 $\backslash operatorname\{Var\}(X+Y)=\backslash operatorname\{Var\}(X)\; +\; \backslash operatorname\{Var\}(Y)$
and
 $\backslash mu\_3(X+Y)=\backslash mu\_3(X)+\backslash mu\_3(Y).\backslash ,$
(These can also hold for variables that satisfy weaker conditions than independence. The first always holds; if the second holds, the variables are called uncorrelated).
In fact, these are the first three cumulants and all cumulants share this additivity property.
Sample moments
For all k, the kth raw moment of a population can be estimated using the kth raw sample moment
 $\backslash frac\{1\}\{n\}\backslash sum\_\{i\; =\; 1\}^\{n\}\; X^k\_i\backslash ,\backslash !$
applied to a sample X_{1},X_{2},..., X_{n} drawn from the population.
It can be shown that the expected value of the raw sample moment is equal to the kth raw moment of the population, if that moment exists, for any sample size n. It is thus an unbiased estimator. This contrasts with the situation for central moments, whose computation uses up a degree of freedom by using the sample mean. So for example an unbiased estimate of the population variance (the second central moment) is given by
 $\backslash frac\{1\}\{n1\}\backslash sum\_\{i\; =\; 1\}^\{n\}\; (X\_i\backslash bar\; X)^2\backslash ,\backslash !$
in which the previous denominator n has been replaced by the degrees of freedom n−1, and in which $\backslash bar\; X$ refers to the sample mean. This estimate of the population moment is greater than the unadjusted observed sample moment by a factor of $\backslash tfrac\{n\}\{n1\},$ and it is referred to as the "adjusted sample variance" or sometimes simply the "sample variance".
Problem of moments
Main article:
Moment problem
The problem of moments seeks characterizations of sequences { μ′_{n} : n = 1, 2, 3, ... } that are sequences of moments of some function f.
Partial moments
Partial moments are sometimes referred to as "onesided moments." The nth order lower and upper partial moments with respect to a reference point r may be expressed as
 $\backslash mu\_n^(r)=\backslash int\_\{\backslash infty\}^r\; (r\; \; x)^n\backslash ,f(x)\backslash ,dx,$
 $\backslash mu\_n^+(r)=\backslash int\_r^\backslash infty\; (x\; \; r)^n\backslash ,f(x)\backslash ,dx.$
Partial moments are normalized by being raised to the power 1/n. The upside potential ratio may be expressed as a ratio of a firstorder upper partial moment to a normalized secondorder lower partial moment. They have been used in the definition of some financial metrics, such as the Sortino ratio, as they focus purely on upside or downside.
Central moments in metric spaces
Let (M, d) be a metric space, and let B(M) be the Borel σalgebra on M, the σalgebra generated by the dopen subsets of M. (For technical reasons, it is also convenient to assume that M is a separable space with respect to the metric d.) Let 1 ≤ p ≤ +∞.
The pth central moment of a measure μ on the measurable space (M, B(M)) about a given point x_{0} in M is defined to be
 $\backslash int\_\{M\}\; d(x,\; x\_\{0\})^\{p\}\; \backslash ,\; \backslash mathrm\{d\}\; \backslash mu\; (x).$
μ is said to have finite pth central moment if the pth central moment of μ about x_{0} is finite for some x_{0} ∈ M.
This terminology for measures carries over to random variables in the usual way: if (Ω, Σ, P) is a probability space and X : Ω → M is a random variable, then the pth central moment of X about x_{0} ∈ M is defined to be
 $\backslash int\_\{M\}\; d\; (x,\; x\_\{0\})^\{p\}\; \backslash ,\; \backslash mathrm\{d\}\; \backslash left(\; X\_\{*\}\; (\backslash mathbf\{P\})\; \backslash right)\; (x)\; \backslash equiv\; \backslash int\_\{\backslash Omega\}\; d\; (X(\backslash omega),\; x\_\{0\})^\{p\}\; \backslash ,\; \backslash mathrm\{d\}\; \backslash mathbf\{P\}\; (\backslash omega),$
and X has finite pth central moment if the pth central moment of X about x_{0} is finite for some x_{0} ∈ M.
See also
References
External links
 Template:Springer
 Moments at Mathworld
 Higher Moments
This article was sourced from Creative Commons AttributionShareAlike License; additional terms may apply. World Heritage Encyclopedia content is assembled from numerous content providers, Open Access Publishing, and in compliance with The Fair Access to Science and Technology Research Act (FASTR), Wikimedia Foundation, Inc., Public Library of Science, The Encyclopedia of Life, Open Book Publishers (OBP), PubMed, U.S. National Library of Medicine, National Center for Biotechnology Information, U.S. National Library of Medicine, National Institutes of Health (NIH), U.S. Department of Health & Human Services, and USA.gov, which sources content from all federal, state, local, tribal, and territorial government publication portals (.gov, .mil, .edu). Funding for USA.gov and content contributors is made possible from the U.S. Congress, EGovernment Act of 2002.
Crowd sourced content that is contributed to World Heritage Encyclopedia is peer reviewed and edited by our editorial staff to ensure quality scholarly research articles.
By using this site, you agree to the Terms of Use and Privacy Policy. World Heritage Encyclopedia™ is a registered trademark of the World Public Library Association, a nonprofit organization.