The Theil index is a statistic used to measure economic inequality.^{[1]} It has also been used to measure the lack of racial diversity.^{[2]} The basic Theil index T_{T} is the same as redundancy in information theory which is the maximum possible entropy of the data minus the observed entropy. It is a special case of the generalized entropy index. It can be viewed as a measure of redundancy, lack of diversity, isolation, segregation, inequality, nonrandomness, and compressibility. It was proposed by econometrician Henri Theil at the Erasmus University Rotterdam.
Contents

Formula 1

Derivation from Entropy 2

Decomposability 3

Applications 4

See also 5

References 6

External links 7
Formula
The Theil index is^{[3]}

T_T=T_{\alpha=1}=\frac{1}{N}\sum_{i=1}^N \left( \frac{x_i}{\overline{x}} \cdot \ln{\frac{x_i}{\overline{x}}} \right)
where {\overline{x}} is the mean of x.
If everyone has the same income, then T_{T} gives 0 which, counterintuitively, is when the population's income has maximum disorder. If one person has all the income, then T_{T} gives the result \ln N, which is maximum order. Dividing T_{T} by \ln N can normalize the equation to range from 0 to 1.
The Theil index measures an entropic "distance" the population is away from the "ideal" egalitarian state of everyone having the same income. The numerical result is in terms of negative entropy so that a higher number indicates more order that is further away from the "ideal" of maximum disorder. Formulating the index to represent negative entropy instead of entropy allows it to be a measure of inequality rather than equality.
Derivation from Entropy
The Theil index is derived from Shannon's measure of information entropy S, where entropy is a measure of randomness in a given set of information. In information theory, physics, and the Theil index, the general form of entropy is

S = k \sum_{i=1}^N \left( p_i \log{\frac{1}{p_i}} \right) =  k \sum_{i=1}^N \left( p_i \log{p_i} \right)
where p_i is the probability of finding member i from a random sample of the population. In physics, k is Boltzmann's constant. In information theory, when information is given in binary digits, k=1 and the log base is 2. In physics and also in computation of Theil index, the natural logarithm is chosen as the logarithmic base. When p_i is chosen to be income per person x_i, it needs to be normalized by dividing by the total population income, N\overline{x}. This gives the observed entropy S_\text{Theil} of a population to be:

S_\text{Theil} = \sum_{i=1}^N \left( \frac{x_i}{N \overline{x}} \ln{\frac{N \overline{x}}{x_i}} \right)
The Theil index is T_T = S_\text{max}  S_\text{Theil} where S_\text{max} is the theoretical maximum entropy that is reached when all incomes are equal, i.e. x_i=\overline{x} for all i. This is substituted into S_\text{Theil} to give S_\text{max} = \ln N, a constant determined solely by the population. So the Theil index gives a value in terms of an entropy that measures how far S_\text{Theil} is away from the "ideal" S_\text{max}. The index is a "negative entropy" in the sense that it gets smaller as the disorder gets larger, hence it is a measure of order rather than disorder.
When x is in units of population/species, S_\text{Theil} is a measure of biodiversity and is called the Shannon index. If the Theil index is used with x=population/species, it is a measure of inequality of population among a set of species, or "bioisolation" as opposed to "wealth isolation".
The Theil index measures what is called redundancy in information theory.^{[3]} It is the left over "information space" that was not utilized to convey information, which reduces the effectiveness of the price signal. The Theil index is a measure of the redundancy of income (or other measure of wealth) in some individuals. Redundancy in some individuals implies scarcity in others. A high Theil index indicates the total income is not distributed evenly among individuals in the same way an uncompressed text file does not have a similar number of byte locations assigned to the available unique byte characters.
Notation

Information theory

Theil index T_{T}

N

number of unique characters

number of individuals

i

a particular character

a particular individual

x_i

count of ith character

income of ith individual

N\overline{x}

total characters in document

total income in population

T_T

unused information space

unused potential in price mechanism


data compression

progressive tax

Decomposability
One of the advantages of the Theil index is that it is a weighted average of inequality within subgroups, plus inequality among those subgroups. For example, inequality within the United States is the average inequality within each state, weighted by state income, plus the inequality among states.
If for the Theil index the population is divided into m certain subgroups and s_i is the income share of group i, T_{Ti} is the Theil index for that subgroup, and \overline{x}_i is the average income in group i, then the Theil index is

T_T = \sum_{i=1}^m s_i T_{T_i} + \sum_{i=1}^m s_i \ln{\frac{\overline{x}_i}{\overline{x}}}

Note: This image is not the Theil Index in each area of the United States, but of contributions to the US Theil Index by each area (the Theil Index is always positive, individual contributions to the Theil Index may be negative or positive).
The decomposition of the overall Theil index which identifies the share attributable to the betweenregion component becomes a helpful tool for the positive analysis of regional inequality as it suggests the relative importance of spatial dimension of inequality.^{[4]}
The decomposability is a property of the Theil index which the more popular Gini coefficient does not offer. The Gini coefficient is more intuitive to many people since it is based on the Lorenz curve. However, it is not easily decomposable like the Theil.
Applications
In addition to multitude of economic applications, the Theil index has been applied to assess performance of irrigation systems^{[5]} and distribution of software metrics.^{[6]}
See also
References

^ Introduction to the Theil index from the University of Texas

^ http://geodacenter.asu.edu/node/236

^ ^{a} ^{b} http://www.poorcity.richcity.org (Redundancy, Entropy and Inequality Measures)

^ Novotny, J. (2007). "On the measurement of regional inequality: Does spatial dimension of income inequality matter?" (PDF). Annals of Regional Science 41 (3): 563–580.

^ Rajan K. Sampath. Equity Measures for Irrigation Performance Evaluation. Water International, 13(1), 1988.

^ A. Serebrenik, M. van den Brand. Theil index for aggregation of software metrics values. 26th IEEE International Conference on Software Maintenance. IEEE Computer Society.
External links

Software:

Free Online Calculator computes the Gini Coefficient, plots the Lorenz curve, and computes many other measures of concentration for any dataset

Free Calculator: Online and downloadable scripts (Python and Lua) for Atkinson, Gini, and Hoover inequalities

Users of the R data analysis software can install the "ineq" package which allows for computation of a variety of inequality indices including Gini, Atkinson, Theil.

A MATLAB Inequality Package, including code for computing Gini, Atkinson, Theil indexes and for plotting the Lorenz Curve. Many examples are available.
This article was sourced from Creative Commons AttributionShareAlike License; additional terms may apply. World Heritage Encyclopedia content is assembled from numerous content providers, Open Access Publishing, and in compliance with The Fair Access to Science and Technology Research Act (FASTR), Wikimedia Foundation, Inc., Public Library of Science, The Encyclopedia of Life, Open Book Publishers (OBP), PubMed, U.S. National Library of Medicine, National Center for Biotechnology Information, U.S. National Library of Medicine, National Institutes of Health (NIH), U.S. Department of Health & Human Services, and USA.gov, which sources content from all federal, state, local, tribal, and territorial government publication portals (.gov, .mil, .edu). Funding for USA.gov and content contributors is made possible from the U.S. Congress, EGovernment Act of 2002.
Crowd sourced content that is contributed to World Heritage Encyclopedia is peer reviewed and edited by our editorial staff to ensure quality scholarly research articles.
By using this site, you agree to the Terms of Use and Privacy Policy. World Heritage Encyclopedia™ is a registered trademark of the World Public Library Association, a nonprofit organization.