Portals
A-Z
Categories
Random
Schizophrenia
Tropical Storm Bill (2003)
Grover Cleveland
House Martin
Children's Crusade
Homeopathy
The Rocky Horror Picture Show
Nickel Creek
Loch Arkaig treasure
Tamaraw
edit page
history/authors
discussion

Theil index




The Theil index[1], derived by econometrician Henri Theil, is a statistic used to measure economic inequality.

Contents

[edit] Mathematics

The formula[2] is

Failed to parse (Missing texvc executable;

please see math/README to configure.): T_T=\frac{1}{N}\sum_{i=1}^N \left( \frac{x_i}{\overline{x}} \cdot \ln{\frac{x_i}{\overline{x}}} \right)


Failed to parse (Missing texvc executable;

please see math/README to configure.): T_L=\frac{1}{N}\sum_{i=1}^N \left( \ln{\frac{\overline{x}}{x_i}} \right)


where Failed to parse (Missing texvc executable; please see math/README to configure.): x_i

is the income of the Failed to parse (Missing texvc executable;

please see math/README to configure.): i th person, Failed to parse (Missing texvc executable; please see math/README to configure.): \overline{x}= \frac{1}{N} \sum_{i=1}^N x_i

is the mean income, and Failed to parse (Missing texvc executable;

please see math/README to configure.): N

is the number of people.  The first term inside the sum can be considered the individual's share of aggregate income, and the second term is that person's income relative to the mean.  If everyone has the same (i.e., mean) income, then the index is 0.  If one person has all the income, then the index is ln N.

The Theil index is derived from Shannon's measure of information entropy. Letting T be the Theil index and S be Shannon's information entropy measure,

Failed to parse (Missing texvc executable;

please see math/README to configure.): T=\ln(N)-S. \,


Shannon derived his entropy measure in terms of the probability of an event occurring. This can be interpreted in the Theil index as the probability a dollar drawn at random from the population came from a specific individual. This is the same as the first term, the individual's share of aggregate income.

With reference to information theory[3], Theil's measure is a redundancy rather than an entropy. The redundancy of a system at a given time is the difference between its maximum entropy and its present entropy at that time.[4]

[edit] Application of the Theil index

Theil's index takes an equal distribution for reference which is similar to distributions in statistical physics. An index for an actual system is an actual redundancy, that is, the difference between maximum entropy and actual entropy of that system.

Theil's measure can be converted[4] into one of the indexes of Anthony Barnes Atkinson. The result of the conversion also is called normalized Theil index[5]. James E. Foster[6] used such a measure to replace the Gini coefficient in Amartya Sen's welfare function W=f(income,inequality). The income e.g. is the average income for individuals in a group of income earners. Thus, Foster's welfare function can be computed directly from the Theil index T, if the conversion is included into the computation of the average per capita welfare function:

Failed to parse (Missing texvc executable;

please see math/README to configure.): W = \overline\text{income} \cdot {e^{-T_L}}.\,


Here the "Theil-L" index should be used. The difference to the "Theil-T" index will be described later.

[edit] Theil Index and Hoover Index

For the income distributions provided by the The World Income Inequality Database (2007-05) the difference between their symmetrized Theil indices and their Hoover indices are plotted over their respective Gini indices. The difference illustrates the impact of the different inequalities on the information generated by them. Negative values occur for Theil indices, which are smaller than the respective Hoover indices.
For the income distributions provided by the The World Income Inequality Database (2007-05)[7] the difference between their symmetrized Theil indices and their Hoover indices are plotted over their respective Gini indices. The difference illustrates the impact of the different inequalities on the information generated by them. Negative values occur for Theil indices, which are smaller than the respective Hoover indices.
For the following formulas, a notation[8] is used, where the amount Failed to parse (Missing texvc executable;

please see math/README to configure.): N

of quantiles only appears as upper border of summations. Thus, inequities can be computed for quantiles with different widths Failed to parse (Missing texvc executable;

please see math/README to configure.): A_i . For example, Failed to parse (Missing texvc executable; please see math/README to configure.): E_i

could be the income in the quantile #i and Failed to parse (Missing texvc executable;

please see math/README to configure.): A_i

could be the amount (absolute or relative) of earners in the quantile #i. Failed to parse (Missing texvc executable;

please see math/README to configure.): E_\text{total}

then would be the sum of incomes of all Failed to parse (Missing texvc executable;

please see math/README to configure.): N

quantiles and Failed to parse (Missing texvc executable;

please see math/README to configure.): A_\text{total}

would be the sum of the income earners in all Failed to parse (Missing texvc executable;

please see math/README to configure.): N

quantiles.

[edit] Theil Index

Computation of the (asymmetric) Theil index T [9]:

A first variant of the Theil index refers to Failed to parse (Missing texvc executable; please see math/README to configure.): E

as a base.
Failed to parse (Missing texvc executable;

please see math/README to configure.): T_T = \ln{\frac{{A}_\text{total}}{{E}_\text{total}}} - \frac{\sum_{i=1}^N {{E}_i} \ln{\frac{{A}_i}{{E}_i}}}{{E}_\text{total}}.


With normalized data, Failed to parse (Missing texvc executable; please see math/README to configure.): {{E}'_i}=E_i/E_\text{total}

and Failed to parse (Missing texvc executable;

please see math/README to configure.): {{A}'_i}=A_i/A_\text{total}

would apply. This would simplify the formula:
Failed to parse (Missing texvc executable;

please see math/README to configure.): \color{Gray} T_T = 0 - \frac{\sum_{i=1}^N {{E}'_i} \ln{\frac{{A}'_i}{{E}'_i}}}{1} = \sum_{i=1}^N {{E}'_i} \ln{\frac{{E}'_i}{{A}'_i}}


The second variant of the Theil index refers to Failed to parse (Missing texvc executable; please see math/README to configure.): A

as a base[10].
Failed to parse (Missing texvc executable;

please see math/README to configure.): T_L = \ln{\frac{{E}_\text{total}}{{A}_\text{total}}} - \frac{\sum_{i=1}^N {{A}_i} \ln{\frac{{E}_i}{{A}_i}}}{{A}_\text{total}}.


With normalized data, Failed to parse (Missing texvc executable; please see math/README to configure.): {{A}'_i}=A_i/A_\text{total}

and Failed to parse (Missing texvc executable;

please see math/README to configure.): {{E}'_i}=E_i/E_\text{total}

would apply. This would simplify the formula:
Failed to parse (Missing texvc executable;

please see math/README to configure.): \color{Gray} T_L = 0 - \frac{\sum_{i=1}^N {{A}'_i} \ln{\frac{{E}'_i}{{A}'_i}}}{1} = \sum_{i=1}^N {{A}'_i} \ln{\frac{{A}'_i}{{E}'_i}}


Computation of the symmetrized Theil index Failed to parse (Missing texvc executable; please see math/README to configure.): T_s

Failed to parse (Missing texvc executable;

please see math/README to configure.): T_s = \frac{1}{2} \left( \ln{\frac{{A}_\text{total}}{{E}_\text{total}}} - \frac{\sum_{i=1}^N {{E}_i} \ln{\frac{{A}_i}{{E}_i}}}{{E}_\text{total}} + \ln{\frac{{E}_\text{total}}{{A}_\text{total}}} - \frac{\sum_{i=1}^N {{A}_i} \ln{\frac{{E}_i}{{A}_i}}}{{A}_\text{total}} \right).


This leads to:

Failed to parse (Missing texvc executable;

please see math/README to configure.): T_s = {\frac{1}{2}} \sum_{i=1}^N \color{Blue} \ln{\frac{{E}_i}{{A}_i}} \left( \color{Black} {\frac{{E}_i}{{E}_\text{total}}} - {\frac{{A}_i}{{A}_\text{total}}} \color{Blue} \right) \color{Black}.


[edit] Hoover Index

The formula for the Hoover index (also called Robin Hood index) Failed to parse (Missing texvc executable; please see math/README to configure.): H

is:
Failed to parse (Missing texvc executable;

please see math/README to configure.): H = {\frac{1}{2}} \sum_{i=1}^N \color{Blue} \left| \color{Black} {\frac{{E}_i}{{E}_\text{total}}} - {\frac{{A}_i}{{A}_\text{total}}} \color{Blue} \right| \color{Black}.


[edit] Difference between both indexes

The difference between the Hoover index and the symmetrized Theil index only is the operation in the deviation from equity Failed to parse (Missing texvc executable; please see math/README to configure.): {E}_{i}/{E}_{total} - {A}_{i}/{A}_{total} .

A comparison of the Hoover index and the Theil index gives sense to both indices:

  • For the Hoover index, the relative deviations in each quantile are summed up. Each deviation is weighted by its own sign (+1 or −1). Thus, the Hoover index is the most simple inequality measure. It has no normative foundations and does not refer to any models from physics or information theory.
  • For the symmetrized Theil index, the relative deviations in each quantile are summed up as well. But each deviation is weighted by its relative information weight. Thus, the Theil index is an indicator not only for the plain relative inequality, it also attempts to indicate how much attention inequality can get.

[edit] Pareto principle

[edit] Understanding the range of the Theil index

The property of not being a measure with a closed scale between 0 and 1 (or 0% and 100%), like in case of the Gini index, is a barrier, which to overcome seems to be difficult even for famous scientists: Theil's index "is not a measure that is exactly overflowing with intuitive sense," wrote Amartya Sen in a book[6], in which his co-author James Foster used the Theil index nevertheless. One way to overcome this obstacle is the normalized[5] Theil index Failed to parse (Missing texvc executable; please see math/README to configure.): T_{normalized}=1-e^{-T} .

The alternative is, not to normalize the index and to use it as it is due to an interesting property of that index: For resource distributions described by only two quantiles, the Theil index is 0 for 50:50 distributions and reaches 1 at 82:18[11], which is very close to a distribution often referred to as "Pareto Principle". Higher inequities yield Theil indices above 1. This leads to a comparison, which yields to intuition:
Illustration of the relation between Theil index Failed to parse (Missing texvc executable; please see math/README to configure.): T  and the Hoover index Failed to parse (Missing texvc executable; please see math/README to configure.): H  for societies devides into two quantiles, where a share of A dollars is assigned to a share of B people and a share of B dollars is assigned to a share of A people, with A+B=1 (e.g. “Pareto Principle” with A=0.8 and B=0.2). For societies grouped in such a manner, the Theil index, the Theil index with swapped parameters and the symmetrized Theil index are equal to each other. Also the Gini Index Failed to parse (Missing texvc executable; please see math/README to configure.): G  is equal to the Hoover Index Failed to parse (Missing texvc executable; please see math/README to configure.): H .
Illustration of the relation between Theil index Failed to parse (Missing texvc executable;

please see math/README to configure.): T

and the Hoover index Failed to parse (Missing texvc executable;

please see math/README to configure.): H

for societies devides into two quantiles, where a share of A dollars is assigned to a share of B people and a share of B dollars is assigned to a share of A people, with A+B=1 (e.g. “Pareto Principle” with A=0.8 and B=0.2). For societies grouped in such a manner, the Theil index, the Theil index with swapped parameters and the symmetrized Theil index are equal to each other. Also the Gini Index Failed to parse (Missing texvc executable;

please see math/README to configure.): G

is equal to the Hoover Index Failed to parse (Missing texvc executable;

please see math/README to configure.): H

.
  • The Gini index is 0 if the distribution is completely equal. It is 1 at maximum inequality.
  • The Theil index
    • is 0 for an inequality represented by a 50:50 distribution (the distribution is completely equal),
    • is 0,5 for an inequality represented by a 74:26 distribution,
    • is 1 for an inequality represented by a 82:18[11] distribution (which is slightly above the equivalent to the frequently cited 80:20 distribution),
    • is 2 for an inequality represented by a 92:8 distribution and
    • is 4 for an inequality represented by a 98:2 distribution.

[edit] Computing the Theil index from an A:B distribution

A Theil index Failed to parse (Missing texvc executable; please see math/README to configure.): T

can be found for any A:B distribution in societies, which are split into two quantiles. The height Failed to parse (Missing texvc executable;

please see math/README to configure.): A

of the 1st quantile is the height Failed to parse (Missing texvc executable;

please see math/README to configure.): B

of the 2nd quantile. The width Failed to parse (Missing texvc executable;

please see math/README to configure.): B

of the 1st quantile is the width Failed to parse (Missing texvc executable;

please see math/README to configure.): B

of the 2nd quantile. First the Gini index Failed to parse (Missing texvc executable;

please see math/README to configure.): G

(which in this case is similar to the Hoover index) is calculated from the A:B distribution (the range of the variables is 0 to 1 instead of 0% to 100%):
Failed to parse (Missing texvc executable;

please see math/README to configure.): G=H=\left|2A-1 \right|

Then:

Failed to parse (Missing texvc executable;

please see math/README to configure.): T_T =T_L =T_s = 2 \cdot G \cdot artanh \left( G \right) .

[edit] Reverse computation

The reverse computation is a recursion:

  • Initiation:
Failed to parse (Missing texvc executable;

please see math/README to configure.): \displaystyle G = T

  • Repeat the following two operatios until the error Failed to parse (Missing texvc executable;

please see math/README to configure.): \left| 2 \cdot G \cdot artanh \left( G \right) - T \right|

is small enough:


Failed to parse (Missing texvc executable;

please see math/README to configure.): \displaystyle G_0 = G

Failed to parse (Missing texvc executable;

please see math/README to configure.): \displaystyle G = \mathrm{tanh} \left( \frac{T}{G+G_0} \right)


  • Change to the format of the "pareto-priciple":
Failed to parse (Missing texvc executable;

please see math/README to configure.): \displaystyle A:B = \left( \frac{1+G}{2} \right): \left( \frac{1-G}{2} \right)


[edit] Decomposability

One of the advantages of the Theil index is that it is a weighted average of inequality within subgroups, plus inequality among those subgroups. For example, inequality within the United States is the average inequality within each state, weighted by state income, plus the inequality among states.

If for the Theil-T index the population is divided into Failed to parse (Missing texvc executable; please see math/README to configure.): m

certain subgroups and Failed to parse (Missing texvc executable;

please see math/README to configure.): s_i

is the income share of group Failed to parse (Missing texvc executable;

please see math/README to configure.): i , Failed to parse (Missing texvc executable; please see math/README to configure.): T_{Ti}

is the Theil-T index for that subgroup, and Failed to parse (Missing texvc executable;

please see math/README to configure.): \overline{x}_i

is the average income in group Failed to parse (Missing texvc executable;

please see math/README to configure.): i , then the Theil index is

Failed to parse (Missing texvc executable;

please see math/README to configure.): T_T = \sum_{i=1}^m s_i T_{T_i} + \sum_{i=1}^m s_i \ln{\frac{\overline{x}_i}{\overline{x}}}


The formula for the Theil-L index is:

Failed to parse (Missing texvc executable;

please see math/README to configure.): T_L = \frac{1}{m} \sum_{i=1}^m T_{L_i} + \frac{1}{m} \sum_{i=1}^m \ln{\frac{\overline{x}_i}{\overline{x}}}


Map of economic inequality in the United States using the Theil Index. A high positive theil index indicates more income than population while a negative value shows more population than income. A value of zero shows equality between population and income.
Note: This image is not the Theil Index in each area of the United States, but of contributions to the US Theil Index by each area (the Theil Index is always positive, individual contributions to the Theil Index may be negative or positive).


If the aggregated groups have different amount of members, these formulas apply:

Failed to parse (Missing texvc executable;

please see math/README to configure.): T_T = \ln{\frac{{A}_\mathrm{total}}{{E}_\mathrm{total}}} - \frac{\sum_{i=1}^N {{E}_i} \left( \ln{\frac{{A}_i}{{E}_i}} - T_{T_i}\right)}{{E}_\mathrm{total}}


Failed to parse (Missing texvc executable;

please see math/README to configure.): T_L = \ln{\frac{{E}_\mathrm{total}}{{A}_\mathrm{total}}} - \frac{\sum_{i=1}^N {{A}_i} \left( \ln{\frac{{E}_i}{{A}_i}} - T_{L_i}\right)}{{A}_\mathrm{total}}


Failed to parse (Missing texvc executable;

please see math/README to configure.): T_s = {\frac{1}{2}} \sum_{i=1}^N \ln \frac{E_i}{A_i} \left( \frac{{E}_i}{E_\text{total}} - \frac{A_i}{A_\text{total}} \right) + \frac{{E}_i}{E_\text{total}} T_{T_i} + \frac{{A}_i}{A_\text{total}} T_{L_i}


The decomposability is a property of the Theil index which the more popular Gini coefficient does not offer. The Gini coefficient is more intuitive to many people since it is based on the Lorenz curve. However, it is not easily decomposable like the Theil.

[edit] Welfare Function

Amartya Sen proposed to use the Gini Index to compute a welfare function which would yield the per capita income earned by anyone who is randomly selected from a population within which the total income is distributed inequally:

Failed to parse (Missing texvc executable;

please see math/README to configure.): W_\mathrm{Gini} = \overline{\text{Income}} \cdot \left( 1-G \right)


Later James E. Foster proposed as co author in the second edition of Amartya Sen's On Economic Inequality[12] written together with Amartya Sen to use one of the entropy inequality measures from Atkinson. Due to the relation between that measure and the Theil index, Fosters proposel can be implemented by this formula:

Failed to parse (Missing texvc executable;

please see math/README to configure.): W_\mathrm{Theil-L} = \overline{\text{Income}} \cdot \mathrm{e}^{-T_L} = \frac {E_\mathrm{total}}{A_\mathrm{total}} \text{ } \mathrm{e}^{-T_L}


The same welfare function can be computed from the right term of the Theil-L formula:

Failed to parse (Missing texvc executable;

please see math/README to configure.): W_\mathrm{Theil-L} = \mathrm{e}^{\frac{\sum_{i=1}^N {{A}_i} \left( \ln{\frac{{E}_i}{{A}_i}} - T_{L_i}\right)}{{A}_\mathrm{total}}} = \prod_{i=1}^N \left( \frac{{E}_i}{{A}_i} \text{ } \mathrm{e}^{-T_{L_i}} \right)^{\frac{{A}_i}{{A}_\mathrm{total}}}


(As the Theil index is decomposavle, in this formula as well as in the following formulas Theil indices also can be specified for the individual groups. But usually that index is not known. In that case its value is zero.)

For the Welfare function, the Theil-L index is used. It yields an per capita income which is close to the lower end of middle class incomes. The inverse value of a welfare function computed with the Theil-T index yields an income, which is close to the upper end of middle class incomes:

Failed to parse (Missing texvc executable;

please see math/README to configure.): W^{-1}_\mathrm{Theil-T} = \overline{\text{Income}} \cdot \mathrm{e}^{T_T} = \frac {E_\mathrm{total}}{A_\mathrm{total}} \text{ } \mathrm{e}^{T_T} = \mathrm{e}^{\frac{\sum_{i=1}^N {{E}_i} \left( \ln{\frac{{E}_i}{{A}_i}} + T_{T_i}\right)}{{E}_\mathrm{total}}} = \prod_{i=1}^N \left( \frac{{E}_i}{{A}_i} \text{ } \mathrm{e}^{T_{T_i}} \right)^{\frac{{E}_i}{{E}_\mathrm{total}}}


Example: The average monthly per capita income before taxes in Germany (2001)[13] was 2800€. A welfare function with a Theil-L index of 0.578 yields 1570€ per month. Using a Theil-T index of 0.520, the inverse value of the monthly welfare function was 4700€. In comparison, tarif agreements between the labor union and the employers of the electrical and metal industry in Bavaria cover the salary range between 1649€ und 4000€</ref>. This example does not use welfare functions to define the bounds of middle class incomes. It just puts the welfare functions into relation to real world incomes.

Failed to parse (Missing texvc executable; please see math/README to configure.): \displaystyle W_\mathrm{Theil-L}

is one out of several possible incomes which could be earned by a person, who randomly is selected from a population with a certain distribution of incomes. Similar to the median, this welfare function marks the income, which a randomly selected person is most likely to have. This income will be smaller than the average per capita income.

Failed to parse (Missing texvc executable; please see math/README to configure.): \displaystyle W^{-1}_\mathrm{Theil-T}

is one out of several possible incomes which could be part of the income to which an Euro belongs, which randomly is selected from the sum of all incomes, which are inequally distributed. This welfare function marks the income, which a randomly selected Euro most likely belongs to. This income will be larger than the average per capita income.

[edit] See also

[edit] References

  1. ^ Introduction to the Theil index from the University of Texas
  2. ^ http://economicsbulletin.vanderbilt.edu/2008/volume15/EB-07O10036A.pdf
  3. ^ ISO/IEC DIS 2382-16:1996 Information theory
  4. ^ a b http://www.poorcity.richcity.org (Redundancy, Entropy and Inequality Measures)
  5. ^ a b Juana Domínguez-Domínguez, José Javier Núñez-Velázquez: The Evolution of Economic Inequality in the EU Countries During the Nineties, 2005
  6. ^ a b James E. Foster and Amartya Sen, 1996, On Economic Inequality, expanded edition with annexe, ISBN 0-19-828193-5
  7. ^ UNU-WIDER : Database (WIID)
  8. ^ The notation using E and A follows the notation of a small calculus published by Lionnel Maugis: Inequality Measures in Mathematical Programming for the Air Traffic Flow Management Problem with En-Route Capacities (für IFORS 96), 1996
  9. ^ (1) The first part of the formula is the maximum entropy of the E-A-system. The second part (after the minus symbol) is the real entropy of the E-A-system at a certain time. Such a difference is called redundancy (ISO/IEC DIS 2382-16, information theory).
    (2) This version of Theil's formula allows to process quantiles with different widths Failed to parse (Missing texvc executable; please see math/README to configure.): A_i . Failed to parse (Missing texvc executable; please see math/README to configure.): N only serves as summation index.
    (3) Besides mathematical comparison of this formula to the formulas found in many calculuses, you can compare the results 1A and 1B yielded by this formula with the examples 1A and 1B given in The Theoretical Basics of Popular Inequality Measures (Travis Hale, University of Texas Inequality Project, 2003).
  10. ^ Elhanan Helpman: The Mystery of Economic Growth, 2004, ISBN 0-674-01572-X (See page 150 for a similar computation of Failed to parse (Missing texvc executable; please see math/README to configure.): T_T and Failed to parse (Missing texvc executable; please see math/README to configure.): T_L by two formulas.)
  11. ^ a b Example: 82.4% of the people own 17.6% of all ressources and 17.6% own 82.4% of all ressources. For computation see also http://www.poorcity.richcity.org/calculator/?quantiles=82.4,17.6|17.6,82.4
  12. ^ James E. Foster und Amartya Sen, 1996, On Economic Inequality, expanded edition with annexe, ISBN 0-19-828193-5
  13. ^ Online Calculator: Distribution of incomes (before taxation) in Germany, 2001

Copyright © 2009. Knowledgehunter.
Other Links:
Wissen im Web
Shopping 0nline
Dictionary of Meaning