Talk:Gini coefficient
From Wikipedia, the free encyclopedia
[edit] Figure Incorrect?
According to the map showing gini coefficients for all countries, Greenland has a gini coefficient that is < 0,25. However, Statistics Greenland gives these coefficient in their latest publication on income (based on 2004 data): income - 0,46 income after taxes - 0,44 disposable income (includes social benefits from the state) - 0, 41 For those of you who read Danish, see here for further info: www.statgreen.gl
I have found data from the Census Bureau that conflicts with the GINI values in the diagram. The figure appears to be incorrect. For example, the US did not have a GINI lower than .4 after 1977.
I am pretty sure this map is wrong, dated, or both. Russia has at least 40%, according to the Wiki page on Postsoviet Russia, and Hungary is certainly no greenland-like outlier in Central Europe! varbal 00:23, 12 September 2006 (UTC)
hee I have a question about calculating the gini-coeffiecient: at this site they say you can calculate the gini-coeffiecient by A/(A+B) bud my question is how do you calculate A and B?!
- How is your integral calculus? If you have curves available, you integrate the area under each curve. If no curves have been created yet, then you need to construct them. mydogategodshat 16:36, 11 May 2004 (UTC)
javascript:insertTags('Image:','','Example.jpg'); Embedded image
Yes I think what you mean is: for instance suppose, in a perfectly egalitarian sociaety that everyone has the same income. The curves are degenerate. The correct way to think of this is as probability distribution functions and the gini coefficent is a measure of non-uniformity, such as Renyi entropy (but most certainly not Shannon entropy). I'll think about this.CSTAR 22:50, 17 May 2004 (UTC)
I added some information on Gini coefficients in the U.S. It'd be better to have it for other countries as well — does anyone have that sort of data? Factitious 16:00, Oct 13, 2004 (UTC)
- Yes, they are in the UN Human development report linked in the page - I will add some... - Marcika 22:29, 18 Nov 2004 (UTC)
[edit] Long tail
I suggest that the following sentence be removed:
"There is an implication built into the Gini coefficient that a straight-line distribution is a desirable outcome, which in the newly evolving long tail economics may not be the case."
First, I see no such implication. Second, "the newly evolving long tail economics" is far from achieving widespread recognition. Third, the comment is highly speculative. TomSlee 17:50, 26 Jun 2005 (UTC)
- I strongly agree: the quoted sentence should be removed from the article. One further point. There is a big problem about which raw data should be used for calculation. In particular, survey data on household expenditures yield much higher Gin coefficients than national accounting data. There are arguments for/against each choice. This should be mentioned, and then the choice underlying the data given in the article should be stated. --Mario 12:09, 16 July 2005 (UTC)
-
- I removed the "long tail" phrase. I read the long tail article and it gave no indication of how that idea relates to the Gini coefficient and wealth distribution. AdamRetchless 18:54, 5 August 2005 (UTC)
---
What are the advantages of using Gini coefficient instead of the variance?? I think this should be pointed.
Moreover I do not understand this sentence "The small sample variance properties of G are not known, and large sample approximations to the variance of G are poor. ". Is this unclar, or is it only me?
---
Are you sure the formula is correct?
I think it should be (X_{k+1} - X_{k}) * Y_{k+1}
Y is saied to be "cumulative" already, so I dont see why you would sum Y_k and Y_k+1. alternatively you could multiply to (Y_k + y_k+1) ; where Y_k is cumulative until k and y_k+1 is the exact value for the k+1 sample
---
[edit] A simple description is missing.
I can find no place in the article that gives the value of a perfectly "equal" distribution. A reader might think it is .45 or 0 or 1. A clarification should be in the overview.
- A clarification is in the overview. The third sentence of the article reads: "The Gini coefficient is a number between 0 and 1, where 0 corresponds with perfect equality (where everyone has the same income) and 1 corresponds with perfect inequality (where one person has all the income, and everyone else has zero income)." I think it cannot be made much clearer. -- Marcika 14:35, 27 July 2005 (UTC)
[edit] abused concept
The gini is a much abused concept that this article doesnt reflect.
Firstly socialists use it to imply that unequal income distribution is bad, whereas this is absolutely not the case. In itself the use of the word "perfect" distribution implies that a perfect straight line is some of form of goal. A straight-line distribution is neither desierable nor necessarily acheivable.
Inequality is the natural product of individual choices expressed as preferences for differentiated options. See Albert-László Barabási: Linked, the new science of networks. These preferential attachments, as Barabási calls them are mathematically shown to result in power law distributions. Which are distributions of vast inequality.
The underlying principle can simple be explained as small difference in a set options yield vast differences in outcome.
The implication is very clear that inequality in society can be the natural result of fair and free trade, so the factors that alter inequality in either positive or negative are not comparable with gini.
The gini coefficient tells you nothing about any particular society other then as a fairly meaningless comparative number.
Deus777
- What about large gini coefficients, in comparison to other countries or the world average, says more than 0.5? does that tell us anything about the country? It sounds rather unhealthy. --Vsion 02:06, 26 August 2005 (UTC)
-
- no because you can't tell what is the normal degree of inequality and what factors changed it. you also have to consider the dynamic nature of a society. what is the rate of change over time? You cant compare two societies with vastly different tax system because they create inequality at different speeds, but eventually the higher taxing system will arrive at the same place as the lower taxing system. I think you can only meaningfully compare one set of data against itself over time if you know what are the causes of the differences.
-
- some distributions are extremely unequal, for example distributions of market share of search engines, but this is a good thing because it means more people are using the better search engines.
-
- A single number doesnt really tell you where in its evolution a country is or why its inequality deviates from expectation or even what its expectation is.
-
- the UK has a gini index of 36. Uzbekistan has a gini index of 26 and papua new guinea 51. yet the later are both very poor countries. You cant infer much from these numbers without looking at what is happening in each country. vietnam has an index of 36.1 which is nearly indentical to the UK. yet you would expect the UK to have a vastly different inequality to vietnam.
-
- Deus777
-
-
- If a country has a high gini index say more than .50, then i would say the country has an income distribution problem, leading to social instability and crimes. Unless it is caused by natural disaster, the problem probably indicates an unfair distribution of the country's resources (fertile land, mineral resources, govt. revenue, restriction on internal migration, ethnic discrimination, etc). Of course, we can't summarize a whole society into a single number, we alway need to examine further to better understand. --Vsion 03:35, 26 August 2005 (UTC)
-
-
-
-
- Perhaps you can point to evidence of any study that shows a corelation between a gini number and crime or social instability. Denmark has more property crime then the US and as much as the UK but has one of the lowest gini coeficients in the world. gini doesn't measure opportunity and restriction on opportunity is a better indicator of a society with a problem. Uzbekistan has a very low gini yet has a lot of social instability same with most of the former yugoslav republic. Rwanda is another example of a very low gini with catastrophic social instability.
- A counter example is Hong Kong, above .5 for a long time but the only social unrest is against china not inequality.
-
-
-
-
-
- Inequality in itself is fairly benign if opportunity is present. In fact an excess of equality can be a sign of a society with serious problems. Deus777
-
-
Anyway today is generally accepted as better a low one. Think too about envy in a not equal society.
-
-
-
-
- Granite26
- I think the issue is that according to this, a 'perfect' distribution has a doctor (8+ years of schooling) making the same income as a janitor(High School, maybe), and somebody working 50 hours a week making the same as somebody working 30.
-
-
-
[edit] VOTE!! - HDI in country infobox/template?
The Human Development Index (HDI) is a standard UN measure/rank of how developed a country is or is not. It is a composite index based on GDP per capita (PPP), literacy, life expectancy, and school enrollment. However, as it is a composite index/rank, some may challenge its usefulness or applicability as information.
Thus, the following question is put to a vote:
Should any, some, or all of the following be included in the Wikipedia country infobox/template:
- (1) Human Development Index (HDI) for applicable countries, with year;
- (2) Rank of country’s HDI;
- (3) Category of country’s HDI (high, medium, or low)?
YES / NO / UNDECIDED/ABSTAIN - vote here
Thanks!
E Pluribus Anthony 01:52, 20 September 2005 (UTC)
[edit] Effect of adding populations
User DL5MDA made some remarks on the effect of calculating the index separately for partial populations or for the whole together. They were incorrect. See for example the extreme case of two regions, each of which has perfect equality of income. However one in one region each person earns double the income of a person in the other region. Assume that equally many persons live in both regions. Now join the regions together. Everyone in the poor region will be on the left half of the curve, reaching to total 1/3 of total income. The remaining 2/3 of income are in the right part of the curve. A simple calculation shows that the index will now be 1/6 (about 17%). So merging these two populations with index 0 each, yields an index of 17% together. −Woodstone 12:14, 24 September 2005 (UTC)
[edit] Disadvantages
I don't know what this quote means:
- The Gini coefficient is an often abused measure, ie it is often used to imply that one value is better or worse then another. This is not the case as other then the very extremes in most cases there is no way to decide if any number if better or worse then any other.
Any measurement can be "abused" -- is there something about the Gini that makes it more vulnerable to abuse than any other statistic? Afelton 17:51, 1 November 2005 (UTC)
- Actually, yes; it condenses the Lorenz curve into a single number that hides a great deal of information. Extremely different shapes of Lorenz curves can give the same Gini coefficient, and those who do not understand the Gini coefficient often assume that different countries with the same Gini coefficient have similar income distributions. This is just one among the many ways in which the Gini coefficient can be used in misleading ways. The Gini coefficient can be very useful, but it needs to be properly used, and it often is not. —Lowellian (reply) 12:14, 15 March 2006 (UTC)
[edit] People or households?
The definition at the beginning of the article is:
"...It is a number between 0 and 1, where 0 corresponds to perfect equality (e.g. everyone has the same income) and 1 corresponds to perfect inequality (e.g. one person has all the income, and everyone else has zero income)." (My bold).
The definition in The Economist's Essential Economics is:
"...It varies between zero, which indicates perfect equality, with every household earning exactly the same, to one, which implies absolute inequality, with a single household earning a country's entire income." (My bold).
Is there a diffence between "people" and "households"? Which is correct? Tamino 08:04, 3 May 2006 (UTC)
- Both are incorrect, of course; the number will never reach one, even if there is a single person in a single household (though it will be very very close).
- My understanding is that, technically, the income to be used for calculating the Gini coefficient should be supplemented with an imputed income/loss of income due to other members of a household — I don't really know what's done in practice, but i wouldn't be surprised if a constant household size were assumed.
- RandomP 15:49, 1 July 2006 (UTC)
- People or Households, which is correct? It depends, neither is right or wrong all the time. It depends on how and in what context you use it. The Gini Coefficient is like any other descriptive statistic. You wouldn't ask generically: average income per household or average income per individual, which is correct? And you wouldn't hear a sports fan ask generically: which is correct, average points per game for a team or average points per game for an individual player? It depends on what you want to do. Just be careful about mixing apples and organges.
- Up to but not including 1: Random P is correct. That is addressed in the mean difference article, which is a little more technically detailed and precise than the Gini coefficient article. For example, the statement about being between 0 and 1 also depends on negative values not being allowed for the underlying measured values. -DCary 00:08, 3 July 2006 (UTC)
[edit] Calculation
The supporting details for the Brown formula don't make sense. X k is being used on the left side to denote a cumulated amount, while X m is being used on the right side to denote a non-cumulated amount. Since m runs from 1 to k, this appears to be a circular or implicit definition of X k , but it is not supposed to be. Likewise for Y k .
It would be nice to explicitly list separate formulas or explain the application of formulas for:
- a numerical approximation to the true value. (This appears to be one of the uses of the Brown formula.)
- a population (applicable especially to small populations)
- a discrete probability function (the article on the Lorenz curve does not cover this case)
- a sample from a population.
DCary 21:19, 25 May 2006 (UTC)
- You are right: Xn and Yn are used in two conflicting ways. I removed the unnecessary and faulty formulae that were added at some point in time. −Woodstone 21:48, 25 May 2006 (UTC)
- And yes, it would be interesting to see the gini coefficient of a normal distribution. Might take a while to find out. −Woodstone 21:51, 25 May 2006 (UTC)
-
- I calculated the Gini Coefficient for a normal distribution with a mean of 1 and standardard deviation of 1: G(N(1,1))= 0.56418958. That means that for an arbitrary mean m and standard deviation s, G(N(m,s)) = 0.56418958 * s / m. Not tremendously difficult if you have some of the basic formulas. I'll work on adding them to the article. −DCary 02:39, 1 June 2006 (UTC)
The statement about multiplying the Gini coefficient of a sample by n/(n-1) to get an unbiased estimator of the population value is wrong. It needs to be removed or qualified in some way. In fact, it appears not difficult to show that it is impossible in the general case to calculate from a sample an unbiased estimator for the population value. −DCary 22:33, 31 May 2006 (UTC)
The statement "large sample approximations to the variance of G are poor" needs some clarification. What is meant by "large sample approximations to the variance of G"? In what sense, by what measure are they poor? −DCary 22:33, 31 May 2006 (UTC)
I removed the Brown eponymy for the formula based on the trapezoid rule because it is a straight forward application of the trapezoid rule which is a generic math formula, the only association I could find of a Brown with the formula was in (Brown, 1994), and the formula for approximating the Gini coefficient has published uses at least as early as (Morgan, 1962). If there is a good reason to name the formula after Brown, please explain.
Material that addresses some of the other issues in this section of discussion was put in the new article about the mean difference and relative mean difference. -DCary 16:19, 27 June 2006 (UTC)
[edit] "Note how this corresponds to"
"Note how this corresponds to the lowering of the highest tax bracket, for example, from 70% in the 1960s to 35% by 2000." I don't understand this sentence. It sits outside any other paragraph. What does "this" refer to?
"This" refers to the rise in Gini coefficient from 0.394 in 1970 to 0.469 in 2005.--Patchouli 23:17, 28 October 2006 (UTC)
[edit] Credit risk use
Thank you, Bluemoose, for pointing the fact, that there is needed citation for the use of Gini coefficient in the credit risk modelling. For people working there it is one of basic tools, usually we hear at meetings things like "model has Gini of 73.46 %, it is quite well performing". But well, just quick googling of "gini coefficient credit risk" gives you so many citations... E.g. in [1] on page 14 you can find: "The K-S statistic and the Gini coefficient are common measures of a model’s ability to separate risk." Separate risk = discriminate between good and bad. And so on. It is really very basic tool. --Ruziklan 10:49, 28 September 2006 (UTC)
[edit] "optimal Gini coefficient": please defend.
The "optimal Gini-coefficient" section, at least, is in severe need of a rework: it appears to be based on a single empirical "study" which in turns appears to be concerned pretty much exclusively with claiming that the difference in economic development in the 20-year period between Sweden and Ireland is due to their different policies, and advocating the Irish policy over the Swedish one.
Common sense would suggest that the correlation demonstrated could just as easily be due to chance or, even more likely, due to a third effect not considered in the "study".
I'm not even sure the document can be used as a reliable source. It appears extremely dodgy to me. Note that the basic statement, that, all other things being equal, strengthening a redistribution system to lower the Gini coefficient below a certain value is likely to have overwhelming negative effects in some fairly simple (and working) economic models, is perfectly okay. The "Sweden would have a wonderful economy if only they raised their Gini coefficient" statement hinted at in the study and the section is quite ridiculous, though.
Statements like "Extreme egalitarianism leads to [...] corruption in the redistribution system" appear to me to warrant removal rather than qualification, simply for the way they misrepresent causality. Certainly you could criticise various redistribution systems for being amenable to corruption, and some of those might lead to extreme egalitarianism, too, but that's quite a different kettle of tea.
Again, the "study" used as a reference rings alarm bells on many fronts (typos (in addition to those spelling mistakes I assume were caused by overly direct transcriptions from the authors' first languages), the treatment of inflation, near-total lack of academic credentials for the authors). Unless a fairly good defence is coming up, I'm tempted to treat this as a vanity reference.
RandomP 00:10, 1 November 2006 (UTC)