Wednesday, November 17, 2021

A simple R/Shiny app to illustrate two properties of means and medians

 Recently I saw two interesting discussions on Twitter that had to do with means and medians and the relationship between both. Both were in Dutch. The first one was between Casper Albers (@CaAl) and Zihni Özdil (@ZihniOzdil) about student loans (see here). The bottomline is dat Casper Albers reminded us that a positive stochastic variable can never have a median that is more than twice the mean.

The second discussion was between Joël De Ceulaer (@jdceulaer), Karel Anthonissen (@KAnthonissen),  Youssef Kobo (@Youssef_Kobo) and Koen Fillet (@filletk) about young people buying houses and the support they get from their parents (see here). That discussion also involved means and medians.  I mentioned that if a distribution has a finite variance, the absolute value of the difference between mean and median is at most equal to the standard deviation  (see here). 

These two properties are not well know and suprising to some. I will admit that I only learned about them many years after I had left university and even then rather by accident.

To illustrate this I made a small R/Shiny App that allows you to flexibelly specify a distribution. The app will then generate some data according to that distribution, calculate mean and median, and plot the results to show that the properties hold. 

To specify the distribution I use a mixture of two 5-parameter beta-distributions. The first component of the mixture distribution specifies the bulk of the distribution. Optionally one can use the second component to specify outliers that are much further away from the first component. 

You can access the R/Shiny here. Enjoy!