Control parameters of different styles of DistributionChart

11

6

Using DistributionChart, you can choose among several styles aka ChartElementFunction:

ChartElementData["DistributionChart"]

{"Density", "DensityQuantile", "FadingQuantile",
 "GlassQuantile", "HistogramDensity", "LineDensity", "PointDensity", 
 "Quantile", "SmoothDensity"}

Is there a way to control their parameters? The documentation does not seem to say.


The problem is that the defaults are less than optimal for my data. Consider these:

Using SmoothDensity:

SmoothDensity

Using HistogramDensity:

HistogramDensity

Using LineDensity:

<code>LineDensity</code>

Clearly, SmoothDensity grossly misrepresents the data (just compare with the LineDensity version); the violins should all look like the ones for 2 and 9. HistogramDensity does a better job but its resolution is horrible; there are 100 data points per size, about 70 of which fall into the upper class -- that should be plenty of points to draw more bars.

I would like to tell HistogramDensity to use more/smaller bins, and/or SmoothDensity to smooth less. How is this possible?

Raphael

Posted 2012-09-25T19:19:10.473

Reputation: 509

We briefly tried writing our own ChartElementFunction, but that turned out to be very daunting indeed. – Raphael – 2012-09-25T20:17:21.170

1

Re: your comment on the blogpost, I left you a reply there, but am commenting here again because I don't know if you'll be pinged/emailed for that or not. See this question for ways to find undocumented options.

– rm -rf – 2012-10-16T16:12:42.937

Answers

16

To get the options available for various ChartElementDataFunctions you can use:

 {#, Column[ChartElementData[#, "Options"]]} & /@ 
   ChartElementData["DistributionChart"] // Grid[#, Frame -> All] &

enter image description here


For "HistogramDensity", any bin specification accepted by Histogram > MoreInformation can be used as the setting for the suboption "Bins":

enter image description here

data = Table[RandomVariate[NormalDistribution[RandomInteger[5], 1], 100], {3}];

Partition[Table[DistributionChart[data, ChartStyle -> "SolarColors",
ChartElementFunction -> (ChartElementDataFunction["HistogramDensity", "Bins" -> i]),
PlotLabel -> Row[{"\"Bins\"", "->", ToString@i}], ImageSize -> 200],
{i, {10, 5, {.3}, {0, 8, .5}, {{0, 1, 2, 5, 6, 8}},
 Automatic, "Sturges", "Scott", "FreedmanDiaconis", "Knuth", 
 "Wand", "Log",
 {"Log", "Sturges"}, {"Log", "Scott"}, {"Log", 
  "FreedmanDiaconis"}, {"Log", "Knuth"}}}], 4] //
Grid[#, Frame -> All, Spacings -> 5] &

enter image description here

... including custom bin specifications like

binFunc1 = Union[IntegerPart[#]] &;
binFunc2 = Quantile[#, {0, .05, .1, .25, .5, .75, .9, .95, 1.}] &;
binFunc3 = First[HistogramList[#, "FreedmanDiaconis"]] &;
binFunc4 = Sort@#[[RandomSample[Range@Length@#, 10]]] &;

Partition[Table[DistributionChart[data, ChartStyle -> "Rainbow",
 ChartElementFunction -> (ChartElementDataFunction[
   "HistogramDensity", "Bins" -> i]),
 PlotLabel -> Row[{"\"Bins\"", "->\n", ToString@i}], 
 ImageSize -> 300],
 {i, {binFunc1, binFunc2, binFunc3, binFunc4}}], 2] //
 Grid[#, Frame -> All, Spacings -> 5] &

enter image description here


For "Quantile", "FadingQuantile", "GlassQuantile" and "DensityQuantile", the settings for suboption "Quantile" can be either an integer n (short for the n-1 quantiles 100 i/n (i = 1, ... , n-1) or an explicit list of integers between 0 and 100. Furthermore, each of the explicitly specified quantiles can be styled individually using the suboption "QuantileStyle".

Partition[Table[DistributionChart[data, 
ChartElementFunction -> (ChartElementDataFunction["GlassQuantile",
   "Quantile" -> i,
   "QuantileStyle" -> (Directive[Thick, Hue[#/100]] & /@ i),
   "QuantileShading" -> True]),
PlotLabel -> Row[{"Quantiles:  ", ToString@i}], ImageSize -> 300],
{i, {4, {25, 50, 75}, {10, 90}, {5, 10, 25, 50, 75, 90, 95}}}], 2] //
Grid[#, Frame -> All, Spacings -> 5] &

enter image description here


The option setting for "Threshold" seems to control symmetric trimming at the two tails as the following examples suggest. (Perhaps, further fishing may reveal that it accepts additional values to control the bandwidths)

Row@Table[DistributionChart[data, 
 ChartElementFunction -> (ChartElementDataFunction["SmoothDensity", 
  "ColorScheme" -> "DeepSeaColors", "Threshold" -> i]), 
 ImageSize -> 300], {i, {.05, .1, .5}}]

enter image description here

Row@Table[DistributionChart[data, ChartStyle -> "SolarColors", 
 ChartElementFunction -> (ChartElementDataFunction["Density", "Threshold" -> i]),
 ImageSize -> 300], {i, {.05, .1, .5}}]

enter image description here

kglr

Posted 2012-09-25T19:19:10.473

Reputation: 302 076

I was about to post a "Please teach me how to catch the fish!" comment, but you were faster. Awesome, thanks a lot! – Raphael – 2012-09-26T00:32:04.183

1Unfortunately, there does not seem to be a parameter suitable for making the (smooth) charts match the data better; Threshold only controls the fraction of data used at all. (By the way, how do you find out which values are valid for e.g. Shape?) – Raphael – 2012-09-26T00:43:07.373

4@Raphael, as far as I know there is no organized documentation yet on these "Options". Some possible settings for few of these options are in the Chart Element Schemes palette as drop-down choices. – kglr – 2012-09-26T00:59:43.673