Data visualization

A scatterplot shows x and y relationship, a histogram shows only x relationship where the y axis is frequency.

In this case you could bin it into say bins of 20 NK cells wide e.g 20-40 and within one bin you might have say 10 people, so rather than stack 10 dots you draw a bar that is 10 high.

Either way histogram > this funny jitter boxplot stuff, but biologists are not the best at presenting data, since their focus is biology, not data visualisation, so it's just one of those things.

You can see from their jitter plot, 250 is the 95th percentile for ME but 250 is the median / 50th percentile for HC. It's a big difference.

In the Fluge Mella study the 3 highest NK cells who were responders were above 250.
 
I think the main value of box plots (and/or jitter plots) is it’s infinitely neater to add brackets for significance especially when doing multiple group comparisons.
 
I think the main value of box plots (and/or jitter plots) is it’s infinitely neater to add brackets for significance especially when doing multiple group comparisons.
Can’t you add that to any visualisation that has multiple groups side by side if you’re talking about brackets like these at the top?IMG_0561.png
 
@jnmaciuch
Thoughts about this one?
IMG_0564.jpeg

From this paper.
Each “cloud” represents a half-violin plot showing the kernel density estimation of the scores, while the “rain” consists of individual data points jittered to reveal the underlying density of the patient population. The central boxplots indicate the median and interquartile ranges (IQR)
 
Sure, I tend to use violins often. I'm not sure if the right hand side dots are the best way to convey information about density though
I agree with the dots, I’d leave them out. But it shows that you can combine histogram-like plots with box plots, and annotations about p-values etc.
 
I agree with the dots, I’d leave them out. But it shows that you can combine histogram-like plots with box plots, and annotations about p-values etc.
Yup violin plots are definitely used for that reason--though often jitter plots convey sample density better than a violin because violins get scaled to the width of their "lane" on the x axis. If it was important to convey that one group had way more samples than another I might opt for jitter plots over this
 
Yup violin plots are definitely used for that reason--though often jitter plots convey sample density better than a violin because violins get scaled to the width of their "lane" on the x axis. If it was important to convey that one group had way more samples than another I might opt for jitter plots over this
Good point about the scaling of the X axis. It seems like you can change it in some programs so the X axis has the same amount of units for all plots instead. That would make the width if the violin with a smaller sample size a lot smaller, so it would be more difficult to see the distribution. Maybe both could be used at the same time?
 
It seems like you can change it in some programs so the X axis has the same amount of units for all plots instead. That would make the width if the violin with a smaller sample size a lot smaller, so it would be more difficult to see the distribution. Maybe both could be used at the same time?
Yeah, potentially.

I think many biology grad students will be completely upfront that the reason they keep producing the same somewhat ugly jitter plots is just because they have zero coding knowledge, know how to use one plotting software with a UI, and that's the type of plot it produces.

But just as many grad students are perfectly familiar with all the plotting options in R or python and regularly spend hours fighting with packages to figure out the past way to present their data within several logistical constraints (I am guilty of perhaps wasting way too much time on this). It's not that we've never considered a violin plot or histogram, it's just that there tend to be several reasons why box plots with jitter are the best option in many cases
 
Back
Top Bottom