



















Statistical Layers: Transforming Data Before Plotting
Every geom in ggplot2
applies a statistical transformation to processes your raw data before plotting. So far, you’ve mostly used geoms and let their default stats do the work behind the scenes. But sometimes, the defaults are not what you need.
In this lesson, we’ll explore how to explicitly control statistical transformations in layers. Understanding the link between geoms and stats — and when to switch or customize them — helps you create clearer, more advanced visualizations.
🔭 Just a Matter of Perspective
So far, you’ve worked with a bunch of geom_*()
functions — each one building a visual layer using some data, a set of aesthetic mappings, and a specification of the position.
But there’s one important argument we haven’t talked about yet.
Every geom also performs a statistical transformation — and the stat
argument determines which one. Most of the time, it uses a default silently in the background. But that transformation is always there.
And, as usual, we can override the default if we want the geom_*
function to behave differently.
By default, geom_point()
draws one point for all observations that contain a value for class
and hwy
— no transformation is applied.
Instead of changing the stat
argument in geom_*()
functions, you can also flip your perspective and start from the transformation itself.
That’s where stat_*()
functions come in: they do the same work, but from the other direction.
Instead of focusing on how to draw the data:
“I want to plot bars.”
…you’re now focusing on how to transform the data before drawing:
“I want to show my data as counts.”
As you've already learned, geom_bar()
calculates the number of observations for each group — because the default stat
is "count"
.
Oh no! 😱
It seems like you haven’t enrolled in the course yet!
Join many other students today and become the ggplot2 expert your company needs!
Or Login