Title: | Create Elegant Data Visualisations Using the Grammar of Graphics |
---|---|
Description: | A system for 'declaratively' creating graphics, based on "The Grammar of Graphics". You provide the data, tell 'ggplot2' how to map variables to aesthetics, what graphical primitives to use, and it takes care of the details. |
Authors: | Hadley Wickham [aut] , Winston Chang [aut] , Lionel Henry [aut], Thomas Lin Pedersen [aut, cre] , Kohske Takahashi [aut], Claus Wilke [aut] , Kara Woo [aut] , Hiroaki Yutani [aut] , Dewey Dunnington [aut] , Teun van den Brand [aut] , Posit, PBC [cph, fnd] |
Maintainer: | Thomas Lin Pedersen <[email protected]> |
License: | MIT + file LICENSE |
Version: | 3.5.1.9000 |
Built: | 2024-09-16 09:21:05 UTC |
Source: | https://github.com/tidyverse/ggplot2 |
+
is the key to constructing sophisticated ggplot2 graphics. It
allows you to start simple, then get more and more complex, checking your
work at each step.
## S3 method for class 'gg' e1 + e2 e1 %+% e2
## S3 method for class 'gg' e1 + e2 e1 %+% e2
e1 |
|
e2 |
A plot component, as described below. |
You can add any of the following types of objects:
An aes()
object replaces the default aesthetics.
A layer created by a geom_
or stat_
function adds a
new layer.
A scale
overrides the existing scale.
A theme()
modifies the current theme.
A coord
overrides the current coordinate system.
A facet
specification overrides the current faceting.
To replace the current default data frame, you must use %+%
,
due to S3 method precedence issues.
You can also supply a list, in which case each element of the list will be added in turn.
base <- ggplot(mpg, aes(displ, hwy)) + geom_point() base + geom_smooth() # To override the data, you must use %+% base %+% subset(mpg, fl == "p") # Alternatively, you can add multiple components with a list. # This can be useful to return from a function. base + list(subset(mpg, fl == "p"), geom_smooth())
base <- ggplot(mpg, aes(displ, hwy)) + geom_point() base + geom_smooth() # To override the data, you must use %+% base %+% subset(mpg, fl == "p") # Alternatively, you can add multiple components with a list. # This can be useful to return from a function. base + list(subset(mpg, fl == "p"), geom_smooth())
Aesthetic mappings describe how variables in the data are mapped to visual
properties (aesthetics) of geoms. Aesthetic mappings can be set in
ggplot()
and in individual layers.
aes(x, y, ...)
aes(x, y, ...)
x , y , ...
|
< |
This function also standardises aesthetic names by converting color
to colour
(also in substrings, e.g., point_color
to point_colour
) and translating old style
R names to ggplot names (e.g., pch
to shape
and cex
to size
).
A list with class uneval
. Components of the list are either
quosures or constants.
aes()
is a quoting function. This means that
its inputs are quoted to be evaluated in the context of the
data. This makes it easy to work with variables from the data frame
because you can name those directly. The flip side is that you have
to use quasiquotation to program with
aes()
. See a tidy evaluation tutorial such as the dplyr programming vignette
to learn more about these techniques.
vars()
for another quoting function designed for
faceting specifications.
Run vignette("ggplot2-specs")
to see an overview of other aesthetics
that can be modified.
Delayed evaluation for working with computed variables.
Other aesthetics documentation:
aes_colour_fill_alpha
,
aes_group_order
,
aes_linetype_size_shape
,
aes_position
aes(x = mpg, y = wt) aes(mpg, wt) # You can also map aesthetics to functions of variables aes(x = mpg ^ 2, y = wt / cyl) # Or to constants aes(x = 1, colour = "smooth") # Aesthetic names are automatically standardised aes(col = x) aes(fg = x) aes(color = x) aes(colour = x) # aes() is passed to either ggplot() or specific layer. Aesthetics supplied # to ggplot() are used as defaults for every layer. ggplot(mpg, aes(displ, hwy)) + geom_point() ggplot(mpg) + geom_point(aes(displ, hwy)) # Tidy evaluation ---------------------------------------------------- # aes() automatically quotes all its arguments, so you need to use tidy # evaluation to create wrappers around ggplot2 pipelines. The # simplest case occurs when your wrapper takes dots: scatter_by <- function(data, ...) { ggplot(data) + geom_point(aes(...)) } scatter_by(mtcars, disp, drat) # If your wrapper has a more specific interface with named arguments, # you need the "embrace operator": scatter_by <- function(data, x, y) { ggplot(data) + geom_point(aes({{ x }}, {{ y }})) } scatter_by(mtcars, disp, drat) # Note that users of your wrapper can use their own functions in the # quoted expressions and all will resolve as it should! cut3 <- function(x) cut_number(x, 3) scatter_by(mtcars, cut3(disp), drat)
aes(x = mpg, y = wt) aes(mpg, wt) # You can also map aesthetics to functions of variables aes(x = mpg ^ 2, y = wt / cyl) # Or to constants aes(x = 1, colour = "smooth") # Aesthetic names are automatically standardised aes(col = x) aes(fg = x) aes(color = x) aes(colour = x) # aes() is passed to either ggplot() or specific layer. Aesthetics supplied # to ggplot() are used as defaults for every layer. ggplot(mpg, aes(displ, hwy)) + geom_point() ggplot(mpg) + geom_point(aes(displ, hwy)) # Tidy evaluation ---------------------------------------------------- # aes() automatically quotes all its arguments, so you need to use tidy # evaluation to create wrappers around ggplot2 pipelines. The # simplest case occurs when your wrapper takes dots: scatter_by <- function(data, ...) { ggplot(data) + geom_point(aes(...)) } scatter_by(mtcars, disp, drat) # If your wrapper has a more specific interface with named arguments, # you need the "embrace operator": scatter_by <- function(data, x, y) { ggplot(data) + geom_point(aes({{ x }}, {{ y }})) } scatter_by(mtcars, disp, drat) # Note that users of your wrapper can use their own functions in the # quoted expressions and all will resolve as it should! cut3 <- function(x) cut_number(x, 3) scatter_by(mtcars, cut3(disp), drat)
These aesthetics parameters change the colour (colour
and fill
) and the
opacity (alpha
) of geom elements on a plot. Almost every geom has either
colour or fill (or both), as well as can have their alpha modified.
Modifying colour on a plot is a useful way to enhance the presentation of data,
often especially when a plot graphs more than two variables.
The colour
aesthetic is used to draw lines and strokes, such as in
geom_point()
and geom_line()
, but also the line contours of
geom_rect()
and geom_polygon()
. The fill
aesthetic is used to
colour the inside areas of geoms, such as geom_rect()
and
geom_polygon()
, but also the insides of shapes 21-25 of geom_point()
.
Colours and fills can be specified in the following ways:
A name, e.g., "red"
. R has 657 built-in named colours, which can be
listed with grDevices::colors()
.
An rgb specification, with a string of the form "#RRGGBB"
where each of the
pairs RR
, GG
, BB
consists of two hexadecimal digits giving a value in the
range 00
to FF
. You can optionally make the colour transparent by using the
form "#RRGGBBAA"
.
An NA
, for a completely transparent colour.
Alpha refers to the opacity of a geom. Values of alpha
range from 0 to 1,
with lower values corresponding to more transparent colors.
Alpha can additionally be modified through the colour
or fill
aesthetic
if either aesthetic provides color values using an rgb specification
("#RRGGBBAA"
), where AA
refers to transparency values.
Other options for modifying colour:
scale_colour_brewer()
,
scale_colour_gradient()
, scale_colour_grey()
,
scale_colour_hue()
, scale_colour_identity()
,
scale_colour_manual()
, scale_colour_viridis_d()
Other options for modifying fill:
scale_fill_brewer()
,
scale_fill_gradient()
, scale_fill_grey()
,
scale_fill_hue()
, scale_fill_identity()
,
scale_fill_manual()
, scale_fill_viridis_d()
Other options for modifying alpha:
scale_alpha()
, scale_alpha_manual()
, scale_alpha_identity()
Run vignette("ggplot2-specs")
to see an overview of other aesthetics that
can be modified.
Other aesthetics documentation:
aes()
,
aes_group_order
,
aes_linetype_size_shape
,
aes_position
# Bar chart example p <- ggplot(mtcars, aes(factor(cyl))) # Default plotting p + geom_bar() # To change the interior colouring use fill aesthetic p + geom_bar(fill = "red") # Compare with the colour aesthetic which changes just the bar outline p + geom_bar(colour = "red") # Combining both, you can see the changes more clearly p + geom_bar(fill = "white", colour = "red") # Both colour and fill can take an rgb specification. p + geom_bar(fill = "#00abff") # Use NA for a completely transparent colour. p + geom_bar(fill = NA, colour = "#00abff") # Colouring scales differ depending on whether a discrete or # continuous variable is being mapped. For example, when mapping # fill to a factor variable, a discrete colour scale is used. ggplot(mtcars, aes(factor(cyl), fill = factor(vs))) + geom_bar() # When mapping fill to continuous variable a continuous colour # scale is used. ggplot(faithfuld, aes(waiting, eruptions)) + geom_raster(aes(fill = density)) # Some geoms only use the colour aesthetic but not the fill # aesthetic (e.g. geom_point() or geom_line()). p <- ggplot(economics, aes(x = date, y = unemploy)) p + geom_line() p + geom_line(colour = "green") p + geom_point() p + geom_point(colour = "red") # For large datasets with overplotting the alpha # aesthetic will make the points more transparent. set.seed(1) df <- data.frame(x = rnorm(5000), y = rnorm(5000)) p <- ggplot(df, aes(x,y)) p + geom_point() p + geom_point(alpha = 0.5) p + geom_point(alpha = 1/10) # Alpha can also be used to add shading. p <- ggplot(economics, aes(x = date, y = unemploy)) + geom_line() p yrng <- range(economics$unemploy) p <- p + geom_rect( aes(NULL, NULL, xmin = start, xmax = end, fill = party), ymin = yrng[1], ymax = yrng[2], data = presidential ) p p + scale_fill_manual(values = alpha(c("blue", "red"), .3))
# Bar chart example p <- ggplot(mtcars, aes(factor(cyl))) # Default plotting p + geom_bar() # To change the interior colouring use fill aesthetic p + geom_bar(fill = "red") # Compare with the colour aesthetic which changes just the bar outline p + geom_bar(colour = "red") # Combining both, you can see the changes more clearly p + geom_bar(fill = "white", colour = "red") # Both colour and fill can take an rgb specification. p + geom_bar(fill = "#00abff") # Use NA for a completely transparent colour. p + geom_bar(fill = NA, colour = "#00abff") # Colouring scales differ depending on whether a discrete or # continuous variable is being mapped. For example, when mapping # fill to a factor variable, a discrete colour scale is used. ggplot(mtcars, aes(factor(cyl), fill = factor(vs))) + geom_bar() # When mapping fill to continuous variable a continuous colour # scale is used. ggplot(faithfuld, aes(waiting, eruptions)) + geom_raster(aes(fill = density)) # Some geoms only use the colour aesthetic but not the fill # aesthetic (e.g. geom_point() or geom_line()). p <- ggplot(economics, aes(x = date, y = unemploy)) p + geom_line() p + geom_line(colour = "green") p + geom_point() p + geom_point(colour = "red") # For large datasets with overplotting the alpha # aesthetic will make the points more transparent. set.seed(1) df <- data.frame(x = rnorm(5000), y = rnorm(5000)) p <- ggplot(df, aes(x,y)) p + geom_point() p + geom_point(alpha = 0.5) p + geom_point(alpha = 1/10) # Alpha can also be used to add shading. p <- ggplot(economics, aes(x = date, y = unemploy)) + geom_line() p yrng <- range(economics$unemploy) p <- p + geom_rect( aes(NULL, NULL, xmin = start, xmax = end, fill = party), ymin = yrng[1], ymax = yrng[2], data = presidential ) p p + scale_fill_manual(values = alpha(c("blue", "red"), .3))
Most aesthetics are mapped from variables found in the data. Sometimes, however, you want to delay the mapping until later in the rendering process. ggplot2 has three stages of the data that you can map aesthetics from, and three functions to control at which stage aesthetics should be evaluated.
after_stat()
replaces the old approaches of using either stat()
, e.g.
stat(density)
, or surrounding the variable names with ..
, e.g.
..density..
.
# These functions can be used inside the `aes()` function # used as the `mapping` argument in layers, for example: # geom_density(mapping = aes(y = after_stat(scaled))) after_stat(x) after_scale(x) from_theme(x) stage(start = NULL, after_stat = NULL, after_scale = NULL)
# These functions can be used inside the `aes()` function # used as the `mapping` argument in layers, for example: # geom_density(mapping = aes(y = after_stat(scaled))) after_stat(x) after_scale(x) from_theme(x) stage(start = NULL, after_stat = NULL, after_scale = NULL)
x |
< |
start |
< |
after_stat |
< |
after_scale |
< |
Below follows an overview of the three stages of evaluation and how aesthetic evaluation can be controlled.
The default is to map at the beginning, using the layer data provided by the user. If you want to map directly from the layer data you should not do anything special. This is the only stage where the original layer data can be accessed.
# 'x' and 'y' are mapped directly ggplot(mtcars) + geom_point(aes(x = mpg, y = disp))
The second stage is after the data has been transformed by the layer
stat. The most common example of mapping from stat transformed data is the
height of bars in geom_histogram()
: the height does not come from a
variable in the underlying data, but is instead mapped to the count
computed by stat_bin()
. In order to map from stat transformed data you
should use the after_stat()
function to flag that evaluation of the
aesthetic mapping should be postponed until after stat transformation.
Evaluation after stat transformation will have access to the variables
calculated by the stat, not the original mapped values. The 'computed
variables' section in each stat lists which variables are available to
access.
# The 'y' values for the histogram are computed by the stat ggplot(faithful, aes(x = waiting)) + geom_histogram() # Choosing a different computed variable to display, matching up the # histogram with the density plot ggplot(faithful, aes(x = waiting)) + geom_histogram(aes(y = after_stat(density))) + geom_density()
The third and last stage is after the data has been transformed and
mapped by the plot scales. An example of mapping from scaled data could
be to use a desaturated version of the stroke colour for fill. You should
use after_scale()
to flag evaluation of mapping for after data has been
scaled. Evaluation after scaling will only have access to the final
aesthetics of the layer (including non-mapped, default aesthetics).
# The exact colour is known after scale transformation ggplot(mpg, aes(cty, colour = factor(cyl))) + geom_density() # We re-use colour properties for the fill without a separate fill scale ggplot(mpg, aes(cty, colour = factor(cyl))) + geom_density(aes(fill = after_scale(alpha(colour, 0.3))))
If you want to map the same aesthetic multiple times, e.g. map x
to a
data column for the stat, but remap it for the geom, you can use the
stage()
function to collect multiple mappings.
# Use stage to modify the scaled fill ggplot(mpg, aes(class, hwy)) + geom_boxplot(aes(fill = stage(class, after_scale = alpha(fill, 0.4)))) # Using data for computing summary, but placing label elsewhere. # Also, we're making our own computed variable to use for the label. ggplot(mpg, aes(class, displ)) + geom_violin() + stat_summary( aes( y = stage(displ, after_stat = 8), label = after_stat(paste(mean, "±", sd)) ), geom = "text", fun.data = ~ round(data.frame(mean = mean(.x), sd = sd(.x)), 2) )
The from_theme()
function can be used to acces the element_geom()
fields of the theme(geom)
argument. Using aes(colour = from_theme(ink))
and aes(colour = from_theme(accent))
allows swapping between foreground and
accent colours.
# Default histogram display ggplot(mpg, aes(displ)) + geom_histogram(aes(y = after_stat(count))) # Scale tallest bin to 1 ggplot(mpg, aes(displ)) + geom_histogram(aes(y = after_stat(count / max(count)))) # Use a transparent version of colour for fill ggplot(mpg, aes(class, hwy)) + geom_boxplot(aes(colour = class, fill = after_scale(alpha(colour, 0.4)))) # Use stage to modify the scaled fill ggplot(mpg, aes(class, hwy)) + geom_boxplot(aes(fill = stage(class, after_scale = alpha(fill, 0.4)))) # Making a proportional stacked density plot ggplot(mpg, aes(cty)) + geom_density( aes( colour = factor(cyl), fill = after_scale(alpha(colour, 0.3)), y = after_stat(count / sum(n[!duplicated(group)])) ), position = "stack", bw = 1 ) + geom_density(bw = 1) # Imitating a ridgeline plot ggplot(mpg, aes(cty, colour = factor(cyl))) + geom_ribbon( stat = "density", outline.type = "upper", aes( fill = after_scale(alpha(colour, 0.3)), ymin = after_stat(group), ymax = after_stat(group + ndensity) ) ) # Labelling a bar plot ggplot(mpg, aes(class)) + geom_bar() + geom_text( aes( y = after_stat(count + 2), label = after_stat(count) ), stat = "count" ) # Labelling the upper hinge of a boxplot, # inspired by June Choe ggplot(mpg, aes(displ, class)) + geom_boxplot(outlier.shape = NA) + geom_text( aes( label = after_stat(xmax), x = stage(displ, after_stat = xmax) ), stat = "boxplot", hjust = -0.5 )
# Default histogram display ggplot(mpg, aes(displ)) + geom_histogram(aes(y = after_stat(count))) # Scale tallest bin to 1 ggplot(mpg, aes(displ)) + geom_histogram(aes(y = after_stat(count / max(count)))) # Use a transparent version of colour for fill ggplot(mpg, aes(class, hwy)) + geom_boxplot(aes(colour = class, fill = after_scale(alpha(colour, 0.4)))) # Use stage to modify the scaled fill ggplot(mpg, aes(class, hwy)) + geom_boxplot(aes(fill = stage(class, after_scale = alpha(fill, 0.4)))) # Making a proportional stacked density plot ggplot(mpg, aes(cty)) + geom_density( aes( colour = factor(cyl), fill = after_scale(alpha(colour, 0.3)), y = after_stat(count / sum(n[!duplicated(group)])) ), position = "stack", bw = 1 ) + geom_density(bw = 1) # Imitating a ridgeline plot ggplot(mpg, aes(cty, colour = factor(cyl))) + geom_ribbon( stat = "density", outline.type = "upper", aes( fill = after_scale(alpha(colour, 0.3)), ymin = after_stat(group), ymax = after_stat(group + ndensity) ) ) # Labelling a bar plot ggplot(mpg, aes(class)) + geom_bar() + geom_text( aes( y = after_stat(count + 2), label = after_stat(count) ), stat = "count" ) # Labelling the upper hinge of a boxplot, # inspired by June Choe ggplot(mpg, aes(displ, class)) + geom_boxplot(outlier.shape = NA) + geom_text( aes( label = after_stat(xmax), x = stage(displ, after_stat = xmax) ), stat = "boxplot", hjust = -0.5 )
The group
aesthetic is by default set to the interaction of all discrete variables
in the plot. This choice often partitions the data correctly, but when it does not,
or when no discrete variable is used in the plot, you will need to explicitly define the
grouping structure by mapping group
to a variable that has a different value
for each group.
For most applications the grouping is set implicitly by mapping one or more
discrete variables to x
, y
, colour
, fill
, alpha
, shape
, size
,
and/or linetype
. This is demonstrated in the examples below.
There are three common cases where the default does not display the data correctly.
geom_line()
where there are multiple individuals and the plot tries to
connect every observation, even across individuals, with a line.
geom_line()
where a discrete x-position implies groups, whereas observations
span the discrete x-positions.
When the grouping needs to be different over different layers, for example when computing a statistic on all observations when another layer shows individuals.
The examples below use a longitudinal dataset, Oxboys
, from the nlme package to demonstrate
these cases. Oxboys
records the heights (height) and centered ages (age) of 26 boys (Subject),
measured on nine occasions (Occasion).
Geoms commonly used with groups: geom_bar()
, geom_histogram()
, geom_line()
Run vignette("ggplot2-specs")
to see an overview of other aesthetics that
can be modified.
Other aesthetics documentation:
aes()
,
aes_colour_fill_alpha
,
aes_linetype_size_shape
,
aes_position
p <- ggplot(mtcars, aes(wt, mpg)) # A basic scatter plot p + geom_point(size = 4) # Using the colour aesthetic p + geom_point(aes(colour = factor(cyl)), size = 4) # Using the shape aesthetic p + geom_point(aes(shape = factor(cyl)), size = 4) # Using fill p <- ggplot(mtcars, aes(factor(cyl))) p + geom_bar() p + geom_bar(aes(fill = factor(cyl))) p + geom_bar(aes(fill = factor(vs))) # Using linetypes ggplot(economics_long, aes(date, value01)) + geom_line(aes(linetype = variable)) # Multiple groups with one aesthetic p <- ggplot(nlme::Oxboys, aes(age, height)) # The default is not sufficient here. A single line tries to connect all # the observations. p + geom_line() # To fix this, use the group aesthetic to map a different line for each # subject. p + geom_line(aes(group = Subject)) # Different groups on different layers p <- p + geom_line(aes(group = Subject)) # Using the group aesthetic with both geom_line() and geom_smooth() # groups the data the same way for both layers p + geom_smooth(aes(group = Subject), method = "lm", se = FALSE) # Changing the group aesthetic for the smoother layer # fits a single line of best fit across all boys p + geom_smooth(aes(group = 1), size = 2, method = "lm", se = FALSE) # Overriding the default grouping # Sometimes the plot has a discrete scale but you want to draw lines # that connect across groups. This is the strategy used in interaction # plots, profile plots, and parallel coordinate plots, among others. # For example, we draw boxplots of height at each measurement occasion. p <- ggplot(nlme::Oxboys, aes(Occasion, height)) + geom_boxplot() p # There is no need to specify the group aesthetic here; the default grouping # works because occasion is a discrete variable. To overlay individual # trajectories, we again need to override the default grouping for that layer # with aes(group = Subject) p + geom_line(aes(group = Subject), colour = "blue")
p <- ggplot(mtcars, aes(wt, mpg)) # A basic scatter plot p + geom_point(size = 4) # Using the colour aesthetic p + geom_point(aes(colour = factor(cyl)), size = 4) # Using the shape aesthetic p + geom_point(aes(shape = factor(cyl)), size = 4) # Using fill p <- ggplot(mtcars, aes(factor(cyl))) p + geom_bar() p + geom_bar(aes(fill = factor(cyl))) p + geom_bar(aes(fill = factor(vs))) # Using linetypes ggplot(economics_long, aes(date, value01)) + geom_line(aes(linetype = variable)) # Multiple groups with one aesthetic p <- ggplot(nlme::Oxboys, aes(age, height)) # The default is not sufficient here. A single line tries to connect all # the observations. p + geom_line() # To fix this, use the group aesthetic to map a different line for each # subject. p + geom_line(aes(group = Subject)) # Different groups on different layers p <- p + geom_line(aes(group = Subject)) # Using the group aesthetic with both geom_line() and geom_smooth() # groups the data the same way for both layers p + geom_smooth(aes(group = Subject), method = "lm", se = FALSE) # Changing the group aesthetic for the smoother layer # fits a single line of best fit across all boys p + geom_smooth(aes(group = 1), size = 2, method = "lm", se = FALSE) # Overriding the default grouping # Sometimes the plot has a discrete scale but you want to draw lines # that connect across groups. This is the strategy used in interaction # plots, profile plots, and parallel coordinate plots, among others. # For example, we draw boxplots of height at each measurement occasion. p <- ggplot(nlme::Oxboys, aes(Occasion, height)) + geom_boxplot() p # There is no need to specify the group aesthetic here; the default grouping # works because occasion is a discrete variable. To overlay individual # trajectories, we again need to override the default grouping for that layer # with aes(group = Subject) p + geom_line(aes(group = Subject), colour = "blue")
The linetype
, linewidth
, size
, and shape
aesthetics modify the
appearance of lines and/or points. They also apply to the outlines of
polygons (linetype
and linewidth
) or to text (size
).
The linetype
aesthetic can be specified with either an integer (0-6), a
name (0 = blank, 1 = solid, 2 = dashed, 3 = dotted, 4 = dotdash, 5 = longdash,
6 = twodash), a mapping to a discrete variable, or a string of an even number
(up to eight) of hexadecimal digits which give the lengths in consecutive
positions in the string. See examples for a hex string demonstration.
The linewidth
aesthetic sets the widths of lines, and can be specified
with a numeric value (for historical reasons, these units are about 0.75
millimetres). Alternatively, they can also be set via mapping to a continuous
variable. The stroke
aesthetic serves the same role for points, but is
distinct for discriminating points from lines in geoms such as
geom_pointrange()
.
The size
aesthetic control the size of points and text, and can be
specified with a numerical value (in millimetres) or via a mapping to a
continuous variable.
The shape
aesthetic controls the symbols of points, and can be specified
with an integer (between 0 and 25), a single character (which uses that
character as the plotting symbol), a .
to draw the smallest rectangle that
is visible (i.e., about one pixel), an NA
to draw nothing, or a mapping to
a discrete variable. Symbols and filled shapes are described in the examples
below.
geom_line()
and geom_point()
for geoms commonly used
with these aesthetics.
aes_group_order()
for using linetype
, size
, or
shape
for grouping.
Scales that can be used to modify these aesthetics: scale_linetype()
,
scale_linewidth()
, scale_size()
, and scale_shape()
.
Run vignette("ggplot2-specs")
to see an overview of other aesthetics that
can be modified.
Other aesthetics documentation:
aes()
,
aes_colour_fill_alpha
,
aes_group_order
,
aes_position
df <- data.frame(x = 1:10 , y = 1:10) p <- ggplot(df, aes(x, y)) p + geom_line(linetype = 2) p + geom_line(linetype = "dotdash") # An example with hex strings; the string "33" specifies three units on followed # by three off and "3313" specifies three units on followed by three off followed # by one on and finally three off. p + geom_line(linetype = "3313") # Mapping line type from a grouping variable ggplot(economics_long, aes(date, value01)) + geom_line(aes(linetype = variable)) # Linewidth examples ggplot(economics, aes(date, unemploy)) + geom_line(linewidth = 2, lineend = "round") ggplot(economics, aes(date, unemploy)) + geom_line(aes(linewidth = uempmed), lineend = "round") # Size examples p <- ggplot(mtcars, aes(wt, mpg)) p + geom_point(size = 4) p + geom_point(aes(size = qsec)) p + geom_point(size = 2.5) + geom_hline(yintercept = 25, size = 3.5) # Shape examples p + geom_point() p + geom_point(shape = 5) p + geom_point(shape = "k", size = 3) p + geom_point(shape = ".") p + geom_point(shape = NA) p + geom_point(aes(shape = factor(cyl))) # A look at all 25 symbols df2 <- data.frame(x = 1:5 , y = 1:25, z = 1:25) p <- ggplot(df2, aes(x, y)) p + geom_point(aes(shape = z), size = 4) + scale_shape_identity() # While all symbols have a foreground colour, symbols 19-25 also take a # background colour (fill) p + geom_point(aes(shape = z), size = 4, colour = "Red") + scale_shape_identity() p + geom_point(aes(shape = z), size = 4, colour = "Red", fill = "Black") + scale_shape_identity()
df <- data.frame(x = 1:10 , y = 1:10) p <- ggplot(df, aes(x, y)) p + geom_line(linetype = 2) p + geom_line(linetype = "dotdash") # An example with hex strings; the string "33" specifies three units on followed # by three off and "3313" specifies three units on followed by three off followed # by one on and finally three off. p + geom_line(linetype = "3313") # Mapping line type from a grouping variable ggplot(economics_long, aes(date, value01)) + geom_line(aes(linetype = variable)) # Linewidth examples ggplot(economics, aes(date, unemploy)) + geom_line(linewidth = 2, lineend = "round") ggplot(economics, aes(date, unemploy)) + geom_line(aes(linewidth = uempmed), lineend = "round") # Size examples p <- ggplot(mtcars, aes(wt, mpg)) p + geom_point(size = 4) p + geom_point(aes(size = qsec)) p + geom_point(size = 2.5) + geom_hline(yintercept = 25, size = 3.5) # Shape examples p + geom_point() p + geom_point(shape = 5) p + geom_point(shape = "k", size = 3) p + geom_point(shape = ".") p + geom_point(shape = NA) p + geom_point(aes(shape = factor(cyl))) # A look at all 25 symbols df2 <- data.frame(x = 1:5 , y = 1:25, z = 1:25) p <- ggplot(df2, aes(x, y)) p + geom_point(aes(shape = z), size = 4) + scale_shape_identity() # While all symbols have a foreground colour, symbols 19-25 also take a # background colour (fill) p + geom_point(aes(shape = z), size = 4, colour = "Red") + scale_shape_identity() p + geom_point(aes(shape = z), size = 4, colour = "Red", fill = "Black") + scale_shape_identity()
The following aesthetics can be used to specify the position of elements:
x
, y
, xmin
, xmax
, ymin
, ymax
, xend
, yend
.
x
and y
define the locations of points or of positions along a line
or path.
x
, y
and xend
, yend
define the starting and ending points of
segment and curve geometries.
xmin
, xmax
, ymin
and ymax
can be used to specify the position of
annotations and to represent rectangular areas.
In addition, there are position aesthetics that are contextual to the
geometry that they're used in. These are xintercept
, yintercept
,
xmin_final
, ymin_final
, xmax_final
, ymax_final
, xlower
, lower
,
xmiddle
, middle
, xupper
, upper
, x0
and y0
. Many of these are used
and automatically computed in geom_boxplot()
.
width
and height
The position aesthetics mentioned above like x
and y
are all location
based. The width
and height
aesthetics are closely related length
based aesthetics, but are not position aesthetics. Consequently, x
and y
aesthetics respond to scale transformations, whereas the length based
width
and height
aesthetics are not transformed by scales. For example,
if we have the pair x = 10, width = 2
, that gets translated to the
locations xmin = 9, xmax = 11
when using the default identity scales.
However, the same pair becomes xmin = 1, xmax = 100
when using log10 scales,
as width = 2
in log10-space spans a 100-fold change.
Geoms that commonly use these aesthetics: geom_crossbar()
,
geom_curve()
, geom_errorbar()
, geom_line()
, geom_linerange()
,
geom_path()
, geom_point()
, geom_pointrange()
, geom_rect()
,
geom_segment()
Scales that can be used to modify positions:
scale_continuous()
,
scale_discrete()
,
scale_binned()
,
scale_date()
.
See also annotate()
for placing annotations.
Other aesthetics documentation:
aes()
,
aes_colour_fill_alpha
,
aes_group_order
,
aes_linetype_size_shape
# Generate data: means and standard errors of means for prices # for each type of cut dmod <- lm(price ~ cut, data = diamonds) cut <- unique(diamonds$cut) cuts_df <- data.frame( cut, predict(dmod, data.frame(cut), se = TRUE)[c("fit", "se.fit")] ) ggplot(cuts_df) + aes( x = cut, y = fit, ymin = fit - se.fit, ymax = fit + se.fit, colour = cut ) + geom_pointrange() # Using annotate p <- ggplot(mtcars, aes(x = wt, y = mpg)) + geom_point() p p + annotate( "rect", xmin = 2, xmax = 3.5, ymin = 2, ymax = 25, fill = "dark grey", alpha = .5 ) # Geom_segment examples p + geom_segment( aes(x = 2, y = 15, xend = 2, yend = 25), arrow = arrow(length = unit(0.5, "cm")) ) p + geom_segment( aes(x = 2, y = 15, xend = 3, yend = 15), arrow = arrow(length = unit(0.5, "cm")) ) p + geom_segment( aes(x = 5, y = 30, xend = 3.5, yend = 25), arrow = arrow(length = unit(0.5, "cm")) ) # You can also use geom_segment() to recreate plot(type = "h") # from base R: set.seed(1) counts <- as.data.frame(table(x = rpois(100, 5))) counts$x <- as.numeric(as.character(counts$x)) with(counts, plot(x, Freq, type = "h", lwd = 10)) ggplot(counts, aes(x = x, y = Freq)) + geom_segment(aes(yend = 0, xend = x), size = 10)
# Generate data: means and standard errors of means for prices # for each type of cut dmod <- lm(price ~ cut, data = diamonds) cut <- unique(diamonds$cut) cuts_df <- data.frame( cut, predict(dmod, data.frame(cut), se = TRUE)[c("fit", "se.fit")] ) ggplot(cuts_df) + aes( x = cut, y = fit, ymin = fit - se.fit, ymax = fit + se.fit, colour = cut ) + geom_pointrange() # Using annotate p <- ggplot(mtcars, aes(x = wt, y = mpg)) + geom_point() p p + annotate( "rect", xmin = 2, xmax = 3.5, ymin = 2, ymax = 25, fill = "dark grey", alpha = .5 ) # Geom_segment examples p + geom_segment( aes(x = 2, y = 15, xend = 2, yend = 25), arrow = arrow(length = unit(0.5, "cm")) ) p + geom_segment( aes(x = 2, y = 15, xend = 3, yend = 15), arrow = arrow(length = unit(0.5, "cm")) ) p + geom_segment( aes(x = 5, y = 30, xend = 3.5, yend = 25), arrow = arrow(length = unit(0.5, "cm")) ) # You can also use geom_segment() to recreate plot(type = "h") # from base R: set.seed(1) counts <- as.data.frame(table(x = rpois(100, 5))) counts$x <- as.numeric(as.character(counts$x)) with(counts, plot(x, Freq, type = "h", lwd = 10)) ggplot(counts, aes(x = x, y = Freq)) + geom_segment(aes(yend = 0, xend = x), size = 10)
This function adds geoms to a plot, but unlike a typical geom function, the properties of the geoms are not mapped from variables of a data frame, but are instead passed in as vectors. This is useful for adding small annotations (such as text labels) or if you have your data in vectors, and for some reason don't want to put them in a data frame.
annotate( geom, x = NULL, y = NULL, xmin = NULL, xmax = NULL, ymin = NULL, ymax = NULL, xend = NULL, yend = NULL, ..., na.rm = FALSE )
annotate( geom, x = NULL, y = NULL, xmin = NULL, xmax = NULL, ymin = NULL, ymax = NULL, xend = NULL, yend = NULL, ..., na.rm = FALSE )
geom |
name of geom to use for annotation |
x , y , xmin , ymin , xmax , ymax , xend , yend
|
positioning aesthetics - you must specify at least one of these. |
... |
Other arguments passed on to
|
na.rm |
If |
Note that all position aesthetics are scaled (i.e. they will expand the limits of the plot so they are visible), but all other aesthetics are set. This means that layers created with this function will never affect the legend.
Due to their special nature, reference line geoms geom_abline()
,
geom_hline()
, and geom_vline()
can't be used with annotate()
.
You can use these geoms directly for annotations.
The custom annotations section of the online ggplot2 book.
p <- ggplot(mtcars, aes(x = wt, y = mpg)) + geom_point() p + annotate("text", x = 4, y = 25, label = "Some text") p + annotate("text", x = 2:5, y = 25, label = "Some text") p + annotate("rect", xmin = 3, xmax = 4.2, ymin = 12, ymax = 21, alpha = .2) p + annotate("segment", x = 2.5, xend = 4, y = 15, yend = 25, colour = "blue") p + annotate("pointrange", x = 3.5, y = 20, ymin = 12, ymax = 28, colour = "red", size = 2.5, linewidth = 1.5) p + annotate("text", x = 2:3, y = 20:21, label = c("my label", "label 2")) p + annotate("text", x = 4, y = 25, label = "italic(R) ^ 2 == 0.75", parse = TRUE) p + annotate("text", x = 4, y = 25, label = "paste(italic(R) ^ 2, \" = .75\")", parse = TRUE)
p <- ggplot(mtcars, aes(x = wt, y = mpg)) + geom_point() p + annotate("text", x = 4, y = 25, label = "Some text") p + annotate("text", x = 2:5, y = 25, label = "Some text") p + annotate("rect", xmin = 3, xmax = 4.2, ymin = 12, ymax = 21, alpha = .2) p + annotate("segment", x = 2.5, xend = 4, y = 15, yend = 25, colour = "blue") p + annotate("pointrange", x = 3.5, y = 20, ymin = 12, ymax = 28, colour = "red", size = 2.5, linewidth = 1.5) p + annotate("text", x = 2:3, y = 20:21, label = c("my label", "label 2")) p + annotate("text", x = 4, y = 25, label = "italic(R) ^ 2 == 0.75", parse = TRUE) p + annotate("text", x = 4, y = 25, label = "paste(italic(R) ^ 2, \" = .75\")", parse = TRUE)
This is a special geom intended for use as static annotations that are the same in every panel. These annotations will not affect scales (i.e. the x and y axes will not grow to cover the range of the grob, and the grob will not be modified by any ggplot settings or mappings).
annotation_custom(grob, xmin = -Inf, xmax = Inf, ymin = -Inf, ymax = Inf)
annotation_custom(grob, xmin = -Inf, xmax = Inf, ymin = -Inf, ymax = Inf)
grob |
grob to display |
xmin , xmax
|
x location (in data coordinates) giving horizontal location of raster |
ymin , ymax
|
y location (in data coordinates) giving vertical location of raster |
Most useful for adding tables, inset plots, and other grid-based decorations.
annotation_custom()
expects the grob to fill the entire viewport
defined by xmin, xmax, ymin, ymax. Grobs with a different (absolute) size
will be center-justified in that region.
Inf values can be used to fill the full plot panel (see examples).
# Dummy plot df <- data.frame(x = 1:10, y = 1:10) base <- ggplot(df, aes(x, y)) + geom_blank() + theme_bw() # Full panel annotation base + annotation_custom( grob = grid::roundrectGrob(), xmin = -Inf, xmax = Inf, ymin = -Inf, ymax = Inf ) # Inset plot df2 <- data.frame(x = 1 , y = 1) g <- ggplotGrob(ggplot(df2, aes(x, y)) + geom_point() + theme(plot.background = element_rect(colour = "black"))) base + annotation_custom(grob = g, xmin = 1, xmax = 10, ymin = 8, ymax = 10)
# Dummy plot df <- data.frame(x = 1:10, y = 1:10) base <- ggplot(df, aes(x, y)) + geom_blank() + theme_bw() # Full panel annotation base + annotation_custom( grob = grid::roundrectGrob(), xmin = -Inf, xmax = Inf, ymin = -Inf, ymax = Inf ) # Inset plot df2 <- data.frame(x = 1 , y = 1) g <- ggplotGrob(ggplot(df2, aes(x, y)) + geom_point() + theme(plot.background = element_rect(colour = "black"))) base + annotation_custom(grob = g, xmin = 1, xmax = 10, ymin = 8, ymax = 10)
This function is superseded by using guide_axis_logticks()
.
This annotation adds log tick marks with diminishing spacing. These tick marks probably make sense only for base 10.
annotation_logticks( base = 10, sides = "bl", outside = FALSE, scaled = TRUE, short = unit(0.1, "cm"), mid = unit(0.2, "cm"), long = unit(0.3, "cm"), colour = "black", linewidth = 0.5, linetype = 1, alpha = 1, color = NULL, ..., size = deprecated() )
annotation_logticks( base = 10, sides = "bl", outside = FALSE, scaled = TRUE, short = unit(0.1, "cm"), mid = unit(0.2, "cm"), long = unit(0.3, "cm"), colour = "black", linewidth = 0.5, linetype = 1, alpha = 1, color = NULL, ..., size = deprecated() )
base |
the base of the log (default 10) |
sides |
a string that controls which sides of the plot the log ticks appear on.
It can be set to a string containing any of |
outside |
logical that controls whether to move the log ticks outside
of the plot area. Default is off ( |
scaled |
is the data already log-scaled? This should be |
short |
a |
mid |
a |
long |
a |
colour |
Colour of the tick marks. |
linewidth |
Thickness of tick marks, in mm. |
linetype |
Linetype of tick marks ( |
alpha |
The transparency of the tick marks. |
color |
An alias for |
... |
Other parameters passed on to the layer |
size |
scale_y_continuous()
, scale_y_log10()
for log scale
transformations.
coord_trans()
for log coordinate transformations.
# Make a log-log plot (without log ticks) a <- ggplot(msleep, aes(bodywt, brainwt)) + geom_point(na.rm = TRUE) + scale_x_log10( breaks = scales::trans_breaks("log10", function(x) 10^x), labels = scales::trans_format("log10", scales::math_format(10^.x)) ) + scale_y_log10( breaks = scales::trans_breaks("log10", function(x) 10^x), labels = scales::trans_format("log10", scales::math_format(10^.x)) ) + theme_bw() a + annotation_logticks() # Default: log ticks on bottom and left a + annotation_logticks(sides = "lr") # Log ticks for y, on left and right a + annotation_logticks(sides = "trbl") # All four sides a + annotation_logticks(sides = "lr", outside = TRUE) + coord_cartesian(clip = "off") # Ticks outside plot # Hide the minor grid lines because they don't align with the ticks a + annotation_logticks(sides = "trbl") + theme(panel.grid.minor = element_blank()) # Another way to get the same results as 'a' above: log-transform the data before # plotting it. Also hide the minor grid lines. b <- ggplot(msleep, aes(log10(bodywt), log10(brainwt))) + geom_point(na.rm = TRUE) + scale_x_continuous(name = "body", labels = scales::label_math(10^.x)) + scale_y_continuous(name = "brain", labels = scales::label_math(10^.x)) + theme_bw() + theme(panel.grid.minor = element_blank()) b + annotation_logticks() # Using a coordinate transform requires scaled = FALSE t <- ggplot(msleep, aes(bodywt, brainwt)) + geom_point() + coord_trans(x = "log10", y = "log10") + theme_bw() t + annotation_logticks(scaled = FALSE) # Change the length of the ticks a + annotation_logticks( short = unit(.5,"mm"), mid = unit(3,"mm"), long = unit(4,"mm") )
# Make a log-log plot (without log ticks) a <- ggplot(msleep, aes(bodywt, brainwt)) + geom_point(na.rm = TRUE) + scale_x_log10( breaks = scales::trans_breaks("log10", function(x) 10^x), labels = scales::trans_format("log10", scales::math_format(10^.x)) ) + scale_y_log10( breaks = scales::trans_breaks("log10", function(x) 10^x), labels = scales::trans_format("log10", scales::math_format(10^.x)) ) + theme_bw() a + annotation_logticks() # Default: log ticks on bottom and left a + annotation_logticks(sides = "lr") # Log ticks for y, on left and right a + annotation_logticks(sides = "trbl") # All four sides a + annotation_logticks(sides = "lr", outside = TRUE) + coord_cartesian(clip = "off") # Ticks outside plot # Hide the minor grid lines because they don't align with the ticks a + annotation_logticks(sides = "trbl") + theme(panel.grid.minor = element_blank()) # Another way to get the same results as 'a' above: log-transform the data before # plotting it. Also hide the minor grid lines. b <- ggplot(msleep, aes(log10(bodywt), log10(brainwt))) + geom_point(na.rm = TRUE) + scale_x_continuous(name = "body", labels = scales::label_math(10^.x)) + scale_y_continuous(name = "brain", labels = scales::label_math(10^.x)) + theme_bw() + theme(panel.grid.minor = element_blank()) b + annotation_logticks() # Using a coordinate transform requires scaled = FALSE t <- ggplot(msleep, aes(bodywt, brainwt)) + geom_point() + coord_trans(x = "log10", y = "log10") + theme_bw() t + annotation_logticks(scaled = FALSE) # Change the length of the ticks a + annotation_logticks( short = unit(.5,"mm"), mid = unit(3,"mm"), long = unit(4,"mm") )
Display a fixed map on a plot. This function predates the geom_sf()
framework and does not work with sf geometry columns as input. However,
it can be used in conjunction with geom_sf()
layers and/or
coord_sf()
(see examples).
annotation_map(map, ...)
annotation_map(map, ...)
map |
Data frame representing a map. See |
... |
Other arguments used to modify visual parameters, such as
|
## Not run: if (requireNamespace("maps", quietly = TRUE)) { # location of cities in North Carolina df <- data.frame( name = c("Charlotte", "Raleigh", "Greensboro"), lat = c(35.227, 35.772, 36.073), long = c(-80.843, -78.639, -79.792) ) p <- ggplot(df, aes(x = long, y = lat)) + annotation_map( map_data("state"), fill = "antiquewhite", colour = "darkgrey" ) + geom_point(color = "blue") + geom_text( aes(label = name), hjust = 1.105, vjust = 1.05, color = "blue" ) # use without coord_sf() is possible but not recommended p + xlim(-84, -76) + ylim(34, 37.2) if (requireNamespace("sf", quietly = TRUE)) { # use with coord_sf() for appropriate projection p + coord_sf( crs = sf::st_crs(3347), default_crs = sf::st_crs(4326), # data is provided as long-lat xlim = c(-84, -76), ylim = c(34, 37.2) ) # you can mix annotation_map() and geom_sf() nc <- sf::st_read(system.file("shape/nc.shp", package = "sf"), quiet = TRUE) p + geom_sf( data = nc, inherit.aes = FALSE, fill = NA, color = "black", linewidth = 0.1 ) + coord_sf(crs = sf::st_crs(3347), default_crs = sf::st_crs(4326)) }} ## End(Not run)
## Not run: if (requireNamespace("maps", quietly = TRUE)) { # location of cities in North Carolina df <- data.frame( name = c("Charlotte", "Raleigh", "Greensboro"), lat = c(35.227, 35.772, 36.073), long = c(-80.843, -78.639, -79.792) ) p <- ggplot(df, aes(x = long, y = lat)) + annotation_map( map_data("state"), fill = "antiquewhite", colour = "darkgrey" ) + geom_point(color = "blue") + geom_text( aes(label = name), hjust = 1.105, vjust = 1.05, color = "blue" ) # use without coord_sf() is possible but not recommended p + xlim(-84, -76) + ylim(34, 37.2) if (requireNamespace("sf", quietly = TRUE)) { # use with coord_sf() for appropriate projection p + coord_sf( crs = sf::st_crs(3347), default_crs = sf::st_crs(4326), # data is provided as long-lat xlim = c(-84, -76), ylim = c(34, 37.2) ) # you can mix annotation_map() and geom_sf() nc <- sf::st_read(system.file("shape/nc.shp", package = "sf"), quiet = TRUE) p + geom_sf( data = nc, inherit.aes = FALSE, fill = NA, color = "black", linewidth = 0.1 ) + coord_sf(crs = sf::st_crs(3347), default_crs = sf::st_crs(4326)) }} ## End(Not run)
This is a special version of geom_raster()
optimised for static
annotations that are the same in every panel. These annotations will not
affect scales (i.e. the x and y axes will not grow to cover the range
of the raster, and the raster must already have its own colours). This
is useful for adding bitmap images.
annotation_raster(raster, xmin, xmax, ymin, ymax, interpolate = FALSE)
annotation_raster(raster, xmin, xmax, ymin, ymax, interpolate = FALSE)
raster |
raster object to display, may be an |
xmin , xmax
|
x location (in data coordinates) giving horizontal location of raster |
ymin , ymax
|
y location (in data coordinates) giving vertical location of raster |
interpolate |
If |
# Generate data rainbow <- matrix(hcl(seq(0, 360, length.out = 50 * 50), 80, 70), nrow = 50) ggplot(mtcars, aes(mpg, wt)) + geom_point() + annotation_raster(rainbow, 15, 20, 3, 4) # To fill up whole plot ggplot(mtcars, aes(mpg, wt)) + annotation_raster(rainbow, -Inf, Inf, -Inf, Inf) + geom_point() rainbow2 <- matrix(hcl(seq(0, 360, length.out = 10), 80, 70), nrow = 1) ggplot(mtcars, aes(mpg, wt)) + annotation_raster(rainbow2, -Inf, Inf, -Inf, Inf) + geom_point() rainbow2 <- matrix(hcl(seq(0, 360, length.out = 10), 80, 70), nrow = 1) ggplot(mtcars, aes(mpg, wt)) + annotation_raster(rainbow2, -Inf, Inf, -Inf, Inf, interpolate = TRUE) + geom_point()
# Generate data rainbow <- matrix(hcl(seq(0, 360, length.out = 50 * 50), 80, 70), nrow = 50) ggplot(mtcars, aes(mpg, wt)) + geom_point() + annotation_raster(rainbow, 15, 20, 3, 4) # To fill up whole plot ggplot(mtcars, aes(mpg, wt)) + annotation_raster(rainbow, -Inf, Inf, -Inf, Inf) + geom_point() rainbow2 <- matrix(hcl(seq(0, 360, length.out = 10), 80, 70), nrow = 1) ggplot(mtcars, aes(mpg, wt)) + annotation_raster(rainbow2, -Inf, Inf, -Inf, Inf) + geom_point() rainbow2 <- matrix(hcl(seq(0, 360, length.out = 10), 80, 70), nrow = 1) ggplot(mtcars, aes(mpg, wt)) + annotation_raster(rainbow2, -Inf, Inf, -Inf, Inf, interpolate = TRUE) + geom_point()
autolayer()
uses ggplot2 to draw a particular layer for an object of a
particular class in a single command. This defines the S3 generic that
other classes and packages can extend.
autolayer(object, ...)
autolayer(object, ...)
object |
an object, whose class will determine the behaviour of autolayer |
... |
other arguments passed to specific methods |
a ggplot layer
Other plotting automation topics:
automatic_plotting
,
autoplot()
,
fortify()
There are three functions to make plotting particular data types easier:
autoplot()
, autolayer()
and fortify()
. These are S3 generics for which
other packages can write methods to display classes of data. The three
functions are complementary and allow different levels of customisation.
Below we'll explore implementing this series of methods to automate plotting
of some class.
Let's suppose we are writing a packages that has a class called 'my_heatmap', that wraps a matrix and we'd like users to easily plot this heatmap.
my_heatmap <- function(...) { m <- matrix(...) class(m) <- c("my_heatmap", class(m)) m } my_data <- my_heatmap(volcano)
One of the things we have to do is ensure that the data is shaped in the long
format so that it is compatible with ggplot2. This is the job of the
fortify()
function. Because 'my_heatmap' wraps a matrix, we can let the
fortify method 'melt' the matrix to a long format. If your data is already
based on a long-format <data.frame>
, you can skip implementing a
fortify()
method.
fortify.my_heatmap <- function(model, ...) { data.frame( row = as.vector(row(model)), col = as.vector(col(model)), value = as.vector(model) ) } fortify(my_data)
When you have implemented the fortify()
method, it should be easier to
construct a plot with the data than with the matrix.
ggplot(my_data, aes(x = col, y = row, fill = value)) + geom_raster()
A next step in automating plotting of your data type is to write an
autolayer()
method. These are typically wrappers around geoms or stats
that automatically set aesthetics or other parameters. If you haven't
implemented a fortify()
method for your data type, you might have to
reshape the data in autolayer()
.
If you require multiple layers to display your data type, you can use an
autolayer()
method that constructs a list of layers, which can be added
to a plot.
autolayer.my_heatmap <- function(object, ...) { geom_raster( mapping = aes(x = col, y = row, fill = value), data = object, ..., inherit.aes = FALSE ) } ggplot() + autolayer(my_data)
As a quick tip: if you define a mapping in autolayer()
, you might want
to set inherit.aes = FALSE
to not have aesthetics set in other layers
interfere with your layer.
The last step in automating plotting is to write an autoplot()
method
for your data type. The expectation is that these return a complete plot.
In the example below, we're exploiting the autolayer()
method that we
have already written to make a complete plot.
autoplot.my_heatmap <- function(object, ..., option = "magma") { ggplot() + autolayer(my_data) + scale_fill_viridis_c(option = option) + theme_void() } autoplot(my_data)
If you don't have a wish to implement a base R plotting method, you can set the plot method for your class to the autoplot method.
plot.my_heatmap <- autoplot.my_heatmap plot(my_data)
Other plotting automation topics:
autolayer()
,
autoplot()
,
fortify()
autoplot()
uses ggplot2 to draw a particular plot for an object of a
particular class in a single command. This defines the S3 generic that
other classes and packages can extend.
autoplot(object, ...)
autoplot(object, ...)
object |
an object, whose class will determine the behaviour of autoplot |
... |
other arguments passed to specific methods |
a ggplot object
Other plotting automation topics:
autolayer()
,
automatic_plotting
,
fortify()
This is a quick and dirty way to get map data (from the maps package) onto your plot. This is a good place to start if you need some crude reference lines, but you'll typically want something more sophisticated for communication graphics.
borders( database = "world", regions = ".", fill = NA, colour = "grey50", xlim = NULL, ylim = NULL, ... )
borders( database = "world", regions = ".", fill = NA, colour = "grey50", xlim = NULL, ylim = NULL, ... )
database |
map data, see |
regions |
map region |
fill |
fill colour |
colour |
border colour |
xlim , ylim
|
latitudinal and longitudinal ranges for extracting map
polygons, see |
... |
Arguments passed on to
|
if (require("maps")) { ia <- map_data("county", "iowa") mid_range <- function(x) mean(range(x)) seats <- do.call(rbind, lapply(split(ia, ia$subregion), function(d) { data.frame(lat = mid_range(d$lat), long = mid_range(d$long), subregion = unique(d$subregion)) })) ggplot(ia, aes(long, lat)) + geom_polygon(aes(group = group), fill = NA, colour = "grey60") + geom_text(aes(label = subregion), data = seats, size = 2, angle = 45) } if (require("maps")) { data(us.cities) capitals <- subset(us.cities, capital == 2) ggplot(capitals, aes(long, lat)) + borders("state") + geom_point(aes(size = pop)) + scale_size_area() + coord_quickmap() } if (require("maps")) { # Same map, with some world context ggplot(capitals, aes(long, lat)) + borders("world", xlim = c(-130, -60), ylim = c(20, 50)) + geom_point(aes(size = pop)) + scale_size_area() + coord_quickmap() }
if (require("maps")) { ia <- map_data("county", "iowa") mid_range <- function(x) mean(range(x)) seats <- do.call(rbind, lapply(split(ia, ia$subregion), function(d) { data.frame(lat = mid_range(d$lat), long = mid_range(d$long), subregion = unique(d$subregion)) })) ggplot(ia, aes(long, lat)) + geom_polygon(aes(group = group), fill = NA, colour = "grey60") + geom_text(aes(label = subregion), data = seats, size = 2, angle = 45) } if (require("maps")) { data(us.cities) capitals <- subset(us.cities, capital == 2) ggplot(capitals, aes(long, lat)) + borders("state") + geom_point(aes(size = pop)) + scale_size_area() + coord_quickmap() } if (require("maps")) { # Same map, with some world context ggplot(capitals, aes(long, lat)) + borders("world", xlim = c(-130, -60), ylim = c(20, 50)) + geom_point(aes(size = pop)) + scale_size_area() + coord_quickmap() }
The Cartesian coordinate system is the most familiar, and common, type of coordinate system. Setting limits on the coordinate system will zoom the plot (like you're looking at it with a magnifying glass), and will not change the underlying data like setting limits on a scale will.
coord_cartesian( xlim = NULL, ylim = NULL, expand = TRUE, default = FALSE, clip = "on" )
coord_cartesian( xlim = NULL, ylim = NULL, expand = TRUE, default = FALSE, clip = "on" )
xlim , ylim
|
Limits for the x and y axes. |
expand |
If |
default |
Is this the default coordinate system? If |
clip |
Should drawing be clipped to the extent of the plot panel? A
setting of |
# There are two ways of zooming the plot display: with scales or # with coordinate systems. They work in two rather different ways. p <- ggplot(mtcars, aes(disp, wt)) + geom_point() + geom_smooth() p # Setting the limits on a scale converts all values outside the range to NA. p + scale_x_continuous(limits = c(325, 500)) # Setting the limits on the coordinate system performs a visual zoom. # The data is unchanged, and we just view a small portion of the original # plot. Note how smooth continues past the points visible on this plot. p + coord_cartesian(xlim = c(325, 500)) # By default, the same expansion factor is applied as when setting scale # limits. You can set the limits precisely by setting expand = FALSE p + coord_cartesian(xlim = c(325, 500), expand = FALSE) # Similarly, we can use expand = FALSE to turn off expansion with the # default limits p + coord_cartesian(expand = FALSE) # You can see the same thing with this 2d histogram d <- ggplot(diamonds, aes(carat, price)) + stat_bin_2d(bins = 25, colour = "white") d # When zooming the scale, the we get 25 new bins that are the same # size on the plot, but represent smaller regions of the data space d + scale_x_continuous(limits = c(0, 1)) # When zooming the coordinate system, we see a subset of original 50 bins, # displayed bigger d + coord_cartesian(xlim = c(0, 1))
# There are two ways of zooming the plot display: with scales or # with coordinate systems. They work in two rather different ways. p <- ggplot(mtcars, aes(disp, wt)) + geom_point() + geom_smooth() p # Setting the limits on a scale converts all values outside the range to NA. p + scale_x_continuous(limits = c(325, 500)) # Setting the limits on the coordinate system performs a visual zoom. # The data is unchanged, and we just view a small portion of the original # plot. Note how smooth continues past the points visible on this plot. p + coord_cartesian(xlim = c(325, 500)) # By default, the same expansion factor is applied as when setting scale # limits. You can set the limits precisely by setting expand = FALSE p + coord_cartesian(xlim = c(325, 500), expand = FALSE) # Similarly, we can use expand = FALSE to turn off expansion with the # default limits p + coord_cartesian(expand = FALSE) # You can see the same thing with this 2d histogram d <- ggplot(diamonds, aes(carat, price)) + stat_bin_2d(bins = 25, colour = "white") d # When zooming the scale, the we get 25 new bins that are the same # size on the plot, but represent smaller regions of the data space d + scale_x_continuous(limits = c(0, 1)) # When zooming the coordinate system, we see a subset of original 50 bins, # displayed bigger d + coord_cartesian(xlim = c(0, 1))
A fixed scale coordinate system forces a specified ratio between the
physical representation of data units on the axes. The ratio represents the
number of units on the y-axis equivalent to one unit on the x-axis. The
default, ratio = 1
, ensures that one unit on the x-axis is the same
length as one unit on the y-axis. Ratios higher than one make units on the
y axis longer than units on the x-axis, and vice versa. This is similar to
MASS::eqscplot()
, but it works for all types of graphics.
coord_fixed(ratio = 1, xlim = NULL, ylim = NULL, expand = TRUE, clip = "on")
coord_fixed(ratio = 1, xlim = NULL, ylim = NULL, expand = TRUE, clip = "on")
ratio |
aspect ratio, expressed as |
xlim , ylim
|
Limits for the x and y axes. |
expand |
If |
clip |
Should drawing be clipped to the extent of the plot panel? A
setting of |
# ensures that the ranges of axes are equal to the specified ratio by # adjusting the plot aspect ratio p <- ggplot(mtcars, aes(mpg, wt)) + geom_point() p + coord_fixed(ratio = 1) p + coord_fixed(ratio = 5) p + coord_fixed(ratio = 1/5) p + coord_fixed(xlim = c(15, 30)) # Resize the plot to see that the specified aspect ratio is maintained
# ensures that the ranges of axes are equal to the specified ratio by # adjusting the plot aspect ratio p <- ggplot(mtcars, aes(mpg, wt)) + geom_point() p + coord_fixed(ratio = 1) p + coord_fixed(ratio = 5) p + coord_fixed(ratio = 1/5) p + coord_fixed(xlim = c(15, 30)) # Resize the plot to see that the specified aspect ratio is maintained
This function is superseded because in many cases, coord_flip()
can easily
be replaced by swapping the x and y aesthetics, or optionally setting the
orientation
argument in geom and stat layers.
coord_flip()
is useful for geoms and statistics that do not support
the orientation
setting, and converting the display of y conditional on x,
to x conditional on y.
coord_flip(xlim = NULL, ylim = NULL, expand = TRUE, clip = "on")
coord_flip(xlim = NULL, ylim = NULL, expand = TRUE, clip = "on")
xlim , ylim
|
Limits for the x and y axes. |
expand |
If |
clip |
Should drawing be clipped to the extent of the plot panel? A
setting of |
Coordinate systems interact with many parts of the plotting system. You can
expect the following for coord_flip()
:
It does not change the facet order in facet_grid()
or facet_wrap()
.
The scale_x_*()
functions apply to the vertical direction,
whereas scale_y_*()
functions apply to the horizontal direction. The
same holds for the xlim
and ylim
arguments of coord_flip()
and the
xlim()
and ylim()
functions.
The x-axis theme settings, such as axis.line.x
apply to the horizontal
direction. The y-axis theme settings, such as axis.text.y
apply to the
vertical direction.
# The preferred method of creating horizontal instead of vertical boxplots ggplot(diamonds, aes(price, cut)) + geom_boxplot() # Using `coord_flip()` to make the same plot ggplot(diamonds, aes(cut, price)) + geom_boxplot() + coord_flip() # With swapped aesthetics, the y-scale controls the left axis ggplot(diamonds, aes(y = carat)) + geom_histogram() + scale_y_reverse() # In `coord_flip()`, the x-scale controls the left axis ggplot(diamonds, aes(carat)) + geom_histogram() + coord_flip() + scale_x_reverse() # In line and area plots, swapped aesthetics require an explicit orientation df <- data.frame(a = 1:5, b = (1:5) ^ 2) ggplot(df, aes(b, a)) + geom_area(orientation = "y") # The same plot with `coord_flip()` ggplot(df, aes(a, b)) + geom_area() + coord_flip()
# The preferred method of creating horizontal instead of vertical boxplots ggplot(diamonds, aes(price, cut)) + geom_boxplot() # Using `coord_flip()` to make the same plot ggplot(diamonds, aes(cut, price)) + geom_boxplot() + coord_flip() # With swapped aesthetics, the y-scale controls the left axis ggplot(diamonds, aes(y = carat)) + geom_histogram() + scale_y_reverse() # In `coord_flip()`, the x-scale controls the left axis ggplot(diamonds, aes(carat)) + geom_histogram() + coord_flip() + scale_x_reverse() # In line and area plots, swapped aesthetics require an explicit orientation df <- data.frame(a = 1:5, b = (1:5) ^ 2) ggplot(df, aes(b, a)) + geom_area(orientation = "y") # The same plot with `coord_flip()` ggplot(df, aes(a, b)) + geom_area() + coord_flip()
coord_map()
projects a portion of the earth, which is approximately
spherical, onto a flat 2D plane using any projection defined by the
mapproj
package. Map projections do not, in general, preserve straight
lines, so this requires considerable computation. coord_quickmap()
is a
quick approximation that does preserve straight lines. It works best for
smaller areas closer to the equator.
Both coord_map()
and coord_quickmap()
are superseded by coord_sf()
, and should no longer be used in new
code. All regular (non-sf) geoms can be used with coord_sf()
by
setting the default coordinate system via the default_crs
argument.
See also the examples for annotation_map()
and geom_map()
.
coord_map( projection = "mercator", ..., parameters = NULL, orientation = NULL, xlim = NULL, ylim = NULL, clip = "on" ) coord_quickmap(xlim = NULL, ylim = NULL, expand = TRUE, clip = "on")
coord_map( projection = "mercator", ..., parameters = NULL, orientation = NULL, xlim = NULL, ylim = NULL, clip = "on" ) coord_quickmap(xlim = NULL, ylim = NULL, expand = TRUE, clip = "on")
projection |
projection to use, see
|
... , parameters
|
Other arguments passed on to
|
orientation |
projection orientation, which defaults to
|
xlim , ylim
|
Manually specific x/y limits (in degrees of longitude/latitude) |
clip |
Should drawing be clipped to the extent of the plot panel? A
setting of |
expand |
If |
Map projections must account for the fact that the actual length
(in km) of one degree of longitude varies between the equator and the pole.
Near the equator, the ratio between the lengths of one degree of latitude and
one degree of longitude is approximately 1. Near the pole, it tends
towards infinity because the length of one degree of longitude tends towards
0. For regions that span only a few degrees and are not too close to the
poles, setting the aspect ratio of the plot to the appropriate lat/lon ratio
approximates the usual mercator projection. This is what
coord_quickmap()
does, and is much faster (particularly for complex
plots like geom_tile()
) at the expense of correctness.
The polygon maps section of the online ggplot2 book.
if (require("maps")) { nz <- map_data("nz") # Prepare a map of NZ nzmap <- ggplot(nz, aes(x = long, y = lat, group = group)) + geom_polygon(fill = "white", colour = "black") # Plot it in cartesian coordinates nzmap } if (require("maps")) { # With correct mercator projection nzmap + coord_map() } if (require("maps")) { # With the aspect ratio approximation nzmap + coord_quickmap() } if (require("maps")) { # Other projections nzmap + coord_map("azequalarea", orientation = c(-36.92, 174.6, 0)) } if (require("maps")) { states <- map_data("state") usamap <- ggplot(states, aes(long, lat, group = group)) + geom_polygon(fill = "white", colour = "black") # Use cartesian coordinates usamap } if (require("maps")) { # With mercator projection usamap + coord_map() } if (require("maps")) { # See ?mapproject for coordinate systems and their parameters usamap + coord_map("gilbert") } if (require("maps")) { # For most projections, you'll need to set the orientation yourself # as the automatic selection done by mapproject is not available to # ggplot usamap + coord_map("orthographic") } if (require("maps")) { usamap + coord_map("conic", lat0 = 30) } if (require("maps")) { usamap + coord_map("bonne", lat0 = 50) } ## Not run: if (require("maps")) { # World map, using geom_path instead of geom_polygon world <- map_data("world") worldmap <- ggplot(world, aes(x = long, y = lat, group = group)) + geom_path() + scale_y_continuous(breaks = (-2:2) * 30) + scale_x_continuous(breaks = (-4:4) * 45) # Orthographic projection with default orientation (looking down at North pole) worldmap + coord_map("ortho") } if (require("maps")) { # Looking up up at South Pole worldmap + coord_map("ortho", orientation = c(-90, 0, 0)) } if (require("maps")) { # Centered on New York (currently has issues with closing polygons) worldmap + coord_map("ortho", orientation = c(41, -74, 0)) } ## End(Not run)
if (require("maps")) { nz <- map_data("nz") # Prepare a map of NZ nzmap <- ggplot(nz, aes(x = long, y = lat, group = group)) + geom_polygon(fill = "white", colour = "black") # Plot it in cartesian coordinates nzmap } if (require("maps")) { # With correct mercator projection nzmap + coord_map() } if (require("maps")) { # With the aspect ratio approximation nzmap + coord_quickmap() } if (require("maps")) { # Other projections nzmap + coord_map("azequalarea", orientation = c(-36.92, 174.6, 0)) } if (require("maps")) { states <- map_data("state") usamap <- ggplot(states, aes(long, lat, group = group)) + geom_polygon(fill = "white", colour = "black") # Use cartesian coordinates usamap } if (require("maps")) { # With mercator projection usamap + coord_map() } if (require("maps")) { # See ?mapproject for coordinate systems and their parameters usamap + coord_map("gilbert") } if (require("maps")) { # For most projections, you'll need to set the orientation yourself # as the automatic selection done by mapproject is not available to # ggplot usamap + coord_map("orthographic") } if (require("maps")) { usamap + coord_map("conic", lat0 = 30) } if (require("maps")) { usamap + coord_map("bonne", lat0 = 50) } ## Not run: if (require("maps")) { # World map, using geom_path instead of geom_polygon world <- map_data("world") worldmap <- ggplot(world, aes(x = long, y = lat, group = group)) + geom_path() + scale_y_continuous(breaks = (-2:2) * 30) + scale_x_continuous(breaks = (-4:4) * 45) # Orthographic projection with default orientation (looking down at North pole) worldmap + coord_map("ortho") } if (require("maps")) { # Looking up up at South Pole worldmap + coord_map("ortho", orientation = c(-90, 0, 0)) } if (require("maps")) { # Centered on New York (currently has issues with closing polygons) worldmap + coord_map("ortho", orientation = c(41, -74, 0)) } ## End(Not run)
The polar coordinate system is most commonly used for pie charts, which
are a stacked bar chart in polar coordinates. coord_radial()
has extended
options.
coord_polar(theta = "x", start = 0, direction = 1, clip = "on") coord_radial( theta = "x", start = 0, end = NULL, expand = TRUE, direction = 1, clip = "off", r.axis.inside = NULL, rotate.angle = FALSE, inner.radius = 0, r_axis_inside = deprecated(), rotate_angle = deprecated() )
coord_polar(theta = "x", start = 0, direction = 1, clip = "on") coord_radial( theta = "x", start = 0, end = NULL, expand = TRUE, direction = 1, clip = "off", r.axis.inside = NULL, rotate.angle = FALSE, inner.radius = 0, r_axis_inside = deprecated(), rotate_angle = deprecated() )
theta |
variable to map angle to ( |
start |
Offset of starting point from 12 o'clock in radians. Offset
is applied clockwise or anticlockwise depending on value of |
direction |
1, clockwise; -1, anticlockwise |
clip |
Should drawing be clipped to the extent of the plot panel? A
setting of |
end |
Position from 12 o'clock in radians where plot ends, to allow
for partial polar coordinates. The default, |
expand |
If |
r.axis.inside |
One of the following:
|
rotate.angle |
If |
inner.radius |
A |
r_axis_inside , rotate_angle
|
In coord_radial()
, position guides are can be defined by using
guides(r = ..., theta = ..., r.sec = ..., theta.sec = ...)
. Note that
these guides require r
and theta
as available aesthetics. The classic
guide_axis()
can be used for the r
positions and guide_axis_theta()
can
be used for the theta
positions. Using the theta.sec
position is only
sensible when inner.radius > 0
.
The polar coordinates section of the online ggplot2 book.
# NOTE: Use these plots with caution - polar coordinates has # major perceptual problems. The main point of these examples is # to demonstrate how these common plots can be described in the # grammar. Use with EXTREME caution. #' # A pie chart = stacked bar chart + polar coordinates pie <- ggplot(mtcars, aes(x = factor(1), fill = factor(cyl))) + geom_bar(width = 1) pie + coord_polar(theta = "y") # A coxcomb plot = bar chart + polar coordinates cxc <- ggplot(mtcars, aes(x = factor(cyl))) + geom_bar(width = 1, colour = "black") cxc + coord_polar() # A new type of plot? cxc + coord_polar(theta = "y") # The bullseye chart pie + coord_polar() # Hadley's favourite pie chart df <- data.frame( variable = c("does not resemble", "resembles"), value = c(20, 80) ) ggplot(df, aes(x = "", y = value, fill = variable)) + geom_col(width = 1) + scale_fill_manual(values = c("red", "yellow")) + coord_polar("y", start = pi / 3) + labs(title = "Pac man") # Windrose + doughnut plot if (require("ggplot2movies")) { movies$rrating <- cut_interval(movies$rating, length = 1) movies$budgetq <- cut_number(movies$budget, 4) doh <- ggplot(movies, aes(x = rrating, fill = budgetq)) # Wind rose doh + geom_bar(width = 1) + coord_polar() # Race track plot doh + geom_bar(width = 0.9, position = "fill") + coord_polar(theta = "y") } # A partial polar plot ggplot(mtcars, aes(disp, mpg)) + geom_point() + coord_radial(start = -0.4 * pi, end = 0.4 * pi, inner.radius = 0.3)
# NOTE: Use these plots with caution - polar coordinates has # major perceptual problems. The main point of these examples is # to demonstrate how these common plots can be described in the # grammar. Use with EXTREME caution. #' # A pie chart = stacked bar chart + polar coordinates pie <- ggplot(mtcars, aes(x = factor(1), fill = factor(cyl))) + geom_bar(width = 1) pie + coord_polar(theta = "y") # A coxcomb plot = bar chart + polar coordinates cxc <- ggplot(mtcars, aes(x = factor(cyl))) + geom_bar(width = 1, colour = "black") cxc + coord_polar() # A new type of plot? cxc + coord_polar(theta = "y") # The bullseye chart pie + coord_polar() # Hadley's favourite pie chart df <- data.frame( variable = c("does not resemble", "resembles"), value = c(20, 80) ) ggplot(df, aes(x = "", y = value, fill = variable)) + geom_col(width = 1) + scale_fill_manual(values = c("red", "yellow")) + coord_polar("y", start = pi / 3) + labs(title = "Pac man") # Windrose + doughnut plot if (require("ggplot2movies")) { movies$rrating <- cut_interval(movies$rating, length = 1) movies$budgetq <- cut_number(movies$budget, 4) doh <- ggplot(movies, aes(x = rrating, fill = budgetq)) # Wind rose doh + geom_bar(width = 1) + coord_polar() # Race track plot doh + geom_bar(width = 0.9, position = "fill") + coord_polar(theta = "y") } # A partial polar plot ggplot(mtcars, aes(disp, mpg)) + geom_point() + coord_radial(start = -0.4 * pi, end = 0.4 * pi, inner.radius = 0.3)
coord_trans()
is different to scale transformations in that it occurs after
statistical transformation and will affect the visual appearance of geoms - there is
no guarantee that straight lines will continue to be straight.
coord_trans( x = "identity", y = "identity", xlim = NULL, ylim = NULL, limx = deprecated(), limy = deprecated(), clip = "on", expand = TRUE )
coord_trans( x = "identity", y = "identity", xlim = NULL, ylim = NULL, limx = deprecated(), limy = deprecated(), clip = "on", expand = TRUE )
x , y
|
Transformers for x and y axes or their names. |
xlim , ylim
|
Limits for the x and y axes. |
limx , limy
|
|
clip |
Should drawing be clipped to the extent of the plot panel? A
setting of |
expand |
If |
Transformations only work with continuous values: see
scales::new_transform()
for list of transformations, and instructions
on how to create your own.
The coord transformations section of the online ggplot2 book.
# See ?geom_boxplot for other examples # Three ways of doing transformation in ggplot: # * by transforming the data ggplot(diamonds, aes(log10(carat), log10(price))) + geom_point() # * by transforming the scales ggplot(diamonds, aes(carat, price)) + geom_point() + scale_x_log10() + scale_y_log10() # * by transforming the coordinate system: ggplot(diamonds, aes(carat, price)) + geom_point() + coord_trans(x = "log10", y = "log10") # The difference between transforming the scales and # transforming the coordinate system is that scale # transformation occurs BEFORE statistics, and coordinate # transformation afterwards. Coordinate transformation also # changes the shape of geoms: d <- subset(diamonds, carat > 0.5) ggplot(d, aes(carat, price)) + geom_point() + geom_smooth(method = "lm") + scale_x_log10() + scale_y_log10() ggplot(d, aes(carat, price)) + geom_point() + geom_smooth(method = "lm") + coord_trans(x = "log10", y = "log10") # Here I used a subset of diamonds so that the smoothed line didn't # drop below zero, which obviously causes problems on the log-transformed # scale # With a combination of scale and coordinate transformation, it's # possible to do back-transformations: ggplot(diamonds, aes(carat, price)) + geom_point() + geom_smooth(method = "lm") + scale_x_log10() + scale_y_log10() + coord_trans(x = scales::transform_exp(10), y = scales::transform_exp(10)) # cf. ggplot(diamonds, aes(carat, price)) + geom_point() + geom_smooth(method = "lm") # Also works with discrete scales set.seed(1) df <- data.frame(a = abs(rnorm(26)),letters) plot <- ggplot(df,aes(a,letters)) + geom_point() plot + coord_trans(x = "log10") plot + coord_trans(x = "sqrt")
# See ?geom_boxplot for other examples # Three ways of doing transformation in ggplot: # * by transforming the data ggplot(diamonds, aes(log10(carat), log10(price))) + geom_point() # * by transforming the scales ggplot(diamonds, aes(carat, price)) + geom_point() + scale_x_log10() + scale_y_log10() # * by transforming the coordinate system: ggplot(diamonds, aes(carat, price)) + geom_point() + coord_trans(x = "log10", y = "log10") # The difference between transforming the scales and # transforming the coordinate system is that scale # transformation occurs BEFORE statistics, and coordinate # transformation afterwards. Coordinate transformation also # changes the shape of geoms: d <- subset(diamonds, carat > 0.5) ggplot(d, aes(carat, price)) + geom_point() + geom_smooth(method = "lm") + scale_x_log10() + scale_y_log10() ggplot(d, aes(carat, price)) + geom_point() + geom_smooth(method = "lm") + coord_trans(x = "log10", y = "log10") # Here I used a subset of diamonds so that the smoothed line didn't # drop below zero, which obviously causes problems on the log-transformed # scale # With a combination of scale and coordinate transformation, it's # possible to do back-transformations: ggplot(diamonds, aes(carat, price)) + geom_point() + geom_smooth(method = "lm") + scale_x_log10() + scale_y_log10() + coord_trans(x = scales::transform_exp(10), y = scales::transform_exp(10)) # cf. ggplot(diamonds, aes(carat, price)) + geom_point() + geom_smooth(method = "lm") # Also works with discrete scales set.seed(1) df <- data.frame(a = abs(rnorm(26)),letters) plot <- ggplot(df,aes(a,letters)) + geom_point() plot + coord_trans(x = "log10") plot + coord_trans(x = "sqrt")
This set of geom, stat, and coord are used to visualise simple feature (sf)
objects. For simple plots, you will only need geom_sf()
as it
uses stat_sf()
and adds coord_sf()
for you. geom_sf()
is
an unusual geom because it will draw different geometric objects depending
on what simple features are present in the data: you can get points, lines,
or polygons.
For text and labels, you can use geom_sf_text()
and geom_sf_label()
.
coord_sf( xlim = NULL, ylim = NULL, expand = TRUE, crs = NULL, default_crs = NULL, datum = sf::st_crs(4326), label_graticule = waiver(), label_axes = waiver(), lims_method = "cross", ndiscr = 100, default = FALSE, clip = "on" ) geom_sf( mapping = aes(), data = NULL, stat = "sf", position = "identity", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, ... ) geom_sf_label( mapping = aes(), data = NULL, stat = "sf_coordinates", position = "identity", ..., parse = FALSE, nudge_x = 0, nudge_y = 0, label.padding = unit(0.25, "lines"), label.r = unit(0.15, "lines"), label.size = 0.25, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, fun.geometry = NULL ) geom_sf_text( mapping = aes(), data = NULL, stat = "sf_coordinates", position = "identity", ..., parse = FALSE, nudge_x = 0, nudge_y = 0, check_overlap = FALSE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, fun.geometry = NULL ) stat_sf( mapping = NULL, data = NULL, geom = "rect", position = "identity", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, ... )
coord_sf( xlim = NULL, ylim = NULL, expand = TRUE, crs = NULL, default_crs = NULL, datum = sf::st_crs(4326), label_graticule = waiver(), label_axes = waiver(), lims_method = "cross", ndiscr = 100, default = FALSE, clip = "on" ) geom_sf( mapping = aes(), data = NULL, stat = "sf", position = "identity", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, ... ) geom_sf_label( mapping = aes(), data = NULL, stat = "sf_coordinates", position = "identity", ..., parse = FALSE, nudge_x = 0, nudge_y = 0, label.padding = unit(0.25, "lines"), label.r = unit(0.15, "lines"), label.size = 0.25, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, fun.geometry = NULL ) geom_sf_text( mapping = aes(), data = NULL, stat = "sf_coordinates", position = "identity", ..., parse = FALSE, nudge_x = 0, nudge_y = 0, check_overlap = FALSE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, fun.geometry = NULL ) stat_sf( mapping = NULL, data = NULL, geom = "rect", position = "identity", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, ... )
xlim , ylim
|
Limits for the x and y axes. These limits are specified
in the units of the default CRS. By default, this means projected coordinates
( |
expand |
If |
crs |
The coordinate reference system (CRS) into which all data should be projected before plotting. If not specified, will use the CRS defined in the first sf layer of the plot. |
default_crs |
The default CRS to be used for non-sf layers (which
don't carry any CRS information) and scale limits. The default value of
|
datum |
CRS that provides datum to use when generating graticules. |
label_graticule |
Character vector indicating which graticule lines should be labeled
where. Meridians run north-south, and the letters This parameter can be used alone or in combination with |
label_axes |
Character vector or named list of character values
specifying which graticule lines (meridians or parallels) should be labeled on
which side of the plot. Meridians are indicated by This parameter can be used alone or in combination with |
lims_method |
Method specifying how scale limits are converted into
limits on the plot region. Has no effect when |
ndiscr |
Number of segments to use for discretising graticule lines; try increasing this number when graticules look incorrect. |
default |
Is this the default coordinate system? If |
clip |
Should drawing be clipped to the extent of the plot panel? A
setting of |
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
You can also set this to one of "polygon", "line", and "point" to override the default legend. |
inherit.aes |
If |
... |
Other arguments passed on to
|
parse |
If |
nudge_x , nudge_y
|
Horizontal and vertical adjustment to nudge labels by.
Useful for offsetting text from points, particularly on discrete scales.
Cannot be jointly specified with |
label.padding |
Amount of padding around label. Defaults to 0.25 lines. |
label.r |
Radius of rounded corners. Defaults to 0.15 lines. |
label.size |
Size of label border, in mm. |
fun.geometry |
A function that takes a |
check_overlap |
If |
geom |
The geometric object to use to display the data for this layer.
When using a
|
geom_sf()
uses a unique aesthetic: geometry
, giving an
column of class sfc
containing simple features data. There
are three ways to supply the geometry
aesthetic:
Do nothing: by default geom_sf()
assumes it is stored in
the geometry
column.
Explicitly pass an sf
object to the data
argument.
This will use the primary geometry column, no matter what it's called.
Supply your own using aes(geometry = my_column)
Unlike other aesthetics, geometry
will never be inherited from
the plot.
coord_sf()
ensures that all layers use a common CRS. You can
either specify it using the crs
param, or coord_sf()
will
take it from the first layer that defines a CRS.
Most regular geoms, such as geom_point()
, geom_path()
,
geom_text()
, geom_polygon()
etc. will work fine with coord_sf()
. However
when using these geoms, two problems arise. First, what CRS should be used
for the x and y coordinates used by these non-sf geoms? The CRS applied to
non-sf geoms is set by the default_crs
parameter, and it defaults to
NULL
, which means positions for non-sf geoms are interpreted as projected
coordinates in the coordinate system set by the crs
parameter. This setting
allows you complete control over where exactly items are placed on the plot
canvas, but it may require some understanding of how projections work and how
to generate data in projected coordinates. As an alternative, you can set
default_crs = sf::st_crs(4326)
, the World Geodetic System 1984 (WGS84).
This means that x and y positions are interpreted as longitude and latitude,
respectively. You can also specify any other valid CRS as the default CRS for
non-sf geoms.
The second problem that arises for non-sf geoms is how straight lines
should be interpreted in projected space when default_crs
is not set to NULL
.
The approach coord_sf()
takes is to break straight lines into small pieces
(i.e., segmentize them) and then transform the pieces into projected coordinates.
For the default setting where x and y are interpreted as longitude and latitude,
this approach means that horizontal lines follow the parallels and vertical lines
follow the meridians. If you need a different approach to handling straight lines,
then you should manually segmentize and project coordinates and generate the plot
in projected coordinates.
The simple feature maps section of the online ggplot2 book.
if (requireNamespace("sf", quietly = TRUE)) { nc <- sf::st_read(system.file("shape/nc.shp", package = "sf"), quiet = TRUE) ggplot(nc) + geom_sf(aes(fill = AREA)) # If not supplied, coord_sf() will take the CRS from the first layer # and automatically transform all other layers to use that CRS. This # ensures that all data will correctly line up nc_3857 <- sf::st_transform(nc, 3857) ggplot() + geom_sf(data = nc) + geom_sf(data = nc_3857, colour = "red", fill = NA) # Unfortunately if you plot other types of feature you'll need to use # show.legend to tell ggplot2 what type of legend to use nc_3857$mid <- sf::st_centroid(nc_3857$geometry) ggplot(nc_3857) + geom_sf(colour = "white") + geom_sf(aes(geometry = mid, size = AREA), show.legend = "point") # You can also use layers with x and y aesthetics. To have these interpreted # as longitude/latitude you need to set the default CRS in coord_sf() ggplot(nc_3857) + geom_sf() + annotate("point", x = -80, y = 35, colour = "red", size = 4) + coord_sf(default_crs = sf::st_crs(4326)) # To add labels, use geom_sf_label(). ggplot(nc_3857[1:3, ]) + geom_sf(aes(fill = AREA)) + geom_sf_label(aes(label = NAME)) } # Thanks to the power of sf, a geom_sf nicely handles varying projections # setting the aspect ratio correctly. if (requireNamespace('maps', quietly = TRUE)) { library(maps) world1 <- sf::st_as_sf(map('world', plot = FALSE, fill = TRUE)) ggplot() + geom_sf(data = world1) world2 <- sf::st_transform( world1, "+proj=laea +y_0=0 +lon_0=155 +lat_0=-90 +ellps=WGS84 +no_defs" ) ggplot() + geom_sf(data = world2) }
if (requireNamespace("sf", quietly = TRUE)) { nc <- sf::st_read(system.file("shape/nc.shp", package = "sf"), quiet = TRUE) ggplot(nc) + geom_sf(aes(fill = AREA)) # If not supplied, coord_sf() will take the CRS from the first layer # and automatically transform all other layers to use that CRS. This # ensures that all data will correctly line up nc_3857 <- sf::st_transform(nc, 3857) ggplot() + geom_sf(data = nc) + geom_sf(data = nc_3857, colour = "red", fill = NA) # Unfortunately if you plot other types of feature you'll need to use # show.legend to tell ggplot2 what type of legend to use nc_3857$mid <- sf::st_centroid(nc_3857$geometry) ggplot(nc_3857) + geom_sf(colour = "white") + geom_sf(aes(geometry = mid, size = AREA), show.legend = "point") # You can also use layers with x and y aesthetics. To have these interpreted # as longitude/latitude you need to set the default CRS in coord_sf() ggplot(nc_3857) + geom_sf() + annotate("point", x = -80, y = 35, colour = "red", size = 4) + coord_sf(default_crs = sf::st_crs(4326)) # To add labels, use geom_sf_label(). ggplot(nc_3857[1:3, ]) + geom_sf(aes(fill = AREA)) + geom_sf_label(aes(label = NAME)) } # Thanks to the power of sf, a geom_sf nicely handles varying projections # setting the aspect ratio correctly. if (requireNamespace('maps', quietly = TRUE)) { library(maps) world1 <- sf::st_as_sf(map('world', plot = FALSE, fill = TRUE)) ggplot() + geom_sf(data = world1) world2 <- sf::st_transform( world1, "+proj=laea +y_0=0 +lon_0=155 +lat_0=-90 +ellps=WGS84 +no_defs" ) ggplot() + geom_sf(data = world2) }
cut_interval()
makes n
groups with equal range, cut_number()
makes n
groups with (approximately) equal numbers of observations;
cut_width()
makes groups of width width
.
cut_interval(x, n = NULL, length = NULL, ...) cut_number(x, n = NULL, ...) cut_width(x, width, center = NULL, boundary = NULL, closed = "right", ...)
cut_interval(x, n = NULL, length = NULL, ...) cut_number(x, n = NULL, ...) cut_width(x, width, center = NULL, boundary = NULL, closed = "right", ...)
x |
numeric vector |
n |
number of intervals to create, OR |
length |
length of each interval |
... |
Arguments passed on to
|
width |
The bin width. |
center , boundary
|
Specify either the position of edge or the center of a bin. Since all bins are aligned, specifying the position of a single bin (which doesn't need to be in the range of the data) affects the location of all bins. If not specified, uses the "tile layers algorithm", and sets the boundary to half of the binwidth. To center on integers, |
closed |
One of |
Randall Prium contributed most of the implementation of
cut_width()
.
table(cut_interval(1:100, 10)) table(cut_interval(1:100, 11)) set.seed(1) table(cut_number(runif(1000), 10)) table(cut_width(runif(1000), 0.1)) table(cut_width(runif(1000), 0.1, boundary = 0)) table(cut_width(runif(1000), 0.1, center = 0)) table(cut_width(runif(1000), 0.1, labels = FALSE))
table(cut_interval(1:100, 10)) table(cut_interval(1:100, 11)) set.seed(1) table(cut_number(runif(1000), 10)) table(cut_width(runif(1000), 0.1)) table(cut_width(runif(1000), 0.1, boundary = 0)) table(cut_width(runif(1000), 0.1, center = 0)) table(cut_width(runif(1000), 0.1, labels = FALSE))
A dataset containing the prices and other attributes of almost 54,000 diamonds. The variables are as follows:
diamonds
diamonds
A data frame with 53940 rows and 10 variables:
price in US dollars ($326–$18,823)
weight of the diamond (0.2–5.01)
quality of the cut (Fair, Good, Very Good, Premium, Ideal)
diamond colour, from D (best) to J (worst)
a measurement of how clear the diamond is (I1 (worst), SI2, SI1, VS2, VS1, VVS2, VVS1, IF (best))
length in mm (0–10.74)
width in mm (0–58.9)
depth in mm (0–31.8)
total depth percentage = z / mean(x, y) = 2 * z / (x + y) (43–79)
width of top of diamond relative to widest point (43–95)
Each geom has an associated function that draws the key when the geom needs
to be displayed in a legend. These functions are called draw_key_*()
, where
*
stands for the name of the respective key glyph. The key glyphs can be
customized for individual geoms by providing a geom with the key_glyph
argument (see layer()
or examples below.)
draw_key_point(data, params, size) draw_key_abline(data, params, size) draw_key_rect(data, params, size) draw_key_polygon(data, params, size) draw_key_blank(data, params, size) draw_key_boxplot(data, params, size) draw_key_crossbar(data, params, size) draw_key_path(data, params, size) draw_key_vpath(data, params, size) draw_key_dotplot(data, params, size) draw_key_linerange(data, params, size) draw_key_pointrange(data, params, size) draw_key_smooth(data, params, size) draw_key_text(data, params, size) draw_key_label(data, params, size) draw_key_vline(data, params, size) draw_key_timeseries(data, params, size)
draw_key_point(data, params, size) draw_key_abline(data, params, size) draw_key_rect(data, params, size) draw_key_polygon(data, params, size) draw_key_blank(data, params, size) draw_key_boxplot(data, params, size) draw_key_crossbar(data, params, size) draw_key_path(data, params, size) draw_key_vpath(data, params, size) draw_key_dotplot(data, params, size) draw_key_linerange(data, params, size) draw_key_pointrange(data, params, size) draw_key_smooth(data, params, size) draw_key_text(data, params, size) draw_key_label(data, params, size) draw_key_vline(data, params, size) draw_key_timeseries(data, params, size)
data |
A single row data frame containing the scaled aesthetics to display in this key |
params |
A list of additional parameters supplied to the geom. |
size |
Width and height of key in mm. |
A grid grob.
p <- ggplot(economics, aes(date, psavert, color = "savings rate")) # key glyphs can be specified by their name p + geom_line(key_glyph = "timeseries") # key glyphs can be specified via their drawing function p + geom_line(key_glyph = draw_key_rect)
p <- ggplot(economics, aes(date, psavert, color = "savings rate")) # key glyphs can be specified by their name p + geom_line(key_glyph = "timeseries") # key glyphs can be specified via their drawing function p + geom_line(key_glyph = draw_key_rect)
This dataset was produced from US economic time series data available from
https://fred.stlouisfed.org/. economics
is in "wide"
format, economics_long
is in "long" format.
economics economics_long
economics economics_long
A data frame with 574 rows and 6 variables:
Month of data collection
personal consumption expenditures, in billions of dollars, https://fred.stlouisfed.org/series/PCE
total population, in thousands, https://fred.stlouisfed.org/series/POP
personal savings rate, https://fred.stlouisfed.org/series/PSAVERT/
median duration of unemployment, in weeks, https://fred.stlouisfed.org/series/UEMPMED
number of unemployed in thousands, https://fred.stlouisfed.org/series/UNEMPLOY
An object of class tbl_df
(inherits from tbl
, data.frame
) with 2870 rows and 4 columns.
In conjunction with the theme system, the element_
functions
specify the display of how non-data components of the plot are drawn.
element_blank()
: draws nothing, and assigns no space.
element_rect()
: borders and backgrounds.
element_line()
: lines.
element_text()
: text.
element_geom()
: defaults for drawing layers.
rel()
is used to specify sizes relative to the parent,
margin()
is used to specify the margins of elements.
element_blank() element_rect( fill = NULL, colour = NULL, linewidth = NULL, linetype = NULL, color = NULL, inherit.blank = FALSE, size = deprecated() ) element_line( colour = NULL, linewidth = NULL, linetype = NULL, lineend = NULL, color = NULL, arrow = NULL, arrow.fill = NULL, inherit.blank = FALSE, size = deprecated() ) element_text( family = NULL, face = NULL, colour = NULL, size = NULL, hjust = NULL, vjust = NULL, angle = NULL, lineheight = NULL, color = NULL, margin = NULL, debug = NULL, inherit.blank = FALSE ) element_geom( ink = NULL, paper = NULL, accent = NULL, linewidth = NULL, borderwidth = NULL, linetype = NULL, bordertype = NULL, family = NULL, fontsize = NULL, pointsize = NULL, pointshape = NULL ) rel(x) margin(t = 0, r = 0, b = 0, l = 0, unit = "pt")
element_blank() element_rect( fill = NULL, colour = NULL, linewidth = NULL, linetype = NULL, color = NULL, inherit.blank = FALSE, size = deprecated() ) element_line( colour = NULL, linewidth = NULL, linetype = NULL, lineend = NULL, color = NULL, arrow = NULL, arrow.fill = NULL, inherit.blank = FALSE, size = deprecated() ) element_text( family = NULL, face = NULL, colour = NULL, size = NULL, hjust = NULL, vjust = NULL, angle = NULL, lineheight = NULL, color = NULL, margin = NULL, debug = NULL, inherit.blank = FALSE ) element_geom( ink = NULL, paper = NULL, accent = NULL, linewidth = NULL, borderwidth = NULL, linetype = NULL, bordertype = NULL, family = NULL, fontsize = NULL, pointsize = NULL, pointshape = NULL ) rel(x) margin(t = 0, r = 0, b = 0, l = 0, unit = "pt")
fill |
Fill colour. |
colour , color
|
Line/border colour. Color is an alias for colour. |
linewidth , borderwidth
|
Line/border size in mm. |
linetype , bordertype
|
Line type for lines and borders respectively. An integer (0:8), a name (blank, solid, dashed, dotted, dotdash, longdash, twodash), or a string with an even number (up to eight) of hexadecimal digits which give the lengths in consecutive positions in the string. |
inherit.blank |
Should this element inherit the existence of an
|
size , fontsize
|
text size in pts. |
lineend |
Line end Line end style (round, butt, square) |
arrow |
Arrow specification, as created by |
arrow.fill |
Fill colour for arrows. |
family |
Font family |
face |
Font face ("plain", "italic", "bold", "bold.italic") |
hjust |
Horizontal justification (in |
vjust |
Vertical justification (in |
angle |
Angle (in |
lineheight |
Line height |
margin |
Margins around the text. See |
debug |
If |
ink |
Foreground colour. |
paper |
Background colour. |
accent |
Accent colour. |
pointsize |
Size for points in mm. |
pointshape |
Shape for points (1-25). |
x |
A single number specifying size relative to parent element. |
t , r , b , l
|
Dimensions of each margin. (To remember order, think trouble). |
unit |
Default units of dimensions. Defaults to "pt" so it can be most easily scaled with the text. |
An S3 object of class element
, rel
, or margin
.
plot <- ggplot(mpg, aes(displ, hwy)) + geom_point() plot + theme( panel.background = element_blank(), axis.text = element_blank() ) plot + theme( axis.text = element_text(colour = "red", size = rel(1.5)) ) plot + theme( axis.line = element_line(arrow = arrow()) ) plot + theme( panel.background = element_rect(fill = "white"), plot.margin = margin(2, 2, 2, 2, "cm"), plot.background = element_rect( fill = "grey90", colour = "black", linewidth = 1 ) ) ggplot(mpg, aes(displ, hwy)) + geom_point() + geom_smooth(formula = y ~ x, method = "lm") + theme(geom = element_geom( ink = "red", accent = "black", pointsize = 1, linewidth = 2 ))
plot <- ggplot(mpg, aes(displ, hwy)) + geom_point() plot + theme( panel.background = element_blank(), axis.text = element_blank() ) plot + theme( axis.text = element_text(colour = "red", size = rel(1.5)) ) plot + theme( axis.line = element_line(arrow = arrow()) ) plot + theme( panel.background = element_rect(fill = "white"), plot.margin = margin(2, 2, 2, 2, "cm"), plot.background = element_rect( fill = "grey90", colour = "black", linewidth = 1 ) ) ggplot(mpg, aes(displ, hwy)) + geom_point() + geom_smooth(formula = y ~ x, method = "lm") + theme(geom = element_geom( ink = "red", accent = "black", pointsize = 1, linewidth = 2 ))
Sometimes you may want to ensure limits include a single value, for all
panels or all plots. This function is a thin wrapper around
geom_blank()
that makes it easy to add such values.
expand_limits(...)
expand_limits(...)
... |
named list of aesthetics specifying the value (or values) that should be included in each scale. |
p <- ggplot(mtcars, aes(mpg, wt)) + geom_point() p + expand_limits(x = 0) p + expand_limits(y = c(1, 9)) p + expand_limits(x = 0, y = 0) ggplot(mtcars, aes(mpg, wt)) + geom_point(aes(colour = cyl)) + expand_limits(colour = seq(2, 10, by = 2)) ggplot(mtcars, aes(mpg, wt)) + geom_point(aes(colour = factor(cyl))) + expand_limits(colour = factor(seq(2, 10, by = 2)))
p <- ggplot(mtcars, aes(mpg, wt)) + geom_point() p + expand_limits(x = 0) p + expand_limits(y = c(1, 9)) p + expand_limits(x = 0, y = 0) ggplot(mtcars, aes(mpg, wt)) + geom_point(aes(colour = cyl)) + expand_limits(colour = seq(2, 10, by = 2)) ggplot(mtcars, aes(mpg, wt)) + geom_point(aes(colour = factor(cyl))) + expand_limits(colour = factor(seq(2, 10, by = 2)))
This is a convenience function for generating scale expansion vectors
for the expand
argument of scale_(x|y)_continuous
and scale_(x|y)_discrete. The expansion vectors are used to
add some space between the data and the axes.
expansion(mult = 0, add = 0) expand_scale(mult = 0, add = 0)
expansion(mult = 0, add = 0) expand_scale(mult = 0, add = 0)
mult |
vector of multiplicative range expansion factors.
If length 1, both the lower and upper limits of the scale
are expanded outwards by |
add |
vector of additive range expansion constants.
If length 1, both the lower and upper limits of the scale
are expanded outwards by |
# No space below the bars but 10% above them ggplot(mtcars) + geom_bar(aes(x = factor(cyl))) + scale_y_continuous(expand = expansion(mult = c(0, .1))) # Add 2 units of space on the left and right of the data ggplot(subset(diamonds, carat > 2), aes(cut, clarity)) + geom_jitter() + scale_x_discrete(expand = expansion(add = 2)) # Reproduce the default range expansion used # when the 'expand' argument is not specified ggplot(subset(diamonds, carat > 2), aes(cut, price)) + geom_jitter() + scale_x_discrete(expand = expansion(add = .6)) + scale_y_continuous(expand = expansion(mult = .05))
# No space below the bars but 10% above them ggplot(mtcars) + geom_bar(aes(x = factor(cyl))) + scale_y_continuous(expand = expansion(mult = c(0, .1))) # Add 2 units of space on the left and right of the data ggplot(subset(diamonds, carat > 2), aes(cut, clarity)) + geom_jitter() + scale_x_discrete(expand = expansion(add = 2)) # Reproduce the default range expansion used # when the 'expand' argument is not specified ggplot(subset(diamonds, carat > 2), aes(cut, price)) + geom_jitter() + scale_x_discrete(expand = expansion(add = .6)) + scale_y_continuous(expand = expansion(mult = .05))
facet_grid()
forms a matrix of panels defined by row and column
faceting variables. It is most useful when you have two discrete
variables, and all combinations of the variables exist in the data.
If you have only one variable with many levels, try facet_wrap()
.
facet_grid( rows = NULL, cols = NULL, scales = "fixed", space = "fixed", shrink = TRUE, labeller = "label_value", as.table = TRUE, switch = NULL, drop = TRUE, margins = FALSE, axes = "margins", axis.labels = "all", facets = deprecated() )
facet_grid( rows = NULL, cols = NULL, scales = "fixed", space = "fixed", shrink = TRUE, labeller = "label_value", as.table = TRUE, switch = NULL, drop = TRUE, margins = FALSE, axes = "margins", axis.labels = "all", facets = deprecated() )
rows , cols
|
A set of variables or expressions quoted by
For compatibility with the classic interface, |
scales |
Are scales shared across all facets (the default,
|
space |
If |
shrink |
If |
labeller |
A function that takes one data frame of labels and
returns a list or data frame of character vectors. Each input
column corresponds to one factor. Thus there will be more than
one with |
as.table |
If |
switch |
By default, the labels are displayed on the top and
right of the plot. If |
drop |
If |
margins |
Either a logical value or a character
vector. Margins are additional facets which contain all the data
for each of the possible values of the faceting variables. If
|
axes |
Determines which axes will be drawn. When |
axis.labels |
Determines whether to draw labels for interior axes when
the |
facets |
The facet grid section of the online ggplot2 book.
p <- ggplot(mpg, aes(displ, cty)) + geom_point() # Use vars() to supply variables from the dataset: p + facet_grid(rows = vars(drv)) p + facet_grid(cols = vars(cyl)) p + facet_grid(vars(drv), vars(cyl)) # To change plot order of facet grid, # change the order of variable levels with factor() # If you combine a facetted dataset with a dataset that lacks those # faceting variables, the data will be repeated across the missing # combinations: df <- data.frame(displ = mean(mpg$displ), cty = mean(mpg$cty)) p + facet_grid(cols = vars(cyl)) + geom_point(data = df, colour = "red", size = 2) # When scales are constant, duplicated axes can be shown with # or without labels ggplot(mpg, aes(cty, hwy)) + geom_point() + facet_grid(year ~ drv, axes = "all", axis.labels = "all_x") # Free scales ------------------------------------------------------- # You can also choose whether the scales should be constant # across all panels (the default), or whether they should be allowed # to vary mt <- ggplot(mtcars, aes(mpg, wt, colour = factor(cyl))) + geom_point() mt + facet_grid(vars(cyl), scales = "free") # If scales and space are free, then the mapping between position # and values in the data will be the same across all panels. This # is particularly useful for categorical axes ggplot(mpg, aes(drv, model)) + geom_point() + facet_grid(manufacturer ~ ., scales = "free", space = "free") + theme(strip.text.y = element_text(angle = 0)) # Margins ---------------------------------------------------------- # Margins can be specified logically (all yes or all no) or for specific # variables as (character) variable names mg <- ggplot(mtcars, aes(x = mpg, y = wt)) + geom_point() mg + facet_grid(vs + am ~ gear, margins = TRUE) mg + facet_grid(vs + am ~ gear, margins = "am") # when margins are made over "vs", since the facets for "am" vary # within the values of "vs", the marginal facet for "vs" is also # a margin over "am". mg + facet_grid(vs + am ~ gear, margins = "vs")
p <- ggplot(mpg, aes(displ, cty)) + geom_point() # Use vars() to supply variables from the dataset: p + facet_grid(rows = vars(drv)) p + facet_grid(cols = vars(cyl)) p + facet_grid(vars(drv), vars(cyl)) # To change plot order of facet grid, # change the order of variable levels with factor() # If you combine a facetted dataset with a dataset that lacks those # faceting variables, the data will be repeated across the missing # combinations: df <- data.frame(displ = mean(mpg$displ), cty = mean(mpg$cty)) p + facet_grid(cols = vars(cyl)) + geom_point(data = df, colour = "red", size = 2) # When scales are constant, duplicated axes can be shown with # or without labels ggplot(mpg, aes(cty, hwy)) + geom_point() + facet_grid(year ~ drv, axes = "all", axis.labels = "all_x") # Free scales ------------------------------------------------------- # You can also choose whether the scales should be constant # across all panels (the default), or whether they should be allowed # to vary mt <- ggplot(mtcars, aes(mpg, wt, colour = factor(cyl))) + geom_point() mt + facet_grid(vars(cyl), scales = "free") # If scales and space are free, then the mapping between position # and values in the data will be the same across all panels. This # is particularly useful for categorical axes ggplot(mpg, aes(drv, model)) + geom_point() + facet_grid(manufacturer ~ ., scales = "free", space = "free") + theme(strip.text.y = element_text(angle = 0)) # Margins ---------------------------------------------------------- # Margins can be specified logically (all yes or all no) or for specific # variables as (character) variable names mg <- ggplot(mtcars, aes(x = mpg, y = wt)) + geom_point() mg + facet_grid(vs + am ~ gear, margins = TRUE) mg + facet_grid(vs + am ~ gear, margins = "am") # when margins are made over "vs", since the facets for "am" vary # within the values of "vs", the marginal facet for "vs" is also # a margin over "am". mg + facet_grid(vs + am ~ gear, margins = "vs")
facet_wrap()
wraps a 1d sequence of panels into 2d. This is generally
a better use of screen space than facet_grid()
because most
displays are roughly rectangular.
facet_wrap( facets, nrow = NULL, ncol = NULL, scales = "fixed", space = "fixed", shrink = TRUE, labeller = "label_value", as.table = TRUE, switch = deprecated(), drop = TRUE, dir = "h", strip.position = "top", axes = "margins", axis.labels = "all" )
facet_wrap( facets, nrow = NULL, ncol = NULL, scales = "fixed", space = "fixed", shrink = TRUE, labeller = "label_value", as.table = TRUE, switch = deprecated(), drop = TRUE, dir = "h", strip.position = "top", axes = "margins", axis.labels = "all" )
facets |
A set of variables or expressions quoted by For compatibility with the classic interface, can also be a
formula or character vector. Use either a one sided formula, |
nrow , ncol
|
Number of rows and columns. |
scales |
Should scales be fixed ( |
space |
If |
shrink |
If |
labeller |
A function that takes one data frame of labels and
returns a list or data frame of character vectors. Each input
column corresponds to one factor. Thus there will be more than
one with |
as.table |
If |
switch |
By default, the labels are displayed on the top and
right of the plot. If |
drop |
If |
dir |
Direction: either |
strip.position |
By default, the labels are displayed on the top of
the plot. Using |
axes |
Determines which axes will be drawn in case of fixed scales.
When |
axis.labels |
Determines whether to draw labels for interior axes when
the scale is fixed and the |
The facet wrap section of the online ggplot2 book.
p <- ggplot(mpg, aes(displ, hwy)) + geom_point() # Use vars() to supply faceting variables: p + facet_wrap(vars(class)) # Control the number of rows and columns with nrow and ncol p + facet_wrap(vars(class), nrow = 4) # You can facet by multiple variables ggplot(mpg, aes(displ, hwy)) + geom_point() + facet_wrap(vars(cyl, drv)) # Use the `labeller` option to control how labels are printed: ggplot(mpg, aes(displ, hwy)) + geom_point() + facet_wrap(vars(cyl, drv), labeller = "label_both") # To change the order in which the panels appear, change the levels # of the underlying factor. mpg$class2 <- reorder(mpg$class, mpg$displ) ggplot(mpg, aes(displ, hwy)) + geom_point() + facet_wrap(vars(class2)) # By default, the same scales are used for all panels. You can allow # scales to vary across the panels with the `scales` argument. # Free scales make it easier to see patterns within each panel, but # harder to compare across panels. ggplot(mpg, aes(displ, hwy)) + geom_point() + facet_wrap(vars(class), scales = "free") # When scales are constant, duplicated axes can be shown with # or without labels ggplot(mpg, aes(displ, hwy)) + geom_point() + facet_wrap(vars(class), axes = "all", axis.labels = "all_y") # To repeat the same data in every panel, simply construct a data frame # that does not contain the faceting variable. ggplot(mpg, aes(displ, hwy)) + geom_point(data = transform(mpg, class = NULL), colour = "grey85") + geom_point() + facet_wrap(vars(class)) # Use `strip.position` to display the facet labels at the side of your # choice. Setting it to `bottom` makes it act as a subtitle for the axis. # This is typically used with free scales and a theme without boxes around # strip labels. ggplot(economics_long, aes(date, value)) + geom_line() + facet_wrap(vars(variable), scales = "free_y", nrow = 2, strip.position = "top") + theme(strip.background = element_blank(), strip.placement = "outside") # The two letters determine the starting position, so 'tr' starts # in the top-right. # The first letter determines direction, so 'tr' fills top-to-bottom. # `dir = "tr"` is equivalent to `dir = "v", as.table = FALSE` ggplot(mpg, aes(displ, hwy)) + geom_point() + facet_wrap(vars(class), dir = "tr")
p <- ggplot(mpg, aes(displ, hwy)) + geom_point() # Use vars() to supply faceting variables: p + facet_wrap(vars(class)) # Control the number of rows and columns with nrow and ncol p + facet_wrap(vars(class), nrow = 4) # You can facet by multiple variables ggplot(mpg, aes(displ, hwy)) + geom_point() + facet_wrap(vars(cyl, drv)) # Use the `labeller` option to control how labels are printed: ggplot(mpg, aes(displ, hwy)) + geom_point() + facet_wrap(vars(cyl, drv), labeller = "label_both") # To change the order in which the panels appear, change the levels # of the underlying factor. mpg$class2 <- reorder(mpg$class, mpg$displ) ggplot(mpg, aes(displ, hwy)) + geom_point() + facet_wrap(vars(class2)) # By default, the same scales are used for all panels. You can allow # scales to vary across the panels with the `scales` argument. # Free scales make it easier to see patterns within each panel, but # harder to compare across panels. ggplot(mpg, aes(displ, hwy)) + geom_point() + facet_wrap(vars(class), scales = "free") # When scales are constant, duplicated axes can be shown with # or without labels ggplot(mpg, aes(displ, hwy)) + geom_point() + facet_wrap(vars(class), axes = "all", axis.labels = "all_y") # To repeat the same data in every panel, simply construct a data frame # that does not contain the faceting variable. ggplot(mpg, aes(displ, hwy)) + geom_point(data = transform(mpg, class = NULL), colour = "grey85") + geom_point() + facet_wrap(vars(class)) # Use `strip.position` to display the facet labels at the side of your # choice. Setting it to `bottom` makes it act as a subtitle for the axis. # This is typically used with free scales and a theme without boxes around # strip labels. ggplot(economics_long, aes(date, value)) + geom_line() + facet_wrap(vars(variable), scales = "free_y", nrow = 2, strip.position = "top") + theme(strip.background = element_blank(), strip.placement = "outside") # The two letters determine the starting position, so 'tr' starts # in the top-right. # The first letter determines direction, so 'tr' fills top-to-bottom. # `dir = "tr"` is equivalent to `dir = "v", as.table = FALSE` ggplot(mpg, aes(displ, hwy)) + geom_point() + facet_wrap(vars(class), dir = "tr")
A 2d density estimate of the waiting and eruptions variables data faithful.
faithfuld
faithfuld
A data frame with 5,625 observations and 3 variables:
Eruption time in mins
Waiting time to next eruption in mins
2d density estimate
Rather than using this function, I now recommend using the broom
package, which implements a much wider range of methods. fortify()
may be deprecated in the future.
fortify(model, data, ...)
fortify(model, data, ...)
model |
model or other R object to convert to data frame |
data |
original dataset, if needed |
... |
Arguments passed to methods. |
Other plotting automation topics:
autolayer()
,
automatic_plotting
,
autoplot()
These geoms add reference lines (sometimes called rules) to a plot, either horizontal, vertical, or diagonal (specified by slope and intercept). These are useful for annotating plots.
geom_abline( mapping = NULL, data = NULL, ..., slope, intercept, na.rm = FALSE, show.legend = NA ) geom_hline( mapping = NULL, data = NULL, position = "identity", ..., yintercept, na.rm = FALSE, show.legend = NA ) geom_vline( mapping = NULL, data = NULL, position = "identity", ..., xintercept, na.rm = FALSE, show.legend = NA )
geom_abline( mapping = NULL, data = NULL, ..., slope, intercept, na.rm = FALSE, show.legend = NA ) geom_hline( mapping = NULL, data = NULL, position = "identity", ..., yintercept, na.rm = FALSE, show.legend = NA ) geom_vline( mapping = NULL, data = NULL, position = "identity", ..., xintercept, na.rm = FALSE, show.legend = NA )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
... |
Other arguments passed on to
|
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
xintercept , yintercept , slope , intercept
|
Parameters that control the
position of the line. If these are set, |
These geoms act slightly differently from other geoms. You can supply the
parameters in two ways: either as arguments to the layer function,
or via aesthetics. If you use arguments, e.g.
geom_abline(intercept = 0, slope = 1)
, then behind the scenes
the geom makes a new data frame containing just the data you've supplied.
That means that the lines will be the same in all facets; if you want them
to vary across facets, construct the data frame yourself and use aesthetics.
Unlike most other geoms, these geoms do not inherit aesthetics from the plot default, because they do not understand x and y aesthetics which are commonly set in the plot. They also do not affect the x and y scales.
These geoms are drawn using geom_line()
so they support the
same aesthetics: alpha
, colour
, linetype
and
linewidth
. They also each have aesthetics that control the position of
the line:
geom_vline()
: xintercept
geom_hline()
: yintercept
geom_abline()
: slope
and intercept
See geom_segment()
for a more general approach to
adding straight line segments to a plot.
p <- ggplot(mtcars, aes(wt, mpg)) + geom_point() # Fixed values p + geom_vline(xintercept = 5) p + geom_vline(xintercept = 1:5) p + geom_hline(yintercept = 20) p + geom_abline() # Can't see it - outside the range of the data p + geom_abline(intercept = 20) # Calculate slope and intercept of line of best fit coef(lm(mpg ~ wt, data = mtcars)) p + geom_abline(intercept = 37, slope = -5) # But this is easier to do with geom_smooth: p + geom_smooth(method = "lm", se = FALSE) # To show different lines in different facets, use aesthetics p <- ggplot(mtcars, aes(mpg, wt)) + geom_point() + facet_wrap(~ cyl) mean_wt <- data.frame(cyl = c(4, 6, 8), wt = c(2.28, 3.11, 4.00)) p + geom_hline(aes(yintercept = wt), mean_wt) # You can also control other aesthetics ggplot(mtcars, aes(mpg, wt, colour = wt)) + geom_point() + geom_hline(aes(yintercept = wt, colour = wt), mean_wt) + facet_wrap(~ cyl)
p <- ggplot(mtcars, aes(wt, mpg)) + geom_point() # Fixed values p + geom_vline(xintercept = 5) p + geom_vline(xintercept = 1:5) p + geom_hline(yintercept = 20) p + geom_abline() # Can't see it - outside the range of the data p + geom_abline(intercept = 20) # Calculate slope and intercept of line of best fit coef(lm(mpg ~ wt, data = mtcars)) p + geom_abline(intercept = 37, slope = -5) # But this is easier to do with geom_smooth: p + geom_smooth(method = "lm", se = FALSE) # To show different lines in different facets, use aesthetics p <- ggplot(mtcars, aes(mpg, wt)) + geom_point() + facet_wrap(~ cyl) mean_wt <- data.frame(cyl = c(4, 6, 8), wt = c(2.28, 3.11, 4.00)) p + geom_hline(aes(yintercept = wt), mean_wt) # You can also control other aesthetics ggplot(mtcars, aes(mpg, wt, colour = wt)) + geom_point() + geom_hline(aes(yintercept = wt, colour = wt), mean_wt) + facet_wrap(~ cyl)
There are two types of bar charts: geom_bar()
and geom_col()
.
geom_bar()
makes the height of the bar proportional to the number of
cases in each group (or if the weight
aesthetic is supplied, the sum
of the weights). If you want the heights of the bars to represent values
in the data, use geom_col()
instead. geom_bar()
uses stat_count()
by
default: it counts the number of cases at each x position. geom_col()
uses stat_identity()
: it leaves the data as is.
geom_bar( mapping = NULL, data = NULL, stat = "count", position = "stack", ..., just = 0.5, na.rm = FALSE, orientation = NA, show.legend = NA, inherit.aes = TRUE ) geom_col( mapping = NULL, data = NULL, position = "stack", ..., just = 0.5, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_count( mapping = NULL, data = NULL, geom = "bar", position = "stack", ..., na.rm = FALSE, orientation = NA, show.legend = NA, inherit.aes = TRUE )
geom_bar( mapping = NULL, data = NULL, stat = "count", position = "stack", ..., just = 0.5, na.rm = FALSE, orientation = NA, show.legend = NA, inherit.aes = TRUE ) geom_col( mapping = NULL, data = NULL, position = "stack", ..., just = 0.5, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_count( mapping = NULL, data = NULL, geom = "bar", position = "stack", ..., na.rm = FALSE, orientation = NA, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
just |
Adjustment for column placement. Set to |
na.rm |
If |
orientation |
The orientation of the layer. The default ( |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
geom , stat
|
Override the default connection between |
A bar chart uses height to represent a value, and so the base of the
bar must always be shown to produce a valid visual comparison.
Proceed with caution when using transformed scales with a bar chart.
It's important to always use a meaningful reference point for the base of the bar.
For example, for log transformations the reference point is 1. In fact, when
using a log scale, geom_bar()
automatically places the base of the bar at 1.
Furthermore, never use stacked bars with a transformed scale, because scaling
happens before stacking. As a consequence, the height of bars will be wrong
when stacking occurs with a transformed scale.
By default, multiple bars occupying the same x
position will be stacked
atop one another by position_stack()
. If you want them to be dodged
side-to-side, use position_dodge()
or position_dodge2()
. Finally,
position_fill()
shows relative proportions at each x
by stacking the
bars and then standardising each bar to have the same height.
This geom treats each axis differently and, thus, can thus have two orientations. Often the orientation is easy to deduce from a combination of the given mappings and the types of positional scales in use. Thus, ggplot2 will by default try to guess which orientation the layer should have. Under rare circumstances, the orientation is ambiguous and guessing may fail. In that case the orientation can be specified directly using the orientation
parameter, which can be either "x"
or "y"
. The value gives the axis that the geom should run along, "x"
being the default orientation you would expect for the geom.
geom_bar()
understands the following aesthetics (required aesthetics are in bold):
Learn more about setting these aesthetics in vignette("ggplot2-specs")
.
geom_col()
understands the following aesthetics (required aesthetics are in bold):
Learn more about setting these aesthetics in vignette("ggplot2-specs")
.
stat_count()
understands the following aesthetics (required aesthetics are in bold):
Learn more about setting these aesthetics in vignette("ggplot2-specs")
.
These are calculated by the 'stat' part of layers and can be accessed with delayed evaluation.
after_stat(count)
number of points in bin.
after_stat(prop)
groupwise proportion
geom_histogram()
for continuous data,
position_dodge()
and position_dodge2()
for creating side-by-side
bar charts.
stat_bin()
, which bins data in ranges and counts the
cases in each range. It differs from stat_count()
, which counts the
number of cases at each x
position (without binning into ranges).
stat_bin()
requires continuous x
data, whereas
stat_count()
can be used for both discrete and continuous x
data.
# geom_bar is designed to make it easy to create bar charts that show # counts (or sums of weights) g <- ggplot(mpg, aes(class)) # Number of cars in each class: g + geom_bar() # Total engine displacement of each class g + geom_bar(aes(weight = displ)) # Map class to y instead to flip the orientation ggplot(mpg) + geom_bar(aes(y = class)) # Bar charts are automatically stacked when multiple bars are placed # at the same location. The order of the fill is designed to match # the legend g + geom_bar(aes(fill = drv)) # If you need to flip the order (because you've flipped the orientation) # call position_stack() explicitly: ggplot(mpg, aes(y = class)) + geom_bar(aes(fill = drv), position = position_stack(reverse = TRUE)) + theme(legend.position = "top") # To show (e.g.) means, you need geom_col() df <- data.frame(trt = c("a", "b", "c"), outcome = c(2.3, 1.9, 3.2)) ggplot(df, aes(trt, outcome)) + geom_col() # But geom_point() displays exactly the same information and doesn't # require the y-axis to touch zero. ggplot(df, aes(trt, outcome)) + geom_point() # You can also use geom_bar() with continuous data, in which case # it will show counts at unique locations df <- data.frame(x = rep(c(2.9, 3.1, 4.5), c(5, 10, 4))) ggplot(df, aes(x)) + geom_bar() # cf. a histogram of the same data ggplot(df, aes(x)) + geom_histogram(binwidth = 0.5) # Use `just` to control how columns are aligned with axis breaks: df <- data.frame(x = as.Date(c("2020-01-01", "2020-02-01")), y = 1:2) # Columns centered on the first day of the month ggplot(df, aes(x, y)) + geom_col(just = 0.5) # Columns begin on the first day of the month ggplot(df, aes(x, y)) + geom_col(just = 1)
# geom_bar is designed to make it easy to create bar charts that show # counts (or sums of weights) g <- ggplot(mpg, aes(class)) # Number of cars in each class: g + geom_bar() # Total engine displacement of each class g + geom_bar(aes(weight = displ)) # Map class to y instead to flip the orientation ggplot(mpg) + geom_bar(aes(y = class)) # Bar charts are automatically stacked when multiple bars are placed # at the same location. The order of the fill is designed to match # the legend g + geom_bar(aes(fill = drv)) # If you need to flip the order (because you've flipped the orientation) # call position_stack() explicitly: ggplot(mpg, aes(y = class)) + geom_bar(aes(fill = drv), position = position_stack(reverse = TRUE)) + theme(legend.position = "top") # To show (e.g.) means, you need geom_col() df <- data.frame(trt = c("a", "b", "c"), outcome = c(2.3, 1.9, 3.2)) ggplot(df, aes(trt, outcome)) + geom_col() # But geom_point() displays exactly the same information and doesn't # require the y-axis to touch zero. ggplot(df, aes(trt, outcome)) + geom_point() # You can also use geom_bar() with continuous data, in which case # it will show counts at unique locations df <- data.frame(x = rep(c(2.9, 3.1, 4.5), c(5, 10, 4))) ggplot(df, aes(x)) + geom_bar() # cf. a histogram of the same data ggplot(df, aes(x)) + geom_histogram(binwidth = 0.5) # Use `just` to control how columns are aligned with axis breaks: df <- data.frame(x = as.Date(c("2020-01-01", "2020-02-01")), y = 1:2) # Columns centered on the first day of the month ggplot(df, aes(x, y)) + geom_col(just = 0.5) # Columns begin on the first day of the month ggplot(df, aes(x, y)) + geom_col(just = 1)
Divides the plane into rectangles, counts the number of cases in
each rectangle, and then (by default) maps the number of cases to the
rectangle's fill. This is a useful alternative to geom_point()
in the presence of overplotting.
geom_bin_2d( mapping = NULL, data = NULL, stat = "bin2d", position = "identity", ..., na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_bin_2d( mapping = NULL, data = NULL, geom = "tile", position = "identity", ..., bins = 30, binwidth = NULL, drop = TRUE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
geom_bin_2d( mapping = NULL, data = NULL, stat = "bin2d", position = "identity", ..., na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_bin_2d( mapping = NULL, data = NULL, geom = "tile", position = "identity", ..., bins = 30, binwidth = NULL, drop = TRUE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
geom , stat
|
Use to override the default connection between
|
bins |
numeric vector giving number of bins in both vertical and horizontal directions. Set to 30 by default. |
binwidth |
Numeric vector giving bin width in both vertical and
horizontal directions. Overrides |
drop |
if |
stat_bin_2d()
understands the following aesthetics (required aesthetics are in bold):
Learn more about setting these aesthetics in vignette("ggplot2-specs")
.
These are calculated by the 'stat' part of layers and can be accessed with delayed evaluation.
after_stat(count)
number of points in bin.
after_stat(density)
density of points in bin, scaled to integrate to 1.
after_stat(ncount)
count, scaled to maximum of 1.
after_stat(ndensity)
density, scaled to a maximum of 1.
stat_bin_hex()
for hexagonal binning
d <- ggplot(diamonds, aes(x, y)) + xlim(4, 10) + ylim(4, 10) d + geom_bin_2d() # You can control the size of the bins by specifying the number of # bins in each direction: d + geom_bin_2d(bins = 10) d + geom_bin_2d(bins = 30) # Or by specifying the width of the bins d + geom_bin_2d(binwidth = c(0.1, 0.1))
d <- ggplot(diamonds, aes(x, y)) + xlim(4, 10) + ylim(4, 10) d + geom_bin_2d() # You can control the size of the bins by specifying the number of # bins in each direction: d + geom_bin_2d(bins = 10) d + geom_bin_2d(bins = 30) # Or by specifying the width of the bins d + geom_bin_2d(binwidth = c(0.1, 0.1))
The blank geom draws nothing, but can be a useful way of ensuring common
scales between different plots. See expand_limits()
for
more details.
geom_blank( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., show.legend = NA, inherit.aes = TRUE )
geom_blank( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
ggplot(mtcars, aes(wt, mpg)) # Nothing to see here!
ggplot(mtcars, aes(wt, mpg)) # Nothing to see here!
The boxplot compactly displays the distribution of a continuous variable. It visualises five summary statistics (the median, two hinges and two whiskers), and all "outlying" points individually.
geom_boxplot( mapping = NULL, data = NULL, stat = "boxplot", position = "dodge2", ..., outliers = TRUE, outlier.colour = NULL, outlier.color = NULL, outlier.fill = NULL, outlier.shape = NULL, outlier.size = NULL, outlier.stroke = 0.5, outlier.alpha = NULL, notch = FALSE, notchwidth = 0.5, staplewidth = 0, varwidth = FALSE, na.rm = FALSE, orientation = NA, show.legend = NA, inherit.aes = TRUE ) stat_boxplot( mapping = NULL, data = NULL, geom = "boxplot", position = "dodge2", ..., coef = 1.5, na.rm = FALSE, orientation = NA, show.legend = NA, inherit.aes = TRUE )
geom_boxplot( mapping = NULL, data = NULL, stat = "boxplot", position = "dodge2", ..., outliers = TRUE, outlier.colour = NULL, outlier.color = NULL, outlier.fill = NULL, outlier.shape = NULL, outlier.size = NULL, outlier.stroke = 0.5, outlier.alpha = NULL, notch = FALSE, notchwidth = 0.5, staplewidth = 0, varwidth = FALSE, na.rm = FALSE, orientation = NA, show.legend = NA, inherit.aes = TRUE ) stat_boxplot( mapping = NULL, data = NULL, geom = "boxplot", position = "dodge2", ..., coef = 1.5, na.rm = FALSE, orientation = NA, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
outliers |
Whether to display ( |
outlier.colour , outlier.color , outlier.fill , outlier.shape , outlier.size , outlier.stroke , outlier.alpha
|
Default aesthetics for outliers. Set to In the unlikely event you specify both US and UK spellings of colour, the US spelling will take precedence. |
notch |
If |
notchwidth |
For a notched box plot, width of the notch relative to
the body (defaults to |
staplewidth |
The relative width of staples to the width of the box. Staples mark the ends of the whiskers with a line. |
varwidth |
If |
na.rm |
If |
orientation |
The orientation of the layer. The default ( |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
geom , stat
|
Use to override the default connection between
|
coef |
Length of the whiskers as multiple of IQR. Defaults to 1.5. |
This geom treats each axis differently and, thus, can thus have two orientations. Often the orientation is easy to deduce from a combination of the given mappings and the types of positional scales in use. Thus, ggplot2 will by default try to guess which orientation the layer should have. Under rare circumstances, the orientation is ambiguous and guessing may fail. In that case the orientation can be specified directly using the orientation
parameter, which can be either "x"
or "y"
. The value gives the axis that the geom should run along, "x"
being the default orientation you would expect for the geom.
The lower and upper hinges correspond to the first and third quartiles
(the 25th and 75th percentiles). This differs slightly from the method used
by the boxplot()
function, and may be apparent with small samples.
See boxplot.stats()
for more information on how hinge
positions are calculated for boxplot()
.
The upper whisker extends from the hinge to the largest value no further than 1.5 * IQR from the hinge (where IQR is the inter-quartile range, or distance between the first and third quartiles). The lower whisker extends from the hinge to the smallest value at most 1.5 * IQR of the hinge. Data beyond the end of the whiskers are called "outlying" points and are plotted individually.
In a notched box plot, the notches extend 1.58 * IQR / sqrt(n)
.
This gives a roughly 95% confidence interval for comparing medians.
See McGill et al. (1978) for more details.
geom_boxplot()
understands the following aesthetics (required aesthetics are in bold):
lower
or xlower
upper
or xupper
middle
or xmiddle
weight
Learn more about setting these aesthetics in vignette("ggplot2-specs")
.
These are calculated by the 'stat' part of layers and can be accessed with delayed evaluation. stat_boxplot()
provides the following variables, some of which depend on the orientation:
after_stat(width)
width of boxplot.
after_stat(ymin)
or after_stat(xmin)
lower whisker = smallest observation greater than or equal to lower hinger - 1.5 * IQR.
after_stat(lower)
or after_stat(xlower)
lower hinge, 25% quantile.
after_stat(notchlower)
lower edge of notch = median - 1.58 * IQR / sqrt(n).
after_stat(middle)
or after_stat(xmiddle)
median, 50% quantile.
after_stat(notchupper)
upper edge of notch = median + 1.58 * IQR / sqrt(n).
after_stat(upper)
or after_stat(xupper)
upper hinge, 75% quantile.
after_stat(ymax)
or after_stat(xmax)
upper whisker = largest observation less than or equal to upper hinger + 1.5 * IQR.
McGill, R., Tukey, J. W. and Larsen, W. A. (1978) Variations of box plots. The American Statistician 32, 12-16.
geom_quantile()
for continuous x
,
geom_violin()
for a richer display of the distribution, and
geom_jitter()
for a useful technique for small data.
p <- ggplot(mpg, aes(class, hwy)) p + geom_boxplot() # Orientation follows the discrete axis ggplot(mpg, aes(hwy, class)) + geom_boxplot() p + geom_boxplot(notch = TRUE) p + geom_boxplot(varwidth = TRUE) p + geom_boxplot(fill = "white", colour = "#3366FF") # By default, outlier points match the colour of the box. Use # outlier.colour to override p + geom_boxplot(outlier.colour = "red", outlier.shape = 1) # Remove outliers when overlaying boxplot with original data points p + geom_boxplot(outlier.shape = NA) + geom_jitter(width = 0.2) # Boxplots are automatically dodged when any aesthetic is a factor p + geom_boxplot(aes(colour = drv)) # You can also use boxplots with continuous x, as long as you supply # a grouping variable. cut_width is particularly useful ggplot(diamonds, aes(carat, price)) + geom_boxplot() ggplot(diamonds, aes(carat, price)) + geom_boxplot(aes(group = cut_width(carat, 0.25))) # Adjust the transparency of outliers using outlier.alpha ggplot(diamonds, aes(carat, price)) + geom_boxplot(aes(group = cut_width(carat, 0.25)), outlier.alpha = 0.1) # It's possible to draw a boxplot with your own computations if you # use stat = "identity": set.seed(1) y <- rnorm(100) df <- data.frame( x = 1, y0 = min(y), y25 = quantile(y, 0.25), y50 = median(y), y75 = quantile(y, 0.75), y100 = max(y) ) ggplot(df, aes(x)) + geom_boxplot( aes(ymin = y0, lower = y25, middle = y50, upper = y75, ymax = y100), stat = "identity" )
p <- ggplot(mpg, aes(class, hwy)) p + geom_boxplot() # Orientation follows the discrete axis ggplot(mpg, aes(hwy, class)) + geom_boxplot() p + geom_boxplot(notch = TRUE) p + geom_boxplot(varwidth = TRUE) p + geom_boxplot(fill = "white", colour = "#3366FF") # By default, outlier points match the colour of the box. Use # outlier.colour to override p + geom_boxplot(outlier.colour = "red", outlier.shape = 1) # Remove outliers when overlaying boxplot with original data points p + geom_boxplot(outlier.shape = NA) + geom_jitter(width = 0.2) # Boxplots are automatically dodged when any aesthetic is a factor p + geom_boxplot(aes(colour = drv)) # You can also use boxplots with continuous x, as long as you supply # a grouping variable. cut_width is particularly useful ggplot(diamonds, aes(carat, price)) + geom_boxplot() ggplot(diamonds, aes(carat, price)) + geom_boxplot(aes(group = cut_width(carat, 0.25))) # Adjust the transparency of outliers using outlier.alpha ggplot(diamonds, aes(carat, price)) + geom_boxplot(aes(group = cut_width(carat, 0.25)), outlier.alpha = 0.1) # It's possible to draw a boxplot with your own computations if you # use stat = "identity": set.seed(1) y <- rnorm(100) df <- data.frame( x = 1, y0 = min(y), y25 = quantile(y, 0.25), y50 = median(y), y75 = quantile(y, 0.75), y100 = max(y) ) ggplot(df, aes(x)) + geom_boxplot( aes(ymin = y0, lower = y25, middle = y50, upper = y75, ymax = y100), stat = "identity" )
ggplot2 can not draw true 3D surfaces, but you can use geom_contour()
,
geom_contour_filled()
, and geom_tile()
to visualise 3D surfaces in 2D.
These functions require regular data, where the x
and y
coordinates
form an equally spaced grid, and each combination of x
and y
appears
once. Missing values of z
are allowed, but contouring will only work for
grid points where all four corners are non-missing. If you have irregular
data, you'll need to first interpolate on to a grid before visualising,
using interp::interp()
, akima::bilinear()
, or similar.
geom_contour( mapping = NULL, data = NULL, stat = "contour", position = "identity", ..., bins = NULL, binwidth = NULL, breaks = NULL, lineend = "butt", linejoin = "round", linemitre = 10, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_contour_filled( mapping = NULL, data = NULL, stat = "contour_filled", position = "identity", ..., bins = NULL, binwidth = NULL, breaks = NULL, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_contour( mapping = NULL, data = NULL, geom = "contour", position = "identity", ..., bins = NULL, binwidth = NULL, breaks = NULL, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_contour_filled( mapping = NULL, data = NULL, geom = "contour_filled", position = "identity", ..., bins = NULL, binwidth = NULL, breaks = NULL, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
geom_contour( mapping = NULL, data = NULL, stat = "contour", position = "identity", ..., bins = NULL, binwidth = NULL, breaks = NULL, lineend = "butt", linejoin = "round", linemitre = 10, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_contour_filled( mapping = NULL, data = NULL, stat = "contour_filled", position = "identity", ..., bins = NULL, binwidth = NULL, breaks = NULL, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_contour( mapping = NULL, data = NULL, geom = "contour", position = "identity", ..., bins = NULL, binwidth = NULL, breaks = NULL, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_contour_filled( mapping = NULL, data = NULL, geom = "contour_filled", position = "identity", ..., bins = NULL, binwidth = NULL, breaks = NULL, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
bins |
Number of contour bins. Overridden by |
binwidth |
The width of the contour bins. Overridden by |
breaks |
One of:
Overrides |
lineend |
Line end style (round, butt, square). |
linejoin |
Line join style (round, mitre, bevel). |
linemitre |
Line mitre limit (number greater than 1). |
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
geom |
The geometric object to use to display the data for this layer.
When using a
|
geom_contour()
understands the following aesthetics (required aesthetics are in bold):
Learn more about setting these aesthetics in vignette("ggplot2-specs")
.
geom_contour_filled()
understands the following aesthetics (required aesthetics are in bold):
Learn more about setting these aesthetics in vignette("ggplot2-specs")
.
stat_contour()
understands the following aesthetics (required aesthetics are in bold):
Learn more about setting these aesthetics in vignette("ggplot2-specs")
.
stat_contour_filled()
understands the following aesthetics (required aesthetics are in bold):
Learn more about setting these aesthetics in vignette("ggplot2-specs")
.
These are calculated by the 'stat' part of layers and can be accessed with delayed evaluation. The computed variables differ somewhat for contour lines (computed by stat_contour()
) and contour bands (filled contours, computed by stat_contour_filled()
). The variables nlevel
and piece
are available for both, whereas level_low
, level_high
, and level_mid
are only available for bands. The variable level
is a numeric or a factor depending on whether lines or bands are calculated.
after_stat(level)
Height of contour. For contour lines, this is a numeric vector that represents bin boundaries. For contour bands, this is an ordered factor that represents bin ranges.
after_stat(level_low)
, after_stat(level_high)
, after_stat(level_mid)
(contour bands only) Lower and upper bin boundaries for each band, as well as the mid point between boundaries.
after_stat(nlevel)
Height of contour, scaled to a maximum of 1.
after_stat(piece)
Contour piece (an integer).
z
After contouring, the z values of individual data points are no longer available.
geom_density_2d()
: 2d density contours
# Basic plot v <- ggplot(faithfuld, aes(waiting, eruptions, z = density)) v + geom_contour() # Or compute from raw data ggplot(faithful, aes(waiting, eruptions)) + geom_density_2d() # use geom_contour_filled() for filled contours v + geom_contour_filled() # Setting bins creates evenly spaced contours in the range of the data v + geom_contour(bins = 3) v + geom_contour(bins = 5) # Setting binwidth does the same thing, parameterised by the distance # between contours v + geom_contour(binwidth = 0.01) v + geom_contour(binwidth = 0.001) # Other parameters v + geom_contour(aes(colour = after_stat(level))) v + geom_contour(colour = "red") v + geom_raster(aes(fill = density)) + geom_contour(colour = "white")
# Basic plot v <- ggplot(faithfuld, aes(waiting, eruptions, z = density)) v + geom_contour() # Or compute from raw data ggplot(faithful, aes(waiting, eruptions)) + geom_density_2d() # use geom_contour_filled() for filled contours v + geom_contour_filled() # Setting bins creates evenly spaced contours in the range of the data v + geom_contour(bins = 3) v + geom_contour(bins = 5) # Setting binwidth does the same thing, parameterised by the distance # between contours v + geom_contour(binwidth = 0.01) v + geom_contour(binwidth = 0.001) # Other parameters v + geom_contour(aes(colour = after_stat(level))) v + geom_contour(colour = "red") v + geom_raster(aes(fill = density)) + geom_contour(colour = "white")
This is a variant geom_point()
that counts the number of
observations at each location, then maps the count to point area. It
useful when you have discrete data and overplotting.
geom_count( mapping = NULL, data = NULL, stat = "sum", position = "identity", ..., na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_sum( mapping = NULL, data = NULL, geom = "point", position = "identity", ..., na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
geom_count( mapping = NULL, data = NULL, stat = "sum", position = "identity", ..., na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_sum( mapping = NULL, data = NULL, geom = "point", position = "identity", ..., na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
geom , stat
|
Use to override the default connection between
|
geom_point()
understands the following aesthetics (required aesthetics are in bold):
Learn more about setting these aesthetics in vignette("ggplot2-specs")
.
These are calculated by the 'stat' part of layers and can be accessed with delayed evaluation.
after_stat(n)
Number of observations at position.
after_stat(prop)
Percent of points in that panel at that position.
For continuous x
and y
, use geom_bin_2d()
.
ggplot(mpg, aes(cty, hwy)) + geom_point() ggplot(mpg, aes(cty, hwy)) + geom_count() # Best used in conjunction with scale_size_area which ensures that # counts of zero would be given size 0. Doesn't make much different # here because the smallest count is already close to 0. ggplot(mpg, aes(cty, hwy)) + geom_count() + scale_size_area() # Display proportions instead of counts ------------------------------------- # By default, all categorical variables in the plot form the groups. # Specifying geom_count without a group identifier leads to a plot which is # not useful: d <- ggplot(diamonds, aes(x = cut, y = clarity)) d + geom_count(aes(size = after_stat(prop))) # To correct this problem and achieve a more desirable plot, we need # to specify which group the proportion is to be calculated over. d + geom_count(aes(size = after_stat(prop), group = 1)) + scale_size_area(max_size = 10) # Or group by x/y variables to have rows/columns sum to 1. d + geom_count(aes(size = after_stat(prop), group = cut)) + scale_size_area(max_size = 10) d + geom_count(aes(size = after_stat(prop), group = clarity)) + scale_size_area(max_size = 10)
ggplot(mpg, aes(cty, hwy)) + geom_point() ggplot(mpg, aes(cty, hwy)) + geom_count() # Best used in conjunction with scale_size_area which ensures that # counts of zero would be given size 0. Doesn't make much different # here because the smallest count is already close to 0. ggplot(mpg, aes(cty, hwy)) + geom_count() + scale_size_area() # Display proportions instead of counts ------------------------------------- # By default, all categorical variables in the plot form the groups. # Specifying geom_count without a group identifier leads to a plot which is # not useful: d <- ggplot(diamonds, aes(x = cut, y = clarity)) d + geom_count(aes(size = after_stat(prop))) # To correct this problem and achieve a more desirable plot, we need # to specify which group the proportion is to be calculated over. d + geom_count(aes(size = after_stat(prop), group = 1)) + scale_size_area(max_size = 10) # Or group by x/y variables to have rows/columns sum to 1. d + geom_count(aes(size = after_stat(prop), group = cut)) + scale_size_area(max_size = 10) d + geom_count(aes(size = after_stat(prop), group = clarity)) + scale_size_area(max_size = 10)
Various ways of representing a vertical interval defined by x
,
ymin
and ymax
. Each case draws a single graphical object.
geom_crossbar( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., fatten = 2.5, na.rm = FALSE, orientation = NA, show.legend = NA, inherit.aes = TRUE ) geom_errorbar( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., na.rm = FALSE, orientation = NA, show.legend = NA, inherit.aes = TRUE ) geom_linerange( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., na.rm = FALSE, orientation = NA, show.legend = NA, inherit.aes = TRUE ) geom_pointrange( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., fatten = 4, na.rm = FALSE, orientation = NA, show.legend = NA, inherit.aes = TRUE )
geom_crossbar( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., fatten = 2.5, na.rm = FALSE, orientation = NA, show.legend = NA, inherit.aes = TRUE ) geom_errorbar( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., na.rm = FALSE, orientation = NA, show.legend = NA, inherit.aes = TRUE ) geom_linerange( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., na.rm = FALSE, orientation = NA, show.legend = NA, inherit.aes = TRUE ) geom_pointrange( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., fatten = 4, na.rm = FALSE, orientation = NA, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
fatten |
A multiplicative factor used to increase the size of the
middle bar in |
na.rm |
If |
orientation |
The orientation of the layer. The default ( |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
This geom treats each axis differently and, thus, can thus have two orientations. Often the orientation is easy to deduce from a combination of the given mappings and the types of positional scales in use. Thus, ggplot2 will by default try to guess which orientation the layer should have. Under rare circumstances, the orientation is ambiguous and guessing may fail. In that case the orientation can be specified directly using the orientation
parameter, which can be either "x"
or "y"
. The value gives the axis that the geom should run along, "x"
being the default orientation you would expect for the geom.
geom_linerange()
understands the following aesthetics (required aesthetics are in bold):
Note that geom_pointrange()
also understands size
for the size of the points.
Learn more about setting these aesthetics in vignette("ggplot2-specs")
.
stat_summary()
for examples of these guys in use,
geom_smooth()
for continuous analogue,
geom_errorbarh()
for a horizontal error bar.
# Create a simple example dataset df <- data.frame( trt = factor(c(1, 1, 2, 2)), resp = c(1, 5, 3, 4), group = factor(c(1, 2, 1, 2)), upper = c(1.1, 5.3, 3.3, 4.2), lower = c(0.8, 4.6, 2.4, 3.6) ) p <- ggplot(df, aes(trt, resp, colour = group)) p + geom_linerange(aes(ymin = lower, ymax = upper)) p + geom_pointrange(aes(ymin = lower, ymax = upper)) p + geom_crossbar(aes(ymin = lower, ymax = upper), width = 0.2) p + geom_errorbar(aes(ymin = lower, ymax = upper), width = 0.2) # Flip the orientation by changing mapping ggplot(df, aes(resp, trt, colour = group)) + geom_linerange(aes(xmin = lower, xmax = upper)) # Draw lines connecting group means p + geom_line(aes(group = group)) + geom_errorbar(aes(ymin = lower, ymax = upper), width = 0.2) # If you want to dodge bars and errorbars, you need to manually # specify the dodge width p <- ggplot(df, aes(trt, resp, fill = group)) p + geom_col(position = "dodge") + geom_errorbar(aes(ymin = lower, ymax = upper), position = "dodge", width = 0.25) # Because the bars and errorbars have different widths # we need to specify how wide the objects we are dodging are dodge <- position_dodge(width=0.9) p + geom_col(position = dodge) + geom_errorbar(aes(ymin = lower, ymax = upper), position = dodge, width = 0.25) # When using geom_errorbar() with position_dodge2(), extra padding will be # needed between the error bars to keep them aligned with the bars. p + geom_col(position = "dodge2") + geom_errorbar( aes(ymin = lower, ymax = upper), position = position_dodge2(width = 0.5, padding = 0.5) )
# Create a simple example dataset df <- data.frame( trt = factor(c(1, 1, 2, 2)), resp = c(1, 5, 3, 4), group = factor(c(1, 2, 1, 2)), upper = c(1.1, 5.3, 3.3, 4.2), lower = c(0.8, 4.6, 2.4, 3.6) ) p <- ggplot(df, aes(trt, resp, colour = group)) p + geom_linerange(aes(ymin = lower, ymax = upper)) p + geom_pointrange(aes(ymin = lower, ymax = upper)) p + geom_crossbar(aes(ymin = lower, ymax = upper), width = 0.2) p + geom_errorbar(aes(ymin = lower, ymax = upper), width = 0.2) # Flip the orientation by changing mapping ggplot(df, aes(resp, trt, colour = group)) + geom_linerange(aes(xmin = lower, xmax = upper)) # Draw lines connecting group means p + geom_line(aes(group = group)) + geom_errorbar(aes(ymin = lower, ymax = upper), width = 0.2) # If you want to dodge bars and errorbars, you need to manually # specify the dodge width p <- ggplot(df, aes(trt, resp, fill = group)) p + geom_col(position = "dodge") + geom_errorbar(aes(ymin = lower, ymax = upper), position = "dodge", width = 0.25) # Because the bars and errorbars have different widths # we need to specify how wide the objects we are dodging are dodge <- position_dodge(width=0.9) p + geom_col(position = dodge) + geom_errorbar(aes(ymin = lower, ymax = upper), position = dodge, width = 0.25) # When using geom_errorbar() with position_dodge2(), extra padding will be # needed between the error bars to keep them aligned with the bars. p + geom_col(position = "dodge2") + geom_errorbar( aes(ymin = lower, ymax = upper), position = position_dodge2(width = 0.5, padding = 0.5) )
Computes and draws kernel density estimate, which is a smoothed version of the histogram. This is a useful alternative to the histogram for continuous data that comes from an underlying smooth distribution.
geom_density( mapping = NULL, data = NULL, stat = "density", position = "identity", ..., na.rm = FALSE, orientation = NA, show.legend = NA, inherit.aes = TRUE, outline.type = "upper" ) stat_density( mapping = NULL, data = NULL, geom = "area", position = "stack", ..., bw = "nrd0", adjust = 1, kernel = "gaussian", n = 512, trim = FALSE, na.rm = FALSE, bounds = c(-Inf, Inf), orientation = NA, show.legend = NA, inherit.aes = TRUE )
geom_density( mapping = NULL, data = NULL, stat = "density", position = "identity", ..., na.rm = FALSE, orientation = NA, show.legend = NA, inherit.aes = TRUE, outline.type = "upper" ) stat_density( mapping = NULL, data = NULL, geom = "area", position = "stack", ..., bw = "nrd0", adjust = 1, kernel = "gaussian", n = 512, trim = FALSE, na.rm = FALSE, bounds = c(-Inf, Inf), orientation = NA, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
na.rm |
If |
orientation |
The orientation of the layer. The default ( |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
outline.type |
Type of the outline of the area; |
geom , stat
|
Use to override the default connection between
|
bw |
The smoothing bandwidth to be used.
If numeric, the standard deviation of the smoothing kernel.
If character, a rule to choose the bandwidth, as listed in
|
adjust |
A multiplicate bandwidth adjustment. This makes it possible
to adjust the bandwidth while still using the a bandwidth estimator.
For example, |
kernel |
Kernel. See list of available kernels in |
n |
number of equally spaced points at which the density is to be
estimated, should be a power of two, see |
trim |
If |
bounds |
Known lower and upper bounds for estimated data. Default
|
This geom treats each axis differently and, thus, can thus have two orientations. Often the orientation is easy to deduce from a combination of the given mappings and the types of positional scales in use. Thus, ggplot2 will by default try to guess which orientation the layer should have. Under rare circumstances, the orientation is ambiguous and guessing may fail. In that case the orientation can be specified directly using the orientation
parameter, which can be either "x"
or "y"
. The value gives the axis that the geom should run along, "x"
being the default orientation you would expect for the geom.
geom_density()
understands the following aesthetics (required aesthetics are in bold):
Learn more about setting these aesthetics in vignette("ggplot2-specs")
.
These are calculated by the 'stat' part of layers and can be accessed with delayed evaluation.
after_stat(density)
density estimate.
after_stat(count)
density * number of points - useful for stacked density plots.
after_stat(wdensity)
density * sum of weights. In absence of weights, the same as count
.
after_stat(scaled)
density estimate, scaled to maximum of 1.
after_stat(n)
number of points.
after_stat(ndensity)
alias for scaled
, to mirror the syntax of stat_bin()
.
See geom_histogram()
, geom_freqpoly()
for
other methods of displaying continuous distribution.
See geom_violin()
for a compact density display.
ggplot(diamonds, aes(carat)) + geom_density() # Map the values to y to flip the orientation ggplot(diamonds, aes(y = carat)) + geom_density() ggplot(diamonds, aes(carat)) + geom_density(adjust = 1/5) ggplot(diamonds, aes(carat)) + geom_density(adjust = 5) ggplot(diamonds, aes(depth, colour = cut)) + geom_density() + xlim(55, 70) ggplot(diamonds, aes(depth, fill = cut, colour = cut)) + geom_density(alpha = 0.1) + xlim(55, 70) # Use `bounds` to adjust computation for known data limits big_diamonds <- diamonds[diamonds$carat >= 1, ] ggplot(big_diamonds, aes(carat)) + geom_density(color = 'red') + geom_density(bounds = c(1, Inf), color = 'blue') # Stacked density plots: if you want to create a stacked density plot, you # probably want to 'count' (density * n) variable instead of the default # density # Loses marginal densities ggplot(diamonds, aes(carat, fill = cut)) + geom_density(position = "stack") # Preserves marginal densities ggplot(diamonds, aes(carat, after_stat(count), fill = cut)) + geom_density(position = "stack") # You can use position="fill" to produce a conditional density estimate ggplot(diamonds, aes(carat, after_stat(count), fill = cut)) + geom_density(position = "fill")
ggplot(diamonds, aes(carat)) + geom_density() # Map the values to y to flip the orientation ggplot(diamonds, aes(y = carat)) + geom_density() ggplot(diamonds, aes(carat)) + geom_density(adjust = 1/5) ggplot(diamonds, aes(carat)) + geom_density(adjust = 5) ggplot(diamonds, aes(depth, colour = cut)) + geom_density() + xlim(55, 70) ggplot(diamonds, aes(depth, fill = cut, colour = cut)) + geom_density(alpha = 0.1) + xlim(55, 70) # Use `bounds` to adjust computation for known data limits big_diamonds <- diamonds[diamonds$carat >= 1, ] ggplot(big_diamonds, aes(carat)) + geom_density(color = 'red') + geom_density(bounds = c(1, Inf), color = 'blue') # Stacked density plots: if you want to create a stacked density plot, you # probably want to 'count' (density * n) variable instead of the default # density # Loses marginal densities ggplot(diamonds, aes(carat, fill = cut)) + geom_density(position = "stack") # Preserves marginal densities ggplot(diamonds, aes(carat, after_stat(count), fill = cut)) + geom_density(position = "stack") # You can use position="fill" to produce a conditional density estimate ggplot(diamonds, aes(carat, after_stat(count), fill = cut)) + geom_density(position = "fill")
Perform a 2D kernel density estimation using MASS::kde2d()
and
display the results with contours. This can be useful for dealing with
overplotting. This is a 2D version of geom_density()
. geom_density_2d()
draws contour lines, and geom_density_2d_filled()
draws filled contour
bands.
geom_density_2d( mapping = NULL, data = NULL, stat = "density_2d", position = "identity", ..., contour_var = "density", lineend = "butt", linejoin = "round", linemitre = 10, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_density_2d_filled( mapping = NULL, data = NULL, stat = "density_2d_filled", position = "identity", ..., contour_var = "density", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_density_2d( mapping = NULL, data = NULL, geom = "density_2d", position = "identity", ..., contour = TRUE, contour_var = "density", n = 100, h = NULL, adjust = c(1, 1), na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_density_2d_filled( mapping = NULL, data = NULL, geom = "density_2d_filled", position = "identity", ..., contour = TRUE, contour_var = "density", n = 100, h = NULL, adjust = c(1, 1), na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
geom_density_2d( mapping = NULL, data = NULL, stat = "density_2d", position = "identity", ..., contour_var = "density", lineend = "butt", linejoin = "round", linemitre = 10, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_density_2d_filled( mapping = NULL, data = NULL, stat = "density_2d_filled", position = "identity", ..., contour_var = "density", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_density_2d( mapping = NULL, data = NULL, geom = "density_2d", position = "identity", ..., contour = TRUE, contour_var = "density", n = 100, h = NULL, adjust = c(1, 1), na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_density_2d_filled( mapping = NULL, data = NULL, geom = "density_2d_filled", position = "identity", ..., contour = TRUE, contour_var = "density", n = 100, h = NULL, adjust = c(1, 1), na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Arguments passed on to
|
contour_var |
Character string identifying the variable to contour
by. Can be one of |
lineend |
Line end style (round, butt, square). |
linejoin |
Line join style (round, mitre, bevel). |
linemitre |
Line mitre limit (number greater than 1). |
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
geom , stat
|
Use to override the default connection between
|
contour |
If |
n |
Number of grid points in each direction. |
h |
Bandwidth (vector of length two). If |
adjust |
A multiplicative bandwidth adjustment to be used if 'h' is
'NULL'. This makes it possible to adjust the bandwidth while still
using the a bandwidth estimator. For example, |
geom_density_2d()
understands the following aesthetics (required aesthetics are in bold):
Learn more about setting these aesthetics in vignette("ggplot2-specs")
.
geom_density_2d_filled()
understands the following aesthetics (required aesthetics are in bold):
Learn more about setting these aesthetics in vignette("ggplot2-specs")
.
These are calculated by the 'stat' part of layers and can be accessed with delayed evaluation. stat_density_2d()
and stat_density_2d_filled()
compute different variables depending on whether contouring is turned on or off. With contouring off (contour = FALSE
), both stats behave the same, and the following variables are provided:
after_stat(density)
The density estimate.
after_stat(ndensity)
Density estimate, scaled to a maximum of 1.
after_stat(count)
Density estimate * number of observations in group.
after_stat(n)
Number of observations in each group.
With contouring on (contour = TRUE
), either stat_contour()
or
stat_contour_filled()
(for contour lines or contour bands,
respectively) is run after the density estimate has been obtained,
and the computed variables are determined by these stats.
Contours are calculated for one of the three types of density estimates
obtained before contouring, density
, ndensity
, and count
. Which
of those should be used is determined by the contour_var
parameter.
z
After density estimation, the z values of individual data points are no longer available.
If contouring is enabled, then similarly density
, ndensity
, and count
are no longer available after the contouring pass.
geom_contour()
, geom_contour_filled()
for information about
how contours are drawn; geom_bin_2d()
for another way of dealing with
overplotting.
m <- ggplot(faithful, aes(x = eruptions, y = waiting)) + geom_point() + xlim(0.5, 6) + ylim(40, 110) # contour lines m + geom_density_2d() # contour bands m + geom_density_2d_filled(alpha = 0.5) # contour bands and contour lines m + geom_density_2d_filled(alpha = 0.5) + geom_density_2d(linewidth = 0.25, colour = "black") set.seed(4393) dsmall <- diamonds[sample(nrow(diamonds), 1000), ] d <- ggplot(dsmall, aes(x, y)) # If you map an aesthetic to a categorical variable, you will get a # set of contours for each value of that variable d + geom_density_2d(aes(colour = cut)) # If you draw filled contours across multiple facets, the same bins are # used across all facets d + geom_density_2d_filled() + facet_wrap(vars(cut)) # If you want to make sure the peak intensity is the same in each facet, # use `contour_var = "ndensity"`. d + geom_density_2d_filled(contour_var = "ndensity") + facet_wrap(vars(cut)) # If you want to scale intensity by the number of observations in each group, # use `contour_var = "count"`. d + geom_density_2d_filled(contour_var = "count") + facet_wrap(vars(cut)) # If we turn contouring off, we can use other geoms, such as tiles: d + stat_density_2d( geom = "raster", aes(fill = after_stat(density)), contour = FALSE ) + scale_fill_viridis_c() # Or points: d + stat_density_2d(geom = "point", aes(size = after_stat(density)), n = 20, contour = FALSE)
m <- ggplot(faithful, aes(x = eruptions, y = waiting)) + geom_point() + xlim(0.5, 6) + ylim(40, 110) # contour lines m + geom_density_2d() # contour bands m + geom_density_2d_filled(alpha = 0.5) # contour bands and contour lines m + geom_density_2d_filled(alpha = 0.5) + geom_density_2d(linewidth = 0.25, colour = "black") set.seed(4393) dsmall <- diamonds[sample(nrow(diamonds), 1000), ] d <- ggplot(dsmall, aes(x, y)) # If you map an aesthetic to a categorical variable, you will get a # set of contours for each value of that variable d + geom_density_2d(aes(colour = cut)) # If you draw filled contours across multiple facets, the same bins are # used across all facets d + geom_density_2d_filled() + facet_wrap(vars(cut)) # If you want to make sure the peak intensity is the same in each facet, # use `contour_var = "ndensity"`. d + geom_density_2d_filled(contour_var = "ndensity") + facet_wrap(vars(cut)) # If you want to scale intensity by the number of observations in each group, # use `contour_var = "count"`. d + geom_density_2d_filled(contour_var = "count") + facet_wrap(vars(cut)) # If we turn contouring off, we can use other geoms, such as tiles: d + stat_density_2d( geom = "raster", aes(fill = after_stat(density)), contour = FALSE ) + scale_fill_viridis_c() # Or points: d + stat_density_2d(geom = "point", aes(size = after_stat(density)), n = 20, contour = FALSE)
In a dot plot, the width of a dot corresponds to the bin width (or maximum width, depending on the binning algorithm), and dots are stacked, with each dot representing one observation.
geom_dotplot( mapping = NULL, data = NULL, position = "identity", ..., binwidth = NULL, binaxis = "x", method = "dotdensity", binpositions = "bygroup", stackdir = "up", stackratio = 1, dotsize = 1, stackgroups = FALSE, origin = NULL, right = TRUE, width = 0.9, drop = FALSE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
geom_dotplot( mapping = NULL, data = NULL, position = "identity", ..., binwidth = NULL, binaxis = "x", method = "dotdensity", binpositions = "bygroup", stackdir = "up", stackratio = 1, dotsize = 1, stackgroups = FALSE, origin = NULL, right = TRUE, width = 0.9, drop = FALSE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
binwidth |
When |
binaxis |
The axis to bin along, "x" (default) or "y" |
method |
"dotdensity" (default) for dot-density binning, or "histodot" for fixed bin widths (like stat_bin) |
binpositions |
When |
stackdir |
which direction to stack the dots. "up" (default), "down", "center", "centerwhole" (centered, but with dots aligned) |
stackratio |
how close to stack the dots. Default is 1, where dots just touch. Use smaller values for closer, overlapping dots. |
dotsize |
The diameter of the dots relative to |
stackgroups |
should dots be stacked across groups? This has the effect
that |
origin |
When |
right |
When |
width |
When |
drop |
If TRUE, remove all bins with zero counts |
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
There are two basic approaches: dot-density and histodot.
With dot-density binning, the bin positions are determined by the data and
binwidth
, which is the maximum width of each bin. See Wilkinson
(1999) for details on the dot-density binning algorithm. With histodot
binning, the bins have fixed positions and fixed widths, much like a
histogram.
When binning along the x axis and stacking along the y axis, the numbers on y axis are not meaningful, due to technical limitations of ggplot2. You can hide the y axis, as in one of the examples, or manually scale it to match the number of dots.
geom_dotplot()
understands the following aesthetics (required aesthetics are in bold):
Learn more about setting these aesthetics in vignette("ggplot2-specs")
.
These are calculated by the 'stat' part of layers and can be accessed with delayed evaluation.
after_stat(x)
center of each bin, if binaxis
is "x"
.
after_stat(y)
center of each bin, if binaxis
is "x"
.
after_stat(binwidth)
maximum width of each bin if method is "dotdensity"
; width of each bin if method is "histodot"
.
after_stat(count)
number of points in bin.
after_stat(ncount)
count, scaled to a maximum of 1.
after_stat(density)
density of points in bin, scaled to integrate to 1, if method is "histodot"
.
after_stat(ndensity)
density, scaled to maximum of 1, if method is "histodot"
.
Wilkinson, L. (1999) Dot plots. The American Statistician, 53(3), 276-281.
ggplot(mtcars, aes(x = mpg)) + geom_dotplot() ggplot(mtcars, aes(x = mpg)) + geom_dotplot(binwidth = 1.5) # Use fixed-width bins ggplot(mtcars, aes(x = mpg)) + geom_dotplot(method="histodot", binwidth = 1.5) # Some other stacking methods ggplot(mtcars, aes(x = mpg)) + geom_dotplot(binwidth = 1.5, stackdir = "center") ggplot(mtcars, aes(x = mpg)) + geom_dotplot(binwidth = 1.5, stackdir = "centerwhole") # y axis isn't really meaningful, so hide it ggplot(mtcars, aes(x = mpg)) + geom_dotplot(binwidth = 1.5) + scale_y_continuous(NULL, breaks = NULL) # Overlap dots vertically ggplot(mtcars, aes(x = mpg)) + geom_dotplot(binwidth = 1.5, stackratio = .7) # Expand dot diameter ggplot(mtcars, aes(x = mpg)) + geom_dotplot(binwidth = 1.5, dotsize = 1.25) # Change dot fill colour, stroke width ggplot(mtcars, aes(x = mpg)) + geom_dotplot(binwidth = 1.5, fill = "white", stroke = 2) # Examples with stacking along y axis instead of x ggplot(mtcars, aes(x = 1, y = mpg)) + geom_dotplot(binaxis = "y", stackdir = "center") ggplot(mtcars, aes(x = factor(cyl), y = mpg)) + geom_dotplot(binaxis = "y", stackdir = "center") ggplot(mtcars, aes(x = factor(cyl), y = mpg)) + geom_dotplot(binaxis = "y", stackdir = "centerwhole") ggplot(mtcars, aes(x = factor(vs), fill = factor(cyl), y = mpg)) + geom_dotplot(binaxis = "y", stackdir = "center", position = "dodge") # binpositions="all" ensures that the bins are aligned between groups ggplot(mtcars, aes(x = factor(am), y = mpg)) + geom_dotplot(binaxis = "y", stackdir = "center", binpositions="all") # Stacking multiple groups, with different fill ggplot(mtcars, aes(x = mpg, fill = factor(cyl))) + geom_dotplot(stackgroups = TRUE, binwidth = 1, binpositions = "all") ggplot(mtcars, aes(x = mpg, fill = factor(cyl))) + geom_dotplot(stackgroups = TRUE, binwidth = 1, method = "histodot") ggplot(mtcars, aes(x = 1, y = mpg, fill = factor(cyl))) + geom_dotplot(binaxis = "y", stackgroups = TRUE, binwidth = 1, method = "histodot")
ggplot(mtcars, aes(x = mpg)) + geom_dotplot() ggplot(mtcars, aes(x = mpg)) + geom_dotplot(binwidth = 1.5) # Use fixed-width bins ggplot(mtcars, aes(x = mpg)) + geom_dotplot(method="histodot", binwidth = 1.5) # Some other stacking methods ggplot(mtcars, aes(x = mpg)) + geom_dotplot(binwidth = 1.5, stackdir = "center") ggplot(mtcars, aes(x = mpg)) + geom_dotplot(binwidth = 1.5, stackdir = "centerwhole") # y axis isn't really meaningful, so hide it ggplot(mtcars, aes(x = mpg)) + geom_dotplot(binwidth = 1.5) + scale_y_continuous(NULL, breaks = NULL) # Overlap dots vertically ggplot(mtcars, aes(x = mpg)) + geom_dotplot(binwidth = 1.5, stackratio = .7) # Expand dot diameter ggplot(mtcars, aes(x = mpg)) + geom_dotplot(binwidth = 1.5, dotsize = 1.25) # Change dot fill colour, stroke width ggplot(mtcars, aes(x = mpg)) + geom_dotplot(binwidth = 1.5, fill = "white", stroke = 2) # Examples with stacking along y axis instead of x ggplot(mtcars, aes(x = 1, y = mpg)) + geom_dotplot(binaxis = "y", stackdir = "center") ggplot(mtcars, aes(x = factor(cyl), y = mpg)) + geom_dotplot(binaxis = "y", stackdir = "center") ggplot(mtcars, aes(x = factor(cyl), y = mpg)) + geom_dotplot(binaxis = "y", stackdir = "centerwhole") ggplot(mtcars, aes(x = factor(vs), fill = factor(cyl), y = mpg)) + geom_dotplot(binaxis = "y", stackdir = "center", position = "dodge") # binpositions="all" ensures that the bins are aligned between groups ggplot(mtcars, aes(x = factor(am), y = mpg)) + geom_dotplot(binaxis = "y", stackdir = "center", binpositions="all") # Stacking multiple groups, with different fill ggplot(mtcars, aes(x = mpg, fill = factor(cyl))) + geom_dotplot(stackgroups = TRUE, binwidth = 1, binpositions = "all") ggplot(mtcars, aes(x = mpg, fill = factor(cyl))) + geom_dotplot(stackgroups = TRUE, binwidth = 1, method = "histodot") ggplot(mtcars, aes(x = 1, y = mpg, fill = factor(cyl))) + geom_dotplot(binaxis = "y", stackgroups = TRUE, binwidth = 1, method = "histodot")
A rotated version of geom_errorbar()
.
geom_errorbarh( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
geom_errorbarh( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
geom_errorbarh()
understands the following aesthetics (required aesthetics are in bold):
Learn more about setting these aesthetics in vignette("ggplot2-specs")
.
df <- data.frame( trt = factor(c(1, 1, 2, 2)), resp = c(1, 5, 3, 4), group = factor(c(1, 2, 1, 2)), se = c(0.1, 0.3, 0.3, 0.2) ) # Define the top and bottom of the errorbars p <- ggplot(df, aes(resp, trt, colour = group)) p + geom_point() + geom_errorbarh(aes(xmax = resp + se, xmin = resp - se)) p + geom_point() + geom_errorbarh(aes(xmax = resp + se, xmin = resp - se, height = .2))
df <- data.frame( trt = factor(c(1, 1, 2, 2)), resp = c(1, 5, 3, 4), group = factor(c(1, 2, 1, 2)), se = c(0.1, 0.3, 0.3, 0.2) ) # Define the top and bottom of the errorbars p <- ggplot(df, aes(resp, trt, colour = group)) p + geom_point() + geom_errorbarh(aes(xmax = resp + se, xmin = resp - se)) p + geom_point() + geom_errorbarh(aes(xmax = resp + se, xmin = resp - se, height = .2))
Visualise the distribution of a single continuous variable by dividing
the x axis into bins and counting the number of observations in each bin.
Histograms (geom_histogram()
) display the counts with bars; frequency
polygons (geom_freqpoly()
) display the counts with lines. Frequency
polygons are more suitable when you want to compare the distribution
across the levels of a categorical variable.
geom_freqpoly( mapping = NULL, data = NULL, stat = "bin", position = "identity", ..., na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_histogram( mapping = NULL, data = NULL, stat = "bin", position = "stack", ..., binwidth = NULL, bins = NULL, na.rm = FALSE, orientation = NA, show.legend = NA, inherit.aes = TRUE ) stat_bin( mapping = NULL, data = NULL, geom = "bar", position = "stack", ..., binwidth = NULL, bins = NULL, center = NULL, boundary = NULL, breaks = NULL, closed = c("right", "left"), pad = FALSE, na.rm = FALSE, orientation = NA, show.legend = NA, inherit.aes = TRUE )
geom_freqpoly( mapping = NULL, data = NULL, stat = "bin", position = "identity", ..., na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_histogram( mapping = NULL, data = NULL, stat = "bin", position = "stack", ..., binwidth = NULL, bins = NULL, na.rm = FALSE, orientation = NA, show.legend = NA, inherit.aes = TRUE ) stat_bin( mapping = NULL, data = NULL, geom = "bar", position = "stack", ..., binwidth = NULL, bins = NULL, center = NULL, boundary = NULL, breaks = NULL, closed = c("right", "left"), pad = FALSE, na.rm = FALSE, orientation = NA, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
binwidth |
The width of the bins. Can be specified as a numeric value
or as a function that takes x after scale transformation as input and
returns a single numeric value. When specifying a function along with a
grouping structure, the function will be called once per group.
The default is to use the number of bins in The bin width of a date variable is the number of days in each time; the bin width of a time variable is the number of seconds. |
bins |
Number of bins. Overridden by |
orientation |
The orientation of the layer. The default ( |
geom , stat
|
Use to override the default connection between
|
center , boundary
|
bin position specifiers. Only one, |
breaks |
Alternatively, you can supply a numeric vector giving
the bin boundaries. Overrides |
closed |
One of |
pad |
If |
stat_bin()
is suitable only for continuous x data. If your x data is
discrete, you probably want to use stat_count()
.
By default, the underlying computation (stat_bin()
) uses 30 bins;
this is not a good default, but the idea is to get you experimenting with
different number of bins. You can also experiment modifying the binwidth
with
center
or boundary
arguments. binwidth
overrides bins
so you should do
one change at a time. You may need to look at a few options to uncover
the full story behind your data.
In addition to geom_histogram()
, you can create a histogram plot by using
scale_x_binned()
with geom_bar()
. This method by default plots tick marks
in between each bar.
This geom treats each axis differently and, thus, can thus have two orientations. Often the orientation is easy to deduce from a combination of the given mappings and the types of positional scales in use. Thus, ggplot2 will by default try to guess which orientation the layer should have. Under rare circumstances, the orientation is ambiguous and guessing may fail. In that case the orientation can be specified directly using the orientation
parameter, which can be either "x"
or "y"
. The value gives the axis that the geom should run along, "x"
being the default orientation you would expect for the geom.
geom_histogram()
uses the same aesthetics as geom_bar()
;
geom_freqpoly()
uses the same aesthetics as geom_line()
.
These are calculated by the 'stat' part of layers and can be accessed with delayed evaluation.
after_stat(count)
number of points in bin.
after_stat(density)
density of points in bin, scaled to integrate to 1.
after_stat(ncount)
count, scaled to a maximum of 1.
after_stat(ndensity)
density, scaled to a maximum of 1.
after_stat(width)
widths of bins.
weight
After binning, weights of individual data points (if supplied) are no longer available.
stat_count()
, which counts the number of cases at each x
position, without binning. It is suitable for both discrete and continuous
x data, whereas stat_bin()
is suitable only for continuous x data.
ggplot(diamonds, aes(carat)) + geom_histogram() ggplot(diamonds, aes(carat)) + geom_histogram(binwidth = 0.01) ggplot(diamonds, aes(carat)) + geom_histogram(bins = 200) # Map values to y to flip the orientation ggplot(diamonds, aes(y = carat)) + geom_histogram() # For histograms with tick marks between each bin, use `geom_bar()` with # `scale_x_binned()`. ggplot(diamonds, aes(carat)) + geom_bar() + scale_x_binned() # Rather than stacking histograms, it's easier to compare frequency # polygons ggplot(diamonds, aes(price, fill = cut)) + geom_histogram(binwidth = 500) ggplot(diamonds, aes(price, colour = cut)) + geom_freqpoly(binwidth = 500) # To make it easier to compare distributions with very different counts, # put density on the y axis instead of the default count ggplot(diamonds, aes(price, after_stat(density), colour = cut)) + geom_freqpoly(binwidth = 500) if (require("ggplot2movies")) { # Often we don't want the height of the bar to represent the # count of observations, but the sum of some other variable. # For example, the following plot shows the number of movies # in each rating. m <- ggplot(movies, aes(rating)) m + geom_histogram(binwidth = 0.1) # If, however, we want to see the number of votes cast in each # category, we need to weight by the votes variable m + geom_histogram(aes(weight = votes), binwidth = 0.1) + ylab("votes") # For transformed scales, binwidth applies to the transformed data. # The bins have constant width on the transformed scale. m + geom_histogram() + scale_x_log10() m + geom_histogram(binwidth = 0.05) + scale_x_log10() # For transformed coordinate systems, the binwidth applies to the # raw data. The bins have constant width on the original scale. # Using log scales does not work here, because the first # bar is anchored at zero, and so when transformed becomes negative # infinity. This is not a problem when transforming the scales, because # no observations have 0 ratings. m + geom_histogram(boundary = 0) + coord_trans(x = "log10") # Use boundary = 0, to make sure we don't take sqrt of negative values m + geom_histogram(boundary = 0) + coord_trans(x = "sqrt") # You can also transform the y axis. Remember that the base of the bars # has value 0, so log transformations are not appropriate m <- ggplot(movies, aes(x = rating)) m + geom_histogram(binwidth = 0.5) + scale_y_sqrt() } # You can specify a function for calculating binwidth, which is # particularly useful when faceting along variables with # different ranges because the function will be called once per facet ggplot(economics_long, aes(value)) + facet_wrap(~variable, scales = 'free_x') + geom_histogram(binwidth = function(x) 2 * IQR(x) / (length(x)^(1/3)))
ggplot(diamonds, aes(carat)) + geom_histogram() ggplot(diamonds, aes(carat)) + geom_histogram(binwidth = 0.01) ggplot(diamonds, aes(carat)) + geom_histogram(bins = 200) # Map values to y to flip the orientation ggplot(diamonds, aes(y = carat)) + geom_histogram() # For histograms with tick marks between each bin, use `geom_bar()` with # `scale_x_binned()`. ggplot(diamonds, aes(carat)) + geom_bar() + scale_x_binned() # Rather than stacking histograms, it's easier to compare frequency # polygons ggplot(diamonds, aes(price, fill = cut)) + geom_histogram(binwidth = 500) ggplot(diamonds, aes(price, colour = cut)) + geom_freqpoly(binwidth = 500) # To make it easier to compare distributions with very different counts, # put density on the y axis instead of the default count ggplot(diamonds, aes(price, after_stat(density), colour = cut)) + geom_freqpoly(binwidth = 500) if (require("ggplot2movies")) { # Often we don't want the height of the bar to represent the # count of observations, but the sum of some other variable. # For example, the following plot shows the number of movies # in each rating. m <- ggplot(movies, aes(rating)) m + geom_histogram(binwidth = 0.1) # If, however, we want to see the number of votes cast in each # category, we need to weight by the votes variable m + geom_histogram(aes(weight = votes), binwidth = 0.1) + ylab("votes") # For transformed scales, binwidth applies to the transformed data. # The bins have constant width on the transformed scale. m + geom_histogram() + scale_x_log10() m + geom_histogram(binwidth = 0.05) + scale_x_log10() # For transformed coordinate systems, the binwidth applies to the # raw data. The bins have constant width on the original scale. # Using log scales does not work here, because the first # bar is anchored at zero, and so when transformed becomes negative # infinity. This is not a problem when transforming the scales, because # no observations have 0 ratings. m + geom_histogram(boundary = 0) + coord_trans(x = "log10") # Use boundary = 0, to make sure we don't take sqrt of negative values m + geom_histogram(boundary = 0) + coord_trans(x = "sqrt") # You can also transform the y axis. Remember that the base of the bars # has value 0, so log transformations are not appropriate m <- ggplot(movies, aes(x = rating)) m + geom_histogram(binwidth = 0.5) + scale_y_sqrt() } # You can specify a function for calculating binwidth, which is # particularly useful when faceting along variables with # different ranges because the function will be called once per facet ggplot(economics_long, aes(value)) + facet_wrap(~variable, scales = 'free_x') + geom_histogram(binwidth = function(x) 2 * IQR(x) / (length(x)^(1/3)))
Computes and draws a function as a continuous curve. This makes it easy to superimpose a function on top of an existing plot. The function is called with a grid of evenly spaced values along the x axis, and the results are drawn (by default) with a line.
geom_function( mapping = NULL, data = NULL, stat = "function", position = "identity", ..., na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_function( mapping = NULL, data = NULL, geom = "function", position = "identity", ..., fun, xlim = NULL, n = 101, args = list(), na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
geom_function( mapping = NULL, data = NULL, stat = "function", position = "identity", ..., na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_function( mapping = NULL, data = NULL, geom = "function", position = "identity", ..., fun, xlim = NULL, n = 101, args = list(), na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
Ignored by |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
geom |
The geometric object to use to display the data for this layer.
When using a
|
fun |
Function to use. Either 1) an anonymous function in the base or
rlang formula syntax (see |
xlim |
Optionally, specify the range of the function. |
n |
Number of points to interpolate along the x axis. |
args |
List of additional arguments passed on to the function defined by |
geom_function()
understands the following aesthetics (required aesthetics are in bold):
Learn more about setting these aesthetics in vignette("ggplot2-specs")
.
These are calculated by the 'stat' part of layers and can be accessed with delayed evaluation.
after_stat(x)
x
values along a grid.
after_stat(y)
values of the function evaluated at corresponding x
.
# geom_function() is useful for overlaying functions set.seed(1492) ggplot(data.frame(x = rnorm(100)), aes(x)) + geom_density() + geom_function(fun = dnorm, colour = "red") # To plot functions without data, specify range of x-axis base <- ggplot() + xlim(-5, 5) base + geom_function(fun = dnorm) base + geom_function(fun = dnorm, args = list(mean = 2, sd = .5)) # The underlying mechanics evaluate the function at discrete points # and connect the points with lines base + stat_function(fun = dnorm, geom = "point") base + stat_function(fun = dnorm, geom = "point", n = 20) base + stat_function(fun = dnorm, geom = "polygon", color = "blue", fill = "blue", alpha = 0.5) base + geom_function(fun = dnorm, n = 20) # Two functions on the same plot base + geom_function(aes(colour = "normal"), fun = dnorm) + geom_function(aes(colour = "t, df = 1"), fun = dt, args = list(df = 1)) # Using a custom anonymous function base + geom_function(fun = function(x) 0.5 * exp(-abs(x))) # or using lambda syntax: # base + geom_function(fun = ~ 0.5 * exp(-abs(.x))) # or in R4.1.0 and above: # base + geom_function(fun = \(x) 0.5 * exp(-abs(x))) # or using a custom named function: # f <- function(x) 0.5 * exp(-abs(x)) # base + geom_function(fun = f) # Using xlim to restrict the range of function ggplot(data.frame(x = rnorm(100)), aes(x)) + geom_density() + geom_function(fun = dnorm, colour = "red", xlim=c(-1, 1)) # Using xlim to widen the range of function ggplot(data.frame(x = rnorm(100)), aes(x)) + geom_density() + geom_function(fun = dnorm, colour = "red", xlim=c(-7, 7))
# geom_function() is useful for overlaying functions set.seed(1492) ggplot(data.frame(x = rnorm(100)), aes(x)) + geom_density() + geom_function(fun = dnorm, colour = "red") # To plot functions without data, specify range of x-axis base <- ggplot() + xlim(-5, 5) base + geom_function(fun = dnorm) base + geom_function(fun = dnorm, args = list(mean = 2, sd = .5)) # The underlying mechanics evaluate the function at discrete points # and connect the points with lines base + stat_function(fun = dnorm, geom = "point") base + stat_function(fun = dnorm, geom = "point", n = 20) base + stat_function(fun = dnorm, geom = "polygon", color = "blue", fill = "blue", alpha = 0.5) base + geom_function(fun = dnorm, n = 20) # Two functions on the same plot base + geom_function(aes(colour = "normal"), fun = dnorm) + geom_function(aes(colour = "t, df = 1"), fun = dt, args = list(df = 1)) # Using a custom anonymous function base + geom_function(fun = function(x) 0.5 * exp(-abs(x))) # or using lambda syntax: # base + geom_function(fun = ~ 0.5 * exp(-abs(.x))) # or in R4.1.0 and above: # base + geom_function(fun = \(x) 0.5 * exp(-abs(x))) # or using a custom named function: # f <- function(x) 0.5 * exp(-abs(x)) # base + geom_function(fun = f) # Using xlim to restrict the range of function ggplot(data.frame(x = rnorm(100)), aes(x)) + geom_density() + geom_function(fun = dnorm, colour = "red", xlim=c(-1, 1)) # Using xlim to widen the range of function ggplot(data.frame(x = rnorm(100)), aes(x)) + geom_density() + geom_function(fun = dnorm, colour = "red", xlim=c(-7, 7))
Divides the plane into regular hexagons, counts the number of cases in
each hexagon, and then (by default) maps the number of cases to the hexagon
fill. Hexagon bins avoid the visual artefacts sometimes generated by
the very regular alignment of geom_bin_2d()
.
geom_hex( mapping = NULL, data = NULL, stat = "binhex", position = "identity", ..., na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_bin_hex( mapping = NULL, data = NULL, geom = "hex", position = "identity", ..., bins = 30, binwidth = NULL, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
geom_hex( mapping = NULL, data = NULL, stat = "binhex", position = "identity", ..., na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_bin_hex( mapping = NULL, data = NULL, geom = "hex", position = "identity", ..., bins = 30, binwidth = NULL, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
geom , stat
|
Override the default connection between |
bins |
numeric vector giving number of bins in both vertical and horizontal directions. Set to 30 by default. |
binwidth |
Numeric vector giving bin width in both vertical and
horizontal directions. Overrides |
geom_hex()
understands the following aesthetics (required aesthetics are in bold):
Learn more about setting these aesthetics in vignette("ggplot2-specs")
.
stat_binhex()
understands the following aesthetics (required aesthetics are in bold):
Learn more about setting these aesthetics in vignette("ggplot2-specs")
.
These are calculated by the 'stat' part of layers and can be accessed with delayed evaluation.
after_stat(count)
number of points in bin.
after_stat(density)
density of points in bin, scaled to integrate to 1.
after_stat(ncount)
count, scaled to maximum of 1.
after_stat(ndensity)
density, scaled to maximum of 1.
stat_bin_2d()
for rectangular binning
d <- ggplot(diamonds, aes(carat, price)) d + geom_hex() # You can control the size of the bins by specifying the number of # bins in each direction: d + geom_hex(bins = 10) d + geom_hex(bins = 30) # Or by specifying the width of the bins d + geom_hex(binwidth = c(1, 1000)) d + geom_hex(binwidth = c(.1, 500))
d <- ggplot(diamonds, aes(carat, price)) d + geom_hex() # You can control the size of the bins by specifying the number of # bins in each direction: d + geom_hex(bins = 10) d + geom_hex(bins = 30) # Or by specifying the width of the bins d + geom_hex(binwidth = c(1, 1000)) d + geom_hex(binwidth = c(.1, 500))
The jitter geom is a convenient shortcut for
geom_point(position = "jitter")
. It adds a small amount of random
variation to the location of each point, and is a useful way of handling
overplotting caused by discreteness in smaller datasets.
geom_jitter( mapping = NULL, data = NULL, stat = "identity", position = "jitter", ..., width = NULL, height = NULL, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
geom_jitter( mapping = NULL, data = NULL, stat = "identity", position = "jitter", ..., width = NULL, height = NULL, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
width , height
|
Amount of vertical and horizontal jitter. The jitter is added in both positive and negative directions, so the total spread is twice the value specified here. If omitted, defaults to 40% of the resolution of the data: this means the jitter values will occupy 80% of the implied bins. Categorical data is aligned on the integers, so a width or height of 0.5 will spread the data so it's not possible to see the distinction between the categories. |
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
geom_point()
understands the following aesthetics (required aesthetics are in bold):
Learn more about setting these aesthetics in vignette("ggplot2-specs")
.
geom_point()
for regular, unjittered points,
geom_boxplot()
for another way of looking at the conditional
distribution of a variable
p <- ggplot(mpg, aes(cyl, hwy)) p + geom_point() p + geom_jitter() # Add aesthetic mappings p + geom_jitter(aes(colour = class)) # Use smaller width/height to emphasise categories ggplot(mpg, aes(cyl, hwy)) + geom_jitter() ggplot(mpg, aes(cyl, hwy)) + geom_jitter(width = 0.25) # Use larger width/height to completely smooth away discreteness ggplot(mpg, aes(cty, hwy)) + geom_jitter() ggplot(mpg, aes(cty, hwy)) + geom_jitter(width = 0.5, height = 0.5)
p <- ggplot(mpg, aes(cyl, hwy)) p + geom_point() p + geom_jitter() # Add aesthetic mappings p + geom_jitter(aes(colour = class)) # Use smaller width/height to emphasise categories ggplot(mpg, aes(cyl, hwy)) + geom_jitter() ggplot(mpg, aes(cyl, hwy)) + geom_jitter(width = 0.25) # Use larger width/height to completely smooth away discreteness ggplot(mpg, aes(cty, hwy)) + geom_jitter() ggplot(mpg, aes(cty, hwy)) + geom_jitter(width = 0.5, height = 0.5)
Text geoms are useful for labeling plots. They can be used by themselves as
scatterplots or in combination with other geoms, for example, for labeling
points or for annotating the height of bars. geom_text()
adds only text
to the plot. geom_label()
draws a rectangle behind the text, making it
easier to read.
geom_label( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., parse = FALSE, nudge_x = 0, nudge_y = 0, label.padding = unit(0.25, "lines"), label.r = unit(0.15, "lines"), label.size = 0.25, size.unit = "mm", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_text( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., parse = FALSE, nudge_x = 0, nudge_y = 0, check_overlap = FALSE, size.unit = "mm", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
geom_label( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., parse = FALSE, nudge_x = 0, nudge_y = 0, label.padding = unit(0.25, "lines"), label.r = unit(0.15, "lines"), label.size = 0.25, size.unit = "mm", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_text( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., parse = FALSE, nudge_x = 0, nudge_y = 0, check_overlap = FALSE, size.unit = "mm", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer.
Cannot be jointy specified with
|
... |
Other arguments passed on to
|
parse |
If |
nudge_x , nudge_y
|
Horizontal and vertical adjustment to nudge labels by.
Useful for offsetting text from points, particularly on discrete scales.
Cannot be jointly specified with |
label.padding |
Amount of padding around label. Defaults to 0.25 lines. |
label.r |
Radius of rounded corners. Defaults to 0.15 lines. |
label.size |
Size of label border, in mm. |
size.unit |
How the |
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
check_overlap |
If |
Note that when you resize a plot, text labels stay the same size, even though the size of the plot area changes. This happens because the "width" and "height" of a text element are 0. Obviously, text labels do have height and width, but they are physical units, not data units. For the same reason, stacking and dodging text will not work by default, and axis limits are not automatically expanded to include all text.
geom_text()
and geom_label()
add labels for each row in the
data, even if coordinates x, y are set to single values in the call
to geom_label()
or geom_text()
.
To add labels at specified points use annotate()
with
annotate(geom = "text", ...)
or annotate(geom = "label", ...)
.
To automatically position non-overlapping text labels see the ggrepel package.
geom_text()
understands the following aesthetics (required aesthetics are in bold):
Learn more about setting these aesthetics in vignette("ggplot2-specs")
.
geom_label()
Currently geom_label()
does not support the check_overlap
argument. Also,
it is considerably slower than geom_text()
. The fill
aesthetic controls
the background colour of the label.
You can modify text alignment with the vjust
and hjust
aesthetics. These can either be a number between 0 (left/bottom) and
1 (right/top) or a character ("left"
, "middle"
, "right"
, "bottom"
,
"center"
, "top"
). There are two special alignments: "inward"
and
"outward"
. Inward always aligns text towards the center, and outward
aligns it away from the center.
The text labels section of the online ggplot2 book.
p <- ggplot(mtcars, aes(wt, mpg, label = rownames(mtcars))) p + geom_text() # Avoid overlaps p + geom_text(check_overlap = TRUE) # Labels with background p + geom_label() # Change size of the label p + geom_text(size = 10) # Set aesthetics to fixed value p + geom_point() + geom_text(hjust = 0, nudge_x = 0.05) p + geom_point() + geom_text(vjust = 0, nudge_y = 0.5) p + geom_point() + geom_text(angle = 45) ## Not run: # Doesn't work on all systems p + geom_text(family = "Times New Roman") ## End(Not run) # Add aesthetic mappings p + geom_text(aes(colour = factor(cyl))) p + geom_text(aes(colour = factor(cyl))) + scale_colour_discrete(l = 40) p + geom_label(aes(fill = factor(cyl)), colour = "white", fontface = "bold") # Scale size of text, and change legend key glyph from a to point p + geom_text(aes(size = wt), key_glyph = "point") # Scale height of text, rather than sqrt(height) p + geom_text(aes(size = wt), key_glyph = "point") + scale_radius(range = c(3,6)) # You can display expressions by setting parse = TRUE. The # details of the display are described in ?plotmath, but note that # geom_text uses strings, not expressions. p + geom_text( aes(label = paste(wt, "^(", cyl, ")", sep = "")), parse = TRUE ) # Add a text annotation p + geom_text() + annotate( "text", label = "plot mpg vs. wt", x = 2, y = 15, size = 8, colour = "red" ) # Aligning labels and bars -------------------------------------------------- df <- data.frame( x = factor(c(1, 1, 2, 2)), y = c(1, 3, 2, 1), grp = c("a", "b", "a", "b") ) # ggplot2 doesn't know you want to give the labels the same virtual width # as the bars: ggplot(data = df, aes(x, y, group = grp)) + geom_col(aes(fill = grp), position = "dodge") + geom_text(aes(label = y), position = "dodge") # So tell it: ggplot(data = df, aes(x, y, group = grp)) + geom_col(aes(fill = grp), position = "dodge") + geom_text(aes(label = y), position = position_dodge(0.9)) # You can't nudge and dodge text, so instead adjust the y position ggplot(data = df, aes(x, y, group = grp)) + geom_col(aes(fill = grp), position = "dodge") + geom_text( aes(label = y, y = y + 0.05), position = position_dodge(0.9), vjust = 0 ) # To place text in the middle of each bar in a stacked barplot, you # need to set the vjust parameter of position_stack() ggplot(data = df, aes(x, y, group = grp)) + geom_col(aes(fill = grp)) + geom_text(aes(label = y), position = position_stack(vjust = 0.5)) # Justification ------------------------------------------------------------- df <- data.frame( x = c(1, 1, 2, 2, 1.5), y = c(1, 2, 1, 2, 1.5), text = c("bottom-left", "top-left", "bottom-right", "top-right", "center") ) ggplot(df, aes(x, y)) + geom_text(aes(label = text)) ggplot(df, aes(x, y)) + geom_text(aes(label = text), vjust = "inward", hjust = "inward")
p <- ggplot(mtcars, aes(wt, mpg, label = rownames(mtcars))) p + geom_text() # Avoid overlaps p + geom_text(check_overlap = TRUE) # Labels with background p + geom_label() # Change size of the label p + geom_text(size = 10) # Set aesthetics to fixed value p + geom_point() + geom_text(hjust = 0, nudge_x = 0.05) p + geom_point() + geom_text(vjust = 0, nudge_y = 0.5) p + geom_point() + geom_text(angle = 45) ## Not run: # Doesn't work on all systems p + geom_text(family = "Times New Roman") ## End(Not run) # Add aesthetic mappings p + geom_text(aes(colour = factor(cyl))) p + geom_text(aes(colour = factor(cyl))) + scale_colour_discrete(l = 40) p + geom_label(aes(fill = factor(cyl)), colour = "white", fontface = "bold") # Scale size of text, and change legend key glyph from a to point p + geom_text(aes(size = wt), key_glyph = "point") # Scale height of text, rather than sqrt(height) p + geom_text(aes(size = wt), key_glyph = "point") + scale_radius(range = c(3,6)) # You can display expressions by setting parse = TRUE. The # details of the display are described in ?plotmath, but note that # geom_text uses strings, not expressions. p + geom_text( aes(label = paste(wt, "^(", cyl, ")", sep = "")), parse = TRUE ) # Add a text annotation p + geom_text() + annotate( "text", label = "plot mpg vs. wt", x = 2, y = 15, size = 8, colour = "red" ) # Aligning labels and bars -------------------------------------------------- df <- data.frame( x = factor(c(1, 1, 2, 2)), y = c(1, 3, 2, 1), grp = c("a", "b", "a", "b") ) # ggplot2 doesn't know you want to give the labels the same virtual width # as the bars: ggplot(data = df, aes(x, y, group = grp)) + geom_col(aes(fill = grp), position = "dodge") + geom_text(aes(label = y), position = "dodge") # So tell it: ggplot(data = df, aes(x, y, group = grp)) + geom_col(aes(fill = grp), position = "dodge") + geom_text(aes(label = y), position = position_dodge(0.9)) # You can't nudge and dodge text, so instead adjust the y position ggplot(data = df, aes(x, y, group = grp)) + geom_col(aes(fill = grp), position = "dodge") + geom_text( aes(label = y, y = y + 0.05), position = position_dodge(0.9), vjust = 0 ) # To place text in the middle of each bar in a stacked barplot, you # need to set the vjust parameter of position_stack() ggplot(data = df, aes(x, y, group = grp)) + geom_col(aes(fill = grp)) + geom_text(aes(label = y), position = position_stack(vjust = 0.5)) # Justification ------------------------------------------------------------- df <- data.frame( x = c(1, 1, 2, 2, 1.5), y = c(1, 2, 1, 2, 1.5), text = c("bottom-left", "top-left", "bottom-right", "top-right", "center") ) ggplot(df, aes(x, y)) + geom_text(aes(label = text)) ggplot(df, aes(x, y)) + geom_text(aes(label = text), vjust = "inward", hjust = "inward")
Display polygons as a map. This is meant as annotation, so it does not
affect position scales. Note that this function predates the geom_sf()
framework and does not work with sf geometry columns as input. However,
it can be used in conjunction with geom_sf()
layers and/or
coord_sf()
(see examples).
geom_map( mapping = NULL, data = NULL, stat = "identity", ..., map, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
geom_map( mapping = NULL, data = NULL, stat = "identity", ..., map, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
... |
Other arguments passed on to
|
map |
Data frame that contains the map coordinates. This will
typically be created using |
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
geom_map()
understands the following aesthetics (required aesthetics are in bold):
Learn more about setting these aesthetics in vignette("ggplot2-specs")
.
# First, a made-up example containing a few polygons, to explain # how `geom_map()` works. It requires two data frames: # One contains the coordinates of each polygon (`positions`), and is # provided via the `map` argument. The other contains the # other the values associated with each polygon (`values`). An id # variable links the two together. ids <- factor(c("1.1", "2.1", "1.2", "2.2", "1.3", "2.3")) values <- data.frame( id = ids, value = c(3, 3.1, 3.1, 3.2, 3.15, 3.5) ) positions <- data.frame( id = rep(ids, each = 4), x = c(2, 1, 1.1, 2.2, 1, 0, 0.3, 1.1, 2.2, 1.1, 1.2, 2.5, 1.1, 0.3, 0.5, 1.2, 2.5, 1.2, 1.3, 2.7, 1.2, 0.5, 0.6, 1.3), y = c(-0.5, 0, 1, 0.5, 0, 0.5, 1.5, 1, 0.5, 1, 2.1, 1.7, 1, 1.5, 2.2, 2.1, 1.7, 2.1, 3.2, 2.8, 2.1, 2.2, 3.3, 3.2) ) ggplot(values) + geom_map(aes(map_id = id), map = positions) + expand_limits(positions) ggplot(values, aes(fill = value)) + geom_map(aes(map_id = id), map = positions) + expand_limits(positions) ggplot(values, aes(fill = value)) + geom_map(aes(map_id = id), map = positions) + expand_limits(positions) + ylim(0, 3) # Now some examples with real maps if (require(maps)) { crimes <- data.frame(state = tolower(rownames(USArrests)), USArrests) # Equivalent to crimes %>% tidyr::pivot_longer(Murder:Rape) vars <- lapply(names(crimes)[-1], function(j) { data.frame(state = crimes$state, variable = j, value = crimes[[j]]) }) crimes_long <- do.call("rbind", vars) states_map <- map_data("state") # without geospatial coordinate system, the resulting plot # looks weird ggplot(crimes, aes(map_id = state)) + geom_map(aes(fill = Murder), map = states_map) + expand_limits(x = states_map$long, y = states_map$lat) # in combination with `coord_sf()` we get an appropriate result ggplot(crimes, aes(map_id = state)) + geom_map(aes(fill = Murder), map = states_map) + # crs = 5070 is a Conus Albers projection for North America, # see: https://epsg.io/5070 # default_crs = 4326 tells coord_sf() that the input map data # are in longitude-latitude format coord_sf( crs = 5070, default_crs = 4326, xlim = c(-125, -70), ylim = c(25, 52) ) ggplot(crimes_long, aes(map_id = state)) + geom_map(aes(fill = value), map = states_map) + coord_sf( crs = 5070, default_crs = 4326, xlim = c(-125, -70), ylim = c(25, 52) ) + facet_wrap(~variable) }
# First, a made-up example containing a few polygons, to explain # how `geom_map()` works. It requires two data frames: # One contains the coordinates of each polygon (`positions`), and is # provided via the `map` argument. The other contains the # other the values associated with each polygon (`values`). An id # variable links the two together. ids <- factor(c("1.1", "2.1", "1.2", "2.2", "1.3", "2.3")) values <- data.frame( id = ids, value = c(3, 3.1, 3.1, 3.2, 3.15, 3.5) ) positions <- data.frame( id = rep(ids, each = 4), x = c(2, 1, 1.1, 2.2, 1, 0, 0.3, 1.1, 2.2, 1.1, 1.2, 2.5, 1.1, 0.3, 0.5, 1.2, 2.5, 1.2, 1.3, 2.7, 1.2, 0.5, 0.6, 1.3), y = c(-0.5, 0, 1, 0.5, 0, 0.5, 1.5, 1, 0.5, 1, 2.1, 1.7, 1, 1.5, 2.2, 2.1, 1.7, 2.1, 3.2, 2.8, 2.1, 2.2, 3.3, 3.2) ) ggplot(values) + geom_map(aes(map_id = id), map = positions) + expand_limits(positions) ggplot(values, aes(fill = value)) + geom_map(aes(map_id = id), map = positions) + expand_limits(positions) ggplot(values, aes(fill = value)) + geom_map(aes(map_id = id), map = positions) + expand_limits(positions) + ylim(0, 3) # Now some examples with real maps if (require(maps)) { crimes <- data.frame(state = tolower(rownames(USArrests)), USArrests) # Equivalent to crimes %>% tidyr::pivot_longer(Murder:Rape) vars <- lapply(names(crimes)[-1], function(j) { data.frame(state = crimes$state, variable = j, value = crimes[[j]]) }) crimes_long <- do.call("rbind", vars) states_map <- map_data("state") # without geospatial coordinate system, the resulting plot # looks weird ggplot(crimes, aes(map_id = state)) + geom_map(aes(fill = Murder), map = states_map) + expand_limits(x = states_map$long, y = states_map$lat) # in combination with `coord_sf()` we get an appropriate result ggplot(crimes, aes(map_id = state)) + geom_map(aes(fill = Murder), map = states_map) + # crs = 5070 is a Conus Albers projection for North America, # see: https://epsg.io/5070 # default_crs = 4326 tells coord_sf() that the input map data # are in longitude-latitude format coord_sf( crs = 5070, default_crs = 4326, xlim = c(-125, -70), ylim = c(25, 52) ) ggplot(crimes_long, aes(map_id = state)) + geom_map(aes(fill = value), map = states_map) + coord_sf( crs = 5070, default_crs = 4326, xlim = c(-125, -70), ylim = c(25, 52) ) + facet_wrap(~variable) }
geom_path()
connects the observations in the order in which they appear
in the data. geom_line()
connects them in order of the variable on the
x axis. geom_step()
creates a stairstep plot, highlighting exactly
when changes occur. The group
aesthetic determines which cases are
connected together.
geom_path( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., lineend = "butt", linejoin = "round", linemitre = 10, arrow = NULL, arrow.fill = NULL, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_line( mapping = NULL, data = NULL, stat = "identity", position = "identity", na.rm = FALSE, orientation = NA, show.legend = NA, inherit.aes = TRUE, ... ) geom_step( mapping = NULL, data = NULL, stat = "identity", position = "identity", direction = "hv", na.rm = FALSE, orientation = NA, show.legend = NA, inherit.aes = TRUE, ... )
geom_path( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., lineend = "butt", linejoin = "round", linemitre = 10, arrow = NULL, arrow.fill = NULL, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_line( mapping = NULL, data = NULL, stat = "identity", position = "identity", na.rm = FALSE, orientation = NA, show.legend = NA, inherit.aes = TRUE, ... ) geom_step( mapping = NULL, data = NULL, stat = "identity", position = "identity", direction = "hv", na.rm = FALSE, orientation = NA, show.legend = NA, inherit.aes = TRUE, ... )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
lineend |
Line end style (round, butt, square). |
linejoin |
Line join style (round, mitre, bevel). |
linemitre |
Line mitre limit (number greater than 1). |
arrow |
Arrow specification, as created by |
arrow.fill |
fill colour to use for the arrow head (if closed). |
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
orientation |
The orientation of the layer. The default ( |
direction |
direction of stairs: 'vh' for vertical then horizontal, 'hv' for horizontal then vertical, or 'mid' for step half-way between adjacent x-values. |
An alternative parameterisation is geom_segment()
, where each line
corresponds to a single case which provides the start and end coordinates.
This geom treats each axis differently and, thus, can thus have two orientations. Often the orientation is easy to deduce from a combination of the given mappings and the types of positional scales in use. Thus, ggplot2 will by default try to guess which orientation the layer should have. Under rare circumstances, the orientation is ambiguous and guessing may fail. In that case the orientation can be specified directly using the orientation
parameter, which can be either "x"
or "y"
. The value gives the axis that the geom should run along, "x"
being the default orientation you would expect for the geom.
geom_path()
understands the following aesthetics (required aesthetics are in bold):
Learn more about setting these aesthetics in vignette("ggplot2-specs")
.
geom_path()
, geom_line()
, and geom_step()
handle NA
as follows:
If an NA
occurs in the middle of a line, it breaks the line. No warning
is shown, regardless of whether na.rm
is TRUE
or FALSE
.
If an NA
occurs at the start or the end of the line and na.rm
is FALSE
(default), the NA
is removed with a warning.
If an NA
occurs at the start or the end of the line and na.rm
is TRUE
,
the NA
is removed silently, without warning.
geom_polygon()
: Filled paths (polygons);
geom_segment()
: Line segments
# geom_line() is suitable for time series ggplot(economics, aes(date, unemploy)) + geom_line() # separate by colour and use "timeseries" legend key glyph ggplot(economics_long, aes(date, value01, colour = variable)) + geom_line(key_glyph = "timeseries") # You can get a timeseries that run vertically by setting the orientation ggplot(economics, aes(unemploy, date)) + geom_line(orientation = "y") # geom_step() is useful when you want to highlight exactly when # the y value changes recent <- economics[economics$date > as.Date("2013-01-01"), ] ggplot(recent, aes(date, unemploy)) + geom_line() ggplot(recent, aes(date, unemploy)) + geom_step() # geom_path lets you explore how two variables are related over time, # e.g. unemployment and personal savings rate m <- ggplot(economics, aes(unemploy/pop, psavert)) m + geom_path() m + geom_path(aes(colour = as.numeric(date))) # Changing parameters ---------------------------------------------- ggplot(economics, aes(date, unemploy)) + geom_line(colour = "red") # Use the arrow parameter to add an arrow to the line # See ?arrow for more details c <- ggplot(economics, aes(x = date, y = pop)) c + geom_line(arrow = arrow()) c + geom_line( arrow = arrow(angle = 15, ends = "both", type = "closed") ) # Control line join parameters df <- data.frame(x = 1:3, y = c(4, 1, 9)) base <- ggplot(df, aes(x, y)) base + geom_path(linewidth = 10) base + geom_path(linewidth = 10, lineend = "round") base + geom_path(linewidth = 10, linejoin = "mitre", lineend = "butt") # You can use NAs to break the line. df <- data.frame(x = 1:5, y = c(1, 2, NA, 4, 5)) ggplot(df, aes(x, y)) + geom_point() + geom_line() # Setting line type vs colour/size # Line type needs to be applied to a line as a whole, so it can # not be used with colour or size that vary across a line x <- seq(0.01, .99, length.out = 100) df <- data.frame( x = rep(x, 2), y = c(qlogis(x), 2 * qlogis(x)), group = rep(c("a","b"), each = 100) ) p <- ggplot(df, aes(x=x, y=y, group=group)) # These work p + geom_line(linetype = 2) p + geom_line(aes(colour = group), linetype = 2) p + geom_line(aes(colour = x)) # But this doesn't should_stop(p + geom_line(aes(colour = x), linetype=2))
# geom_line() is suitable for time series ggplot(economics, aes(date, unemploy)) + geom_line() # separate by colour and use "timeseries" legend key glyph ggplot(economics_long, aes(date, value01, colour = variable)) + geom_line(key_glyph = "timeseries") # You can get a timeseries that run vertically by setting the orientation ggplot(economics, aes(unemploy, date)) + geom_line(orientation = "y") # geom_step() is useful when you want to highlight exactly when # the y value changes recent <- economics[economics$date > as.Date("2013-01-01"), ] ggplot(recent, aes(date, unemploy)) + geom_line() ggplot(recent, aes(date, unemploy)) + geom_step() # geom_path lets you explore how two variables are related over time, # e.g. unemployment and personal savings rate m <- ggplot(economics, aes(unemploy/pop, psavert)) m + geom_path() m + geom_path(aes(colour = as.numeric(date))) # Changing parameters ---------------------------------------------- ggplot(economics, aes(date, unemploy)) + geom_line(colour = "red") # Use the arrow parameter to add an arrow to the line # See ?arrow for more details c <- ggplot(economics, aes(x = date, y = pop)) c + geom_line(arrow = arrow()) c + geom_line( arrow = arrow(angle = 15, ends = "both", type = "closed") ) # Control line join parameters df <- data.frame(x = 1:3, y = c(4, 1, 9)) base <- ggplot(df, aes(x, y)) base + geom_path(linewidth = 10) base + geom_path(linewidth = 10, lineend = "round") base + geom_path(linewidth = 10, linejoin = "mitre", lineend = "butt") # You can use NAs to break the line. df <- data.frame(x = 1:5, y = c(1, 2, NA, 4, 5)) ggplot(df, aes(x, y)) + geom_point() + geom_line() # Setting line type vs colour/size # Line type needs to be applied to a line as a whole, so it can # not be used with colour or size that vary across a line x <- seq(0.01, .99, length.out = 100) df <- data.frame( x = rep(x, 2), y = c(qlogis(x), 2 * qlogis(x)), group = rep(c("a","b"), each = 100) ) p <- ggplot(df, aes(x=x, y=y, group=group)) # These work p + geom_line(linetype = 2) p + geom_line(aes(colour = group), linetype = 2) p + geom_line(aes(colour = x)) # But this doesn't should_stop(p + geom_line(aes(colour = x), linetype=2))
The point geom is used to create scatterplots. The scatterplot is most
useful for displaying the relationship between two continuous variables.
It can be used to compare one continuous and one categorical variable, or
two categorical variables, but a variation like geom_jitter()
,
geom_count()
, or geom_bin_2d()
is usually more
appropriate. A bubblechart is a scatterplot with a third variable
mapped to the size of points.
geom_point( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
geom_point( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
The biggest potential problem with a scatterplot is overplotting: whenever
you have more than a few points, points may be plotted on top of one
another. This can severely distort the visual appearance of the plot.
There is no one solution to this problem, but there are some techniques
that can help. You can add additional information with
geom_smooth()
, geom_quantile()
or
geom_density_2d()
. If you have few unique x
values,
geom_boxplot()
may also be useful.
Alternatively, you can
summarise the number of points at each location and display that in some
way, using geom_count()
, geom_hex()
, or
geom_density2d()
.
Another technique is to make the points transparent (e.g.
geom_point(alpha = 0.05)
) or very small (e.g.
geom_point(shape = ".")
).
geom_point()
understands the following aesthetics (required aesthetics are in bold):
The fill
aesthetic only applies to shapes 21-25.
Learn more about setting these aesthetics in vignette("ggplot2-specs")
.
p <- ggplot(mtcars, aes(wt, mpg)) p + geom_point() # Add aesthetic mappings p + geom_point(aes(colour = factor(cyl))) p + geom_point(aes(shape = factor(cyl))) # A "bubblechart": p + geom_point(aes(size = qsec)) # Set aesthetics to fixed value ggplot(mtcars, aes(wt, mpg)) + geom_point(colour = "red", size = 3) # Varying alpha is useful for large datasets d <- ggplot(diamonds, aes(carat, price)) d + geom_point(alpha = 1/10) d + geom_point(alpha = 1/20) d + geom_point(alpha = 1/100) # For shapes that have a border (like 21), you can colour the inside and # outside separately. Use the stroke aesthetic to modify the width of the # border ggplot(mtcars, aes(wt, mpg)) + geom_point(shape = 21, colour = "black", fill = "white", size = 5, stroke = 5) # You can create interesting shapes by layering multiple points of # different sizes p <- ggplot(mtcars, aes(mpg, wt, shape = factor(cyl))) p + geom_point(aes(colour = factor(cyl)), size = 4) + geom_point(colour = "grey90", size = 1.5) p + geom_point(colour = "black", size = 4.5) + geom_point(colour = "pink", size = 4) + geom_point(aes(shape = factor(cyl))) # geom_point warns when missing values have been dropped from the data set # and not plotted, you can turn this off by setting na.rm = TRUE set.seed(1) mtcars2 <- transform(mtcars, mpg = ifelse(runif(32) < 0.2, NA, mpg)) ggplot(mtcars2, aes(wt, mpg)) + geom_point() ggplot(mtcars2, aes(wt, mpg)) + geom_point(na.rm = TRUE)
p <- ggplot(mtcars, aes(wt, mpg)) p + geom_point() # Add aesthetic mappings p + geom_point(aes(colour = factor(cyl))) p + geom_point(aes(shape = factor(cyl))) # A "bubblechart": p + geom_point(aes(size = qsec)) # Set aesthetics to fixed value ggplot(mtcars, aes(wt, mpg)) + geom_point(colour = "red", size = 3) # Varying alpha is useful for large datasets d <- ggplot(diamonds, aes(carat, price)) d + geom_point(alpha = 1/10) d + geom_point(alpha = 1/20) d + geom_point(alpha = 1/100) # For shapes that have a border (like 21), you can colour the inside and # outside separately. Use the stroke aesthetic to modify the width of the # border ggplot(mtcars, aes(wt, mpg)) + geom_point(shape = 21, colour = "black", fill = "white", size = 5, stroke = 5) # You can create interesting shapes by layering multiple points of # different sizes p <- ggplot(mtcars, aes(mpg, wt, shape = factor(cyl))) p + geom_point(aes(colour = factor(cyl)), size = 4) + geom_point(colour = "grey90", size = 1.5) p + geom_point(colour = "black", size = 4.5) + geom_point(colour = "pink", size = 4) + geom_point(aes(shape = factor(cyl))) # geom_point warns when missing values have been dropped from the data set # and not plotted, you can turn this off by setting na.rm = TRUE set.seed(1) mtcars2 <- transform(mtcars, mpg = ifelse(runif(32) < 0.2, NA, mpg)) ggplot(mtcars2, aes(wt, mpg)) + geom_point() ggplot(mtcars2, aes(wt, mpg)) + geom_point(na.rm = TRUE)
Polygons are very similar to paths (as drawn by geom_path()
)
except that the start and end points are connected and the inside is
coloured by fill
. The group
aesthetic determines which cases
are connected together into a polygon. From R 3.6 and onwards it is possible
to draw polygons with holes by providing a subgroup aesthetic that
differentiates the outer ring points from those describing holes in the
polygon.
geom_polygon( mapping = NULL, data = NULL, stat = "identity", position = "identity", rule = "evenodd", ..., na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
geom_polygon( mapping = NULL, data = NULL, stat = "identity", position = "identity", rule = "evenodd", ..., na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
rule |
Either |
... |
Other arguments passed on to
|
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
geom_polygon()
understands the following aesthetics (required aesthetics are in bold):
Learn more about setting these aesthetics in vignette("ggplot2-specs")
.
geom_path()
for an unfilled polygon,
geom_ribbon()
for a polygon anchored on the x-axis
# When using geom_polygon, you will typically need two data frames: # one contains the coordinates of each polygon (positions), and the # other the values associated with each polygon (values). An id # variable links the two together ids <- factor(c("1.1", "2.1", "1.2", "2.2", "1.3", "2.3")) values <- data.frame( id = ids, value = c(3, 3.1, 3.1, 3.2, 3.15, 3.5) ) positions <- data.frame( id = rep(ids, each = 4), x = c(2, 1, 1.1, 2.2, 1, 0, 0.3, 1.1, 2.2, 1.1, 1.2, 2.5, 1.1, 0.3, 0.5, 1.2, 2.5, 1.2, 1.3, 2.7, 1.2, 0.5, 0.6, 1.3), y = c(-0.5, 0, 1, 0.5, 0, 0.5, 1.5, 1, 0.5, 1, 2.1, 1.7, 1, 1.5, 2.2, 2.1, 1.7, 2.1, 3.2, 2.8, 2.1, 2.2, 3.3, 3.2) ) # Currently we need to manually merge the two together datapoly <- merge(values, positions, by = c("id")) p <- ggplot(datapoly, aes(x = x, y = y)) + geom_polygon(aes(fill = value, group = id)) p # Which seems like a lot of work, but then it's easy to add on # other features in this coordinate system, e.g.: set.seed(1) stream <- data.frame( x = cumsum(runif(50, max = 0.1)), y = cumsum(runif(50,max = 0.1)) ) p + geom_line(data = stream, colour = "grey30", linewidth = 5) # And if the positions are in longitude and latitude, you can use # coord_map to produce different map projections. if (packageVersion("grid") >= "3.6") { # As of R version 3.6 geom_polygon() supports polygons with holes # Use the subgroup aesthetic to differentiate holes from the main polygon holes <- do.call(rbind, lapply(split(datapoly, datapoly$id), function(df) { df$x <- df$x + 0.5 * (mean(df$x) - df$x) df$y <- df$y + 0.5 * (mean(df$y) - df$y) df })) datapoly$subid <- 1L holes$subid <- 2L datapoly <- rbind(datapoly, holes) p <- ggplot(datapoly, aes(x = x, y = y)) + geom_polygon(aes(fill = value, group = id, subgroup = subid)) p }
# When using geom_polygon, you will typically need two data frames: # one contains the coordinates of each polygon (positions), and the # other the values associated with each polygon (values). An id # variable links the two together ids <- factor(c("1.1", "2.1", "1.2", "2.2", "1.3", "2.3")) values <- data.frame( id = ids, value = c(3, 3.1, 3.1, 3.2, 3.15, 3.5) ) positions <- data.frame( id = rep(ids, each = 4), x = c(2, 1, 1.1, 2.2, 1, 0, 0.3, 1.1, 2.2, 1.1, 1.2, 2.5, 1.1, 0.3, 0.5, 1.2, 2.5, 1.2, 1.3, 2.7, 1.2, 0.5, 0.6, 1.3), y = c(-0.5, 0, 1, 0.5, 0, 0.5, 1.5, 1, 0.5, 1, 2.1, 1.7, 1, 1.5, 2.2, 2.1, 1.7, 2.1, 3.2, 2.8, 2.1, 2.2, 3.3, 3.2) ) # Currently we need to manually merge the two together datapoly <- merge(values, positions, by = c("id")) p <- ggplot(datapoly, aes(x = x, y = y)) + geom_polygon(aes(fill = value, group = id)) p # Which seems like a lot of work, but then it's easy to add on # other features in this coordinate system, e.g.: set.seed(1) stream <- data.frame( x = cumsum(runif(50, max = 0.1)), y = cumsum(runif(50,max = 0.1)) ) p + geom_line(data = stream, colour = "grey30", linewidth = 5) # And if the positions are in longitude and latitude, you can use # coord_map to produce different map projections. if (packageVersion("grid") >= "3.6") { # As of R version 3.6 geom_polygon() supports polygons with holes # Use the subgroup aesthetic to differentiate holes from the main polygon holes <- do.call(rbind, lapply(split(datapoly, datapoly$id), function(df) { df$x <- df$x + 0.5 * (mean(df$x) - df$x) df$y <- df$y + 0.5 * (mean(df$y) - df$y) df })) datapoly$subid <- 1L holes$subid <- 2L datapoly <- rbind(datapoly, holes) p <- ggplot(datapoly, aes(x = x, y = y)) + geom_polygon(aes(fill = value, group = id, subgroup = subid)) p }
geom_qq()
and stat_qq()
produce quantile-quantile plots. geom_qq_line()
and
stat_qq_line()
compute the slope and intercept of the line connecting the
points at specified quartiles of the theoretical and sample distributions.
geom_qq_line( mapping = NULL, data = NULL, geom = "path", position = "identity", ..., distribution = stats::qnorm, dparams = list(), line.p = c(0.25, 0.75), fullrange = FALSE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_qq_line( mapping = NULL, data = NULL, geom = "path", position = "identity", ..., distribution = stats::qnorm, dparams = list(), line.p = c(0.25, 0.75), fullrange = FALSE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_qq( mapping = NULL, data = NULL, geom = "point", position = "identity", ..., distribution = stats::qnorm, dparams = list(), na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_qq( mapping = NULL, data = NULL, geom = "point", position = "identity", ..., distribution = stats::qnorm, dparams = list(), na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
geom_qq_line( mapping = NULL, data = NULL, geom = "path", position = "identity", ..., distribution = stats::qnorm, dparams = list(), line.p = c(0.25, 0.75), fullrange = FALSE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_qq_line( mapping = NULL, data = NULL, geom = "path", position = "identity", ..., distribution = stats::qnorm, dparams = list(), line.p = c(0.25, 0.75), fullrange = FALSE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_qq( mapping = NULL, data = NULL, geom = "point", position = "identity", ..., distribution = stats::qnorm, dparams = list(), na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_qq( mapping = NULL, data = NULL, geom = "point", position = "identity", ..., distribution = stats::qnorm, dparams = list(), na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
The geometric object to use to display the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
distribution |
Distribution function to use, if x not specified |
dparams |
Additional parameters passed on to |
line.p |
Vector of quantiles to use when fitting the Q-Q line, defaults
defaults to |
fullrange |
Should the q-q line span the full range of the plot, or just the data |
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
stat_qq()
understands the following aesthetics (required aesthetics are in bold):
Learn more about setting these aesthetics in vignette("ggplot2-specs")
.
stat_qq_line()
understands the following aesthetics (required aesthetics are in bold):
Learn more about setting these aesthetics in vignette("ggplot2-specs")
.
These are calculated by the 'stat' part of layers and can be accessed with delayed evaluation.
Variables computed by stat_qq()
:
after_stat(sample)
Sample quantiles.
after_stat(theoretical)
Theoretical quantiles.
Variables computed by stat_qq_line()
:
after_stat(x)
x-coordinates of the endpoints of the line segment connecting the points at the chosen quantiles of the theoretical and the sample distributions.
after_stat(y)
y-coordinates of the endpoints.
df <- data.frame(y = rt(200, df = 5)) p <- ggplot(df, aes(sample = y)) p + stat_qq() + stat_qq_line() # Use fitdistr from MASS to estimate distribution params params <- as.list(MASS::fitdistr(df$y, "t")$estimate) ggplot(df, aes(sample = y)) + stat_qq(distribution = qt, dparams = params["df"]) + stat_qq_line(distribution = qt, dparams = params["df"]) # Using to explore the distribution of a variable ggplot(mtcars, aes(sample = mpg)) + stat_qq() + stat_qq_line() ggplot(mtcars, aes(sample = mpg, colour = factor(cyl))) + stat_qq() + stat_qq_line()
df <- data.frame(y = rt(200, df = 5)) p <- ggplot(df, aes(sample = y)) p + stat_qq() + stat_qq_line() # Use fitdistr from MASS to estimate distribution params params <- as.list(MASS::fitdistr(df$y, "t")$estimate) ggplot(df, aes(sample = y)) + stat_qq(distribution = qt, dparams = params["df"]) + stat_qq_line(distribution = qt, dparams = params["df"]) # Using to explore the distribution of a variable ggplot(mtcars, aes(sample = mpg)) + stat_qq() + stat_qq_line() ggplot(mtcars, aes(sample = mpg, colour = factor(cyl))) + stat_qq() + stat_qq_line()
This fits a quantile regression to the data and draws the fitted quantiles
with lines. This is as a continuous analogue to geom_boxplot()
.
geom_quantile( mapping = NULL, data = NULL, stat = "quantile", position = "identity", ..., lineend = "butt", linejoin = "round", linemitre = 10, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_quantile( mapping = NULL, data = NULL, geom = "quantile", position = "identity", ..., quantiles = c(0.25, 0.5, 0.75), formula = NULL, method = "rq", method.args = list(), na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
geom_quantile( mapping = NULL, data = NULL, stat = "quantile", position = "identity", ..., lineend = "butt", linejoin = "round", linemitre = 10, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_quantile( mapping = NULL, data = NULL, geom = "quantile", position = "identity", ..., quantiles = c(0.25, 0.5, 0.75), formula = NULL, method = "rq", method.args = list(), na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
lineend |
Line end style (round, butt, square). |
linejoin |
Line join style (round, mitre, bevel). |
linemitre |
Line mitre limit (number greater than 1). |
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
geom , stat
|
Use to override the default connection between
|
quantiles |
conditional quantiles of y to calculate and display |
formula |
formula relating y variables to x variables |
method |
Quantile regression method to use. Available options are |
method.args |
List of additional arguments passed on to the modelling
function defined by |
geom_quantile()
understands the following aesthetics (required aesthetics are in bold):
Learn more about setting these aesthetics in vignette("ggplot2-specs")
.
These are calculated by the 'stat' part of layers and can be accessed with delayed evaluation.
after_stat(quantile)
Quantile of distribution.
m <- ggplot(mpg, aes(displ, 1 / hwy)) + geom_point() m + geom_quantile() m + geom_quantile(quantiles = 0.5) q10 <- seq(0.05, 0.95, by = 0.05) m + geom_quantile(quantiles = q10) # You can also use rqss to fit smooth quantiles m + geom_quantile(method = "rqss") # Note that rqss doesn't pick a smoothing constant automatically, so # you'll need to tweak lambda yourself m + geom_quantile(method = "rqss", lambda = 0.1) # Set aesthetics to fixed value m + geom_quantile(colour = "red", linewidth = 2, alpha = 0.5)
m <- ggplot(mpg, aes(displ, 1 / hwy)) + geom_point() m + geom_quantile() m + geom_quantile(quantiles = 0.5) q10 <- seq(0.05, 0.95, by = 0.05) m + geom_quantile(quantiles = q10) # You can also use rqss to fit smooth quantiles m + geom_quantile(method = "rqss") # Note that rqss doesn't pick a smoothing constant automatically, so # you'll need to tweak lambda yourself m + geom_quantile(method = "rqss", lambda = 0.1) # Set aesthetics to fixed value m + geom_quantile(colour = "red", linewidth = 2, alpha = 0.5)
geom_rect()
and geom_tile()
do the same thing, but are
parameterised differently: geom_tile()
uses the center of the tile and its
size (x
, y
, width
, height
), while geom_rect()
can use those or the
locations of the corners (xmin
, xmax
, ymin
and ymax
).
geom_raster()
is a high performance special case for when all the tiles
are the same size, and no pattern fills are applied.
geom_raster( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., hjust = 0.5, vjust = 0.5, interpolate = FALSE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_rect( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., linejoin = "mitre", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_tile( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., linejoin = "mitre", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
geom_raster( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., hjust = 0.5, vjust = 0.5, interpolate = FALSE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_rect( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., linejoin = "mitre", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_tile( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., linejoin = "mitre", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
hjust , vjust
|
horizontal and vertical justification of the grob. Each justification value should be a number between 0 and 1. Defaults to 0.5 for both, centering each pixel over its data location. |
interpolate |
If |
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
linejoin |
Line join style (round, mitre, bevel). |
Please note that the width
and height
aesthetics are not true position
aesthetics and therefore are not subject to scale transformation. It is
only after transformation that these aesthetics are applied.
geom_rect()
understands the following aesthetics (required aesthetics are in bold):
geom_tile()
understands only the x
/width
and y
/height
combinations.
Note that geom_raster()
ignores colour
.
Learn more about setting these aesthetics in vignette("ggplot2-specs")
.
# The most common use for rectangles is to draw a surface. You always want # to use geom_raster here because it's so much faster, and produces # smaller output when saving to PDF ggplot(faithfuld, aes(waiting, eruptions)) + geom_raster(aes(fill = density)) # Interpolation smooths the surface & is most helpful when rendering images. ggplot(faithfuld, aes(waiting, eruptions)) + geom_raster(aes(fill = density), interpolate = TRUE) # If you want to draw arbitrary rectangles, use geom_tile() or geom_rect() df <- data.frame( x = rep(c(2, 5, 7, 9, 12), 2), y = rep(c(1, 2), each = 5), z = factor(rep(1:5, each = 2)), w = rep(diff(c(0, 4, 6, 8, 10, 14)), 2) ) ggplot(df, aes(x, y)) + geom_tile(aes(fill = z), colour = "grey50") ggplot(df, aes(x, y, width = w)) + geom_tile(aes(fill = z), colour = "grey50") ggplot(df, aes(xmin = x - w / 2, xmax = x + w / 2, ymin = y, ymax = y + 1)) + geom_rect(aes(fill = z), colour = "grey50") # Justification controls where the cells are anchored df <- expand.grid(x = 0:5, y = 0:5) set.seed(1) df$z <- runif(nrow(df)) # default is compatible with geom_tile() ggplot(df, aes(x, y, fill = z)) + geom_raster() # zero padding ggplot(df, aes(x, y, fill = z)) + geom_raster(hjust = 0, vjust = 0) # Inspired by the image-density plots of Ken Knoblauch cars <- ggplot(mtcars, aes(mpg, factor(cyl))) cars + geom_point() cars + stat_bin_2d(aes(fill = after_stat(count)), binwidth = c(3,1)) cars + stat_bin_2d(aes(fill = after_stat(density)), binwidth = c(3,1)) cars + stat_density( aes(fill = after_stat(density)), geom = "raster", position = "identity" ) cars + stat_density( aes(fill = after_stat(count)), geom = "raster", position = "identity" )
# The most common use for rectangles is to draw a surface. You always want # to use geom_raster here because it's so much faster, and produces # smaller output when saving to PDF ggplot(faithfuld, aes(waiting, eruptions)) + geom_raster(aes(fill = density)) # Interpolation smooths the surface & is most helpful when rendering images. ggplot(faithfuld, aes(waiting, eruptions)) + geom_raster(aes(fill = density), interpolate = TRUE) # If you want to draw arbitrary rectangles, use geom_tile() or geom_rect() df <- data.frame( x = rep(c(2, 5, 7, 9, 12), 2), y = rep(c(1, 2), each = 5), z = factor(rep(1:5, each = 2)), w = rep(diff(c(0, 4, 6, 8, 10, 14)), 2) ) ggplot(df, aes(x, y)) + geom_tile(aes(fill = z), colour = "grey50") ggplot(df, aes(x, y, width = w)) + geom_tile(aes(fill = z), colour = "grey50") ggplot(df, aes(xmin = x - w / 2, xmax = x + w / 2, ymin = y, ymax = y + 1)) + geom_rect(aes(fill = z), colour = "grey50") # Justification controls where the cells are anchored df <- expand.grid(x = 0:5, y = 0:5) set.seed(1) df$z <- runif(nrow(df)) # default is compatible with geom_tile() ggplot(df, aes(x, y, fill = z)) + geom_raster() # zero padding ggplot(df, aes(x, y, fill = z)) + geom_raster(hjust = 0, vjust = 0) # Inspired by the image-density plots of Ken Knoblauch cars <- ggplot(mtcars, aes(mpg, factor(cyl))) cars + geom_point() cars + stat_bin_2d(aes(fill = after_stat(count)), binwidth = c(3,1)) cars + stat_bin_2d(aes(fill = after_stat(density)), binwidth = c(3,1)) cars + stat_density( aes(fill = after_stat(density)), geom = "raster", position = "identity" ) cars + stat_density( aes(fill = after_stat(count)), geom = "raster", position = "identity" )
For each x value, geom_ribbon()
displays a y interval defined
by ymin
and ymax
. geom_area()
is a special case of
geom_ribbon()
, where the ymin
is fixed to 0 and y
is used instead
of ymax
.
geom_ribbon( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., na.rm = FALSE, orientation = NA, show.legend = NA, inherit.aes = TRUE, outline.type = "both" ) geom_area( mapping = NULL, data = NULL, stat = "align", position = "stack", na.rm = FALSE, orientation = NA, show.legend = NA, inherit.aes = TRUE, ..., outline.type = "upper" ) stat_align( mapping = NULL, data = NULL, geom = "area", position = "identity", ..., na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
geom_ribbon( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., na.rm = FALSE, orientation = NA, show.legend = NA, inherit.aes = TRUE, outline.type = "both" ) geom_area( mapping = NULL, data = NULL, stat = "align", position = "stack", na.rm = FALSE, orientation = NA, show.legend = NA, inherit.aes = TRUE, ..., outline.type = "upper" ) stat_align( mapping = NULL, data = NULL, geom = "area", position = "identity", ..., na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
na.rm |
If |
orientation |
The orientation of the layer. The default ( |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
outline.type |
Type of the outline of the area; |
geom |
The geometric object to use to display the data for this layer.
When using a
|
An area plot is the continuous analogue of a stacked bar chart (see
geom_bar()
), and can be used to show how composition of the
whole varies over the range of x. Choosing the order in which different
components is stacked is very important, as it becomes increasing hard to
see the individual pattern as you move up the stack. See
position_stack()
for the details of stacking algorithm. To facilitate
stacking, the default stat = "align"
interpolates groups to a common set
of x-coordinates. To turn off this interpolation, stat = "identity"
can
be used instead.
This geom treats each axis differently and, thus, can thus have two orientations. Often the orientation is easy to deduce from a combination of the given mappings and the types of positional scales in use. Thus, ggplot2 will by default try to guess which orientation the layer should have. Under rare circumstances, the orientation is ambiguous and guessing may fail. In that case the orientation can be specified directly using the orientation
parameter, which can be either "x"
or "y"
. The value gives the axis that the geom should run along, "x"
being the default orientation you would expect for the geom.
geom_ribbon()
understands the following aesthetics (required aesthetics are in bold):
Learn more about setting these aesthetics in vignette("ggplot2-specs")
.
geom_bar()
for discrete intervals (bars),
geom_linerange()
for discrete intervals (lines),
geom_polygon()
for general polygons
# Generate data huron <- data.frame(year = 1875:1972, level = as.vector(LakeHuron)) h <- ggplot(huron, aes(year)) h + geom_ribbon(aes(ymin=0, ymax=level)) h + geom_area(aes(y = level)) # Orientation cannot be deduced by mapping, so must be given explicitly for # flipped orientation h + geom_area(aes(x = level, y = year), orientation = "y") # Add aesthetic mappings h + geom_ribbon(aes(ymin = level - 1, ymax = level + 1), fill = "grey70") + geom_line(aes(y = level)) # The underlying stat_align() takes care of unaligned data points df <- data.frame( g = c("a", "a", "a", "b", "b", "b"), x = c(1, 3, 5, 2, 4, 6), y = c(2, 5, 1, 3, 6, 7) ) a <- ggplot(df, aes(x, y, fill = g)) + geom_area() # Two groups have points on different X values. a + geom_point(size = 8) + facet_grid(g ~ .) # stat_align() interpolates and aligns the value so that the areas can stack # properly. a + geom_point(stat = "align", position = "stack", size = 8) # To turn off the alignment, the stat can be set to "identity" ggplot(df, aes(x, y, fill = g)) + geom_area(stat = "identity")
# Generate data huron <- data.frame(year = 1875:1972, level = as.vector(LakeHuron)) h <- ggplot(huron, aes(year)) h + geom_ribbon(aes(ymin=0, ymax=level)) h + geom_area(aes(y = level)) # Orientation cannot be deduced by mapping, so must be given explicitly for # flipped orientation h + geom_area(aes(x = level, y = year), orientation = "y") # Add aesthetic mappings h + geom_ribbon(aes(ymin = level - 1, ymax = level + 1), fill = "grey70") + geom_line(aes(y = level)) # The underlying stat_align() takes care of unaligned data points df <- data.frame( g = c("a", "a", "a", "b", "b", "b"), x = c(1, 3, 5, 2, 4, 6), y = c(2, 5, 1, 3, 6, 7) ) a <- ggplot(df, aes(x, y, fill = g)) + geom_area() # Two groups have points on different X values. a + geom_point(size = 8) + facet_grid(g ~ .) # stat_align() interpolates and aligns the value so that the areas can stack # properly. a + geom_point(stat = "align", position = "stack", size = 8) # To turn off the alignment, the stat can be set to "identity" ggplot(df, aes(x, y, fill = g)) + geom_area(stat = "identity")
A rug plot is a compact visualisation designed to supplement a 2d display with the two 1d marginal distributions. Rug plots display individual cases so are best used with smaller datasets.
geom_rug( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., outside = FALSE, sides = "bl", length = unit(0.03, "npc"), na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
geom_rug( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., outside = FALSE, sides = "bl", length = unit(0.03, "npc"), na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
outside |
logical that controls whether to move the rug tassels outside of the plot area. Default is off (FALSE). You will also need to use |
sides |
A string that controls which sides of the plot the rugs appear on.
It can be set to a string containing any of |
length |
A |
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
By default, the rug lines are drawn with a length that corresponds to 3% of the total plot size. Since the default scale expansion of for continuous variables is 5% at both ends of the scale, the rug will not overlap with any data points under the default settings.
geom_rug()
understands the following aesthetics (required aesthetics are in bold):
Learn more about setting these aesthetics in vignette("ggplot2-specs")
.
p <- ggplot(mtcars, aes(wt, mpg)) + geom_point() p p + geom_rug() p + geom_rug(sides="b") # Rug on bottom only p + geom_rug(sides="trbl") # All four sides # Use jittering to avoid overplotting for smaller datasets ggplot(mpg, aes(displ, cty)) + geom_point() + geom_rug() ggplot(mpg, aes(displ, cty)) + geom_jitter() + geom_rug(alpha = 1/2, position = "jitter") # move the rug tassels to outside the plot # remember to set clip = "off". p + geom_rug(outside = TRUE) + coord_cartesian(clip = "off") # set sides to top right, and then move the margins p + geom_rug(outside = TRUE, sides = "tr") + coord_cartesian(clip = "off") + theme(plot.margin = margin(1, 1, 1, 1, "cm")) # increase the line length and # expand axis to avoid overplotting p + geom_rug(length = unit(0.05, "npc")) + scale_y_continuous(expand = c(0.1, 0.1))
p <- ggplot(mtcars, aes(wt, mpg)) + geom_point() p p + geom_rug() p + geom_rug(sides="b") # Rug on bottom only p + geom_rug(sides="trbl") # All four sides # Use jittering to avoid overplotting for smaller datasets ggplot(mpg, aes(displ, cty)) + geom_point() + geom_rug() ggplot(mpg, aes(displ, cty)) + geom_jitter() + geom_rug(alpha = 1/2, position = "jitter") # move the rug tassels to outside the plot # remember to set clip = "off". p + geom_rug(outside = TRUE) + coord_cartesian(clip = "off") # set sides to top right, and then move the margins p + geom_rug(outside = TRUE, sides = "tr") + coord_cartesian(clip = "off") + theme(plot.margin = margin(1, 1, 1, 1, "cm")) # increase the line length and # expand axis to avoid overplotting p + geom_rug(length = unit(0.05, "npc")) + scale_y_continuous(expand = c(0.1, 0.1))
geom_segment()
draws a straight line between points (x, y) and
(xend, yend). geom_curve()
draws a curved line. See the underlying
drawing function grid::curveGrob()
for the parameters that
control the curve.
geom_segment( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., arrow = NULL, arrow.fill = NULL, lineend = "butt", linejoin = "round", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_curve( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., curvature = 0.5, angle = 90, ncp = 5, arrow = NULL, arrow.fill = NULL, lineend = "butt", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
geom_segment( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., arrow = NULL, arrow.fill = NULL, lineend = "butt", linejoin = "round", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_curve( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., curvature = 0.5, angle = 90, ncp = 5, arrow = NULL, arrow.fill = NULL, lineend = "butt", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
arrow |
specification for arrow heads, as created by |
arrow.fill |
fill colour to use for the arrow head (if closed). |
lineend |
Line end style (round, butt, square). |
linejoin |
Line join style (round, mitre, bevel). |
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
curvature |
A numeric value giving the amount of curvature. Negative values produce left-hand curves, positive values produce right-hand curves, and zero produces a straight line. |
angle |
A numeric value between 0 and 180, giving an amount to skew the control points of the curve. Values less than 90 skew the curve towards the start point and values greater than 90 skew the curve towards the end point. |
ncp |
The number of control points used to draw the curve. More control points creates a smoother curve. |
Both geoms draw a single segment/curve per case. See geom_path()
if you
need to connect points across multiple cases.
geom_segment()
understands the following aesthetics (required aesthetics are in bold):
Learn more about setting these aesthetics in vignette("ggplot2-specs")
.
geom_path()
and geom_line()
for multi-
segment lines and paths.
geom_spoke()
for a segment parameterised by a location
(x, y), and an angle and radius.
b <- ggplot(mtcars, aes(wt, mpg)) + geom_point() df <- data.frame(x1 = 2.62, x2 = 3.57, y1 = 21.0, y2 = 15.0) b + geom_curve(aes(x = x1, y = y1, xend = x2, yend = y2, colour = "curve"), data = df) + geom_segment(aes(x = x1, y = y1, xend = x2, yend = y2, colour = "segment"), data = df) b + geom_curve(aes(x = x1, y = y1, xend = x2, yend = y2), data = df, curvature = -0.2) b + geom_curve(aes(x = x1, y = y1, xend = x2, yend = y2), data = df, curvature = 1) b + geom_curve( aes(x = x1, y = y1, xend = x2, yend = y2), data = df, arrow = arrow(length = unit(0.03, "npc")) ) if (requireNamespace('maps', quietly = TRUE)) { ggplot(seals, aes(long, lat)) + geom_segment(aes(xend = long + delta_long, yend = lat + delta_lat), arrow = arrow(length = unit(0.1,"cm"))) + borders("state") } # Use lineend and linejoin to change the style of the segments df2 <- expand.grid( lineend = c('round', 'butt', 'square'), linejoin = c('round', 'mitre', 'bevel'), stringsAsFactors = FALSE ) df2 <- data.frame(df2, y = 1:9) ggplot(df2, aes(x = 1, y = y, xend = 2, yend = y, label = paste(lineend, linejoin))) + geom_segment( lineend = df2$lineend, linejoin = df2$linejoin, size = 3, arrow = arrow(length = unit(0.3, "inches")) ) + geom_text(hjust = 'outside', nudge_x = -0.2) + xlim(0.5, 2) # You can also use geom_segment to recreate plot(type = "h") : set.seed(1) counts <- as.data.frame(table(x = rpois(100,5))) counts$x <- as.numeric(as.character(counts$x)) with(counts, plot(x, Freq, type = "h", lwd = 10)) ggplot(counts, aes(x, Freq)) + geom_segment(aes(xend = x, yend = 0), linewidth = 10, lineend = "butt")
b <- ggplot(mtcars, aes(wt, mpg)) + geom_point() df <- data.frame(x1 = 2.62, x2 = 3.57, y1 = 21.0, y2 = 15.0) b + geom_curve(aes(x = x1, y = y1, xend = x2, yend = y2, colour = "curve"), data = df) + geom_segment(aes(x = x1, y = y1, xend = x2, yend = y2, colour = "segment"), data = df) b + geom_curve(aes(x = x1, y = y1, xend = x2, yend = y2), data = df, curvature = -0.2) b + geom_curve(aes(x = x1, y = y1, xend = x2, yend = y2), data = df, curvature = 1) b + geom_curve( aes(x = x1, y = y1, xend = x2, yend = y2), data = df, arrow = arrow(length = unit(0.03, "npc")) ) if (requireNamespace('maps', quietly = TRUE)) { ggplot(seals, aes(long, lat)) + geom_segment(aes(xend = long + delta_long, yend = lat + delta_lat), arrow = arrow(length = unit(0.1,"cm"))) + borders("state") } # Use lineend and linejoin to change the style of the segments df2 <- expand.grid( lineend = c('round', 'butt', 'square'), linejoin = c('round', 'mitre', 'bevel'), stringsAsFactors = FALSE ) df2 <- data.frame(df2, y = 1:9) ggplot(df2, aes(x = 1, y = y, xend = 2, yend = y, label = paste(lineend, linejoin))) + geom_segment( lineend = df2$lineend, linejoin = df2$linejoin, size = 3, arrow = arrow(length = unit(0.3, "inches")) ) + geom_text(hjust = 'outside', nudge_x = -0.2) + xlim(0.5, 2) # You can also use geom_segment to recreate plot(type = "h") : set.seed(1) counts <- as.data.frame(table(x = rpois(100,5))) counts$x <- as.numeric(as.character(counts$x)) with(counts, plot(x, Freq, type = "h", lwd = 10)) ggplot(counts, aes(x, Freq)) + geom_segment(aes(xend = x, yend = 0), linewidth = 10, lineend = "butt")
Aids the eye in seeing patterns in the presence of overplotting.
geom_smooth()
and stat_smooth()
are effectively aliases: they
both use the same arguments. Use stat_smooth()
if you want to
display the results with a non-standard geom.
geom_smooth( mapping = NULL, data = NULL, stat = "smooth", position = "identity", ..., method = NULL, formula = NULL, se = TRUE, na.rm = FALSE, orientation = NA, show.legend = NA, inherit.aes = TRUE ) stat_smooth( mapping = NULL, data = NULL, geom = "smooth", position = "identity", ..., method = NULL, formula = NULL, se = TRUE, n = 80, span = 0.75, fullrange = FALSE, xseq = NULL, level = 0.95, method.args = list(), na.rm = FALSE, orientation = NA, show.legend = NA, inherit.aes = TRUE )
geom_smooth( mapping = NULL, data = NULL, stat = "smooth", position = "identity", ..., method = NULL, formula = NULL, se = TRUE, na.rm = FALSE, orientation = NA, show.legend = NA, inherit.aes = TRUE ) stat_smooth( mapping = NULL, data = NULL, geom = "smooth", position = "identity", ..., method = NULL, formula = NULL, se = TRUE, n = 80, span = 0.75, fullrange = FALSE, xseq = NULL, level = 0.95, method.args = list(), na.rm = FALSE, orientation = NA, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
method |
Smoothing method (function) to use, accepts either
For If you have fewer than 1,000 observations but want to use the same |
formula |
Formula to use in smoothing function, eg. |
se |
Display confidence band around smooth? ( |
na.rm |
If |
orientation |
The orientation of the layer. The default ( |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
geom , stat
|
Use to override the default connection between
|
n |
Number of points at which to evaluate smoother. |
span |
Controls the amount of smoothing for the default loess smoother.
Smaller numbers produce wigglier lines, larger numbers produce smoother
lines. Only used with loess, i.e. when |
fullrange |
If |
xseq |
A numeric vector of values at which the smoother is evaluated.
When |
level |
Level of confidence band to use (0.95 by default). |
method.args |
List of additional arguments passed on to the modelling
function defined by |
Calculation is performed by the (currently undocumented)
predictdf()
generic and its methods. For most methods the standard
error bounds are computed using the predict()
method – the
exceptions are loess()
, which uses a t-based approximation, and
glm()
, where the normal confidence band is constructed on the link
scale and then back-transformed to the response scale.
This geom treats each axis differently and, thus, can thus have two orientations. Often the orientation is easy to deduce from a combination of the given mappings and the types of positional scales in use. Thus, ggplot2 will by default try to guess which orientation the layer should have. Under rare circumstances, the orientation is ambiguous and guessing may fail. In that case the orientation can be specified directly using the orientation
parameter, which can be either "x"
or "y"
. The value gives the axis that the geom should run along, "x"
being the default orientation you would expect for the geom.
geom_smooth()
understands the following aesthetics (required aesthetics are in bold):
Learn more about setting these aesthetics in vignette("ggplot2-specs")
.
These are calculated by the 'stat' part of layers and can be accessed with delayed evaluation. stat_smooth()
provides the following variables, some of which depend on the orientation:
after_stat(y)
or after_stat(x)
Predicted value.
after_stat(ymin)
or after_stat(xmin)
Lower pointwise confidence band around the mean.
after_stat(ymax)
or after_stat(xmax)
Upper pointwise confidence band around the mean.
after_stat(se)
Standard error.
See individual modelling functions for more details:
lm()
for linear smooths,
glm()
for generalised linear smooths, and
loess()
for local smooths.
ggplot(mpg, aes(displ, hwy)) + geom_point() + geom_smooth() # If you need the fitting to be done along the y-axis set the orientation ggplot(mpg, aes(displ, hwy)) + geom_point() + geom_smooth(orientation = "y") # Use span to control the "wiggliness" of the default loess smoother. # The span is the fraction of points used to fit each local regression: # small numbers make a wigglier curve, larger numbers make a smoother curve. ggplot(mpg, aes(displ, hwy)) + geom_point() + geom_smooth(span = 0.3) # Instead of a loess smooth, you can use any other modelling function: ggplot(mpg, aes(displ, hwy)) + geom_point() + geom_smooth(method = lm, se = FALSE) ggplot(mpg, aes(displ, hwy)) + geom_point() + geom_smooth(method = lm, formula = y ~ splines::bs(x, 3), se = FALSE) # Smooths are automatically fit to each group (defined by categorical # aesthetics or the group aesthetic) and for each facet. ggplot(mpg, aes(displ, hwy, colour = class)) + geom_point() + geom_smooth(se = FALSE, method = lm) ggplot(mpg, aes(displ, hwy)) + geom_point() + geom_smooth(span = 0.8) + facet_wrap(~drv) binomial_smooth <- function(...) { geom_smooth(method = "glm", method.args = list(family = "binomial"), ...) } # To fit a logistic regression, you need to coerce the values to # a numeric vector lying between 0 and 1. ggplot(rpart::kyphosis, aes(Age, Kyphosis)) + geom_jitter(height = 0.05) + binomial_smooth() ggplot(rpart::kyphosis, aes(Age, as.numeric(Kyphosis) - 1)) + geom_jitter(height = 0.05) + binomial_smooth() ggplot(rpart::kyphosis, aes(Age, as.numeric(Kyphosis) - 1)) + geom_jitter(height = 0.05) + binomial_smooth(formula = y ~ splines::ns(x, 2)) # But in this case, it's probably better to fit the model yourself # so you can exercise more control and see whether or not it's a good model.
ggplot(mpg, aes(displ, hwy)) + geom_point() + geom_smooth() # If you need the fitting to be done along the y-axis set the orientation ggplot(mpg, aes(displ, hwy)) + geom_point() + geom_smooth(orientation = "y") # Use span to control the "wiggliness" of the default loess smoother. # The span is the fraction of points used to fit each local regression: # small numbers make a wigglier curve, larger numbers make a smoother curve. ggplot(mpg, aes(displ, hwy)) + geom_point() + geom_smooth(span = 0.3) # Instead of a loess smooth, you can use any other modelling function: ggplot(mpg, aes(displ, hwy)) + geom_point() + geom_smooth(method = lm, se = FALSE) ggplot(mpg, aes(displ, hwy)) + geom_point() + geom_smooth(method = lm, formula = y ~ splines::bs(x, 3), se = FALSE) # Smooths are automatically fit to each group (defined by categorical # aesthetics or the group aesthetic) and for each facet. ggplot(mpg, aes(displ, hwy, colour = class)) + geom_point() + geom_smooth(se = FALSE, method = lm) ggplot(mpg, aes(displ, hwy)) + geom_point() + geom_smooth(span = 0.8) + facet_wrap(~drv) binomial_smooth <- function(...) { geom_smooth(method = "glm", method.args = list(family = "binomial"), ...) } # To fit a logistic regression, you need to coerce the values to # a numeric vector lying between 0 and 1. ggplot(rpart::kyphosis, aes(Age, Kyphosis)) + geom_jitter(height = 0.05) + binomial_smooth() ggplot(rpart::kyphosis, aes(Age, as.numeric(Kyphosis) - 1)) + geom_jitter(height = 0.05) + binomial_smooth() ggplot(rpart::kyphosis, aes(Age, as.numeric(Kyphosis) - 1)) + geom_jitter(height = 0.05) + binomial_smooth(formula = y ~ splines::ns(x, 2)) # But in this case, it's probably better to fit the model yourself # so you can exercise more control and see whether or not it's a good model.
This is a polar parameterisation of geom_segment()
. It is
useful when you have variables that describe direction and distance.
The angles start from east and increase counterclockwise.
geom_spoke( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
geom_spoke( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
geom_spoke()
understands the following aesthetics (required aesthetics are in bold):
Learn more about setting these aesthetics in vignette("ggplot2-specs")
.
df <- expand.grid(x = 1:10, y=1:10) set.seed(1) df$angle <- runif(100, 0, 2*pi) df$speed <- runif(100, 0, sqrt(0.1 * df$x)) ggplot(df, aes(x, y)) + geom_point() + geom_spoke(aes(angle = angle), radius = 0.5) ggplot(df, aes(x, y)) + geom_point() + geom_spoke(aes(angle = angle, radius = speed))
df <- expand.grid(x = 1:10, y=1:10) set.seed(1) df$angle <- runif(100, 0, 2*pi) df$speed <- runif