# Specify additional aesthetics for points

This article is originally published at https://www.quantargo.com/blog

**ggplot2** implements the grammar of graphics to map attributes from a data set to plot features through *aesthetics*. This framework can be used to adjust the point `size`

, `color`

and transparency `alpha`

of points in a scatter plot.

- Add additional plotting dimensions through aesthetics
- Adjust the point size of a scatter plot using the
`size`

parameter - Change the point color of a scatter plot using the
`color`

parameter - Set a parameter
`alpha`

to change the transparency of all points - Differentiate between aesthetic mappings and constant parameters

ggplot(___) + geom_point( mapping = aes(x = ___, y = ___, color = ___, size = ___), alpha = ___ )

## Adding more plot aesthetics

In their most basic form scatter plots can only visualize datasets in two dimensions through the `x`

and `y`

aesthetics of the `geom_point()`

layer. However, most data sets have more than two variables and thus might require additional plotting dimensions. `ggplot()`

makes it very easy to map additional variables to different plotting aesthetics like `size`

, transparency `alpha`

and `color`

.

Let’s consider the `gapminder_2007`

dataset which contains the variables GDP per capita `gdpPercap`

and life expectancy `lifeExp`

for 142 countries in the year 2007:

ggplot(gapminder_2007) + geom_point(aes(x = gdpPercap, y = lifeExp))

Mapping the `continent`

variable through the point `color`

aesthetic and the population `pop`

(in millions) through the point `size`

we obtain a much richer plot including 4 different variables from the data set:

## Quiz: geom_point() Aesthetics

Which aesthetics can be specified for`geom_point()`

?`geom_line`

`color`

`point`

`alpha`

`size`

## Adjusting point color

ggplot(___) + geom_point( mapping = aes(x = ___, y = ___, color = ___, size = ___), alpha = ___ )

Typically, the point color is used to introduce a new dimension to a scatter plot. In ggplot we use the `color`

aesthetic to specify the mapping of a variable to the color of the points.

For the `gapminder_2007`

dataset we can plot the GDP per capita `gdpPercap`

vs. the life expectancy `lifeExp`

as follows:

ggplot(gapminder_2007) + geom_point(aes(x = gdpPercap, y = lifeExp))

To color each point based on the `continent`

of each country we can use:

ggplot(gapminder_2007) + geom_point(aes(x = gdpPercap, y = lifeExp, color = continent))

We see that in the resulting plot each point is colored differently based on the `continent`

of each country. `ggplot`

uses the coloring scheme based on the categorical data type of the variable `continent`

.

By contrast, let’s see how the plot looks like if we color the points by the `numeric`

variable population `pop`

:

ggplot(gapminder_2007) + geom_point(aes(x = gdpPercap, y = lifeExp, color = pop))

The scale immediately changes to continuous as it can be seen in the legend and the light-blue points are now the countries with the highest population number (China and India).

## Exercise: Reconstruct Gapminder graph

Reconstruct the following graph which shows the relationship between GDP per capita and life expectancy for the year 2007:

- Use the
`ggplot()`

function and specify the`gapminder_2007`

dataset as input - Add a
`geom_point`

layer to the plot and create a scatter plot showing the GDP per capita`gdpPercap`

on the x-axis and the life expectancy`lifeExp`

on the y-axis - Make the
`color`

aesthetic of the points unique for each`continent`

## Exercise: Create a colored scatter plot with DavisClean

The `DavisClean`

dataset contains the height and weight measurements of 199 people.

- Use the
`ggplot()`

function and specify the`DavisClean`

dataset as input - Add a
`geom_point()`

layer to the plot and create a scatter plot showing the`weight`

on the x- and the`height`

on the y-axis - Make the
`color`

aesthetic of the points unique by the`sex`

of each individual.

## Adjusting point size

ggplot(___) + geom_point( mapping = aes(x = ___, y = ___, color = ___, size = ___), alpha = ___ )

For the `gapminder_2007`

dataset we can plot the GDP per capita `gdpPercap`

vs. the life expectancy as follows:

ggplot(gapminder_2007) + geom_point(aes(x = gdpPercap, y = lifeExp))

To adjust the point size based on the population (`pop`

) of each country we can use:

ggplot(gapminder_2007) + geom_point(aes(x = gdpPercap, y = lifeExp, size = pop))

We see that the point sizes in the plot above do not clearly reflect the population differences in each country. If we compare the point size representing a population of 250 million people with the one displaying 750 million, we can see, that their sizes are not proportional. Instead, the point sizes are *binned* by default. To reflect the actual population differences by the point size we can use the `scale_size_area()`

function instead. The scaling information can be added like any other ggplot object with the `+`

operator:

ggplot(gapminder_2007) + geom_point(aes(x = gdpPercap, y = lifeExp, size = pop)) + scale_size_area(max_size = 10)

Note that we have adjusted the point’s `max_size`

which results in bigger point sizes.

## Exercise: Create a Gapminder scatter plot using size

Create a scatter plot with **ggplot2** which shows the relationship between GDP per capita and life expectancy for the year 2007 using the `gapminder_2007`

dataset.

- Use the
`ggplot()`

function and specify the`gapminder_2007`

dataset as input - Add a
`geom_point()`

layer to the plot and create a scatter plot showing the GDP per capita`gdpPercap`

on the x-axis and the life expectancy`lifeExp`

on the y-axis - Use the
`size`

aesthetic to adjust the point size by the population`pop`

- Use the
`scale_size_area()`

function so that the point sizes reflect actual population differences and set the`max_size`

of each point to`10`

## Setting global aesthetics: transparency

ggplot(___) + geom_point( mapping = aes(x = ___, y = ___, color = ___, size = ___), alpha = ___ )

Plotting many points with similar x- and y-coordinates in one graph can produce dense point clouds. Many points in these clouds are over plotted and the true number of observations in a certain area is not visible any more. As a solution, we can set the transparency of each point using the ggplot parameter `alpha`

.

Since we do **not** want to set the point transparency **individually** for each point but **globally** for all points we do not set the `alpha`

parameter as an aesthetic mapping (within `aes()`

) but outside.

We set the **opacity** of each point to 50% through the parameter `alpha`

**outside** as a constant parameter:

ggplot(gapminder_2007) + geom_point(aes(x = gdpPercap, y = lifeExp, size = pop), alpha = 0.5)

We can now clearly see how many points are overlapping each other and the opacity of each point is set to `0.5`

.

## Quiz: Gapminder Plot

ggplot(gapminder_2007) + geom_point(aes(x = gdpPercap, y = lifeExp, size = pop, alpha = 0.5, color = "red"))Which statements about the plot above are correct?

- Constant plot parameters should be set outside of an aesthetic mapping
`aes()`

. - The reason for the legend entries
`alpha`

and`color`

are that they are set as aesthetic mappings instead of global parameters. - The parameter
`lifeExp`

should be set as a global parameter. - The parameter
`gdpPercap`

should be set as a global parameter.

## Exercise: Reproduce Gapminder scatter plot

Try to reproduce the following plot:

- Use the
`ggplot()`

function and specify the`gapminder_2007`

dataset as input - Add a
`geom_point`

layer to the plot and create a scatter plot showing the GDP per capita`gdpPercap`

on the x-axis and the life expectancy`lifeExp`

on the y-axis - Use the
`color`

aesthetic to indicate each`continent`

by a different color - Use the
`size`

aesthetic to adjust the point size by the population`pop`

- Use
`scale_size_area()`

so that the point sizes reflect the actual population differences and set the`max_size`

of each point to`15`

- Set the opacity/transparency of each point to 70% using the
`alpha`

parameter

Specify additional aesthetics for points is an excerpt from the course Introduction to R, which is available for free at quantargo.com

Thanks for visiting r-craft.org

This article is originally published at https://www.quantargo.com/blog

Please visit source website for post related comments.