K is for Keep or Drop Variables
This article is originally published at http://www.deeplytrivial.com/A few times in this series, I've wanted to display part of a dataset, such as key variables, like Title, Rating, and Pages. The tidyverse allows you to easily keep or drop variables, either temporarily or permanently, with the select function. For instance, we can use select along with other tidyverse functions to create a quick descriptive table of my dataset. Let's filter down to books that are fantasy and/or sci-fi and that took me the longest to read, then select a few descriptives to display.
reads2019 <- read_csv("~/Downloads/Blogging A to Z/SaraReads2019_allrated.csv", col_names = TRUE)
group_by(Fantasy, SciFi) %>%
filter(read_time == max(read_time) & (Fantasy == 1 | SciFi == 1)) %>%
select(Title, Author, Pages, read_time)
## # A tibble: 4 x 6
## # Groups: Fantasy, SciFi 
## Fantasy SciFi Title Author Pages read_time
## <dbl> <dbl> <chr> <chr> <dbl> <dbl>
## 1 1 1 1Q84 Murakami, Ha… 925 7
## 2 0 1 The End of All Things (Old Man's … Scalzi, John 380 10
## 3 0 1 The Long Utopia (The Long Earth #… Pratchett, T… 373 10
## 4 1 0 Tik-Tok of Oz (Oz, #8) Baum, L. Fra… 272 25
reads2019 <- reads2019 %>%
small_reads2019 <- reads2019 %>%
select(-AdditionalAuthors, -AverageRating, -OriginalPublicationYear)
Tomorrow we'll talk about a variable transformation that makes plotting skewed variables much easier. Stay tuned!
Please visit source website for post related comments.