This article is originally published at https://www.tidyverse.org/blog/
We’re happy to announce the release of purrr 1.0.0! purrr enhances R’s functional programming toolkit by providing a complete and consistent set of tools for working with functions and vectors. In the words of ChatGPT:
With purrr, you can easily “kitten” your functions together to perform complex operations, “paws” for a moment to debug and troubleshoot your code, while “feline” good about the elegant and readable code that you write. Whether you’re a “cat”-egorical beginner or a seasoned functional programming “purr”-fessional, purrr has something to offer. So why not “pounce” on the opportunity to try it out and see how it can “meow”-velously improve your R coding experience?
You can install it from CRAN with:
purrr is 7 years old and it’s finally made it to 1.0.0! This is a big release, adding some long-needed functionality (like progress bars!) as well as really refining the core purpose of purrr. In this post, we’ll start with an overview of the breaking changes, then briefly review some documentation changes. Then we’ll get to the good stuff: improvements to the
map family, new
discard_at() functions, and improvements to flattening and simplification. You can see a full list of changes in the
We’ve used the 1.0.0 release as an opportunity to really refine the core purpose of purrr: facilitating functional programming in R. We’ve been more aggressive with deprecations and breaking changes than usual, because a 1.0.0 release signals that purrr is now stable, making it our last opportunity for major changes.
These changes will break some existing code, but we’ve done our best to make it affect as little code as possible. Out of the ~1400 CRAN packages that user purrr, only ~40 were negatively affected, and I made pull requests to fix them all. Making these fixes helped give me confidence that, though we’re deprecating quite a few functions and changing a few special cases, it shouldn’t affect too much code in the wild.
There are four important changes that you should be aware of:
pluck()behaves differently when extracting 0-length vectors.
map()family uses the tidyverse rules for coercion and recycling.
- All functions that modify lists handle
- We’ve deprecated functions that aren’t related to the core purpose of purrr.
pluck() and zero-length vectors
pluck() replaced 0-length vectors with the value of
default is only used for
NULLs and absent elements:
x <- list(y = list(a = character(), b = NULL)) x |> pluck("y", "a", .default = NA) #> character(0) x |> pluck("y", "b", .default = NA) #>  NA x |> pluck("y", "c", .default = NA) #>  NA
This also influences the map family because using an integer vector, character vector, or list instead of a function automatically calls
x <- list(list(1), list(), list(NULL), list(character())) x |> map(1, .default = 0) |> str() #> List of 4 #> $ : num 1 #> $ : num 0 #> $ : num 0 #> $ : chr(0)
We made this change because it makes purrr more consistent with the rest of the tidyverse and it looks like it was a bug in the original implementation of the function.
We’ve tweaked the map family of functions to be more consistent with general tidyverse coercion and recycling rules, as implemented by the
map_dbl() now follow the same
coercion rules as vctrs. In particular:
map_chr(0L, identity), and
map_chr(1.5, identity)have been deprecated because we believe that converting a logical/integer/double to a character vector is potentially dangerous and should require an explicit coercion.
# previously you could write map_chr(1:4, \(x) x + 1) #> Warning: Automatic coercion from double to character was deprecated in purrr 1.0.0. #> ℹ Please use an explicit call to `as.character()` within `map_chr()` instead. #>  "2.000000" "3.000000" "4.000000" "5.000000" # now you need something like this: map_chr(1:4, \(x) as.character(x + 1)) #>  "2" "3" "4" "5"
map_int()requires that the numeric results be close to integers, rather than silently truncating to integers. Compare these two examples:
pmap() use tidyverse recycling rules, which mean that vectors of length 1 are recycled to any size but all other vectors must have the same length. This has two major changes:
Previously, the presence of a zero-length input generated a zero-length output. Now it’s recycled using the same rules:
And now must explicitly recycle vectors that aren’t length 1:
purrr has a number of functions that modify a list:
list_modify(). Previously, these functions had inconsistent behaviour when you attempted to modify an element with
NULL: some functions would delete that element, and some would set it to
NULL. That inconsistency arose because base R handles
NULL in different ways depending on whether or not use you
x1 <- x2 <- x3 <- list(a = 1, b = 2) x1$a <- NULL str(x1) #> List of 1 #> $ b: num 2 x2["a"] <- list(NULL) str(x2) #> List of 2 #> $ a: NULL #> $ b: num 2
Now functions that edit a list will create an element containing
x3 |> list_modify(a = NULL) |> str() #> List of 2 #> $ a: NULL #> $ b: num 2 x3 |> modify_at("b", \(x) NULL) |> str() #> List of 2 #> $ a: num 1 #> $ b: NULL
If you want to delete the element, you can use the special
x3 |> list_modify(a = zap()) |> str() #> List of 1 #> $ b: num 2
zap() does not work in
modify*() because those functions are designed to always return the same top-level structure as the input.
Core purpose refinements
We have deprecated a number of functions to keep purrr focused on its core purpose: facilitating functional programming in R. Deprecation means that the functions will continue to work, but you’ll be warned once every 8 hours if you use them. In several years time, we’ll release an update which causes the warnings to occur on every time you use them, and a few years after that they’ll be transformed to throwing errors.
cross()and all its variants have been deprecated because they’re slow and buggy, and a better approach already exists in
rerun(), and the use of tidyselect with
map_at()and friends have been deprecated because we no longer believe that non-standard evaluation is a good fit for purrr.
lift_*family of functions has been superseded because they promote a style of function manipulation that is not commonly used in R.
list_along()have been deprecated because they’re not directly related to functional programming.
splice()has been deprecated because we no longer believe that automatic splicing makes for good UI and there are other ways to achieve the same result.
Consult the documentation for the alternatives that we now recommend.
Deprecating these functions makes purrr easier to maintain because it reduces the surface area for bugs and issues, and it makes purrr easier to learn because there’s a clearer common thread that ties together all functions.
As you’ve seen in the code above, we are moving from magrittr’s pipe (
%>%) to the base pipe (
|>) and from formula syntax (
~ .x + 1) to R’s new anonymous function short hand (
\(x) x + 1). We believe that it’s better to use these new base tools because they work everywhere: the base pipe doesn’t require that you load magrittr and the new function shorthand works everywhere, not just in purrr functions. Additionally, being able to specify the argument name for the anonymous function can often lead to clearer code.
# Previously we wrote 1:10 %>% map(~ rnorm(10, .x)) %>% map_dbl(mean) #>  0.5586355 1.8213041 2.8764412 4.1521664 5.1160393 6.1271905 #>  6.9109806 8.2808301 9.2373940 10.6269104 # Now we recommend 1:10 |> map(\(mu) rnorm(10, mu)) |> map_dbl(mean) #>  0.4638639 2.0966712 3.4441928 3.7806185 5.3373228 6.1854820 #>  6.5873300 8.3116138 9.4824697 10.4590034
We also recommend using an anonymous function instead of passing additional arguments to map. This avoids a certain class of moderately esoteric argument matching woes and, we believe, is generally easier to read.
mu <- c(1, 10, 100) # Previously we wrote mu |> map_dbl(rnorm, n = 1) #>  0.5706199 11.3604613 99.9291426 # Now we recommend mu |> map_dbl(\(mu) rnorm(1, mean = mu)) #>  0.7278463 7.5533200 100.0654866
Due to the
tidyverse R dependency policy, purrr works in R 3.5, 3.6, 4.0, 4.1, and 4.2, but the base pipe and anonymous function syntax are only available in R 4.0 and later. So the examples are automatically disabled on R 3.5 and 3.6 to allow purrr to continue to pass
R CMD check.
With that out of the way, we can now talk about the exciting new features in purrr 1.0.0. We’ll start with the map family of functions which have three big new features:
- Progress bars.
- Better errors.
- A new family member:
These are described in the following sections.
The map family can now produce a progress bar. This is very useful for long running jobs:
(For interactive use, the progress bar uses some simple heuristics so that it doesn’t show up for very simple jobs.)
In most cases, we expect that
.progress = TRUE is enough, but if you’re wrapping
map() in another function, you might want to set
.progress to a string that identifies the progress bar:
If there’s an error in the function you’re mapping,
map() and friends now tell you which element caused the problem:
x <- sample(1:500) x |> map(\(x) if (x == 1) stop("Error!") else 10) #> Error in `map()`: #> ℹ In index: 51. #> Caused by error in `.f()`: #> ! Error!
We hope that this makes your debugging life just a little bit easier! (Don’t forget about
possibly() if you expect failures and want to either ignore or capture them.)
We have also generally reviewed the error messages throughout purrr in order to make them more actionable. If you hit a confusing error message, please let us know!
map_vec() (along with
pmap_vec()) to handle more types of vectors.
map_chr() to arbitrary types of vectors, like dates, factors, and date-times:
1:3 |> map_vec(\(i) factor(letters[i])) #>  a b c #> Levels: a b c 1:3 |> map_vec(\(i) factor(letters[i], levels = letters[4:1])) #>  a b c #> Levels: d c b a 1:3 |> map_vec(\(i) as.Date(ISOdate(i + 2022, 10, 5))) #>  "2023-10-05" "2024-10-05" "2025-10-05" 1:3 |> map_vec(\(i) ISOdate(i + 2022, 10, 5)) #>  "2023-10-05 12:00:00 GMT" "2024-10-05 12:00:00 GMT" #>  "2025-10-05 12:00:00 GMT"
map_vec() exists somewhat in the middle of base R’s
sapply() it will always return a simpler vector, erroring if there’s no common type:
list("a", 1) |> map_vec(identity) #> Error in `map_vec()`: #> ! Can't combine `<list>[]` <character> and `<list>[]` <double>.
If you want to require a certain type of output, supply
map_vec() behave more like
ptype is short for prototype, and should be a vector that exemplifies the type of output you expect.
x <- list("a", "b") x |> map_vec(identity, .ptype = character()) #>  "a" "b" # will error if the result can't be automatically coerced # to the specified ptype x |> map_vec(identity, .ptype = integer()) #> Error in `map_vec()`: #> ! Can't convert `<list>[]` <character> to <integer>.
We don’t expect you to know or memorise the rules that vctrs uses for coercion; our hope is that they’ll become second nature as we steadily ensure that every tidyverse function follows the same rules.
purrr has gained a new pair of functions,
discard_at(), that work like
discard() but operate on names rather than values:
x <- list(a = 1, b = 2, c = 3, D = 4, E = 5) x |> keep_at(c("a", "b", "c")) |> str() #> List of 3 #> $ a: num 1 #> $ b: num 2 #> $ c: num 3 x |> discard_at(c("a", "b", "c")) |> str() #> List of 2 #> $ D: num 4 #> $ E: num 5
Alternatively, you can supply a function that is called with the names of the elements and should return a logical vector describing which elements to keep/discard:
is_lower_case <- function(x) x == tolower(x) x |> keep_at(is_lower_case) #> $a #>  1 #> #> $b #>  2 #> #> $c #>  3
You can now also pass such a function to all other
x |> modify_at(is_lower_case, \(x) x * 100) |> str() #> List of 5 #> $ a: num 100 #> $ b: num 200 #> $ c: num 300 #> $ D: num 4 #> $ E: num 5
Flattening and simplification
Last, but not least, we’ve reworked the family of functions that flatten and simplify lists. These caused us a lot of confusion internally because folks (and different packages) used the same words to mean different things. Now there are three main functions that share a common prefix that makes it clear that they all operate on lists:
list_flatten()removes a single level of hierarchy from a list; the output is always a list.
list_simplify()reduces a list to a homogeneous vector; the output is always the same length as the input.
list_rbind()concatenate the elements of a list to produce a vector or data frame. There are no constraints on the output.
These functions have lead us to supersede a number of functions. This means that they are not going away but we no longer recommend them, and they will receive only critical bug fixes.
flatten()has been superseded by
flatten_chr()have been superseded by
flatten_dfc()have been superseded by
flatten_dfr()had some particularly puzzling edge cases when the inputs would be flattened into columns.
pmapvariants) have been superseded in favour of using the appropriate map function along with
as_vector()have been superseded in favour of
list_flatten() removes one layer of hierarchy from a list. In other words, if any of the children of the list are themselves lists, the contents of those lists are inlined into the parent:
x <- list(1, list(2, list(3, 4), 5)) x |> str() #> List of 2 #> $ : num 1 #> $ :List of 3 #> ..$ : num 2 #> ..$ :List of 2 #> .. ..$ : num 3 #> .. ..$ : num 4 #> ..$ : num 5 x |> list_flatten() |> str() #> List of 4 #> $ : num 1 #> $ : num 2 #> $ :List of 2 #> ..$ : num 3 #> ..$ : num 4 #> $ : num 5 x |> list_flatten() |> list_flatten() |> str() #> List of 5 #> $ : num 1 #> $ : num 2 #> $ : num 3 #> $ : num 4 #> $ : num 5
list_flatten() always returns a list; once a list is as flat as it can get (i.e. none of its children contain lists), it leaves the input unchanged.
x |> list_flatten() |> list_flatten() |> list_flatten() |> str() #> List of 5 #> $ : num 1 #> $ : num 2 #> $ : num 3 #> $ : num 4 #> $ : num 5
list_simplify() maintains the length of the input, but produces a simpler type:
list(1, 2, 3) |> list_simplify() #>  1 2 3 list("a", "b", "c") |> list_simplify() #>  "a" "b" "c"
Because the length must stay the same, it will only succeed if every element has length 1:
list_simplify(list(1, 2, 3:4)) #> Error in `list_simplify()`: #> ! `x[]` must have size 1, not size 2. list_simplify(list(1, 2, integer())) #> Error in `list_simplify()`: #> ! `x[]` must have size 1, not size 0.
Because the result must be a simpler vector, all the components must be compatible:
list_simplify(list(1, 2, "a")) #> Error in `list_simplify()`: #> ! Can't combine `<list>[]` <double> and `<list>[]` <character>.
If you need to simplify if it’s possible, but otherwise leave the input unchanged, use
strict = FALSE:
list_simplify(list(1, 2, "a"), strict = FALSE) #> [] #>  1 #> #> [] #>  2 #> #> [] #>  "a"
If you want to be specific about the type you want,
list_simplify() can take the same prototype argument as
list(1, 2, 3) |> list_simplify(ptype = integer()) #>  1 2 3 list(1, 2, 3) |> list_simplify(ptype = factor()) #> Error in `list_simplify()`: #> ! Can't convert `<list>[]` <double> to <factor<>>.
list_rbind() concatenate all elements together in a similar way to using
do.call(rbind)1 . Unlike
list_simplify(), this allows the elements to be different lengths:
The downside of this flexibility is that these functions break the connection between the input and the output. This reveals that
map_dfc() don’t really belong to the map family because they don’t maintain a 1-to-1 mapping between input and output: there’s reliable no way to associate a row in the output with an element in an input.
For this reason,
map_dfc() (and the
pmap) variants are superseded and we recommend switching to an explicit call to
paths |> map_dfr(read_csv, .id = "path") # now paths |> map(read_csv) |> list_rbind(names_to = "path")
This new behaviour also affects to
accumulate2(), which previously had an idiosyncratic approach to simplification.
There’s one other new function that isn’t directly related to flattening and friends, but shares the
list_assign() is similar to
list_modify() but it doesn’t work recursively. This is a mildly confusing feature of
list_modify() that it’s easy to miss in the documentation.
list(x = 1, y = list(a = 1)) |> list_modify(y = list(b = 1)) |> str() #> List of 2 #> $ x: num 1 #> $ y:List of 2 #> ..$ a: num 1 #> ..$ b: num 1
list_assign() doesn’t recurse into sublists making it a bit easier to reason about:
list(x = 1, y = list(a = 1)) |> list_assign(y = list(b = 2)) |> str() #> List of 2 #> $ x: num 1 #> $ y:List of 1 #> ..$ b: num 2
A massive thanks to all 162 contributors who have helped make purrr 1.0.0 happen! @adamroyjones, @afoltzm, @agilebean, @ahjames11, @AHoerner, @alberto-dellera, @alex-gable, @AliciaSchep, @ArtemSokolov, @AshesITR, @asmlgkj, @aubryvetepi, @balwierz, @bastianilso, @batpigandme, @bebersb, @behrman, @benjaminschwetz, @billdenney, @Breza, @brunj7, @BrunoGrandePhD, @CGMossa, @cgoo4, @chsafouane, @chumbleycode, @ColinFay, @CorradoLanera, @CPRyan, @czeildi, @dan-reznik, @DanChaltiel, @datawookie, @dave-lovell, @davidsjoberg, @DavisVaughan, @deann88, @dfalbel, @dhslone, @dlependorf, @dllazarov, @dpprdan, @dracodoc, @echasnovski, @edo91, @edoardo-oliveri-sdg, @erictleung, @eyayaw, @felixhell2004, @florianm, @florisvdh, @flying-sheep, @fpinter, @frankzhang21, @gaborcsardi, @GarrettMooney, @gdurif, @ge-li, @ggrothendieck, @grayskripko, @gregleleu, @gregorp, @hadley, @hendrikvanb, @holgerbrandl, @hriebl, @hsloot, @huftis, @iago-pssjd, @iamnicogomez, @IndrajeetPatil, @irudnyts, @izahn, @jameslairdsmith, @jedwards24, @jemus42, @jennybc, @jhrcook, @jimhester, @jimjam-slam, @jnolis, @joelgombin, @jonathan-g, @jpmarindiaz, @jxu, @jzadra, @karchjd, @karjamatti, @kbzsl, @krlmlr, @lahvak, @lambdamoses, @lasuk, @lionel-, @lorenzwalthert, @LukasWallrich, @LukaszDerylo, @malcolmbarrett, @MarceloRTonon, @mattwarkentin, @maxheld83, @Maximilian-Stefan-Ernst, @mccroweyclinton-EPA, @medewitt, @meowcat, @mgirlich, @mine-cetinkaya-rundel, @mitchelloharawild, @mkoohafkan, @mlane3, @mmuurr, @moodymudskipper, @mpettis, @nealrichardson, @Nelson-Gon, @neuwirthe, @njtierney, @oduilln, @papageorgiou, @pat-s, @paulponcet, @petyaracz, @phargarten2, @philiporlando, @q-w-a, @QuLogic, @ramiromagno, @rcorty, @reisner, @Rekyt, @roboes, @romainfrancois, @rorynolan, @salim-b, @sar8421, @ScoobyQ, @sda030, @sgschreiber, @sheffe, @Shians, @ShixiangWang, @shosaco, @siavash-babaei, @stephenashton-dhsc, @stschiff, @surdina, @tdawry, @thebioengineer, @TimTaylor, @TimTeaFan, @tomjemmett, @torbjorn, @tvatter, @TylerGrantSmith, @vorpalvorpal, @vspinu, @wch, @werkstattcodes, @williamlai2, @yogat3ch, @yutannihilation, and @zeehio.
But if they used the tidyverse coercion rules. ↩︎
Thanks for visiting r-craft.org
This article is originally published at https://www.tidyverse.org/blog/
Please visit source website for post related comments.