pak::pak("cran/dplyr@1.1.0")
case_when(), case_match(), and consecutive_id()
dplyr 1.1.0
Install dplyr 1.1.0 with:
Load the package with:
case_when()
case_when() is a general vectorised if-else.
NA
Have you ever run case_when() and gotten the error message:
x <- c(1, 12, -5, 6, -2, NA, 0)case_when(
  x >= 10 ~ "large",
  x >= 0 ~ "small",
  x < 0 ~ NA
)Error: `NA` must be <character>, not <logical>.In this case, you had to use NA_character_ instead of NA.
But not anymore!
In dplyr 1.1.0, the switch to vctrs means that the above code now “just works”:
case_when(
  x >= 10 ~ "large",
  x >= 0 ~ "small",
  x < 0 ~ NA
)[1] "small" "large" NA      "small" NA      NA      "small"TRUE
To set a default in case_when(), you used to have to do this:
[1] "small"   "large"   "other"   "small"   "other"   "missing" "small"  Now there’s an explicit argument .default:
[1] "small"   "large"   "other"   "small"   "other"   "missing" "small"  TRUE isn’t deprecated yet but the team is planning on deprecating it in the future.
case_match()
Sometimes, case_when() can be a bit repetitive:
x <-
  c("USA", "Canada", "Wales", "UK", "China", NA, "Mexico", "Russia")
case_when(
  x %in% c("USA", "Canada", "Mexico") ~ "North America",
  x %in% c("Wales", "UK") ~ "Europe",
  x %in% "China" ~ "Asia"
)[1] "North America" "North America" "Europe"        "Europe"       
[5] "Asia"          NA              "North America" NA             case_match() is a special case that matches values and a nice way to do replacements. You can streamline your code:
case_match(
  x,
  c("USA", "Canada", "Mexico") ~ "North America",
  c("France", "UK") ~ "Europe",
  "China" ~ "Asia"
)[1] "North America" "North America" NA              "Europe"       
[5] "Asia"          NA              "North America" NA             They are no longer logical vectors, just values. You can also put NA on the left-hand side:
case_match(
  x,
  c("USA", "Canada", "Mexico") ~ "North America",
  c("France", "UK") ~ "Europe",
  "China" ~ "Asia",
  NA ~ "missing"
)[1] "North America" "North America" NA              "Europe"       
[5] "Asia"          "missing"       "North America" NA             It also works with .default:
case_match(
  x,
  c("USA", "Canada", "Mexico") ~ "North America",
  c("France", "UK") ~ "Europe",
  "China" ~ "Asia",
  NA ~ "missing",
  .default = "unknown"
)[1] "North America" "North America" "unknown"       "Europe"       
[5] "Asia"          "missing"       "North America" "unknown"      if_else() has received the same updates as case_when(). In particular, it is no longer as strict about typed missing values.
consecutive_id()
Here’s an example transcript:
friends_dialogue# A tibble: 10 × 2
   text                                                                  speaker
   <chr>                                                                 <chr>  
 1 There's nothing to tell! He's just some guy I work with!              Monica…
 2 C'mon, you're going out with the guy! There's gotta be something wro… Joey T…
 3 All right Joey, be nice. So does he have a hump? A hump and a hairpi… Chandl…
 4 Wait, does he eat chalk?                                              Phoebe…
 5 Just, 'cause, I don't want her to go through what I went through wit… Phoebe…
 6 Okay, everybody relax. This is not even a date. It's just two people… Monica…
 7 Sounds like a date to me.                                             Chandl…
 8 Alright, so I'm back in high school, I'm standing in the middle of t… Chandl…
 9 Then I look down, and I realize there's a phone... there.             Chandl…
10 Instead of...?                                                        Joey T…What if we want to put the continuous dialogue together in one line?
friends_dialogue |>
  summarise(text = stringr::str_flatten(text, collapse = " "),
            .by = speaker)# A tibble: 4 × 2
  speaker        text                                                           
  <chr>          <chr>                                                          
1 Monica Geller  There's nothing to tell! He's just some guy I work with! Okay,…
2 Joey Tribbiani C'mon, you're going out with the guy! There's gotta be somethi…
3 Chandler Bing  All right Joey, be nice. So does he have a hump? A hump and a …
4 Phoebe Buffay  Wait, does he eat chalk? Just, 'cause, I don't want her to go …This smushes everything together - what if we want to keep the consecutive runs?
Enter consecutive_id()!
friends_dialogue |>
  mutate(id = consecutive_id(speaker))# A tibble: 10 × 3
   text                                                            speaker    id
   <chr>                                                           <chr>   <int>
 1 There's nothing to tell! He's just some guy I work with!        Monica…     1
 2 C'mon, you're going out with the guy! There's gotta be somethi… Joey T…     2
 3 All right Joey, be nice. So does he have a hump? A hump and a … Chandl…     3
 4 Wait, does he eat chalk?                                        Phoebe…     4
 5 Just, 'cause, I don't want her to go through what I went throu… Phoebe…     4
 6 Okay, everybody relax. This is not even a date. It's just two … Monica…     5
 7 Sounds like a date to me.                                       Chandl…     6
 8 Alright, so I'm back in high school, I'm standing in the middl… Chandl…     6
 9 Then I look down, and I realize there's a phone... there.       Chandl…     6
10 Instead of...?                                                  Joey T…     7With this, we can correctly group the dialogue:
friends_dialogue |>
  mutate(id = consecutive_id(speaker)) |>
  summarise(text = stringr::str_flatten(text, collapse = " "),
            .by = c(id, speaker))# A tibble: 7 × 3
     id speaker        text                                                     
  <int> <chr>          <chr>                                                    
1     1 Monica Geller  There's nothing to tell! He's just some guy I work with! 
2     2 Joey Tribbiani C'mon, you're going out with the guy! There's gotta be s…
3     3 Chandler Bing  All right Joey, be nice. So does he have a hump? A hump …
4     4 Phoebe Buffay  Wait, does he eat chalk? Just, 'cause, I don't want her …
5     5 Monica Geller  Okay, everybody relax. This is not even a date. It's jus…
6     6 Chandler Bing  Sounds like a date to me. Alright, so I'm back in high s…
7     7 Joey Tribbiani Instead of...?