New str_* functions

stringr 1.5.0

stringr
stringr has accumulated several new functions since its last release three years ago.
Published

December 2, 2022

Install stringr 1.5.0 with:

pak::pak("cran/stringr@1.5.0")

Load the package with:

New str_* functions

str_view()

str_view() lets you clearly see a string with special characters:

x <- "a\n'\b\n\"c"
x
[1] "a\n'\b\n\"c"

In base R, you can use writeLines() to get a good look at the string:

a
'
"c

Now you can use str_view()!

[1] │ a
    │ '
    │ "c

str_view() also highlights strings with special characters:

nbsp <- "Hi\u00A0you"
nbsp
[1] "Hi you"
nbsp == "Hi you"
[1] FALSE
str_view(nbsp)
[1] │ Hi{\u00a0}you
tab_space <- "\t"
str_view(tab_space)
[1] │ {\t}

Finally, str_view() makes matches stand out:

str_view(c("abc", "def", "fghi"), "[aeiou]")
[1] │ <a>bc
[2] │ d<e>f
[3] │ fgh<i>
str_view(c("abc", "def", "fghi"), ".$")
[1] │ ab<c>
[2] │ de<f>
[3] │ fgh<i>
str_view(fruit, "(.)\\1")
 [1] │ a<pp>le
 [5] │ be<ll> pe<pp>er
 [6] │ bilbe<rr>y
 [7] │ blackbe<rr>y
 [8] │ blackcu<rr>ant
 [9] │ bl<oo>d orange
[10] │ bluebe<rr>y
[11] │ boysenbe<rr>y
[16] │ che<rr>y
[17] │ chili pe<pp>er
[19] │ cloudbe<rr>y
[21] │ cranbe<rr>y
[23] │ cu<rr>ant
[28] │ e<gg>plant
[29] │ elderbe<rr>y
[32] │ goji be<rr>y
[33] │ g<oo>sebe<rr>y
[38] │ hucklebe<rr>y
[47] │ lych<ee>
[50] │ mulbe<rr>y
... and 9 more

str_equal()

Use str_equal() to determine if two strings are equivalent:

str_equal("a", "A")
[1] FALSE

You have the option to ignore case:

str_equal("a", "A", ignore_case = TRUE)
[1] TRUE
# These two strings encode "a" with an accent in two different ways
a1 <- "\u00e1"
a2 <- "a\u0301"
c(a1, a2)
[1] "á" "á"
a1 == a2
[1] FALSE
str_equal(a1, a2)
[1] TRUE

str_rank()

str_rank() returns the ranks of the values:

str_rank(c("a", "c", "b", "b"))
[1] 1 4 2 2

str_unique()

str_unique() returns unique values:

str_unique(c("a", "a", "A"))
[1] "a" "A"

You have the option to ignore case:

str_unique(c("a", "a", "A"), ignore_case = TRUE)
[1] "a"

str_split_1()

str_split_1() splits a single string. It returns a character vector, not a list:

unlist(str_split("x-y-z", "-"))
[1] "x" "y" "z"
str_split_1("x-y-z", "-")
[1] "x" "y" "z"

str_split_i()

str_split_i() extracts a single piece from the split string:

x <- c("a-b-c", "d-e", "f-g-h-i")
str_split_i(x, "-", 2)
[1] "b" "e" "g"
str_split_i(x, "-", 4)
[1] NA  NA  "i"
str_split_i(x, "-", -1)
[1] "c" "e" "i"

str_like() works like str_detect() but uses SQL’s LIKE syntax:

fruit <- c("apple", "banana", "pear", "pineapple")
fruit[str_like(fruit, "%apple")]
[1] "apple"     "pineapple"
fruit[str_like(fruit, "p__r")]
[1] "pear"

Learn more