regex - How to Extract a substring that matches a Perticular Regular expression match from a String in R -
i trying write function can substrings string matches regular expression , example : -
str <- "hello brother how you"
i want extract substrings str
, substrings matches regular expression - "[a-z]+ [a-z]+"
which results in -
"hello brother" "brother how" "how are" "are you"
is there library function can ?
you can stringr library str_match_all
function , method tim pietzcker described in answer (capturing inside unanchored positive lookahead):
> library(stringr) > str <- "hello brother how you" > res <- str_match_all(str, "(?=\\b([[:alpha:]]+ [[:alpha:]]+))") > l <- unlist(res) > l[l != ""] ## [1] "hello brother" "brother how" "how are" "are you"
or unqiue values:
> unique(l[l != ""]) ##[1] "hello brother" "brother how" "how are" "are you"
i advise use [[:alpha:]]
instead of [a-z]
since pattern matches more letters.
Comments
Post a Comment