2

this is my dataset

id   text
 1    "red"
 1    "blue"
 2    "light blue"
 2    "red"
 2    "yellow"
 3    "dark green"

this is the result I want to obtain:

 id  text2
 1   "red, blue"
 2  "light blue, red, yellow"
 3  "dark green"

basically I need to put together the text from column 'text' with commas to separate the different elements

2 Answers 2

2

Using aggregate and toString.

aggregate(. ~ id, d, toString)
#   id                    text
# 1  1               red, blue
# 2  2 light blue, red, yellow
# 3  3              dark green

Note: This won't work with factor columns, i.e. if is.factor(d$text) yields TRUE you need a slightly different approach. Demonstration:

d$text <- as.factor(d$text)  # make 
is.factor(d$text)
#  [1] TRUE

Do:

aggregate(. ~ id, transform(d, text=as.character(text)), toString)

Data:

d <- structure(list(id = c(1L, 1L, 2L, 2L, 2L, 3L), text = c("red", 
"blue", "light blue", "red", "yellow", "dark green")), row.names = c(NA, 
-6L), class = "data.frame")
Sign up to request clarification or add additional context in comments.

2 Comments

I m not sure how to convert my data frame {that looks like this id <- c(1,1,2,2,2,3) text <- c("red" , "blue", "light blue", "red" , "yellow" , "dark green" ) data <- cbind.data.frame(id, text) } using the command structure. What is c(NA, -6L)? how can I restructure my dataframe so that aggregate words properly? when I run "aggregate" in shows numbers in the text column instead of proper text
Your column seems to be of class "factor". Please see edit to my answer. You could also use cbind.data.frame(id, text, stringsAsFactors=FALSE), though, to prevent factors beforehand. (The structure(.) thing is just the output of dput(d) which is the way we share data here on Stack Overflow, see stackoverflow.com/questions/5963269/…)
1

We can use dplyr

library(dplyr)
df1 %>%
    group_by(id) %>%
    summarise(text2 = toString(text))

data

df1 <- structure(list(id = c(1L, 1L, 2L, 2L, 2L, 3L), text = c("red", 
"blue", "light blue", "red", "yellow", "dark green")), row.names = c(NA, 
-6L), class = "data.frame")

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.