Separating columns using regex in R

Question

I have a data set where the columns are separated using a ton of white space, so that when you open it in a text editor, the columns are aligned.

The problem is that I can't open this file using the white space separator, because one of the columns contain sentences that have spaces. I was wondering if I could somehow open this file in R, by making a regex separator,

like \s{2,}.

I've tried typing sep='\s{2,}'

but that doesn't work.

you can read your file with readLines and then seperate the elements with strsplit and then aggregate your data in a data.frame but it would be tiresome — droopy
– droopy, Commented Mar 19, 2014 at 10:27

sgibb · Accepted Answer · 2014-03-19 10:26:01Z

1

You could use readLines to read all lines and strsplit+rbind to create your data.frame afterwards:

ll <- readLines(
  textConnection("Column1          Column2
Stupid sentence  Stupid sentence 2
foobar           foobar 2"))

l <- strsplit(ll, " {2,}")

df <- as.data.frame(do.call(rbind, l[-1]))
colnames(df) <- l[[1]]
df
#          Column1           Column2
#1 Stupid sentence Stupid sentence 2
#2          foobar          foobar 2

answered Mar 19, 2014 at 10:26

sgibb

25.8k3 gold badges72 silver badges77 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Ravindra · Accepted Answer · 2014-03-19 10:24:12Z

0

You can remove the white spaces for the columns data by regex

answered Mar 19, 2014 at 10:24

Ravindra

2,3113 gold badges20 silver badges22 bronze badges

1 Comment

Galadude Over a year ago

I've tried doing this with Sublime Text, but the text file is pretty big and it crashes.

Collectives™ on Stack Overflow

Separating columns using regex in R

2 Answers 2

Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related