1

I have a data set where the columns are separated using a ton of white space, so that when you open it in a text editor, the columns are aligned.

The problem is that I can't open this file using the white space separator, because one of the columns contain sentences that have spaces. I was wondering if I could somehow open this file in R, by making a regex separator,

like \s{2,}.

I've tried typing sep='\s{2,}'

but that doesn't work.

1
  • 1
    you can read your file with readLines and then seperate the elements with strsplit and then aggregate your data in a data.frame but it would be tiresome Commented Mar 19, 2014 at 10:27

2 Answers 2

1

You could use readLines to read all lines and strsplit+rbind to create your data.frame afterwards:

ll <- readLines(
  textConnection("Column1          Column2
Stupid sentence  Stupid sentence 2
foobar           foobar 2"))

l <- strsplit(ll, " {2,}")

df <- as.data.frame(do.call(rbind, l[-1]))
colnames(df) <- l[[1]]
df
#          Column1           Column2
#1 Stupid sentence Stupid sentence 2
#2          foobar          foobar 2
Sign up to request clarification or add additional context in comments.

Comments

0

You can remove the white spaces for the columns data by regex

1 Comment

I've tried doing this with Sublime Text, but the text file is pretty big and it crashes.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.