25

$data contains tabs, leading spaces and multiple spaces. I wish to replace all tabs with a space. Multiple spaces with one single space, and remove leading spaces.

In fact somthing that would turn this input data:

[    asdf asdf     asdf           asdf   ] 

Into output data:

[asdf asdf asdf asdf]

How do I do this?

2
  • Are you trying to strip all tabs regardless or do you have any desire to columnar maintain formatting? Commented Jan 13, 2013 at 20:46
  • I think you should re-evaluate the answers and accept a different one. Commented Oct 3, 2017 at 22:48

9 Answers 9

25

Trim, replace tabs and extra spaces with single spaces:

$data = preg_replace('/[ ]{2,}|[\t]/', ' ', trim($data));
Sign up to request clarification or add additional context in comments.

2 Comments

First, this is a clunky pattern. Second, as you wrote it, it doesn't rescan, and so it will produce multiple spaces for a run of tab space tab space tab space. Did you test this?
This works only for defined variables, not variables assigned at runtime (at least in my testing).
16
$data = trim(preg_replace('/\s+/g', '', $data));

6 Comments

You also forgot to mention trim to get rid of leading spaces. Probably want to mention ltrim too, since he asked for leading spaces then illustrated both ends.
Yeah, thanks for pointing that. In the example it's shown that both leading and trailing spaces should be removed so I updated my code.
helpful. preg_replace defaults to replace all occurrences of the pattern, unless you specify the limit parameter.
The "g" modifier doesn't seams to work. According to php.net/manual/en/reference.pcre.pattern.modifiers.php no "g" is needed to repeat on the same line as the whole text, even if it contains newlines, is considered by php as one line
minus one for a replacement string that removes ALL whitespace, even the single space that the OP wants between his "words"
|
6
$data = trim($data);

That gets rid of your leading (and trailing) spaces.

$pattern = '/\s+/';
$data = preg_replace($pattern, ' ', $data);

That turns any collection of one or more spaces into just one space.

$data = str_replace("\t", " ", $data);

That gets rid of your tabs.

1 Comment

If you're using \s you don't need \t
4

Assuming the square brackets aren't part of the string and you're just using them for illustrative purposes, then:

$new_string = trim(preg_replace('!\s+!', ' ', $old_string));

You might be able to do that with a single regex but it'll be a fairly complicated regex. The above is much more straightforward.

Note: I'm also assuming you don't want to replace "AB\t\tCD" (\t is a tab) with "AB CD".

1 Comment

This should be the accepted answer. I posted my answer just to refute your claim that trimming with preg_replace is "complicated".
2
$new_data = preg_replace("/[\t\s]+/", " ", trim($data));

1 Comment

if you're using \s you don't need \t
0

This answer takes the question completely literally: it is only concerned with spaces and tabs. Granted, the OP probably also wants to include other kinds of whitespace in what gets trimmed/compressed, but let's pretend he wants to preserve embedded CR and/or LF.

First, let's set up some constants. This will allow for both ease of understanding and maintainability, should modifications become necessary. I put in some extra spaces so that you can compare the similarities and differences more easily.

define( 'S', '[ \t]+'      ); # Stuff you want to compress; in this case ONLY spaces/tabs
define( 'L', '/\A'.S.'/'   ); # stuff on the Left edge will be trimmed
define( 'M',   '/'.S.'/'   ); # stuff in the Middle will be compressed
define( 'R',   '/'.S.'\Z/' ); # stuff on the Right edge will be trimmed
define( 'T', ' '           ); # what we want the stuff compressed To

We are using \A and \Z escape characters to specify the beginning and end of the subject, instead of the typical ^ and $ which are line-oriented meta-characters. This is not so much because they are needed in this instance as much as "defensive" programming, should the value of S change to make them needed in the future.

Now for the secret sauce: we are going to take advantage of some special semantics of preg_replace, namely (emphasis added)

If there are fewer elements in the replacement array than in the pattern array, any extra patterns will be replaced by an empty string.

function trim_press( $data ){
    return preg_replace( [ M, L, R ], [ T ], $data );
}

So instead of a pattern string and replacement string, we are using a pattern array and replacement array, which results in the extra patterns L and R being trimmed.

Comments

0

In case you need to remove   too.

$data = trim(preg_replace('/\s+|nbsp;/g', '', $data));

Comments

0

After much frustration I found this to be the best solution, as it also removes non breaking spaces which can be two characters long:

$data = html_entity_decode(str_replace(' ',' ',htmlentities($data))); $data = trim(preg_replace('/\h/', ' ', $data)); // replaces more space character types than \s

See billynoah

Comments

-1

Just use this regex

$str = trim(preg_replace('/\s\s+/', ' ', $str));

it will replace all tabs and spaces by one space,

here sign + in regex means one or more times, pattern means, that wherever there are two or more spaces, replace it by one space

1 Comment

minus one for a pattern that fails to match a solitary tab (and for not testing your solution before posting).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.