Using PHP to trim white space/tab for a string

Question

According to the previous topic, I am going to trim white space/tab for a string in PHP.

$html = '<tr>       <td>A     </td>                <td>B   </td>      <td>C    </td>       </tr>'

converting to

$html = '<tr><td>A     </td><td>B   </td><td>C    </td></tr>'

How to write the statement likes str.replace(/>\s+</g,'><'); ?

@Tomolak In this case I'd say it's valid. The HTML is not being parsed anyway; it's simple string manipulation. Also, it was a direct answer to the question — Phil
– Phil, Commented Aug 2, 2011 at 1:38
@Tomolak ... further, were it an entire document, I'd definitely agree with you — Phil
– Phil, Commented Aug 2, 2011 at 1:46

Wrikken · Accepted Answer · 2011-08-02 01:43:08Z

4

$str = preg_replace('/(?<=>)\s+(?=<)/', '', $str);

Less prone to breakage, but uses some more resources:

<?php
$html = '<tr>       <td>A     </td>                <td>B   </td>      <td>C    </td>       </tr>';
$d = new DOMDocument();
$d->loadHTML($html);
$x = new DOMXPath($d);
foreach($x->query('//text()[normalize-space()=""]') as $textnode){
    $textnode->deleteData(0,strlen($textnode->wholeText));
}
echo $d->saveXML($d->documentElement->firstChild->firstChild);

edited Aug 2, 2011 at 1:43

answered Aug 2, 2011 at 1:32

Wrikken

70.8k8 gold badges99 silver badges136 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Lightness Races in Orbit Over a year ago

Not a wholly robust solution, but input-dependent it may be sufficient in practice.

Wrikken Over a year ago

@Tomalak: indeed, somewhat iffy when talking sgml/html/xml of course, breakage possibilities a plenty. I'll offer an alternative.

genesis · Accepted Answer · 2011-08-02 01:27:54Z

0

http://sandbox.phpcode.eu/g/54ba6.php

result

<tr><td>A     </td><td>B   </td><td>C    </td></tr>

code

<?php 
$html = '<tr>       <td>A     </td>                <td>B   </td>      <td>C    </td>       </tr>'; 
$html = preg_replace('~(</td>)([\s]+)(<td>)~', '$1$3', $html); 
$html = preg_replace('~(<tr>)([\s]+)(<td>)~', '$1$3', $html); 
echo preg_replace('~(</td>)([\s]+)(</tr>)~', '$1$3', $html);

answered Aug 2, 2011 at 1:27

genesis

51.1k20 gold badges99 silver badges127 bronze badges

5 Comments

Charles Yeung Over a year ago

Thanks for the solution, could you advise what is the meaning of "~" in "'~(</td>)([\s]+)(<td>)~'" and "$1$3"? thanks

Lightness Races in Orbit Over a year ago

@Charles: Substitution. Take a look at the preg_replace manual page, and read a book on regular expressions.

Phil Over a year ago

@Charles PHP can use other characters besides the forward-slash as regular expression delimiters. It's useful if your pattern contains forward slashes (as in this case) as you don't need to escape them

Phil Over a year ago

@genesis Why capture the whitespace in a character class?

Lightness Races in Orbit Over a year ago

@Phil: Good question. It's redundant and genesis you've done this before!

Collectives™ on Stack Overflow

Using PHP to trim white space/tab for a string

2 Answers 2

2 Comments

5 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

5 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related