3

I need to display a plain-text file that contains two-space tab’d columns of data in a web page.

What I did was to use PHP to read the text file and print it out between <pre> tags to use a monospace font as so:

<pre>
<?php
  $fn="data.txt";
  $fi=fopen($fn, "r");
  $fc=fread($fi, filesize($fn));         //open and read text file
  fclose($fi);
  $fc=str_replace("\t", "  ", $fc);      //replace tabs with two spaces
  print($fc);                            //print data between PRE tags
?>
</pre>

It almost works, but the tabs are being troublesome. It is trivial to replace the tabs with two spaces, but then non-whitespace characters are pushed over instead of absorbed into the tabs. True tabs absorb n-1 non-whitespace characters (where n is the number of spaces per tab).

For example, the following table should be displayed as so:

|   | 43| 43|  7|   |   |
| 12|128|128|128|   | 53|
|  3|  3|  3|  3|   |   |
|   |   | 21| 21| 39|   |

However by blindly replacing all tabs with two-spaces, we get this:

|    |  43|  43|    7|   |   |
|  12|128|128|128|   | 53|
|   3|   3|   3|   3|   |   |
|   |   |  21|  21|  39|   |

I’m trying to figure out a (reasonably easy) way to convert the tabs to spaces while accounting for tabs that don’t take up the full n spaces.

5
  • 1
    Unfortunately, I believe you need a second-pass to accurately calculate the column's width Commented Jan 13, 2013 at 21:22
  • What is reasonably easy? It's not hard to do if you allow for some looping. I'm thinking str_pad Commented Jan 13, 2013 at 21:29
  • 1
    can you post the contents of your data.txt? Commented Jan 13, 2013 at 21:30
  • You could probably achieve this effect with preg_replace_callback and printf, read up on those two. Commented Jan 13, 2013 at 21:33
  • @MadaraUchiha, printf is a great idea, but the callback receives an array of matches, in other words, it would get an array of tabs without the non-whitespace characters that come after them. I suppose it might be possible to parse using the column delimiters (assuming they exist like in this particular case). Of course at that point, it would probably be easier to process the file line-by-line instead of doing a string-replace. Commented Jan 13, 2013 at 22:44

3 Answers 3

6

I have written this function some time ago, might be helpful:

function tab2space($line, $tab = 4, $nbsp = FALSE) {
    while (($t = mb_strpos($line,"\t")) !== FALSE) {
        $preTab = $t?mb_substr($line, 0, $t):'';
        $line = $preTab . str_repeat($nbsp?chr(7):' ', $tab-(mb_strlen($preTab)%$tab)) . mb_substr($line, $t+1);
    }
    return  $nbsp?str_replace($nbsp?chr(7):' ', '&nbsp;', $line):$line;
}

It was meant to deal with multibyte strings, if you have only numbers, you can get rid of mb_, it will speed up this function.

[+] Note that this is meant to work with one line, so you will need to process line by line with fgets instead of whole file at once.

Sign up to request clarification or add additional context in comments.

1 Comment

[+] Note that this is meant to work with one line, so you will need to process line by line with fgets instead of whole file at once. Ah, that must be why lines (after the first one) that start with a double-tab are short one space. No problem, it was simple enough to fix: $lines=explode("\n", $fc); foreach ($lines as $l) print(tab2space($l)."\n");
1

You can try to use printf function.

Here an example :

printf("%4d",'37'); // will print ' 37' (with 2 spaces before 37) 
printf("%6d",'37'); // will print '   37' (with 4 spaces before 37) 
printf("%6d",'337'); // will print '  37' (with 3 spaces before 37) 

Some informations about format here.

(For your information, the same trick is available with C)

2 Comments

But you must remove the tabs first (e.g with $fc = str_replace("\t", "", $fc);).
You can try to use printf function.: printf("%d", '37'); But how would you extract the value? (For your information, the same trick is available with C) Something tells me I know that already. ;-)
0

First, get rid of all tabs and spaces:

$fc=str_replace("\t", "", $fc);
$fc = str_replace(" ", "", $fc);

Then apply these replacements. The loops are because the replacements may not hit all possible cases the first time they are run:

//deal with the case of two pipes next to each other
while(strpos($fc, "||") !== false)
   $fc = str_replace("||", "|   |", $fc);

//deal with the case of |XX|
while(preg_match('/\|[0-9][0-9]\|/', $fc) !== 0)
    $fc = preg_replace('/\|([0-9])([0-9])\|/', '| ${1}${2}|', $fc);

//deal with the case of |X|
while(preg_match('/\|([0-9])\|/', $fc) !== 0)
   $fc = preg_replace('/\|([0-9])\|/', '|  ${1}|', $fc);

Since you have three space columns, no need to do anything for 3 digit numbers (|XXX|).

This should work!

2 Comments

Actually, I had already come up with a similar work-around. I used the (smaller) code as follows, and while it suits my current needs, it is specific to this one particular format and would have to be manually changed for other column widths, delimiters, tabs-per-space, etc. and would quickly become untenable. $fc=str_replace("\n\t\t", " ", $fc);     $fc=str_replace("\t\t", " ", $fc);     $fc=str_replace("\t|", " ", $fc);     $fc=str_replace("\t", " ", $fc);
Ah, I see. I was not aware you'd be working with other formats.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.