I've read a dozen different responses for questions like this but all of them fail to me.
I do not know how to compose a regex (is out of my abilities) and I need some help.
I'm trying to parse an Apache log file (set as default, since I didn't make any changes on it - using xampp).
I've tried the following patterns but all of them miss some lines:
$pat='/(\d+\.\d+\.\d+\.\d+) ([^\s]+) ([^\s]+) \[(\d+)\/(\w+)\/(\d+):(\d{1,2}:\d{1,2}:\d{1,2} ?[\+\-]?\d*)\] "(.*) (HTTP\/\d\.\d)" (\d+) (\d+) "([^"]*)" "([^"]*)"/';
$pat="/^(\S+) (\S+) (\S+) \[([^:]+):(\d+:\d+:\d+) ([^\]]+)\] [\w.]+ \"(\S+) (.*?) (\S+)\" (\S+) (\S+) (\".*?\") (\".*?\")$/";
$pat='/^(\S+) \S+ \S+ \[([^\]]+)\] "([A-Z]+)[^"]*" \d+ \d+ "[^"]*" "([^"]*)"$/m';
$pat='/^(\S+) \S+ \S+ \[(.*?)\] "(\S+).*?" \d+ \d+ "(.*?)" "(.*?)"/';
$pat='/^(\S+)\s \S+\s+ (?:\S+\s+)+ \[([^]]+)\]\s "(\S*)\s? (?:((?:[^"]*(?:\\")?)*)\s ([^"]*)"\s| ((?:[^"]*(?:\\")?)*)"\s) (\S+)\s (\S+)\s "((?:[^"]*(?:\\")?)*)"\s "(.*)"$/';
preg_match($pat, $b, $m);
The first one is the best so far (113 misses on 1,000 records).
And here is a sample of a missing line:
10.21.142.253 - - [25/Oct/2014:07:42:36 -0200] "GET / HTTP/1.1" 302 - "-" "Mozilla/5.0 (Windows NT 6.1; rv:33.0) Gecko/20100101 Firefox/33.0"
It does work on lines like this:
127.0.0.1 - - [25/Oct/2014:08:49:51 -0200] "GET /xampp/ HTTP/1.1" 401 1392 "-" "Mozilla/5.0 (Windows NT 6.1; rv:33.0) Gecko/20100101 Firefox/33.0"
There seems to be something wrong on file size.