0

I'm trying to filter down a heap of apache log files to EXCLUDE all requests that have:

  • The pattern /static/ (this is my images/js folder that I want to exclude)
  • 10.xxx.xxx.xxx (where x is any number - I don't want internal requests included)
  • Any response other than "GET / HTTP/1.1" 200 - only want successes

I have a folder containing multiple .gz files. Is there a way to run a linux command that will do the proper filtering and save the results in a file called apache_log.txt?

I'm really limited in my linux knowledge so will appreciate any help greatly!

1 Answer 1

2

For each file *.gz, uncompress and filter out unwanted static and local, and filter wanted "GET 200", and append this in result file.

for f in *.gz ; do zcat $f | grep -v '/static/' | grep -v '10\.[0-9]\+\.\.[0-9]\+\.[0-9]\+' | grep 'GET / HTTP/1.1" 200' >> apache_log.txt ; done

Or on multiple lines.

for f in *.gz
do
    zcat $f \
        | grep -v '/static/' \
        | grep -v '10\.[0-9]\+\.\.[0-9]\+\.[0-9]\+' \
        | grep 'GET / HTTP/1.1" 200' \
        >> apache_log.txt
done
Sign up to request clarification or add additional context in comments.

6 Comments

Thanks for quick reply. I am getting this error for each file in the directory (it seems to have a .Z appended to the filename) zcat: access_log.2011072803.gz.Z: No such file or directory
Forgot the do, sorry: Try for f in *.gz ; do ls -l $f ; done to list the files.
That worked gave me a list (not complete) -rwxr-xr-x 1 chrispaynter staff 463010049 12 Jul 00:01 access_log.2011071103.gz -rwxr-xr-x 1 chrispaynter staff 502073947 12 Aug 10:02 access_log.2011071203.gz -rwxr-xr-x 1 chrispaynter staff 468224970 14 Jul 00:01 access_log.2011071303.gz -rwxr-xr-x 1 chrispaynter staff 464609043 12 Aug 10:02 access_log.2011071403.gz -rwxr-xr-x 1 chrispaynter staff 483168852 16 Jul 00:00 access_log.2011071503.gz
So there's no reason the for command should not work with the zcat command, and magically append a .Z to the filenames.
I just tried the same command with zmore instead and it worked. I should have mentioned I am using a mac not sure if that makes the difference or not. Could I ask one more favor? Could you please show me what to add if I only want to output the IP address of the request?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.