Related Items  

Linux and Windows IT Support  

We make IT Support easy!

Windows, Apple and Linux IT support and services.

For Software and Web Development.

Login or Register  

If you wish to contribute an article, please login or register.

Parsing Apache access log files using PHP


php-logoThis is a bit dated, but I still come back to it. A small script (using regex) that parses apache log files. The data breakdown required:

Server Date / Time [SPACE]
"GET /path/to/page
HTTP/Type Request"
Success Code
Bytes Sent To Client
Client Software

Here's the code that does all the legwork:

$ac_arr = file('/path/to/copy/access_log');
$astring = join("", $ac_arr);
$astring = preg_replace("/(\n|\r|\t)/", "", $astring);

$records = preg_split("/([0-9]+\.[0-9]+\.[0-9]+\.[0-9]+)/", $astring, -1, PREG_SPLIT_DELIM_CAPTURE);
$sizerecs = sizeof($records);

// now split into records
$i = 1;
$each_rec = 0;
while($i<$sizerecs) {
  $ip = $records[$i];
  $all = $records[$i+1];
  // parse other fields
  preg_match("/\[(.+)\]/", $all, $match);
  $access_time = $match[1];
  $all = str_replace($match[1], "", $all);
  preg_match("/\"[A-Z]{3,7} (.[^\"]+)/", $all, $match);
  $http = $match[1];
  $link = explode(" ", $http);
  $all = str_replace("\"[A-Z]{3,7} $match[1]\"", "", $all);
  preg_match("/([0-9]{3})/", $all, $match);
  $success_code = $match[1];
  $all = str_replace($match[1], "", $all);
  preg_match("/\"(.[^\"]+)/", $all, $match);
  $ref = $match[1];
  $all = str_replace("\"$match[1]\"", "", $all);
  preg_match("/\"(.[^\"]+)/", $all, $match);
  $browser = $match[1];
  $all = str_replace("\"$match[1]\"", "", $all);
  preg_match("/([0-9]+\b)/", $all, $match);
  $bytes = $match[1];
  $all = str_replace($match[1], "", $all);
  print("<br>IP: $ip<br>Access Time: $access_time<br>Page: $link[0]<br>Type: $link[1]<br>Success Code: $success_code<br>Bytes Transferred: $bytes<br>Referer: $ref <br>Browser: $browser<hr>");

  // advance to next record
  $i = $i + 2;

Once the info is parsed into data chunks, it can next be written into a more friendly database import format using comma delimited, pipe delimited, tab delimited, etc.

$new_format[$each_rec] = "$ip\t$access_time\t$link[0]\t$link[1]\t$success_code\t$bytes\t$ref\t$browser";

Now for creating a new file that is ready for importing into MySQL:

$fhandle = fopen("/path/to/import_file.txt", "w") {
  foreach($new_format as $data)  {
  fputs($fhandle, "$data\n");


Comments (3)
getting wrong value for status code
3 Wednesday, 10 September 2014 04:38
SR - - [02/Jul/2014:06:25:36 +0000] "GET /browse/jobs/human-resources/all/all?contract=permanent&salary_range%5Bmax%5D=400 HTTP/1.0" 301 26 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +" hosting_site=mpagesiglt pid=15300 request_time=2319996

I have the following logs but I am getting different value for status code.Can you suggest other regex for the above mention log.Also after repetitive parsing its stop parsing incorrectly.Can you suggest other alternative.
re: testing component
2 Tuesday, 06 March 2012 09:09
Thanks - it works well.
testing component
1 Tuesday, 06 March 2012 09:07
just testing the comment component :-)

Add your comment

Your name:
  The word for verification. Lowercase letters only with no spaces.
Word verification:
yvComment v.1.24.0
Copyright © 1999 - 2018 Virtual Helpme | Techical Support and Maintenance | Original Template: Allrounder