Windows Tutorials - Herong's Tutorial Notes
Dr. Herong Yang, Version 4.11

Web Log Analysis

Part:   1  2  3 

Windows Tutorials - Herong's Notes © 2006 Dr. Herong Yang

Trojan and Adware - Vundo

Controlling IE Addons

Removing Spyware

Web Log Analysis

Paint - Graphics Tool

WinRAR - RAR Compression Tool

FTP Server and Client

Crossover Cable Network

... Table of Contents

This chapter describes:

  • Web Log File
  • Analog - Web Log File Analysis Tool
  • Configuring Analog to Run My Logs

Web Log File

Web Log File: A file produced by a Web server to record activities on the Web server. It usually has the following features:

  • The log file is text file. Its records are identical in format.
  • Each record in the log file represents a single HTTP request.
  • A log file record contains important information about a request: the client side host name or IP address, the date and time of the request, the requested file name, the HTTP response status and size, the referring URL, and the browser information.
  • A browser may fire multiple HTTP requests to Web server to display a single Web page. This is because a Web page not only needs the main HTML document, it may also need additional files, like images and JavaScript files. The main HTML document and additional files all require HTTP requests.
  • Each Web server has its own log file format, see log file examples below.
  • If your Web site is hosted by an ISP (Internet Service Provider), they may not keep the log files for you, because log files can be very huge if the site is very busy. Instead, they only give you statistics reports generated from the logs files.

1. IIS (Internet Information Service) Samples: Here are some sample records from an IIS server log file:

02:49:12 127.0.0.1 GET / 200
02:49:35 127.0.0.1 GET /index.html 200
03:01:06 127.0.0.1 GET /images/sponsered.gif 304
03:52:36 127.0.0.1 GET /search.php 200
04:17:03 127.0.0.1 GET /admin/style.css 200
05:04:54 127.0.0.1 GET /favicon.ico 404
05:38:07 127.0.0.1 GET /js/ads.js 200

The record format is very simple. It has fields for: time, client IP address, request command, requested file, and response status code.

2. Apache Samples: Here are some sample records from an Apache server log file:

192.168.198.92 - - [22/Dec/2002:23:08:37 -0400] "GET 
   / HTTP/1.1" 200 6394 www.yahoo.com 
   "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1...)" "-"
192.168.198.92 - - [22/Dec/2002:23:08:38 -0400] "GET 
   /images/logo.gif HTTP/1.1" 200 807 www.yahoo.com 
   "http://www.some.com/" "Mozilla/4.0 (compatible; MSIE 6...)" "-"
192.168.72.177 - - [22/Dec/2002:23:32:14 -0400] "GET 
   /news/sports.html HTTP/1.1" 200 3500 www.yahoo.com 
   "http://www.some.com/" "Mozilla/4.0 (compatible; MSIE ...)" "-"
192.168.72.177 - - [22/Dec/2002:23:32:14 -0400] "GET 
   /favicon.ico HTTP/1.1" 404 1997 www.yahoo.com 
   "-" "Mozilla/5.0 (Windows; U; Windows NT 5.1; rv:1.7.3)..." "-"
192.168.72.177 - - [22/Dec/2002:23:32:15 -0400] "GET 
   /style.css HTTP/1.1" 200 4138 www.yahoo.com 
   "http://www.yahoo.com/index.html" "Mozilla/5.0 (Windows..." "-"
192.168.72.177 - - [22/Dec/2002:23:32:16 -0400] "GET 
   /js/ads.js HTTP/1.1" 200 10229 www.yahoo.com 
   "http://www.search.com/index.html" "Mozilla/5.0 (Windows..." "-"
192.168.72.177 - - [22/Dec/2002:23:32:19 -0400] "GET 
   /search.php HTTP/1.1" 400 1997 www.yahoo.com 
   "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; ...)" "-"

The record format is more complex. The records are also very long. I am breaking them into multiple lines. Some fields are easy to understand, like client IP address, date and time, request command line, response status and size, referring URL, and browser name. I don't know what the other fields are.

(Continued on next part...)

Part:   1  2  3 

Dr. Herong Yang, updated in 2006
Windows Tutorials - Herong's Tutorial Notes - Web Log Analysis