Apache access log structure

Recommended reading:

https://httpd.apache.org/docs/2.4/logs.html

A typical access log structure looks as follow

LogFormat "%h %l %u %t \"%r\" %>s %b" common
CustomLog logs/access_log common

%h: IP of the visitor (it might be a VPN IP for instance).

%l RFC 1413 identity of the client (not check unless IdentityCheck set to On). Highly unreliable.

%u user id. If the document is not password protected, the value will be set to – as above.

%t The time the request was received

\”%r\” Request that the server received, between double quote. It starts with the method, then the requested resource, then the protocol (e.g. HTTP/2.0)

%>s The status code that the server sends back to the client

%b size of the object returned byt the server to the client.

Example:

213.18X.X.XXX – – [11/Oct/2019:11:52:08 +0300] “GET /index.phpXXXXXXX HTTP/2.0” 303 314 “https://example.com/” “Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36” (38FF183F-0.107)

You can see the two dash represented the fact that, for the first dash, the RFC 1413 identity is not used, and for the second, that the resource is public and no identification is requested.

Here you can see there are extra info after the size of the object returned. This indicates the server is using combined log format which goes as follow:

 "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\"" 

\”%{Referer}i\” being the referer (e.g. a link to your site on another web site or a form with a POST action to your server)

\”%{User-agent}i\” the user agent (e.g. the browser specification used to navigate your site)

In a more readable way, you will often see your access log as follow:

IP – – [DATE and TIME] “METHOD /URLREQUESTED PROTOCOL” SERVER_RESPONSE_CODE OBJECT_SIZE “REFERER_URL” “USER_AGENT”