Apache access log structure

Recommended reading:

https://httpd.apache.org/docs/2.4/logs.html

A typical access log structure looks as follow

LogFormat "%h %l %u %t \"%r\" %>s %b" common
CustomLog logs/access_log common

%h: IP of the visitor (it might be a VPN IP for instance).

%l RFC 1413 identity of the client (not check unless IdentityCheck set to On). Highly unreliable.

%u user id. If the document is not password protected, the value will be set to – as above.

%t The time the request was received

\”%r\” Request that the server received, between double quote. It starts with the method, then the requested resource, then the protocol (e.g. HTTP/2.0)

%>s The status code that the server sends back to the client

%b size of the object returned byt the server to the client.

Example:

213.18X.X.XXX – – [11/Oct/2019:11:52:08 +0300] “GET /index.phpXXXXXXX HTTP/2.0” 303 314 “https://example.com/” “Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36” (38FF183F-0.107)

You can see the two dash represented the fact that, for the first dash, the RFC 1413 identity is not used, and for the second, that the resource is public and no identification is requested.

Here you can see there are extra info after the size of the object returned. This indicates the server is using combined log format which goes as follow:

 "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\"" 

\”%{Referer}i\” being the referer (e.g. a link to your site on another web site or a form with a POST action to your server)

\”%{User-agent}i\” the user agent (e.g. the browser specification used to navigate your site)

In a more readable way, you will often see your access log as follow:

IP – – [DATE and TIME] “METHOD /URLREQUESTED PROTOCOL” SERVER_RESPONSE_CODE OBJECT_SIZE “REFERER_URL” “USER_AGENT”

Understanding joomla .htaccess

IndexIgnore *

## No directory listings
<IfModule autoindex>
IndexIgnore *
</IfModule>

The indexIgnore directive prevent files in a directory with index on from being displayed in the directory listings. The star match all files which will prevent any file in the public directory to be displayed in a directory listings even if it directory listings is on ( apache.conf with the indexes option).

Options +FollowSymlinks

# The line 'Options +FollowSymLinks' may cause problems with some server configurations.
# It is required for the use of mod_rewrite, but it may have already been set by your
# server administrator in a way that disallows changing it in this .htaccess file.
# If using it causes your site to produce an error, comment it out (add # to the
# beginning of the line), reload your site in your browser and test your sef urls. If
# they work, then it has been set by your server administrator and you do not need to
# set it here.
#Can be commented out if causes errors, see notes above.
Options +FollowSymlinks
Options -Indexes

The options directive allow to set the directory behavior. Here FollowSymLinks indicate to follow the symbolic links in the directory (in that case this is the public directory as there is no directory specified).

Indexes indicates that if a directory path is entered and there is no index file in this directory, then a directory listing will be displayes. The minus sign preceding the Indexes option indicate that we remove this behavior, (meaning that it will return a 404?)

This settings might have been set by the host on your server so they might cause issue, in that case you can comment those lines, as stated in the note.

## Mod_rewrite in use.
RewriteEngine On

Turn the rewrite engine on so you can perform redirection from .htaccess according to the conditions you set.

Rewrite rules to block out some common exploits.

## Begin - Rewrite rules to block out some common exploits.
# If you experience problems on your site then comment out the operations listed
# below by adding a # to the beginning of the line.
# This attempts to block the most common type of exploit `attempts` on Joomla!
#
# Block any script trying to base64_encode data within the URL.
RewriteCond %{QUERY_STRING} base64_encode[^(]*\([^)]*\) [OR]
# Block any script trying to set a PHP GLOBALS variable via URL.
RewriteCond %{QUERY_STRING} GLOBALS(=|\[|\%[0-9A-Z]{0,2}) [OR]
# Block any script trying to modify a _REQUEST variable via URL.
RewriteCond %{QUERY_STRING} _REQUEST(=|\[|\%[0-9A-Z]{0,2})
# Return 403 Forbidden header and show the content of the root home page
RewriteRule .* index.php [F]

As a platform concerned with security, joomla try to deflect some common attack straight from the start.

Let’s have a look at the rewriteCond directives that they use.

Rewrite condition syntax:
RewriteCond TestString CondPattern [flags]

The Teststring is set to match the query string %{QUERY_STRING}

The condition pattern CondPattern is a regular expression.

Base64_encode:

base64_encode is a php function that will encode a string in base 64

Now let’s have a look at the regular expression that follows:

[^(] in the square brakets is the expression to evaluate. The circumflex accent indicate that it is a negative match (match everything NOT in te list), followed by the list of character to NOT match, here a single opening bracket. The star after the bracket indicates to do that any number of time. We have then an escape string followed by an opening bracket to match an opening bracket (otherwise we would just start a marked sub expression). Then [^)]* will match any character that is not a closing bracket any number of time. And we finish with an escaped closing bracket. So our condition tell us to catch any string ” base64_encode ” followed by any suit of characters containing any string between brackets.

https://regexr.com/

Let’s say this evaluate to true, the rewrite rule will then apply and the whole URL including the query strng will be substituted by index.php and it will be displayed instead.

Rewrite rule syntax:

. match any character and * indicatesto do that any number of time.

RewriteRule Pattern Substitution [flags]

The [F] flag indicate to return a 403 response (forbidden)

The flag is [OR] which indicates to evaluate the next condition instead (instead of skipping it?) (or next condition)

GLOBALS and _Request

Here it will match the string GLOBALS. The opening bracket indicate that we will match a group. This group will start with an equal sign OR ( or being the pipe symbol ‘|’) an (escaped) square bracket OR a percent signed folowed by any number or upper case alphabetical characters. The numbers within the curly brackets indicates that the preceding rule shall at least 0 but nor more than 2 times. Check the figure below to have an illustrated example.

The following condition does the same buit with the string _REQUEST instead of GLOBALS

Catching script injection

And there is one more condition to catch the query string containing a script tag: (<|%3C)([^s]s)+cript.(>|%3E)

%3C is the unicode for less than. So the first group will catch “less than” character. the second group will catch any character preceding a s and the s character one or more time (that is the plus, the star * catch zero or more time). Then it will catch “cript”. The dot “.” match any character and the last group match the closing “greater than” character .

That was a big part! Now the rest should be easy peasy.

So we have a rewrite base that indicates with what string the URL should be prefixed. In our case it is just a slash

RewriteBase /

SEF Section


RewriteRule .* - [E=HTTP_AUTHORIZATION:%{HTTP:Authorization}]


Pattern: .* any URL (any suit of any character)

Substitution: – (dash) indicates that no substitution should be performed.

Flag:

E= Causes an environment variable VAR to be set (to the value VAL if provided). The form !VAR causes the environment variable VAR to be unset.

HTTP_AUTHORIZATION: is an environment variable sent in the header of the http request.

%{HTTP:Authorization} obtains the actual value from the header.

Read more about authentication.

RewriteCond %{REQUEST_URI} !^/index\.php

%{REQUEST_URI}  match against the full URL-path in a per-directory RewriteRule

┬áThe NOT character (‘!‘) as a pattern prefix inverse the condition, so it returns true if it does not match the following pattern. Here, the condition returns true if the full URL path is NOT index.php

%{REQUEST_FILENAME} !-f if the requested file is not a file (so if does not exist).

RewriteCond %{REQUEST_FILENAME} !-d or not a directory (so if such directory does not exist)

RewriteRule .* index.php [L]

then rewrite the full path to index.php

[L] is the Last flag which stops the rewriting process.