Understanding joomla .htaccess

IndexIgnore *

## No directory listings
<IfModule autoindex>
IndexIgnore *
</IfModule>

The indexIgnore directive prevent files in a directory with index on from being displayed in the directory listings. The star match all files which will prevent any file in the public directory to be displayed in a directory listings even if it directory listings is on ( apache.conf with the indexes option).

Options +FollowSymlinks

# The line 'Options +FollowSymLinks' may cause problems with some server configurations.
# It is required for the use of mod_rewrite, but it may have already been set by your
# server administrator in a way that disallows changing it in this .htaccess file.
# If using it causes your site to produce an error, comment it out (add # to the
# beginning of the line), reload your site in your browser and test your sef urls. If
# they work, then it has been set by your server administrator and you do not need to
# set it here.
#Can be commented out if causes errors, see notes above.
Options +FollowSymlinks
Options -Indexes

The options directive allow to set the directory behavior. Here FollowSymLinks indicate to follow the symbolic links in the directory (in that case this is the public directory as there is no directory specified).

Indexes indicates that if a directory path is entered and there is no index file in this directory, then a directory listing will be displayes. The minus sign preceding the Indexes option indicate that we remove this behavior, (meaning that it will return a 404?)

This settings might have been set by the host on your server so they might cause issue, in that case you can comment those lines, as stated in the note.

## Mod_rewrite in use.
RewriteEngine On

Turn the rewrite engine on so you can perform redirection from .htaccess according to the conditions you set.

Rewrite rules to block out some common exploits.

## Begin - Rewrite rules to block out some common exploits.
# If you experience problems on your site then comment out the operations listed
# below by adding a # to the beginning of the line.
# This attempts to block the most common type of exploit `attempts` on Joomla!
#
# Block any script trying to base64_encode data within the URL.
RewriteCond %{QUERY_STRING} base64_encode[^(]*\([^)]*\) [OR]
# Block any script trying to set a PHP GLOBALS variable via URL.
RewriteCond %{QUERY_STRING} GLOBALS(=|\[|\%[0-9A-Z]{0,2}) [OR]
# Block any script trying to modify a _REQUEST variable via URL.
RewriteCond %{QUERY_STRING} _REQUEST(=|\[|\%[0-9A-Z]{0,2})
# Return 403 Forbidden header and show the content of the root home page
RewriteRule .* index.php [F]

As a platform concerned with security, joomla try to deflect some common attack straight from the start.

Let’s have a look at the rewriteCond directives that they use.

Rewrite condition syntax:
RewriteCond TestString CondPattern [flags]

The Teststring is set to match the query string %{QUERY_STRING}

The condition pattern CondPattern is a regular expression.

Base64_encode:

base64_encode is a php function that will encode a string in base 64

Now let’s have a look at the regular expression that follows:

[^(] in the square brakets is the expression to evaluate. The circumflex accent indicates that it is a negative match (match everything NOT in the list), followed by the list of character to NOT match, here a single opening bracket. The star after the bracket indicates to do that any number of time. We have then an escape string followed by an opening bracket to match an opening bracket (otherwise we would just start a marked sub expression). Then [^)]* will match any character that is not a closing bracket any number of time. And we finish with an escaped closing bracket. So our condition tell us to catch any string ” base64_encode ” followed by any suit of characters containing any string between brackets.

https://regexr.com/

Let’s say this evaluate to true, the rewrite rule will then apply and the whole URL including the query strng will be substituted by index.php and it will be displayed instead.

Rewrite rule syntax:

. match any character and * indicates to do that any number of time.

RewriteRule Pattern Substitution [flags]

The [F] flag indicate to return a 403 response (forbidden)

The flag is [OR] which indicates to evaluate the next condition instead (instead of skipping it?) (or next condition)

GLOBALS and _Request

Here it will match the string GLOBALS. The opening bracket indicate that we will match a group. This group will start with an equal sign OR ( or being the pipe symbol ‘|’) an (escaped) square bracket OR a percent signed folowed by any number or upper case alphabetical characters. The numbers within the curly brackets indicates that the preceding rule shall at least 0 but nor more than 2 times. Check the figure below to have an illustrated example.

The following condition does the same but with the string _REQUEST instead of GLOBALS

Catching script injection

And there is one more condition to catch the query string containing a script tag: (<|%3C)([^s]s)+cript.(>|%3E)

%3C is the unicode for less than. So the first group will catch “less than” character. the second group will catch any character preceding a s and the s character one or more time (that is the plus, the star * catch zero or more time). Then it will catch “cript”. The dot “.” match any character and the last group match the closing “greater than” character .

That was a big part! Now the rest should be easy peasy.

So we have a rewrite base that indicates with what string the URL should be prefixed. In our case it is just a slash

RewriteBase /

SEF Section


RewriteRule .* - [E=HTTP_AUTHORIZATION:%{HTTP:Authorization}]


Pattern: .* any URL (any suit of any character)

Substitution: – (dash) indicates that no substitution should be performed.

Flag:

E= Causes an environment variable VAR to be set (to the value VAL if provided). The form !VAR causes the environment variable VAR to be unset.

HTTP_AUTHORIZATION: is an environment variable sent in the header of the http request.

%{HTTP:Authorization} obtains the actual value from the header.

Read more about authentication.

RewriteCond %{REQUEST_URI} !^/index\.php

%{REQUEST_URI}  match against the full URL-path in a per-directory RewriteRule

 The NOT character (‘!‘) as a pattern prefix inverse the condition, so it returns true if it does not match the following pattern. Here, the condition returns true if the full URL path is NOT index.php

%{REQUEST_FILENAME} !-f if the requested file is not a file (so if does not exist).

RewriteCond %{REQUEST_FILENAME} !-d or not a directory (so if such directory does not exist)

RewriteRule .* index.php [L]

then rewrite the full path to index.php

[L] is the Last flag which stops the rewriting process.

My 2cents on Google position on Responsive images

Google insists on serving images at the right size using srcset. It has two advantages, the first one is that the page render faster ( as the picture does not have to be resized by the browser to be displayed, if I understand correctly) and that the content in the page does not jump around when the picture is rendered in the browser.

Avoiding content bouncing around when page is rendered

To avoid bad user experience while your page is rendered on the user browser, namely, to avoid your content position being recalculated when the image is loaded, you have two options: one is to set the image width and height in the style and the other, in case of lazyloading is to load a placeholder of the appropriate size in place of the picture. If you are using pictures of a standard size in your page, you justhave to load the picture once to be all set. Both technics are easy to implement.

When setting the width and height of the picture in CSS, you can then serve a larger image to accomodate high dpi screen. Although it leads to a better user experience in terms of quality of image, at the expense of a some milliseconds rendering, google page speeds insight will penalize the practice in its scoring system.

The team at google does not specify why they are so insistant on this. It is a benefit for the user and for the webmaster as the user experience improves but I assume the guys at google might have a special interest in this.

Actually one example they are using to illustrate the consequence of not using properly sized picture is when the user go to click on a link on a page and an image starts rendering, pushing the content down and leading the user to click on whatever was pushed where is mouse or finger landed. The new place being potentially a google ad. On top of a bad user experience, this “practice”, be it volontary or not lead to bad quality trafic to google ad, and I assume this drag the price of google ad clicks down.

Of course google wants to promote quality sites that provide quality user experience, but I don’t see them insisting that much on other practices so I suspect there should be an extra catch here.

As an example of practices leading to bad user experience is the sudden trend there was in web design to display images that slowly, almost imperceptibly zoom in. It can be onload, or on mouse over, etc. I understand the desire to make the page dynamic and modern by displaying sleek animation but the fact is that it gives the sensation of vertigo and unease (think about how you create this sensation in a movie for instance, that’s what you gonna use).

Vertigo stairs from Alfred Hitchcok vertigo movie

Jaws, by Steven Spielberg

So you are using javascript, load time, etc and gives your user the feeling he is sick and needs to take a break from computer right now. But he won’t be clicking inadvertantly on Google ad so that’s ok.

[EDIT: it seems this nonsense trend has already faded to oblivion. ]

Google I/O 2019 take away

The number one take away is to have a light weight website that is mobile friendly.

We have covered some of the very important topic in previous articles about optimization so we will focus here on the surprise features and highlight some point we have skipped previously.

Schema markup for rich results

One of the topic I didn’t cover so far is schema markup for rich results. Google likes it as it allow its search engine to extract the useful information from your page and display that on their search resuls. One can ponder why we should provide content for Google. There is the hope of figuring prominenttly and rank higher but if people get their answer directly from the search page, what economical model do you have left to make money from your website, unless you are paid directly by google. That would be the case for a recipe site.

One thing that has been highlighted in the talk is that schema information might be used with assistive technology, to walk you through a recipe for instance.

Concerning product page, if you are competitive, you might attract customer with your super price or the super brands you have available at your store.

So that’s the two core benefits you can expect from product mark-up. Event mark up might actually be useful to provide straight from the search information about your live event with clear information for google about when the information will be out of date. Leading for a better user experience, unlike the recipe case where you don’t really your recipe to be displayed directly on search page unless you got some compensation from Google for using your content.

For product markup, in sector where negotiation is important, you might not want to have any kind of price displayed publicly to avoid giving any cue to your competitors and allow you larger freedom regarding your pricing. For instance the price might largely vary depending on the quantity of a single item or the total amount of the order due to saving on scale, distance , options, it would be a daunting task to give all available options and integrate all factors influencing the final price and integrate that in a schema markup, plus as seen above it is totally counter productive in terms of conversion as your competition would only have to check your site to propose a more competitive offer.

Needless to say that all schema markup are not made equal and you might consider the benefit of implementing specific schema markup depending on your activity.

Users like large, high definition images

Google aknowledge that people like large, high definition images and will feature those images in their search.

Using proper compression will help reduce the bandwidth necessary to load the image.

Google will start to show in search 3D models directly available in the search results as AR.

There is a talk dedicated to adding 3D to your website:

All those topic needs further investigation, needless to say that product markup is already implemented on my some of my sites.

Light weight website

Time is money and on the web every millisecond of load time count.

Javascript libraries have become too heavy and css frameworks do not use the most advance capabilities of modern css.

Time to refresh your code and crack down on resource call, scripting and rendering.

One of the main reason I used bootsrap was the grid system, simple and efficient. Nowadays css includes grid-column markup:

https://developer.mozilla.org/en-US/docs/Web/CSS/grid-column

https://drafts.csswg.org/css-grid/#propdef-grid-column

The other good reason I had to use Bootstrap framework was the modals. I like their modals, so much I built a webshop with products popping up in modals. The good thing is that you don’t have to go back and forth product pages and category pages and the down side is that category pages code started to grow a tad whith the store products line up.

Avoid webfonts

Replace glyphicon and fontawesome with css sprites use Hover css selector to change color, icon or size on hover and point to another version of the icon etc.

Inline your scripts

Load essential css and js inline for faster rendering

Avoid libraries. Inline your scripts. Defer everything that can be defered.

Optimize image serving

Use fixed size images and use srcset to load properly sized mobile version.

Lazyload images.

One point I need to clarify is the following: properly displaying compressed images on high DPI screen requires to serve an image larger than the size it will be displayed to accomodate the higher density of screen pixel. The questions being: does it impact user experience positively? and Does it impact SEO positively? And John Müller would proabably answers that only the former matters and he would be right in that respect.

How to test locally hosted page with google tool before release

There are two solutions available:

One is to simply test the code directly on the test page. The three google search testing tools, namely Mobile Friendly tool, AMP testing page, and the Rich Snippet/ Strucure Data testing tool offer this possibility. You have to keep in mind that external resources such as your css and js files should also be available for rendering in the case of the Mobile friendly tool.

You can modify your code directly online and test your modification which is a really nice feature.

If you want to integrate the test in your workflow but your pages are hosted behind a firewall or are simply available on your local machine, google wrote a nice article on how to make this file available for testing online.

https://developers.google.com/search/docs/guides/debug