Sources of keywords

Google Search Console

It gives you queries for which Google Search returned your site as a result

Google Ad Keyword ideas

Primarily meant to find ideas for your Google Ad campaigns, it is a good source of keyword ideas for your SEO. In addition you can evaluate the value of the keyword thanks to volume forecast, recommended minimum biding and competition evaluation.

Latent Semantic Analysis

If the two sources of keyword above are based on user query (the most important) we can also extract valuable keywords from the site already online. Latent semantic analysis allow to find the most statistically significant keyword in a group of pages compared to a larger set of document which is suppose to be representative of the frequence of the words in the language.

Quality of the results yielded by this method depends greatly on the dataset and on the quality of the content for the specific topic you want to analyse. Topic with poor description and little written content (and no SEO effort ) are likely to yield poor results.

Evaluating the quality of the result of the LSA and optimizing the process to extract the maximum information will require some adjustement: is lemmatization a good idea for n-grams? Doesn’t it make results more difficult to analyze? Lemmatization was suppose to strengthen the presence of a given word by aggregating all its different forms in its unic lemma.

A possibility to evaluate the quality of the subset for the topic is to check how much words from the two first lists (after lemmmatization if necessary) are present in the documents. If little are found, it tends to indicate that this field is not optimized toward user query and possibly that little attention is given to the content, or that the content and the user queries for this topic are “disconnected” and that we have a good opportunity to fill the gap.