An excellent bit of dance music from the early 90’s, the Shamen were bang on with LSI. So how has it come to be that LSI is now better known as Latent Semantic Indexing – not quite as interesting is it? But Latent Semantic Indexing is one of the hottest subjects in the world of SEO and if there’s one thing guaranteed to cause a fight between SEO professionals it is the mere mention of LSI. So what is LSI, does it really exist and should you worry about it?
Latent Semantic Indexing Explained
Lets break down what LSI actually is:-
Semantic – This is what the words mean, much more than how they are spelt. It’s dealing with what the words actually stand for in the same way as humans understand words. For example we know that the words bird and avian are very closely related, but if you looked at just the spelling there is no correlation. We know that, computers don’t.
Latent – Semantics are cool but in order to artificially recreate the human understanding of words and their meaning would require huge CPU power. It’s just not realistic. So the Latent in LSI relates to finding the meaning of words based on how often they are used and how they are used in relation to other words. It’s a sort of short cut of the human process based on many examples.
Indexing – Maths fluff.
So in short LSI works with the meaning of words and allows computers to figure out how 1 word relates to another based on a whole sample of examples of how that word has been used in the past. There’s a lot of maths involved. Got it?
If a search engine was able to implement LSI into it’s search algorithm it would significantly improve the results, or so you would hope. Articles written in a natural way using variations (synonyms, plurals etc) would be far more likely to be recognised than the standard “use the target keyword over and over” approach favoured by spammers/SEO experts. The good news/bad news (depending on your point of view) is that it is believed that Google are using LSI in their algorithm.
Are Google Using LSI?
It’s a fact that Google have never patented anything LSI related for their algorithm. They did patent a load of things in 2006 that seemed to be related to LSI but nothing that was explicit. But that doesn’t mean that they’re not using it, after all the common belief is that LSI is a major part of Google’s algorithm. Personally I don’t think so. If LSI was part of Google’s algorithm then words with the same meaning, plurals etc should return very similar if not identical results. Try it yourself.
A Google search for beach returns roughly 116 million results, a search for beaches returns roughly 59 million. There is almost no overlap in the top 10. If a page is authorative for beach then surely (if LSI is applied) it would also be authorative for beaches and vice versa? Not on Google. It’s my own opinion that it’s just not in there. How about fabric and material, pen and pens – give it a go.
The thing is that there are so many myths out there about search rankings and what is and isn’t important. One of the best examples of this was the fuss surrounding reciprocal linking. They don’t count anymore? Reciprocal links are worthless? Google just doesn’t count them? At the same time all this was at its peak a certain Mr John Chow found his way to the number 1 spot in Google for the search phrase “make money online” using nothing more than reciprocal linking. He asked people to link to him with those words and in return he’d provide a link back. Google’s algorithm did NOT stop this occurring, the only way John got binned was via a manual penalty. He made Google look stupid.
I really don’t believe that LSI is part of what Google are doing. Sure they have there own way of grouping related words together but it’s actually at a much simpler level than the analysis that goes on with LSI. I know there are whole guides on how best to set out your site to comply with LSI but for my money I still think they best way of achieving great rankings is as simple as
1. Getting backlinks from pages which are themselves indexed
2. Link text that matches what you want to rank for
3. CREATE CONTENT THAT PEOPLE WANT TO LINK TO
Just do that and you won’t go far wrong (Yeah , “just” because it’s as simple as that)

4 Responses to “LSI – Love Sex Intelligence?”
Paul
May 28, 2009
I wonder what Mr C is doing these days!
|Paul B
May 29, 2009
I sometimes wonder the same thing about dance music in general, where did it go?
|Alan
May 30, 2009
You are quite right as well Paul. The popular belief that LSI is the be all and all is nothing more than a new take on something that has been around for years – something called themeimg.
Denis over at Semiologic – http://www.semiologic.com/resources/seo/silo/ has a good take on Siloing and LSI more or less saying that neither G nor any other SE has the computing power available to process the data required for LSI in a reasonable time.
I have always heard that English is the most difficult of languages to learn and that is only from a human perspective. So many similar words that mean totally different things. Meanings that change again if you add another word before or after. Then add in the English spellings and the American spellings of the same words to the mix. If the human brain has difficulties, computers don’t stand a chance!!
Stick with simple – it always works.
Alan’s last blog post..Lipodissolve Non Surgical Liposuction Risks
|R Kumar
June 3, 2009
LSI,SEO etc. sound as jargons for newbies starting out on the internet marketing business. While most of the internet sales pitches promotes their products as ones that do not require any HTML knowledge or, any technical skills, somewhere down the line, all of these jargons come in the way of their success. So it becomes imperative to learn the meaning and use of all these things.
A nice way to explain LSI. Though I personally feel that the talk and the hype around various search engines using LSI is just to ensure that they have something extra. Otherwise, why will I bother to go to Google instead of MSN.
R Kumar’s last blog post..Bing – The Microsoft competition to Google
|