As SEOs, we spend much of our time trying to figure out how Google's indexing algorithms work. This article will take a look at the techniques used by the search engine. We'll cover latent semantic indexing, local inter-connectivity, link analysis, and more. We'll discuss the implications of these algorithms, and what you should do to give them what they want to see.
Latent Semantic Indexing or LSI
Latent semantic indexing is the science of natural language processing. LSI analyzes relationships between words and is designed to distinguish naturally written text from keyword-stuffed documents. Latent semantic indexing considers natural and synonymous relationship between words.
For example, if you're examining an article about an airplane, LSI will look for synonyms: aircraft, plane, aero plane. It will also look for related words, which are NOT synonymous to the word, but are often mentioned when discussing airplanes: ailerons, turbulence, fuel, clouds, sky, roll, pitch, etc.
The point of LSI is to detect natural writing and distinguish it from robotic copy written to manipulate search results. Google purchased the company applied semantics, which developed advanced LSI technology. Their know-how was incorporated into AdSense, and possibly into search algorithms.
As a webmaster, write naturally and forget about measurements such as keyword density. Also mix related and synonymous phrases into your anchor text.
Ranking Search Results by Reranking the Results Based on Local Inter-Connectivity
This is the name of a patent issued to Google on February 25th 2003.
A search engine for searching a corpus improves the relevancy of the results by refining a standard relevancy score based on the inter-connectivity of the initially returned set of documents.
The search engine referred to above finds a good set of documents using other algorithms (such as PageRank and Trust Rank), and then re-ranks search results based on the inter-connectivity of those documents.
If you have many links from authoritative domains, but still have a hard time ranking on search results, you might need to get links from several websites included in a "set" determined by this algorithm. In other words, you need links from sites that rank at the top for your terms, or links that are better than those of your competitors. If you lack "community" exposure, your rankings may be recalculated and lowered in favor of sites that have more "community" links, even at the cost of some authority.