LSI (Latent Semantic Indexing) keywords are words that are commonly found together within a single topic and are semantically related to each other.
Here is an LSI Keyword example - Let's say your article topic is "horticulture". Typically, you would find multiple related keywords like "agriculture", "crops", "botany" and "plants". As you write naturally about a certain topic that you have done enough research for, a certain number of common keyword phrases will be naturally found for that specific topic. These phrases are called LSI keywords and are semantically linked to each other based on the topic (or the seed keyword) and search engines expect to find them in every article on that topic.
LSI keywords therefore help search engines figure out the main topic of your article. As another example, say your article topic is "cars". If your article had the words "clutch, "gas", "price", "mileage", "aftermarket", "used", then the search engines know that your article is about vehicles for transportation. However, if your article had words like "Disney", "animated", "characters", "McQueen" or "mater", the search engines would know that the article is about the Disney movie "Cars". Hence, having related keywords in your content is critical to sending the right signals about your topic to the search engines.
LSI SEO is the method of optimizing the on-page content of a webpage for a specific search query by ensuring that other naturally occurring words are found around that specific search query.
Using LSI keywords in your content helps the search engines understand the exact topic of the webpage and proves to them that the webpage is relevant to the search query. It also improves your chances to rank for other high-volume keywords that are semantically related to your search query.
When content creators write an article, they focus on the main head keyword, and the associated long-tail keywords. However, without enough LSI keywords, their content does not read naturally.
LSI makes it easy for search engines to figure out how natural a piece of content is based on whether enough LSI keywords show up. It has been speculated that Google's Panda update uses LSI keywords to figure out the quality of the content. If enough LSI keywords are not present, the content looks unnatural and hence of poor quality.
LSIKeywords.com is a free keywords generator tool for the English language.
You can enter up to 10 seed keywords in the search box at the top, hit the “Generate LSI Keywords” button. Our software will then find for your up to 10,000 LSI and Long-Tail keywords, along with their respective global monthly search volume and Cost per click metrics.
You should ensure that you target these generated keywords in your content to improve your relevancy for the search query, and to target other high-volume keywords.
Long tail keywords are phrases derived from a seed keyword that consist of more words than the seed keyword does and have lower volume. Examples of long tail keywords for our example seed keyword "horticulture" are "jobs in horticulture", "horticulture therapy", "horticulture online degree" and so on. As you can see each of these phrases has the seed keyword in it, along with a few other words. Long tail keywords are important because they are very specific leading to better conversion rates and have lower volume, which makes them easier to rank for. When you write content, you want the content to be focused on a single long tail keyword.
LSI keywords, on the other hand, are not used as the main keyword to focus content on. They are keywords that should also be mentioned in the article to let the search engines know that the article is truly about the long tail keyword you have focused on. LSI keywords, therefore, work together to help improve the article to rank for the long tail keyword.
LSI (Latent Semantic Indexing) and TF-IDF (Term Frequency - Inverse Document Frequency) are both algorithm based techniques for finding contextually relevant terms and phrases.
LSI is based on finding clusters of words related to your target keyword based on semantic analysis and TF-IDF is based on finding terms related to the keywords while weeding out more common terms.
In some ways they are similar but as the core algorithm used are different you will get different results with both. While there may be some overlap, both are great approaches to add to your keyword research arsenal. It is likely Google uses these both as well as some other propriety techniques.
To find tf-idf keywords, you can use tf-idf tool
The Wikipedia article on Latent Semantic Analysis is a good dive into the technical aspects of LSI. For those who are even more mathematically inclined, check out Stanford's Introduction to Information Retrieval. For those who prefer video, check out Standford's NLP course. Quora also has a few good questions on this topic that can be found here and here.