Check ut The Best Link Building Strategies

Check ut The Best Link Building Strategies

Link building is the most important part to achieve a high ranking website in search engines. 

It is also an important concept in search engine optimization. Here I highlight the important factors in the area of link building:

  • Make sure your site has something that other webmasters in your niche or vertical would be interested in linking to.
  • Create content that people will be willing to link to, even if it is not directly easy to monetize. These links worthy pages will lift the authority and rankings of all pages on your site.
  • Create content that legitimate webmasters interested in your topic would be interested in linking to.
  • When possible, try to get your keywords in many of the links pointing to your pages. For e.g., if it is a digital marketing training niche, you can include the following keywords in your content – “digital marketing training”, “online marketing course”, “digital marketing institute”, “Google analytics course”, “web analytics course”, “Google Analytics”, etc.
  • Register with, participate in, or trade links with topical hubs and related sites. Be in the discussion or at least be near the discussion.
  • Look for places from which you can get high-quality free links (like local libraries or chambers of commerce).
  • If you have some good internal content, try to get direct links to your inner pages.
  • Produce articles and get them syndicated to more authoritative sites.
  • Start an interesting and unique blog and write about your topics, products, news, and other sites in your community.
  • Comment on other sites with useful relevant and valuable comments.
  • Participate in forums to learn about what your potential consumers think is important. What questions do they frequently have? How do you solve those problems or pain points?
  • Issue press releases with links to your site.
  • Leave testimonials for people and products you really like. Oftentimes when the product owner or person posts the testimonials, they will include a link back to your site.
  • Sponsor charities, blogs, or websites related to your site.
  • Consider renting links if you are in an extremely competitive industry.
  • Adult, gaming, credit, and pharmacy categories will likely require link rentals and/or building topical link networks.
  • Mix your link text up. Adding words like buying or store to the keywords in your some of your link text can make it look like more natural linkage data and help you rank well for many targeted secondary phrases.
  • Survey your vertical and related verticals. What ideas/tools/articles have become industry standard tools or well-cited information? What are ideas missing from the current market space that could also fill that niche?
    If you have a large site, make sure you create legitimate reasons for people to want to reference more than just your homepage.

Other Related Posts:

  1. On-Page SEO Best Practices
  2. Must-Have WordPress Plugins

150+ High PR (DA) DoFollow Social Bookmarking Sites

What is Social Bookmarking?

Social Bookmarking is the best method to organize, store, manage and search for bookmarks of online resources. The SEO players look for the related social bookmarking websites where they can bookmark their own website to get a backlink from the site.

When we discover a web page that we find interesting, in its place of having to remember the address of the webpage, we basically save the address as a “bookmark” in our browser. Social bookmarking is similar to saving favorites on our browser, except we are saving to a website that we can access from any computer in the world. But the component of social bookmarking is the ‘social’ element. Each one can appear at each one else bookmark. That means, we are looking for content, that people already bookmarked.

Mainly social bookmarking websites will display a number next to the content representing how many times it has been saved by a different user. These sites also show a continually updated list of popular web pages. This can be an enormous technique of finding remarkable content that we might not otherwise come across.

There are many social bookmarking websites

The most popular websites are Diigo, Folkd, Plurk, Mix, Scoop, and Fark.

We have listed 150+ best and high PR DoFollow social bookmarking sites below. These links have been checked and working as on 25-01-2020.



Why do you need to do SEO on a daily basis

The very purpose of SEO goals are high ranking in search engine result pages (SERPs). Web sites are always competing with each other and more active competitors occupy the top positions instead of passive sites. Therefore, one needs to constantly develop its website and increase its relevancy.

“A holy place is never empty”

Here it must be understood that by themselves, someone always occupies the first positions in search results on commercial topics. Therefore, if not you, then your competitors receive quality visitors or potential buyers. Hence, once having occupied high positions in the search for targeted queries, the site will be constantly subjected to harsh punishment from the not so successful, but working competitors.

Maintaining high positions is usually easier than promoting a site from the bottom to the top. It is even more offensive when, having conquered them, the web resources stop developing. The owner makes a mistake, deciding to save on SEO – and thus deceiving himself. We at SKARTEC believe that our goal is to be positioned in the first position for all customer-related queries in all major search engines. Naturally, the goal is utopian and unattainable in reality, but the pursuit of this maximally high bar gives long-term growth and maintenance of the site in the search that gives an incentive to never stop and not relax.

SEO is a powerful but inert tool

And this has its pros and cons. SEO is fairly compared (or correlated) with search engine marketing or PPC (pay for click advertising), for example, Google AdWords. Both digital marketing tools are related to search, forming together SEM (Search Engine Marketing). Moreover, the cost of SEO and SEM are interdependent, they balance each other. The client always chooses what to invest in, and, as a rule, chooses cheaper and more effective methods, increasing competitiveness and increasing the entry price. Moreover, the surest strategy is not to choose between these two tools, but to use them together, complementing and enhancing the effect of advertising in the search.

If you compare SEO with the PPC, then by paying for clicks in Google Ads, you can get visitors to the site on any given day. While promoting a web resource using SEO we are always waiting for the reaction of search engines to changes and also website optimization. SEO works are always time-consuming: as the website content is being built up, the website being promoted is becoming more and more popular among other network sites, and, as a reflection of these events, search engines are redefining its chances for high positions.

At the same time, modern high-quality SEO is always improving the quality of a site: working with conversion, analytics, optimizing behavioural scenarios and, as a result, increasing the conversion of a website from all its visitors from all possible channels, including search engine marketing.

And when a company receives targeted traffic and sales as a result of well-conducted, for example, 6-12 months of Search Engine Optimization, it will continue to receive traffic, even after stopping SEO – and, possibly, for a rather long time. After all, the SEO effect is as inert as the growth of the site’s position: search engines with a delay will notice the absence of targeted SEO changes and will respond to it with a delay. But the right decision would be to update the targeted queries, expand the semantics and continue with Search Engine Optimization, allowing the “pot to cook further.”

Benefits of Continuous SEO Work

In addition to supporting and increasing the position of the site for targeted search queries, there are other important advantages arising from the features of the search engines.

Search engines are constantly improving their mechanisms, trying to improve the quality of search and user experience from interacting with them. These mechanisms – algorithms – are a set of criteria and rules on the basis of which search engines decide who will be higher and who is lower. Professional SEO companies are always in the track of such changes, and constantly promoted sites are brought into line with changing algorithms and hence do not lose their relevance.

On the other hand, the interests of the audience are also not constant, which means that the semantics will also change. Therefore, both the site and its SEO optimization must change in order to reach the audience.

It’s always good when the site contains a lot of useful, necessary, relevant content to the interests of the user (that is, containing key topics). Therefore, constant work on the creation of quality content will increase the efficiency and usefulness of the resource, thus expanding its semantics. At the same time, it is important that the SEO specialist must take part in the work on the content so as to include keywords in the right places and proportions. This will help such websites get into the SERP and develop quality traffic.

Also, the importance of a website on the network increases with the number of links to it. Search engines take this important ranking factor into account, and the more high-quality natural links that lead to a site, the higher its position. We can definitely say that the better natural links point to the pages of the target site from significant and high-quality resources, the better for it. Therefore, the targeted work of SEO specialists is to build link mass making sense to carry it out constantly.

Use SEO as a tool to attract an audience on the site wisely and do not stop there. A site, like a business, cannot stand still – it is either developing or fading. Be closer to your audience, study their interests and work on to ensure that your web resource is focused on it, attracts and converts it.

Remember – everything is achieved by constancy and this is right about SEO!

Top 45 Search Engine Research Papers – A Must Read

Top 45 Search Engine Research Papers – A Must Read

Presented here is a rather eclectic collection of search engine research papers written by some of the leading individuals in the search engine scene. I neither propose nor suppose that any or all of the ideas mooted in this collection have made it into actual search engine usage, (though obviously some of them have) but simply that they are at least indicative of the thinking of these individuals who are instrumental in formulating actual search engine algorithms.

It may be interesting to note the affiliations of some of the individuals involved in these research papers:

  • Lawerence Page – Co-founder of Google
  • Sergey Brin – Co-founder of Google
  • Sanjay Ghemawat – Google Engineer
  • Krishna Bharat – Google Engineer
  • George A. Mihaila – Google Engineer
  • Jon M. Kleinberg – Cornell University Associate Professor
  • Monika R. Henzinger – Research Director Google
  • Taher Haveliwala – Google Engineer
  • Sepandar Kamvar – Assistant Professor Stanford
  • Chris Manning – Assistant Professor Stanford
  • Narayanan Shivakumar – Google Engineer
  • Hector Garcia-Molina – Professor Stanford; on the Tech. Advisory board of Yahoo and many more well-known companies…..
There are many more but I’m sure you get the idea.

List of search engine research papers

In this paper, we present Google, a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext. Google is designed to crawl and index the Web efficiently and produce much more satisfying search results than existing systems. The prototype with a full text and hyperlink database of at least 24 million pages is available at To engineer a search engine is a challenging task. Search engines index tens to hundreds of millions of web pages involving a comparable number of distinct terms. They answer tens of millions of queries every day. Despite the importance of large-scale search engines on the web, very little academic research has been done on them. Furthermore, due to the rapid advances in technology and web proliferation, creating a web search engine today is very different from three years ago. This paper provides an in-depth description of our large-scale web search engine — the first such detailed public description we know of to date. Apart from the problems of scaling traditional search techniques to data of this magnitude, there are new technical challenges involved with using the additional information present in hypertext to produce better search results. This paper addresses this question of how to build a practical large-scale system that can exploit the additional information present in hypertext. Also, we look at the problem of how to effectively deal with uncontrolled hypertext collections where anyone can publish anything they want.

The importance of a Web page is an inherently subjective matter, which depends on the reader’s interests, knowledge, and attitudes. But there is still much that can be said objectively about the relative importance of Web pages. This paper describes PageRank, a method for rating Web pages objectively and mechanically, effectively measuring the human interest and attention devoted to them. We compare PageRank to an idealized random Web surfer. We show how to efficiently compute PageRank for large numbers of pages. And, we show how to apply PageRank to search and to user navigation.

MapReduce is a programming model and an associated implementation for processing and generating large data sets. Users specify a _map_ function that processes a key/value pair to generate a set of intermediate key/value pairs and a _reduce_ function that merges all intermediate values associated with the same intermediate key. Many real-world tasks are expressible in this model, as shown in the paper. (Note: this is a program that Google uses to recompile its index in addition to other tasks)

We have designed and implemented the Google File System, a scalable distributed file system for large distributed data-intensive applications. It provides fault tolerance while running on inexpensive commodity hardware, and it delivers high aggregate performance to a large number of clients. While sharing many of the same goals as previous distributed file systems, our design has been driven by observations of our application workloads and technological environment, both current and anticipated, that reflect a marked departure from some earlier file system assumptions. This has led us to reexamine traditional choices and explore radically different design points. The file system has successfully met our storage needs. It is widely deployed within Google as the storage platform for the generation and processing of data used by our service as well as research and development efforts that require large data sets. The largest cluster to date provides hundreds of terabytes of storage across thousands of disks on over a thousand machines, and it is concurrently accessed by hundreds of clients. In this paper, we present file system interface extensions designed to support distributed applications, discuss many aspects of our design, and report measurements from both micro-benchmarks and real-world use.

We investigate using gradient descent methods for learning ranking functions; we propose a simple probabilistic cost function and we introduce RankNet, an implementation of these ideas using a neural network to model the underlying ranking function. We present test results on toy data and on data from a commercial internet search engine.
  • Patterns in Unstructured Data (2003), by Clara Yu, John Cuadrado, Maciej Ceglowski, J. Scott Payne (National Institute for Technology and Liberal Education)
The need for smarter search engines is a presentation suggesting several methods of improving search engine relevancy including latent semantic indexing and multi-dimensional scaling.
In response to a query, a search engine returns a ranked list of documents. If the query is on a popular topic (i.e., it matches many documents) then the returned list is usually too long to view fully. Studies show that users usually look at only the top 10 to 20 results. However, the best targets for popular topics are usually linked to by enthusiasts in the same domain which can be exploited. In this paper, we propose a novel ranking scheme for popular topics that places the most authoritative pages on the query topic at the top of the ranking. Our algorithm operates on a special index of “expert documents.” These are a subset of the pages on the WWW identified as directories of links to non-affiliated sources on specific topics. Results are ranked based on the match between the query and relevant descriptive text for hyperlinks on expert pages pointing to a given result page. We present a prototype search engine that implements our ranking scheme and discuss its performance. With a relatively small (2.5 million page) expert index, our algorithm was able to perform comparably on popular queries with the best of the mainstream search engines.
The network structure of a hyperlinked environment can be a rich source of information about the content of the environment, provided we have effective means for understanding it. We develop a set of algorithmic tools for extracting information from the link structures of such environments and report on experiments that demonstrate their effectiveness in a variety of contexts on the World Wide Web. The central issue we address within our framework is the distillation of broad search topics, through the discovery of “authoritative” information sources on such topics. We propose and test an algorithmic formulation of the notion of authority, based on the relationship between a set of relevant authoritative pages and the set of” hub pages” that join them together in the link structure. Our formulation has connections to the eigenvectors of certain matrices associated with the link graph; these connections, in turn, motivate additional heuristics for link-based analysis.
Webspam uses various techniques to achieve higher-than-deserved rankings in search engines result. While human experts can identify spam, its too expensive to manually evaluate a large number of pages. Instead, we propose techniques to semi-automatically separate reputable, good pages from spam. We first select a small set of seed pages to be evaluated by an expert. Once we manually identify the reputable seed pages we use the linking structure of the web to discover other pages that are likely to be good… Our results show that we can effectively filter out spam from a significant fraction of the web, based on a good seed set of fewer than 200 sites.
Experienced users who query search engines have complex behavior. They explore many topics in parallel, experiment with query variations, consult multiple search engines, and gather information over many sessions. In the process, they need to keep track of search context — namely useful queries and promising result links, which can be hard. We present an extension to search engines called SearchPad that makes it possible to keep track of “search context” explicitly. We describe an efficient implementation of this idea deployed on four search engines: AltaVista, Excite, Google and Hotbot. Our design of SearchPad has several desirable properties: (i) portability across all major platforms and browsers, (ii) instant start requiring no code download or special actions on the part of the user, (iii) no server-side storage, and (iv) no added client-server communication overhead. An added benefit is that it allows search services to collect valuable relevant information about the results shown to the user. In the context of each query, SearchPad can log the actions taken by the user, and in particular record the links that were considered relevant by the user in the context of the query. The service was tested in a multi-platform environment with over 150 users for 4 months and found to be usable and helpful. We discovered that the ability to maintain a search context explicitly seems to affect the way people search. Repeat SearchPad users looked at more search results than is typical on the web, suggesting that availability of search context may partially compensate for non-relevant pages in the ranking.
This paper addresses the problem of topic distillation on theWorld WideWeb, namely, given a typical user query to find quality documents related to the query topic. Connectivity analysis has been shown to be useful in identifying high-quality pages within a topic-specific graph of hyperlinked documents. The essence of our approach is to augment a previous connectivity analysis based algorithm with content analysis. We identify three problems with the existing approach and devise algorithms to tackle them. The results of a user evaluation are reported that show an improvement of precision at 10 documents by at least 45% over pure connectivity analysis.
A search engine for searching a corpus improves the relevancy of the results by refining a standard relevancy score based on the interconnectivity of the initially returned set of documents. The search engine obtains an initial set of relevant documents by matching a user’s search terms to an index of a corpus. A re-ranking component in the search engine then refines the initially returned document rankings so that documents that are frequently cited in the initial set of relevant documents are preferred over documents that are less frequently cited within the initial set.
We observe that the convergence patterns of pages in the PageRank algorithm have a nonuniform distribution. Specifically, many pages converge to their true PageRank quickly, while relatively few pages take a much longer time to converge. Furthermore, we observe that these slow-converging pages are generally those pages with high PageRank. We use this observation to devise a simple algorithm to speed up the computation of PageRank, in which the PageRank of pages that have converged is not recomputed at each iteration after convergence. This algorithm, which we call Adaptive PageRank, speeds up the computation of PageRank by nearly 30%.

The weblink graph has a nested block structure: the vast majority of hyperlinks link pages on a host to other pages on the same host, and many of those that do not link pages within the same domain. We show how to exploit this structure to speed up the computation of PageRank by a 3-stage algorithm whereby (1) the local Page- Ranks of pages for each host are computed independently using the link structure of that host, (2) these local PageRanks are then weighted by the “importance” of the corresponding host, and (3) the standard PageRank algorithm is then run using as its starting vector the weighted aggregate of the local PageRanks. Empirically, this algorithm speeds up the computation of PageRank by a factor of 2 in realistic scenarios. Further, we develop a variant of this algorithm that efficiently computes many different “personalized” PageRanks, and a variant that efficiently recomputes PageRank after node updates.

We present a novel algorithm for the fast computation of PageRank, a hyperlink-based estimate of the “importance” of Web pages. The original PageRank algorithm uses the Power Method to compute successive iterates that converge to the principal eigenvector of the Markov matrix representing the Web link graph. The algorithm presented here, called Quadratic Extrapolation, accelerates the convergence of the Power Method by periodically subtracting off estimates of the non-principal eigenvectors from the current iterate of the Power Method. In Quadratic Extrapolation, we take advantage of the fact that the first eigenvalue of a Markov matrix is known to be 1 to compute the non-principal eigenvectors using successive iterates of the Power Method. Empirically, we show that using Quadratic Extrapolation speeds up PageRank computation by 25– 300% on a web graph of 80 million nodes, with minimal overhead. Our contribution is useful to the PageRank community and the numerical linear algebra community in general, as it is a fast method for determining the dominant eigenvector of a matrix that is too large for standard fast methods to be practical.

We identify crucial design issues in building a distributed inverted index for a large collection of Web pages. We introduce a novel pipelining technique for structuring the core index-building system that substantially reduces the index construction time. We also propose a storage scheme for creating and managing inverted files using an embedded database system. We suggest and compare different strategies for collecting global statistics from distributed inverted indexes. Finally, we present performance results from experiments on a testbed distributed indexing system that we have implemented.

A great challenge for data mining techniques is the huge space of potential rules which can be generated. If there are tens of thousands of items, then potential rules involving three items number in the trillions. Traditional data mining techniques rely on downward-closed measures such as support to prune the space of rules. However, in many applications, such pruning techniques either do not sufficiently reduce the space of rules, or they are overly restrictive. We propose a new solution to this problem, called Dynamic Data Mining (DDM). DDM foregoes the completeness offered by traditional techniques based on downward-closed measures in favor of the ability to drill deep into the space of rules and provide the user with a better view of the structure present in a data set. Instead of a single deterministic run, DDM runs continuously, exploring more and more of the rule space. Instead of using a downward-closed measure such as support to guide its exploration, DDM uses a user-defined measure called weight, which is not restricted to be downward closed. The exploration is guided by a heuristic called the Heavy Edge Property. The system incorporates user feedback by allowing weight to be redefined dynamically. We test the system on a particularly difficult data set – the word usage in a large subset of the World Wide Web. We find that Dynamic Data Mining is an effective tool for mining such difficult data sets.

Many applications compute aggregate functions (such as COUNT, SUM) over an attribute (or set of attributes) to find aggregate values above some specified threshold. We call such queries iceberg queries because the number of above-threshold results is often very small (the tip of an iceberg), relative to a large amount of input data (the iceberg). Such iceberg queries are common in many applications, including data warehousing, information-retrieval, market basket analysis in data mining, clustering, and copy detection. We propose efficient algorithms to evaluate iceberg queries using very little memory and significantly fewer passes over data, as compared to current techniques that use sorting or hashing. We present an experimental case study using over three gigabytes of Web data to illustrate the savings obtained by our algorithms.

The World Wide Web is a vast resource for information. At the same time, it is extremely distributed. A particular type of data such as restaurant lists may be scattered across thousands of independent information sources in many different formats. In this paper, we consider the problem of extracting a relation for such a data type from all of these sources automatically. We present a technique that exploits the duality between sets of patterns and relations to grow the target relation starting from a small sample. To test our technique we use it to extract a relation of (author, title) pairs from the World Wide Web.

We consider how to efficiently compute the overlap between all pairs of web documents. This information can be used to improve web crawlers, web archivers and in the presentation of search results, among others. We report statistics on how common replication is on the web, and on the cost of computing the above information for a relatively large subset of the web { about 24 million web pages which corresponds to about 150 Gigabytes of textual information.

Web searching and browsing can be improved if browsers and search engines know which pages users frequently visit. ‘Web tracking’ is the process of gathering that information. The goal for Web tracking is to obtain a database describing Web page download times and users’ page traversal patterns. The database can then be used for data mining or for suggesting popular or relevant pages to other users. We implemented three Web tracking systems and compared their performance. In the first system, rather than connecting directly to Web sites, a client issues URL requests to a proxy. The proxy connects to the remote server and returns the data to the client, keeping a log of all transactions. The second system uses “sniffers” to log all HTTP traffic on a subnet. The third system periodically collects browser log files and sends them to a central repository for processing. Each of the systems differs in its advantages and pitfalls. We present a comparison of these techniques.

In this paper, we study how to make web servers (e.g., Apache) more crawler friendly. Current web servers offer the same interface to crawlers and regular web surfers, even though crawlers and surfers have very different performance requirements. We evaluate simple and easy-to-incorporate modifications to web servers so that there are significant bandwidth savings. Specifically, we propose that web servers export meta-data archives describing their content.

Many information sources on the web are relevant primarily to specific geographical communities. For instance, web sites containing information on restaurants, theatres, and apartment rentals are relevant primarily to web users in geographical proximity to these locations. We make the case for identifying and exploiting the geographical location information of web sites so that web applications can rank information in a geographically sensitive fashion. For instance, when a user in Palo Alto issues a query for “Italian Restaurants,” a web search engine can rank results based on how close such restaurants are to the user’s physical location rather than based on traditional IR measures. In this paper, we first consider how to compute the geographical location of web pages. Subsequently, we consider how to exploit such information in one specific “proof-of-concept” application we implemented in JAVA.

Today’s Internet search engines help users locate information based on the textual similarity of a query and potential documents. Given a large number of documents available, the user often finds too many documents, and even if the textual similarity is high, in many cases the matching documents are not relevant or of interest. Our goal is to explore other ways to decide if documents are “of value” to the user, i.e., to perform what we call “value filtering.” In particular, we would like to capture access information that may tell us-within limits of privacy concerns-which user groups are accessing what data, and how frequently. This information can then guide users, for example, helping identify information that is popular, or that may have helped others before. This is a type of collaborative filtering or community-based navigation. Access information can either be gathered by the servers that provide the information, or by the clients themselves. Tracing accesses at servers is simple, but often information providers are not willing to share this information. We, therefore, are exploring client-side gathering. Companies like Alexa are currently using client gathering in the large. We are studying client gathering at a much smaller scale, where a small community of users with shared interest collectively track their information accesses. For this, we have developed a proxy system called the Knowledge Sharing System (KSS) that monitors the behavior of a community of users. Through this system, we hope to 1. Develop mechanisms for sharing browsing expertise among a community of users; and 2. Better understand the access patterns of a group of people with common interests, and develop good schemes for sharing this information.

In the face of small, one or two word queries, high volumes of diverse documents on the Web are overwhelming search and ranking technologies that are based on document similarity measures. The increase of multimedia data within documents sharply exacerbates the shortcomings of these approaches. Recently, research prototypes and commercial experiments have added techniques that augment similarity-based search and ranking. These techniques rely on judgments about the ‘value’ of documents. Judgments are obtained directly from users, are derived by conjecture based on observations of user behavior, or are surmised from analyses of documents and collections. All these systems have been pursued independently, and no common understanding of the underlying processes has been presented. We survey existing value-based approaches, develop a reference architecture that helps compare the approaches, and categorize the constituent algorithms. We explain the options for collecting value metadata, and for using that metadata to improve search, the ranking of results, and the enhancement of information browsing. Based on our survey and analysis, we then point to several open problems.

  • Searching the Web (2001), by Arvind Arasu, Junghoo Cho, Hector Garcia-Molina, Andreas Paepcke, and Sriram Raghavan (Stanford University)

We offer an overview of the current Web search engine design. After introducing a generic search engine architecture, we examine each engine component in turn. We cover crawling, local Web page storage, indexing, and the use of link analysis for boosting search performance. The most common design and implementation techniques for each of these components are presented. We draw for this presentation from the literature, and from our own experimental search engine testbed. Emphasis is on introducing the fundamental concepts, and the results of several performance analyses we conducted to compare different designs.

In this paper, we study in what order a crawler should visit the URLs it has seen, in order to obtain more “important” pages first. Obtaining important pages rapidly can be very useful when a crawler cannot visit the entire Web in a reasonable amount of time. We define several important metrics, ordering schemes, and performance evaluation measures for this problem. We also experimentally evaluate the ordering schemes on the Stanford University Web. Our results show that a crawler with a good ordering scheme can obtain important pages significantly faster than one without.

Many online data sources are updated autonomously and independently. In this paper, we make the case for estimating the change frequency of the data, to improve web crawlers, web caches and to help data mining. We first identify various scenarios, where different applications have different requirements on the accuracy of the estimated frequency. Then we develop several “frequency estimators” for the identified scenarios. In developing the estimators, we analytically show how precise/effective the estimators are, and we show that the estimators that we propose can improve precision significantly.

Many web documents (such as JAVA FAQs) are being replicated on the Internet. Often entire document collections (such as hyperlinked Linux manuals) are being replicated many times. In this paper, we make the case for identifying replicated documents and collections to improve web crawlers, archivers, and ranking functions used in search engines. The paper describes how to efficiently identify replicated documents and hyperlinked document collections. The challenge is to identify these replicas from an input data set of several tens of millions of web pages and several hundreds of gigabytes of textual data. We also present two real-life case studies where we used replication information to improve a crawler and a search engine. We report these results for a data set of 25 million web pages (about 150 gigabytes of HTML data) crawled from the web.

  • Parallel Crawlers (2002), by Junghoo Cho, Hector Garcia-Molina (Stanford University)

In this paper, we study how we can design an effective parallel crawler. As the size of the Web grows, it becomes imperative to parallelize a crawling process, in order to finish downloading pages in a reasonable amount of time. We first propose multiple architectures for a parallel crawler and identify fundamental issues related to parallel crawling. Based on this understanding, we then propose metrics to evaluate a parallel crawler and compare the proposed architectures using 40 million pages collected from the Web. Our results clarify the relative merits of each architecture and provide a good guideline on when to adopt which architecture.

In this paper, we study how to build an effective incremental crawler. The crawler selectively and incrementally updates its index and/or local collection of web pages, instead of periodically refreshing the collection in batch mode. The incremental crawler can improve the “freshness” of the collection significantly and bring in new pages in a more timely manner. We first present results from an experiment conducted on more than half-million web pages over 4 months, to estimate how web pages evolve over time. Based on these experimental results, we compare various design choices for an incremental crawler and discuss their trade-offs. We propose an architecture for the incremental crawler, which combines the best design choices.

In this paper, we study how to refresh a local copy of an autonomous data source to maintain the copy up-to-date. As the size of the data grows, it becomes more difficult to maintain the copy “fresh,” making it crucial to synchronize the copy effectively. We define two freshness metrics, change models of the underlying data, and synchronization policies. We analytically study how effective the various policies are. We also experimentally verify our analysis, based on data collected from 270 web sites for more than 4 months, and we show that our new policy improves the “freshness” very significantly compared to current policies in use.

This paper discusses efficient techniques for computing PageRank, a ranking metric for hypertext documents. We show that PageRank can be computed for very large subgraphs of the web (up to hundreds of millions of nodes) on machines with limited main memory. Running-time measurements on various memory configurations are presented for PageRank computation over the 24-million-page Stanford WebBase archive. We discuss several methods for analyzing the convergence of PageRank based on the induced ordering of the pages. We present convergence results helpful for determining the number of iterations necessary to achieve a useful PageRank assignment, both in the absence and presence of search queries.

Finding pages on the Web that are similar to a query page (Related Pages) is an important component of modern search engines. A variety of strategies have been proposed for answering Related Pages queries, but comparative evaluation by user studies are expensive, especially when large strategy spaces must be searched (e.g., when tuning parameters). We present a technique for automatically evaluating strategies using Web hierarchies, such as Open Directory, in place of user feedback. We apply this evaluation methodology to a mix of document representation strategies, including the use of text, anchor-text, and links. We discuss the relative advantages and disadvantages of the various approaches examined. Finally, we describe how to efficiently construct a similarity index out of our chosen strategies, and provide sample results from our index.

Clustering is one of the most crucial techniques for dealing with the massive amount of information present on the web. Clustering can either be performed once offline, independent of search queries or performed online on the results of search queries. Our offline approach aims to efficiently cluster similar pages on the web, using the technique of Locality-Sensitive Hashing (LSH), in which web pages are hashed in such a way that similar pages have a much higher probability of collision than dissimilar pages. Our preliminary experiments on the Stanford WebBase have shown that the hash-based scheme can be scaled to millions of URLs.

Allowing users to find pages on the web similar to a particular query page is a crucial component of modern search engines. A variety of techniques and approaches exist to support “Related Pages” queries. In this paper, we discuss shortcomings of previous approaches and present a unifying approach that puts special emphasis on the use of text, both within anchors and surrounding anchors. In the central contribution of our paper, we present a novel technique for automating the evaluation process, allowing us to tune our parameters to maximize the quality of the results. Finally, we show how to scale our approach to millions of web pages, using the established Locality-Sensitive-Hashing technique.

In the original PageRank algorithm for improving the ranking of search-query results, a single PageRank vector is computed, using the link structure of the Web, to capture the relative “importance” of Web pages, independent of any particular search query. To yield more accurate search results, we propose computing a set of PageRank vectors, biased using a set of representative topics, to capture more accurately the notion of importance with respect to a particular topic. By using these (precomputed) biased PageRank vectors to generate query-specific importance scores for pages at query time, we show that we can generate more accurate rankings than with a single, generic PageRank vector. For ordinary keyword search queries, we compute the topic-sensitive PageRank scores for pages satisfying the query using the topic of the query keywords. For searches done in context (e.g., when the search query is performed by highlighting words in a Web page), we compute the topic-sensitive PageRank scores using the topic of the context in which the query appeared.

In this paper, we consider the design of a reliable multicast facility over an unreliable multicast network. Our multicast facility has several interesting properties: it has different numbers of clients interested in each data packet, allowing us to tune our strategy for each data transmission; has recurring data items, so that missed data items can be rescheduled for later transmission; and allows the server to adjust the schedule according to loss information. We exploit the properties of our system to extend traditional reliability techniques for our case and use performance evaluation to highlight the resulting differences. We find that our reliability techniques can reduce the average client wait time by over thirty percent.

Current-day crawlers retrieve content only from the publicly indexable Web, i.e., the set of web pages reachable purely by following hypertext links, ignoring search forms and pages that require authorization or prior registration. In particular, they ignore the tremendous amount of high-quality content “hidden” behind search forms, in large searchable electronic databases. In this paper, we provide a framework for addressing the problem of extracting content from this hidden Web. At Stanford, we have built a task-specific hidden Web crawler called the Hidden Web Exposer (HiWE). We describe the architecture of HiWE and present a number of novel techniques that went into its design and implementation. We also present results from experiments we conducted to test and validate our techniques.

A Web repository is a large special-purpose collection of Web pages and associated indexes. Many useful queries and computations over such repositories involve traversal and navigation of the Web graph. However, efficient traversal of huge Web graphs containing several hundred million vertices and a few billion edges is a challenging problem. This is further complicated by the lack of a schema to describe the structure of Web graphs. As a result, naive graph representation schemes can significantly increase query execution time and limit the usefulness of Web repositories. In this paper, we propose a novel representation for Web graphs, called an extit{S-Node representation}. We demonstrate that S-Node representations are highly space-efficient, enabling in-memory processing of very large Web graphs. In addition, we present detailed experiments that show that by exploiting some empirically observed properties of Web graphs, S-Node representations significantly reduce query execution times when compared with other schemes for representing Web graphs.

We identify crucial design issues in building a distributed inverted index for a large collection of web pages. We introduce a novel pipelining technique for structuring the core index-building system that substantially reduces the index construction time. We also propose a storage scheme for creating and managing inverted files using an embedded database system. We propose and compare different strategies for addressing various issues relevant to distributed index construction. Finally, we present performance results from experiments on a testbed distributed indexing system that we have implemented.

Web crawlers generate significant loads on Web servers and are difficult to operate. Instead of repeatedly running crawlers at many “client” sites, we propose a central crawler and Web repository that multicasts appropriate subsets of the central repository, and their subsequent changes, to subscribing clients. Loads at Web servers are reduced because a single crawler visits the servers, as opposed to all the client crawlers. In this paper, we model and evaluate such a central Web multicast facility for subscriber clients, and for mixes of subscriber and one-time downloader clients. We consider different performance metrics and multicast algorithms for such a multicast facility and develop guidelines for its design under various conditions.

In response to a query, a search engine returns a ranked list of documents. If the query is broad (i.e., it matches many documents) then the returned list is usually too long to view fully. Studies show that users usually look at only the top 10 to 20 results. In this paper, we propose a novel ranking scheme for broad queries that places the most authoritative pages on the query topic at the top of the ranking. Our algorithm operates on a special index of “expert documents.” These are a subset of the pages on the WWW identified as directories of links to non-affiliated sources on specific topics. Results are ranked based on the match between the query and relevant descriptive text for hyperlinks on expert pages pointing to a given result page. We present a prototype search engine that implements our ranking scheme and discuss its performance. With a relatively small (2.5 million page) expert index, our algorithm was able to perform comparably on broad queries with the best of the mainstream search engines.

In this paper, we study the problem of constructing and maintaining a large shared repository of web pages. We discuss the unique characteristics of such a repository, propose an architecture, and identify its functional modules. We focus on the storage manager module and illustrate how traditional techniques for storage and indexing can be tailored to meet the requirements of a web repository. To evaluate design alternatives, we also present experimental results from a prototype repository called “WebBase”, that is currently being developed at Stanford University.

We describe the design, prototyping, and evaluation of ARC, a system for automatically compiling a list of authoritative Web resources on any (sufficiently broad) topic. The goal of ARC is to compile resource lists similar to those provided by Yahoo! or Infoseek. The fundamental difference is that these services construct lists either manually or through a combination of human and automated effort, while ARC operates fully automatically. We describe the evaluation of ARC, Yahoo!, and Infoseek resource lists by a panel of human users. This evaluation suggests that the resources found by ARC frequently fare almost as well as, and sometimes better than, lists of resources that are manually compiled or classified into a topic. We also provide examples of ARC resource lists for the reader to examine.

Search Engine Patents

There are literally hundreds of patents which have either been applied for or awarded to search engines and researchers in the search and information retrieval fields.

I do not believe, as some seem to, that the fact that a patent has been applied for or has been granted is in itself evidence that the search engine in question has actually applied any portion of the patent to a live search engine, (though there are obvious cases where they have been) but prefer to think that patents are taken out for many reasons including:

  • Protection of techniques that are in fact used by search engines.
  • Protection of promising research that search engines believe may be useful one day.
  • Protection of techniques that search engines intend to use at a later date perhaps when new technologies make it feasible.
  • Protection of techniques that search engines intend to use in the near future.
  • Securing rights to techniques that search engines do not intend to use immediately but which they do not want available to their competitors.
  • It may even be that some portions of some patents are placed there to send us off down the wrong trail.


These are papers that I have found and that interest me. I am always interested to find additional resources, so. If you have any suggestions for papers which might be interesting please feel free to email me at for review and possible inclusion

Are Black-hat SEO Techniques Good or Bad?

Cloaking – the unethical uses

Not all cloaking is bad despite the oft-heard cries that cloaking will get you banned but if you use cloaking in an attempt to deceive the search engines you can expect trouble.

Cloaking is often used as an acronym for using scripts or programs to show one page to search engines and quite another to viewers. Since search engines want their rankings to be based on the page which the viewer sees, this tends to distort the relevancy of the page to the rankings it receives. Search engines frown on all attempts to deceive them and will most likely remove your site from their index if they find you using cloaking to deceive them.

Cloaking – The ethical uses

Like most tools cloaking can be used ethically and usefully and in most cases, search engines have no problem with such uses, and in fact, often use cloaking and redirects themselves. If you use a script to detect what browser your viewer is using and based on that show a slightly different page to viewers using that browser, this should not be a problem and likewise detecting the location or language of a user and redirecting to pages created for that location or language is an ethical scheme which is used to improve the user experience.

I realize that there will be many who will disagree with my suggestions here that there are any ethical uses for cloaking, but you will find on examination that search engines themselves employ cloaking to redirect users to various national and language sites in order to improve the user experience.

Doorway pages are antiquated

Back in the dark days when search engines were slowly groping their way to relevancy, it seemed that every search engine had a different algorithm and what worked for one search engine would not work for another. These days all the search engines are using some combination of on-page content combined with link popularity and anchor text links to rank their results.

Back in those days webmasters often resorted to building several pages on any one topic, each of them optimized in the fashion of the day for one popular search engine.

Every page a doorway page

Today the arguably most important search engine is Google which uses again a combination of on-page content analysis, with PageRank link popularity and a strong emphasis on inbound anchor text links to produce their ranking results. I find that pages that are optimized to rank well in Google will usually rank well in both Yahoo and Bing, which taken together covers 90% of the search engine market. You do have to do things like add meta description tags and keywords meta tags to help to rank in the last two engines and you will need more anchor text links to rank well in Google, but if you do your optimization for Google correctly you should rank even higher in these other search engines.

Following the KISS principle, I believe that doorway pages are well… err as dead as a doornail.


Be sure you clearly understand the do’s and don’ts of cloaking before you attempt to use it.

If you have any questions make sure you are clear and certain on the likely search engine reactions or else don’t do it.

The Fantomaster site is perhaps one of the best on the web with regards to cloaking. If you have questions regarding cloaking that may be a good place to go for answers.

Read this post: Top Off-Page SEO Techniques

Important Google Algorithm Updates

Important Google Algorithm Updates

For any SEO, it’s critical to keep up with Google’s new algorithm updates to understand traffic patterns on your website as well as best practices.

In this list of Google Algorithm Updates, we’ll not only document the Search Engine Optimization updates, but the full story of what’s REALLY going on in the trenches including best practices, strategies, and case studies from the community. 

2017 Updates

Intrusive Interstitial Penalty (The Popup Penalty) – Jan 10, 2017

Google’s penalty to publish sites with intrusive popups went into effect.

Google Mobile Popup Update (HOTH Blog)

2016 Updates

Google Penguin 4.0 Announced – Friday, September 23, 2016

Google Penguin 4 was announced and includes a few pieces. First, it is now a part of the core algorithm and will update in real-time. Second, it will be more “granular” or page specifically as opposed to affecting the entire domain.

Google Penguin 4.0, Possum, And Craziness Update September 2016 (HOTH Blog)
Penguin 4 Official Announcement (Google Webmaster Central)

Google Went Crazy September – Early to Mid September 2016

Around Sept 1-2 many tools reported high SERP fluctuations, especially in local search. Unfortunately, there hasn’t been a lot of data to support what exactly changed. Google’s results started changing again around the 15th, so we are waiting for things to calm down.

Is a big Google search update happening? (SEL)
Google downplays the algorithm ranking update this week as “normal fluctuations” (SEL)

Mobile-Friendly Boost Update – May 12, 2016

This was another update that gave a slight boost to sites that are mobile-friendly within mobile search results. As with the AMP Project, Google seems to be really focused on mobile, but with good reason.

Continuing to make the web more mobile-friendly (Google)

Adwords Change – Feb 23, 2016

Google removes sidebar ads in the search results and adds a 4th ad to the top block.

The Winners and Losers of The Adwords Shakeup – (HOTH)
Google AdWords Switching to 4 Ads on Top, None on Sidebar

Ghost Update – Jan 8, 2016

Lots of tools reported changes / SERP fluctuations around these dates in early January. Most SEOs expected this to be the new Penguin update, but Google denies this. Google said later on that this was a core algorithm update. There were no reports of huge losses.

Google Core Ranking Update (SER)

2015 Updates

Rank Brain Algo Change – Announced Oct 26, 2015, Went Live Months Before

Google announced a change to its algorithm called Rank Brain – Basically Artificial Intelligence learning. There are no new glaring differences in ranking factors, however.

RankBrand & Artificial Intelligence? (Bloomberg)

Google Zombie Update – Oct 14th / Oct 15th, 2015

This wasn’t an official update however many webmasters reported big fluctuations around this time. There was a huge thread at the webmaster’s world about it.

Google Zombie Update? (SER)

Google Snack Pack / Local 3 Pack – August 2015

Not an algo change but an important update – Google rolled out a new design for local, getting rid of the normal 7 pack (map) and changing it to a 3 pack. This raised a few different points of discussion as we noted in our article on the 3 pack change here.

Google Local Snack Pack Shakeup (Moz)

Panda 4.2 – July 17, 2015

Google announced a Panda update, not much happened. They said it would take months to roll out.

Everything We Know About Panda 4.2 (SEM Post)

Google Quality Update – May 3, 2015

This was called a “Phantom 2” update and obviously something happened, but it wasn’t confirmed until after the fact. Google didn’t specify anything except “quality signals” change.

The Quality Update (SEL)

Google Mobilegeddon Mobile Update – April 22, 2015

Google updated its algorithm to change the way results are ranked on mobile devices. It gave preference to sites who were mobile-friendly and demoted sites who are not mobile-friendly / responsive.

How To Prepare For Mobilegeddon (The HOTH)
Google: Mobile Friendly Update (SEL)

What Really Happened & How to Beat This Update:
Google released this update the impact was less than expected. We created an article with all the information on how to check if your site is affected here: Google Mobile Uupdate

Name Update – February 4, 2015

There was no official update but many SERP tracking tools reported movement.

Google Update Feb 2015 (SER)

2014 Updates

Penguin 3 – Oct 18, 2014

After a year since the last major Penguin update, Penguin 3 started rolling out this past weekend. What was expected to be a brutal release seems to be relatively light in comparison to other updates. According to Google, it affected 1% of US English Queries and this is a multi-week roll out. To give some comparison, the original Penguin update affected >3% (3x) the queries. There are many reports of recoveries for those who had previous penalties, did link remediation / disavow.

Google Releases Penguin 3.0 (SEL)
Google Confirms Penguin 3.0 Update (SEJ)
Penguin Update Official (Google)

What Really Happened & How to Beat This Update:
Seems like this update was lighter than expected. Across the sites we track, we haven’t seen much out of the ordinary. Keep in mind that Penguin is traditionally keyword specific and not a site-wide penalty, so take a look at any specific keywords that dropped or pages that dropped and adjust accordingly.

We’ve seen a lot of reports of recovery. Usually, if you were hit by a Penguin penalty in the past, you would need to fix/remove/disavow over-optimized links and wait for an update. Many webmasters have been waiting this whole year for an update and it finally arrived.

Take a look at our Penguin recovery guide here.

Panda 4.1 – Sep 26, 2014

Panda 4.1 started earlier this week and will continue into next week, affecting 3-5% of queries (which is substantial). According to Google “Based on user (and webmaster!) feedback, we’ve been able to discover a few more signals to help Panda identify low-quality content more precisely. This results in a greater diversity of high-quality small- and medium-sized sites ranking higher, which is nice.”

Panda 4.1 — Google’s 27th Panda Update — Is Rolling Out (SEL)

Google Starts De-Indexing Private Blog Networks In Mass – Sep 18, 2014

Although Google has been de-indexing public blog networks publicly for years, we started hearing first reports of the de-indexing of private / semi-private networks. Notable articles from NoHat, ViperChill, NichePursuits, and others came out with varying opinions on the matter.

Special Note To HOTH Users: This update doesn’t affect HOTH users, as we point exactly 0 PBN links to your sites. See our link building strategy here.

No Hat Digital: PBN Sites De-Indexed, How Bad Were We Hit?
ViperChill: Why I’ll Keep Growing My Private Link Network After Google’s “Crackdown”
Niche Pursuits: Alright Google, You Win…I’ll Never Use Private Blog Networks Again!
Source Wave: The Death of PBNS

What Really Happened & How to Beat This Update:
Some amount of de-indexing is normal, and anyone who has ever run a network of any substantial size knows this. What is not normal is having 50% of your network taken out in one fell swoop.

There isn’t a lot of data about what “footprint” has been caught, but common footprints include using seo hosting, not changing whois info, not using co-citations, and more.

Even if you do things “correctly”, there may not have been a single footprint that makes a site get caught and it could be a combination of factors, including a low website Quality Score.

PBNs are not dead. And if you did things right, or at least pretty close, you shouldn’t see a big change in de-indexing. With that said, times do evolve, SEOs get smarter and we diversify strategies. In the meantime, High PR links still work (Google usually goes after what is currently working).

The best “keep my PBN safe” info that has come out has been from Veteran SEO Stephen Floyd (SEOFloyd) in his Bullet Proof SEO Course.

Google Drops Authorship From Search Results Completely – Aug 28, 2014

After dropping authorship photos from search results, Google completely removed authorship from its search results.

It’s Over: The Rise & Fall Of Google Authorship For Search Results (SEL)
Official Announcement

SSL Becomes Ranking Factor – Aug 7, 2014

Google says it will give sites using SSL a minor boost in rankings. No one cares because this is going to be such a minor minor minor factor we even feel bad including this as an update here.

Google Starts Giving A Ranking Boost To Secure HTTPS/SSL Sites

Google Pigeon – July 24, 2014

Google updated its local search algo to include more signals from traditional search like knowledge graph, spelling correction, synonyms and more. The language used is vague, but early evidence just shows a significant drop in the amount of “local packs” being used. Search Engine Land just made up the name for this update, not to be confused with one of Google’s previous April fools day jokes. The other feature of this update is that Google is now blending 7 pack rankings with organic factors, meaning that domain authority of the organic site linked to the Google Local page will help 7 pack rankings.

Google Makes Significant Changes to Local Search

What Really Happened & How to Beat This Update:
Big decline in 7 packs
Local Directories Got A Boost – You should make sure you are in all the relevant local directories since they are getting a boost now. (P.S. You can do that with HOTH Local!)

No Photos on Authorship – June 28, 2014

Photos from the author no longer appear in the SERPs for results with authorship markup. Now they just display the author’s name in text format.

Google Announces End of Author Photos (Moz)

Panda 4.0 – May 19, 2014

The latest addition to the panda update family. Sources say this was a softer update and some sites got a boost that was previously hit by this update. We noticed some sites are getting dinged for having on-site over-optimization. There has been some discussion that now rankings are taking longer than normal – Up to 6 weeks for links to be recognized and re-calculated.

Official Announcement
Google Begins Rolling Panda 4.0 Out Now (SEL)
Panda 4 (Moz)

What Really Happened & How to Beat This Update:
Learn more about Google Panda (HOTH)
The Answers to Google’s 2014 First Algorithm Update (Search Highway) – Relevant
Panda 4 + 5 Steps To Avoid Any Penalty (Source-Wave)
Press releases took a big hit invisibility
4 Week Factor / Slow Rankings

Payday Loan / High-Spam Searches Update 2.0 — May 16, 2014

Right before Panda 4.0, Google rolled out an update that targets queries that are traditionally spammed (SEO-wise). Google claims the update happened around 5/20 and it makes it hard to tell as Panda 4 came out around almost the same time.

Official Announcement

No Name Update — March 24, 2014

Lots of rank trackers and data reported heavy fluctuations, but no update was confirmed by Google.

Did Google Do An Algorithm Update Yesterday? (SERT)

Page Layout #3 — February 6, 2014

A refresh to the page layout algo, originally from Jan 2012 which targets sites that have tons of ads, especially above the fold (in the top section of the website).

Official Announcement

Unnamed Update – January 8th, 2014

An unnamed/unofficial update came out around this time. This was not an official update.

Is Google Search Updating? January 8th & 9th (SER)

What Really Happened & How to Beat This Update:
The Answers to Google’s 2014 First Algorithm Update (Search Highway)

2013 Updates

Authorship Change — December 19, 2013

Matt Cutts leaked that authorship markup was going to play less of a part going forward and around Dec 19, we saw a drop off of about 15% over a period of a month.

Authorshipocalypse! The Great Google Authorship Purge Has Begun (Virante)

No Name Update — December 17, 2013

Almost all algo change trackers showed high activity around Dec 17th, although Google did not confirm an update.

Google Denies A Major Update On December 17th (SEL)

No Name Update — November 14, 2013

Reports went out of unusual activity, which appeared alongside of reports of widespread DNS errors in Google Webmaster Tools. This was not official and Google did not confirm any updates.

Was There a November 14th Google Update? (Moz)

Penguin 2.1 (#5) — October 4, 2013

This does not appear to be a major change to the Penguin algo, just an update.

Penguin 5, With The Penguin 2.1 Spam-Filtering Algorithm, Is Now Live (SEL)

What Really Happened & How to Beat This Update:
Learn more about Google Penguin (HOTH)
You’ll want to just continue to maintain diverse contextual link building strategies.

Hummingbird — August 20, 2013

Google announced the new update on Sep 26 and suggested that “Hummingbird” actually rolled out about a month before around August 20th. Hummingbird is an update that better interprets the way text and queries are typed into Google. There were no widespread reports of penalties like Penguin or Panda.

Google Reveals “Hummingbird” Search Algorithm (SEL)

More About Hummingbird:
Learn more about Google Hummingbird (HOTH)

In-depth Article Update — August 6, 2013

Google is now featuring a new type of content in their search results called “In-depth articles” that is meant for long articles that cover a topic from a-z.

How To Appear in Google’s In-Depth Articles Feature

No Name Update — July 26, 2013

Another non-confirmed google update, however, there were large spikes in search engine tracking activity.

Was There A Weekend Google Update? (SER)

Expansion of Knowledge Graph — July 19, 2013

Significantly more amounts of Knowledge Graph data started appearing in search results, increasing the appearance to nearly 25% of all searches.

The Day the Knowledge Graph Exploded (Moz)

Panda Update (Fine Tuning) — July 18, 2013

A new Panda update that sources reported as being “softer” than others – possibly loosening up the rungs of previous updates. This one rolled out over a 10 day period.

Confirmed: Google Panda Update: The “Softer” Panda Algorithm (SER)

What Really Happened & How to Beat This Update:
Learn more about Google Panda (HOTH)

Multi-Week Update — June 27, 2013

Screaming frog tweeted that someone was spamming for “Car Insurance” and it ranked them #2, Cutts tweets with a reply saying “Multi-week rollout going on now, from next week all the way to the week after July 4th.”

Twitter Thread
Multi-Week Update Rolling Out (SER)

Panda Dance — June 11, 2013

Matt Cutts clarified that Google rolls out Panda updates constantly over a period of about 10 days, almost every month. They also said they are unlikely to announce future Panda updates since they are on-going.

Google’s Panda Dance: Matt Cutts Confirms Panda Rolls Out Monthly Over 10 Of 30 Days (SEL)

What Really Happened & How to Beat This Update:
Learn more about Google Panda (HOTH)

Payday Loan / Spam Query Update — June 11, 2013

Google Announced an update to the algo that specifically targets queries that are regularly spammed for SEO including payday loans, porn, and others.

Google Payday Loan Algorithm: Google Search Algorithm Update To Target Spammy Queries (SEL)

Penguin 2.0 (#4) — May 22, 2013

Google rolled out Penguin 2.0, the 4th iteration of Penguin affecting 2.3% of English queries. This was an update to the algo, not just a data refresh. This was long-awaited since it was ~6 months from the last one.

Penguin 4 Live (SERT)
Official Announcement

What Really Happened & How to Beat This Update:
Learn more about Google Penguin (HOTH)

Domain Crowding / Diversity — May 21, 2013

An update to help increase the amount of diversity in the SERPs. Previously there were problems where one domain would take up too many spots on the page.

Google’s Matt Cutts: Domain Clustering To Change Again; Fewer Results From Same Domain

No Name Update — May 9, 2013

Reports of algo activity, but nothing official from Google.

Large Google Update Happening Now (SER)

Panda Update 25 — March 13-14, 2013

No exact confirmation, but data suggests that an update to panda hit around Mar 13-14.

Google Panda Update 25 Seems To Have Hit (SEL)

What Really Happened & How to Beat This Update:
Learn more about Google Panda (HOTH)

Panda Update 24 — January 22, 2013

Google announced a 24th update to panda affecting 1.2 of search queries.

Google Panda Update Version #24; 1.2% Of Search Queries Impacted (SEL)

What Really Happened & How to Beat This Update:
Learn more about Google Panda (HOTH)

2012 Updates

Panda #23 — December 21, 2012

Panda hits right before the holidays. After this Google says they will try to avoid updates around the holidays.

Official: It’s Google Panda Update 23, Impacting ~1.3% Of Queries

Knowledge Graph Expansion — December 4, 2012

Knowledge graph becomes available on foreign language queries in Spanish, French, German, Portuguese, Japanese, Russian, and Italian.

Official Announcement

Panda #22 — November 21, 2012

Google confirms Panda refresh 22, affecting 0.8% of queries.

Official Google Panda #22 Update: November 21 (SER)

What Really Happened & How to Beat This Update:
Learn more about Google Panda (HOTH)

Panda #21 — November 5, 2012

Another Google Panda update about a month and a half after the last update. Reported to have affected 1.1% of English queries.

Google Releases Panda Update 21 (SEL)

What Really Happened & How to Beat This Update:
Learn more about Google Panda (HOTH)

Page Layout Update #2 — October 9, 2012

An update to the page layout update which affected sites that had too many ads above the fold.

It’s “Top Heavy 2″ As Google Rolls Out Update To Its Page Layout Algorithm (SEL)

Penguin #3 — October 5, 2012

Penguin 3 wasn’t as bad as expected.

Google Released 3rd Penguin Update: Not Jarring Or Jolting (SER)

What Really Happened & How to Beat This Update:
Learn more about Google Penguin (HOTH)

August & September 65 Changes Pack — October 4, 2012

Google published a post with 65 updates that went live between August & September 2012. The list includes updates to Knowledge Graph, a Panda refresh, & Knowledge Graph Carousel.

Official Announcement

Exact-Match Domain (EMD) Update — September 27, 2012

Before this update, it seemed as EMDs were getting a boost. So for instance, if you wanted to rank for “Dallas dentist” having would be a huge boost. This technique caught on and was pretty widespread in the SEO community, so Google updated its algo to reduce the boost that these got. This also plays into the over-optimization penalties. Now it’s recommended that you just get brandable / non-keyword rich domains to help avoid over-optimization of URLs.

Google’s EMD Algo Update – Early Data (Moz)
Deconstructing the EMD update (SEL)

Panda #20 — September 27, 2012

This update came out right alongside the EMD update and was pretty big – Affecting 2.4% of queries.

Google Panda Update 20 Released, 2.4% Of English Queries Impacted (SEL)

What Really Happened & How to Beat This Update:
Learn more about Google Panda (HOTH)

Panda 3.9.2 (#19) — September 18, 2012

A refresh to Panda, nothing major, 0.7% of queries

Google Rolls Out Panda 3.9.2 Refresh (SER)

What Really Happened & How to Beat This Update:
Learn more about Google Panda (HOTH)

Panda 3.9.1 (#18) — August 20, 2012

Another small Panda update.

Confirmed: Google Panda 3.9.1 Update (SER)

What Really Happened & How to Beat This Update:
Learn more about Google Panda (HOTH)

7-Result SERPs — August 14, 2012

Google started displaying only 7 results on the front page for approx 18% of queries.

Google Showing Fewer Results & More From Same Domain

The Pirate / DMCA Penalty — August 10-13, 2012

Google says they will start penalizing sites that get repeatedly accused of copyright infringement.

An update to our search algorithms (Official Announcement)
The Pirate Update: Google Will Penalize Sites Repeatedly Accused Of Copyright Infringement (SEL)

86 Google updates in June & July — August 10, 2012

Big update from the Inside Search blog. 86 updates rolled out in June & July including Panda updates, synonyms, freshness, events in knowledge graph and more.

Search quality highlights: 86 changes for June and July (Official Announcement)

Panda 3.9 (#17) — July 24, 2012

Google pushed out another panda refresh affecting 1% of queries.

Official: Google Panda 3.9 Refresh (SER)

What Really Happened & How to Beat This Update:
Learn more about Google Panda (HOTH)

Webmaster Tool Link Warnings — July 19, 2012

Another batch of WMT unnatural link warnings went out. The insane thing is that in June, Google said you needed to pay attention to these warnings and your site would probably drop if you ignored them. But now in July, Google says you may be able to ignore them — basically saying the exact opposite. The best thing to do is to just watch your traffic/rankings.

Insanity: Google Sends New Link Warnings, Then Says You Can Ignore Them (SEL)

What Really Happened & How to Beat This Update:
Learn more about WMT Unnatural Link Warnings (HOTH)

Panda 3.8 (#16) — June 25, 2012

Another panda data refresh.

Official Google Panda Update Version 3.8 On June 25th (SEL)

What Really Happened & How to Beat This Update:
Learn more about Google Panda (HOTH)

Panda 3.7 (#15) — June 8, 2012

Another Panda refresh.

Official Announcement
Confirmed: Google Panda 3.7 Update (SER)

What Really Happened & How to Beat This Update:
Learn more about Google Panda (HOTH)

39 Google Updates May 2012 — June 7, 2012

Google posted an official blog post highlighting 39 changes in May including “Better application of inorganic backlinks signals” “Improvements to Penguin” and more.

Search quality highlights: 39 changes for May (Official Announcement)
Google’s May Updates: Inorganic Backlinks, Page Titles, Fresh Results & More (SEL)

Penguin 1.1 (#2) — May 25, 2012

Google posted it’s first updated to the Penguin algo – just a data refresh.

Understanding Penguin 1.1: Be Safe from Updates in 3 Easy Steps (SEJ)

What Really Happened & How to Beat This Update:
Learn more about Google Penguin (HOTH)

Knowledge Graph — May 16, 2012

Google released some additions to search results – Knowledge graph is intended to do a few things. One is to be able to tell the difference between people, places, and things. The second is bringing answers and summaries directly into the search results so you can quickly get facts or information without actually visiting any sites.

Cool Knowledge Graph Video
Introducing the Knowledge Graph: things, not strings (Official Announcement)

52 Google Updates for April 2012 — May 4, 2012

Google posted a blog post with 52 updates that occurred in April 2012 including improvements to freshness signal, (no freshness boost for low-quality content), updates for showing public data, a 15% increase in the base index, improvements to site links & more.

Search quality highlights: 52 changes for April (Official Update)

Panda 3.6 (#14) — April 27, 2012

Another Panda data refresh.

Confirmed: Panda Update 3.6 Happened On April 27th (SEL)

What Really Happened & How to Beat This Update:
Learn more about Google Panda (HOTH)

Google Penguin — April 24, 2012

The update that shook the SEO world. Known for aggressively punishing sites using too many exact match anchors, Penguin impacted 3.1% of English queries (big update). Google claimed this affects keyword stuffing, but is mostly associated with off-site factors.

Another step to reward high-quality sites (Official Announcement)
Google Launches “Penguin Update” Targeting Webspam In Search Results (SEL)

What Really Happened & How to Beat This Update:
Learn more about Google Penguin (HOTH)

Panda 3.5 (#13) — April 19, 2012

Another Panda refresh.

Panda Update 3.5 Is Live: Winners & Losers (SEL)

What Really Happened & How to Beat This Update:
Learn more about Google Panda (HOTH)

Parked Domain Bug — April 16, 2012

While webmasters reported drops in rankings, Google claimed that there was a bug in the way they classify parked domains.

Google Parked Domain Classifier Error Blamed for Lost Search Rankings

50 Google Updates in March 2012 -Pack — April 3, 2012

Another group of updates including Panda 3.4, the way anchor text is calculated image search and local query updates.

Search quality highlights: 50 changes for March (Official Announcement)

Panda 3.4 (#12) — March 23, 2012

Another panda refresh.

Google Says Panda 3.4 Is ‘Rolling Out Now’ (SEL)

What Really Happened & How to Beat This Update:
Learn more about Google Panda (HOTH)

Panda 3.3 (#11) — February 27, 2012

Another Panda refresh, however, multiple updates happened around this time as well.

Google Confirms Panda 3.3 Update (SEL)

What Really Happened & How to Beat This Update:
Learn more about Google Panda (HOTH)

February 40-Pack (2) — February 27, 2012

Google posted a blog post on their Inside Search blog with 40 updates that have been launched in February including updates to “related searches, site links, autocomplete, UI elements, indexing, synonyms, SafeSearch and more.”

Search quality highlights: 40 changes for February (Official Post)

Venice — February 27, 2012

Big update to the way Old G displays results – Basically now local sites are going to start showing up when you type in queries even without a geo-modifier. For instance, if you type in “attorney” you may get localized results based on your IP. Great News: for Local SEOs and usability in general. Even benign terms like “coffee” bring up local relevant results!

Understand and Rock the Google Venice Update (SEOmoz)

17 January 2012 Google Updates — February 3, 2012

Another inside search blog post detailing 17 updates released in Jan 2012 including Panda’s integration into the main algo, updates to “freshness.”

17 search quality highlights: January (Official)
Google’s January Search Update: Panda In The Pipelines, Fresher Results, Date Detection & More (SEL)

Page Layout Update — January 19, 2012

Updates to the way pages are judged – If you have too many ads above the fold, you could lose rankings.

Page layout algorithm improvement (Google)

Panda 3.2 (#10) — January 18, 2012

Another Panda update/refresh.

Google Panda 3.2 Update Confirmed (SEL)

What Really Happened & How to Beat This Update:
Learn more about Google Panda (HOTH)

Search + Your World — January 10, 2012

Now if you use Google plus, google will attempt to put in more relevant content into your searches. Things that you’ve shared in the past, pictures from your g+ profile, things your friends have shared will start appearing in your search results in an effort to find the most relevant information. Mostly, Google just loves Google+ and wants to force it on everyone.

Search, plus Your World (Google)
Real-Life Examples Of How Google’s “Search Plus” Pushes Google+ Over Relevancy (SEL)

30 Google Updates in January 2012 — January 5, 2012

Another Inside Search blog post highlighting 30 updates including site links, image search improvements and more.

30 search quality highlights – with codenames! (Google)

2011 Updates

10 Google Updates in December 2011 — December 1, 2011

Blog post on updates made in late 2011 in addition to the last list. Updates include a parked domain classifier (reduce the number of parked domains shown to users in results), autocomplete, and image freshness.

Search quality highlights: new monthly series on algorithm changes (Official Update)

Panda 3.1 (#9) — November 18, 2011

Google Panda refresh.

Google Panda 3.1 Update: 11/18 (SER)

What Really Happened & How to Beat This Update:
Learn more about Google Panda (HOTH)

10 Updates made in November 2011 — November 14, 2011

No timeline was given, but Google posted a blog post highlighting 10 updates including fresher results, improved snippets.

Ten recent algorithm changes (Google)
Improved Snippets, Rank Boost For “Official” Pages Among 10 New Google Algorithm Changes (SEL)

Fresher Results Update — November 3, 2011

A change to the algo where Google wants to display fresher results, especially on queries that are time-sensitive.

Giving you fresher, more recent search results (Google)

Query Encryption — October 18, 2011

SEOs hated this update because this was the beginning of the (not provided) showing up in Analytics. Basically, google started encrypting data when users are signed in for privacy reasons. This makes it much harder to understand where your organic traffic is coming from. Good alternatives include SEMrush and using Webmaster tools (GWMT will tell you some terms you’re ranking for).

Making search more secure (Official Update)
Google Hides Search Referral Data with New SSL Implementation (SEOmoz)

Panda “Flux” (Update #8) — October 5, 2011

Matt Cutts tweeted: “expect some Panda-related flux in the next few weeks” and gave a figure of “~2%”. Lots of updates recently on this.

What Really Happened & How to Beat This Update:
Learn more about Google Panda (HOTH)

Panda 2.5 (Update #7) — September 28, 2011

Another Panda update.

Google Panda 2.5: Losers Include Today Show, The Next Web; Winners Include YouTube, Fox News: (SEL)

What Really Happened & How to Beat This Update:
Learn more about Google Panda (HOTH)

Pagination Elements — September 15, 2011

Google introduced the rel=”next” and rel=”prev” link attributes to help with pagination crawl/index issues.

Pagination with rel=“next” and rel=“prev” (Official)

Expanded Sitelinks — August 16, 2011

Google expands the display of site links, making navigating to specific content right from search easier.

The evolution of site links: expanded and improved (Google)

Panda 2.4 (Update #6) — August 12, 2011

Panda rolls out internationally.

High-quality sites algorithm launched in additional languages (Google)
Official: Google Panda 2.4 Rolls Out to Most Languages (SER)

What Really Happened & How to Beat This Update:
Learn more about Google Panda (HOTH)

Panda 2.3 (#5) — July 23, 2011

Another manual push out of panda.

Official: Google Panda 2.3 Update Is Live (SEL)

What Really Happened & How to Beat This Update:
Learn more about Google Panda (HOTH)

Google+ — June 28, 2011

Google launches Google+, a type of Facebook competitor that promises to integrate multiple Google services for a more personalized experience. They differentiate by creating “circles” so you can share differently with different groups of people. They gain lots of users quickly, but it’s launched with negative criticism.

Google Plus+ Video
Introducing the Google+ project: Real-life sharing, rethought for the web (Google)
Google’s Launch Of Google + Is, Once Again, Deeply Embarrassing — Facebook Must Be Rolling Its Eyes (Business Insider)

Panda 2.2 (Update #4) — June 21, 2011

Panda update.

Official: Google Panda Update 2.2 Is Live (SEL)

What Really Happened & How to Beat This Update:
Learn more about Google Panda (HOTH) — June 2, 2011

A collaboration between Google, Yahoo, and Microsoft to create structured data. With a home base at, webmasters can now use a standardized markup for all kinds of data

Google, Bing & Yahoo Unite To Make Search Listings Richer Through Structured Data (SEL)

Panda 2.1 (Update #3) — May 9, 2011

Another small Panda update.

It’s Panda Update 2.1, Not Panda 3.0, Google Says (SEL)

What Really Happened & How to Beat This Update:
Learn more about Google Panda (HOTH)

Panda 2.0 (Update #2) — April 11, 2011

The first panda update and goes global.

High-quality sites algorithm goes global, incorporates user feedback (Official Update

What Really Happened & How to Beat This Update:
Learn more about Google Panda (HOTH)

The Google +1 Button — March 30, 2011

Basically a button that’s like a Facebook “like” button – Influences results for people in your circles to help bring trusted content to the top.

+1’s: the right recommendations right when you want them—in your search results (Official Google Update)

Panda Update (Also called Farmer) — February 23, 2011

Big update, the first of it’s kind. This affected up to 12% of search results. Panda targeted “content farms” – huge sites with low-quality content, thin affiliate sites without much content, sites with large ad-to-content ratios and on-site over-optimization.

Google’s Farmer/Panda Update: Analysis of Winners vs. Losers (SEOmoz)

Attribution Update — January 28, 2011

This update was to help stop scrapers from stealing content. It affected ~2% of search queries.

Algorithm Change Launched (Matt Cutts) & JCPenny Penalty — January 2011

Google started publicly calling out sites for their SEO practices, starting the PR-scare-train in prep for upcoming updates.

Google Penalizes Overstock for Search Tactics (WSJ) Lands in Google’s Penalty Box Over Links-for-Discounts Deal (SEW)
New York Times Exposes J.C. Penney Link Scheme That Causes Plummeting Rankings in Google (SEL)

2010 Updates

Start Using Social Signals — December 2010

Google & Bing confirm they use social signals to influence rankings including Twitter & Facebook

What Social Signals Do Google & Bing Really Count? (SEL)

Negative Reviews — December 2010

After a big story broke about how a brand was being pushed up the search results as users complained (and left links to the website), Google updated it’s algo to fix the problem.

A Bully Finds a Pulpit on the Web (NY Times)

Being bad to your customers is bad for business (Google)

Instant Visual Previews — November 2010

Google released an update that allowed you to see a visual preview of a website within the search results. It didn’t last too long.

Beyond Instant Results: Instant Previews (Google)

Google Instant — September 2010

This is an addition to Google suggest where google will display actual results before the query is finished.

About Google Instant (Google)
Google Instant: Fewer Changes to SEO than the Average Algo Update (SEOmoz)

Brand Update — August 2010

Google changed to allow some brands/domains to appear multiple times (up to 8+ times) on page one on certain searches.

Google Search Results Dominated By One Domain (SEL)

Caffeine (Rollout) — June 2010

Caffeine is a new web indexing system that Google rolled out around June 2010. It was intended to speed up the rate of indexing and provide fresher results to users.

Our new search index: Caffeine (Google)
Google’s New Indexing Infrastructure “Caffeine” Now Live (SEL)

May Day — May 2010

An update happened April 28 – May 3 and some webmasters noticed drops in traffic, especially in long-tail traffic. This was an algo shift to help combat content farms and was a precursor to the panda update.

Video: Google’s Matt Cutts On May Day Update (SER)

Google Places — April 2010

The Local Business Center became Google Places. This included all the same features from previous Google Places but then added in a few additional features like advertising, service area, and more.

Introducing Google Places (Official)
Google Local Business Center Becomes “Google Places” (SEL)

2009 Updates

Real-time Search — December 2009

Real-time became the real deal. Google News: newly indexed content, twitter feeds, and other sources were pushed together on some SERPs in a real-time feed format. Social media, as well as other sources, kept on growing.

Google Launches Real-Time Search Results (SEL)

Caffeine (Preview) — August 2009

This was Google’s preview of a gigantic change in infrastructure. This change was intended to speed up crawling, enlarge the total index, and incorporate ranking and indexation immediately. In the US, the preview lasted for the rest of the year and wasn’t fully active until early 2010.

Google Caffeine: A Detailed Test of the New Google (Mashable)
Help test some next-generation infrastructure (Google)

Rel-canonical Tag — February 2009

Yahoo, Google, and Microsoft announced its Canonical Tag support campaigns. This allowed canonicalization signals to be sent by webmasters without any effect on human visitors.

Learn about the Canonical Link Element in 5 minutes (
Canonical URL Tag – The Most Important Advancement in SEO Practices Since Sitemaps (SEOmoz)

Vince — February 2009

This was the major update that many SEOs claim started to support big brands. Even though it was called a “minor change” by Matt Cutts, this update had immense and long-lasting repercussions.

Big Brands – Google Brand Promotion: New Search Engine Rankings Place Heavy Emphasis on Branding (SEO Book)
Google’s Vince Update Produces Big Brand Rankings; Google Calls It A Trust “Change” (SEL)

2008 Updates

Google Suggest — August 2008

Google introduces “Suggest” and makes large changes to the logo/box style homepage. “Suggest” displays suggested searches in a new menu below where the visitor is typing. Later, this would continue to power Google Instant.

News: Finally Gets Google Suggest Feature (SEL)

Dewey — April 2008

In what seemed like a bigger move in late March/early April, it was suspected that the internal properties of Google were being pressed down. This included Google Books, but actual evidence of this happening is not easily accessible.

Google’s Cutts Asking for Feedback on March/April ’08 Update (SERoundtable)

2007 Updates

Buffy — June 2007

This update was titled “Buffy” because Vanessa Fox was leaving Google. It is pretty unclear what actually happened with this change, but Matt Cutts provided that Buffy was just a large amount of smaller changes.

Google “Buffy” Update – June Update (SERoundtable)
SMX Seattle wrap-up (

Universal Search — May 2007

This algorithm update integrated old school search results with Video, Local, Images, News:, as well as other vertical results. After this, the 10-listing SERP was done for.

Google 2.0: Google Universal Search (SEL)

2006 Updates

False Alarm — December 2006

While Google didn’t actually report any changes, there was quite a fuss in the SEO community about potential major changes in November and December 2006.

Google Update Debunked By Matt Cutts (SERoundtable)

Supplemental Update — November 2006

2006 was the year of supplemental index changes. This completely changed how the filtering of pages was handled. Even if it seemed like penalization, Google said that supplemental was not intended to be a penalty.

Confusion Over Google’s Supplemental Index (SERoundtable)

2005 Updates

Big Daddy — December 2005

Just like the “Caffeine” update, Big Daddy was intended to update infrastructure. This update came out over the course of 3 months and finished in March. Canonicalization and redirects (301/302) were changed by Big Daddy.

Indexing timeline (
Todd, Greg & Matt Cutts on WebMasterRadio (SEOmoz)

Jagger — October 2005

Aimed at targeting low-quality links, Google unleashed the Jagger series of updates in October. These low-quality links included reciprocal links, paid links, and link farms. This update came out from September-November.

A Review Of The Jagger 2 Update (SERoundtable)
Dealing With Consequences of Jagger Update (WMW)

Google Local/Maps — October 2005

Following the launch of the Local Business Center, Google put all of its Maps data into the LBC. Requesting that businesses update their information, this would eventually lead to quite a few changes regarding SEO on the local level.

Google Merges Local and Maps Products (Google)

Gilligan — September 2005

This was originally described as a false update. Google claimed that they had made no big update to the algorithm, but webmasters were seeing changes all over the board. (Matt) Cutts posted a blog saying that the index data updated by Google was done daily, but the Toolbar PR and different metrics were only changed on a quarterly basis.

Google’s Cutts Says Not An Update – I Say An Update, Just Not A Dance (SEW)
What’s an update? (

Personalized Search — June 2005

Personalized search “truly” rolled out with this update. Unlike previous versions where users had to create custom settings on their profiles, the Personalized Search update streamlined it completely. While this started out small, Google would use this along with search history for many applications to come.

Google Relaunches Personal Search – This Time, It Really Is Personal (SEW)
Search gets personal (Google)

XML Sitemaps — June 2005

This update gave webmasters the ability to submit XML sitemaps with Webmaster Tools. This bypassed the old HTML sitemaps and gave SEOs a small influence over the indexation and crawling.

New “Google Sitemaps” Web Page Feed Program (SEW)

Bourbon — May 2005

Someone under the name “GoogleGuy” posted that Google would be coming out with “something like 3.5 changes in search quality.” What was a 0.5 change going to be?! Webmaster World members thought that Bourbon influenced how duplicate content, as well as non-canonical URLs, were treated.

Google Update “Bourbon” (Battelle Media)
Bourbon Update Survival Kit (SERoundtable)

Allegra — February 2005

Many webmasters saw ranking changes across the board, but what this update really did wasn’t exactly crystal clear. It was considered that Allegra had some effect on the “sandbox”, but others thought the LSI had been changed. This is about the time when people began feeling like Google was penalizing suspicions looking links.

Google’s Feb. 2005 Update (SEW)

Nofollow — January 2005

The “nofollow” attribute was enacted to combat spam and control the outbound link quality. Google, Yahoo, and Microsoft all introduce this at the same time. This update really helped clean up spammy blog comments and untouched for links without being a traditional algorithm update. This was a significant update when it comes to link graph impact.

Google, Yahoo, MSN Unite On Support For Nofollow Attribute For Links (SEW)

2004 Updates

Google IPO — August 2004

This definitely was not an algorithm update, but it was a huge event when it comes to Google. 19M shares of Google were sold, it raised $1.67 billion, and the market value was set at just over $20 billion. 4 months later, Google’s share prices would be double that.

Google IPO priced at $85 a share (CNN)

Brandy — February 2004

In February, there were quite a few changes that came out. Increased attention to anchor text relevance, Latent Semantic Indexing, and link “neighborhoods” all came into existence. LSI gave Google the increased ability to find synonyms for search terms and boosted keyword analysis.

Google’s Brandy Update (OnPage)
How To Beat Google’s “Brandy” Update (SitePoint)

Austin — January 2004

Austin fixed the problems leftover by Florida. The continued crackdown on tricky on-page tactics, Google included invisible text and META-tag stuffing. There was serious speculation that Google took the “Hilltop” algorithm and used it again, beginning to take page relevance extremely seriously.

The latest on update Austin (Google’s January update) (SEJ)
Google Update Austin: Google Update Florida Again (

2003 Updates

Florida — November 2003

Here it is. The update that put “SEO” into real play. Numerous sites lost all ranking and there were even more unhappy business owners. Low-value SEO tactics from the late 90s were finally dead. Keyword stuffing was a thing of the past, and the game was getting serious.

What Happened To My Site On Google? (SEW)

Supplemental Index — September 2003

The “Supplemental Index” was introduced so that Google could index more documents without having to hurt performance. This became a hot issue until the index became integrated again.

Search Engine Size Wars & Google’s Supplemental Results (SEW)

Fritz — July 2003

The monthly “Google Dance” finally came to an end with the “Fritz” update. Instead of completely overhauling the index on a roughly monthly basis, Google switched to an incremental approach. The index was now changing daily.

With this update, Google changed to an incremental or “bit by bit” approach instead of overhauling everything monthly. The index changed every single day.

Explaining algorithm updates and data refreshes (Matt Cutts)
Exclusive: How Google’s Algorithm Rules the Web (Wired)

Esmerelda — June 2003

This was the last scheduled monthly update from Google. After this, a continuous update process was started. “Google Dance” was replaced with something called “Everflux”. This update definitely had some huge structure changes regarding Google.

Google Update Esmeralda (Kuro5hin)

Dominic — May 2003

This was one of many of the changes that happened in May, but not necessarily the easiest to describe. “Fleshbot” and “Deepcrawler” were crawling the internet and bouncing sites left and right. Also, backlinks began to be counted and reported changed considerably.

Understanding Dominic – Part 2 (WMW)

Cassandra — April 2003

Witch Cassandra, Google got down to business on basic link-quality issues. Massive linking from co-domains was one of the main focuses for this update as well as hidden text and links.

Google – Update “Cassandra” is here (Econsultancy)

Boston — February 2003

The first “named” Google update, Boston became the first major monthly update. The first few of these updates combined algorithm changes with index refreshes (Google Dance). The monthly idea expired when frequent updates became a requirement.

2002 Updates

1st Documented Update — September 2002

Before the first-named update (Boston), there was another in Fall 2002. While there aren’t very many details about this update, it seemed to include more than the usual Google Dance and PageRank updates.

September 2002 Google Update Discussion – Part 1 (WMW)
Dancing The Google Dance (Level343)

2000 Updates

Google Toolbar — December 2000

This is the one that started all the SEO arguments. Google launches the toolbar for browsers as well as the Toolbar PageRank (TBPR). When everyone started paying attention to TBPR, that’s when Google began dancing!

Google Launches The Google Toolbar (Google)

Related Posts:  How Search Engines Rank Your Website

x Logo: Shield
This Site Is Protected By