What is the Google Panda SEO Update?
On occasion, the search engine rolls out major algo updates, which significantly impact the search engine results page. One of those big updates was Google Panda. Google Panda was initially released on the 23rd of February, 2011.
The stated purpose of the Google Panda algorithm, our date was to reward high-quality websites and diminished the presence of low-quality websites in Google’s organic search engine results.
Google’s Panda algorithm shook up the search landscape when it came out on February 24, 2012. In their announcement of the release, Google said the following:
Many of the changes we make are so subtle that very few people notice them. But in the last day or so we launched a pretty big algorithmic improvement to our ranking—a change that noticeably impacts 11.8% of our queries—and we wanted to let people know what’s going on.
This update is designed to reduce rankings for low-quality sites—sites that are low-value add for users, copy content from other websites, or sites that are just not very useful. At the same time, it will provide better rankings for high-quality sites—sites with original content and information such as research, in-depth reports, thoughtful analysis, and so on.
Other information also came out pretty quickly about this release, which Danny Sullivan of Search Engine Land initially called the Farmer update. Between this additional information and the initial announcement, several aspects of Panda quickly became clear:
- This was a very large change. Very few algorithm updates by Google impact more than 10% of all search queries.
- This algorithm is focused on analyzing content quality.
- Scraper sites were targeted.
- Sites lacking substantial unique information were also targeted. In particular, content farms were part of what Google was looking to address—a move that Eric Enge predicted would happen three weeks before this release.
- Google clearly states a strong preference for new research, in-depth reports, and thoughtful analysis.
- The Panda algorithm does not use the link graph as a ranking factor.
The second release of Panda came out on April 11, 2011. What made this release particularly interesting is that it incorporated data gathers from Google’s Chrome Blocklist Extension. This extension to Chrome allowed users to indicate that they wanted pages removed from the search results.
This was the first time that Google ever publicly confirmed that it was using a form of direct user input as a ranking factor in any of its algorithms. Initially, Panda was focused only on the United States but was rolled out internationally on August 12, 2011, to most of the rest of the world except Japan, China, and Korea.
Since that time, Google has confirmed only three Panda updates, with the most recent one being Panda 4.0 on May 20, 2014. To track all Google updates over time, you can check the Google algorithm change history page on the Moz website.
How does Google Panda work?
According to Google, Panda’s initial rollout over the course of several months affected up to 12% of English language search results. There are a number of Panda triggers, including thin content, or weak pages with very little relevant content. This algorithm update addressed these things in Google search engine results.
Duplicate content; so copied content that appears on the internet in more than one place. Low-quality content; so pages that provide little value to human readers because they lack in-depth information. Lack of authority/trustworthiness, low-quality user-generated content; so UGC.
High ad-to-content ratio; so pages are made up mostly of paid advertisements rather than original content. Website blocked by users, and content mismatching the search query. So pages that promise to deliver relevant answers if clicked in the search results, but then when they do get clicked they fail to do so.
For example, a website might have a page title, and coupons for Whole Foods, but then when it’s clicked, you land on a page that has no coupons whatsoever. Panda had a massive effect on sites like this. So how do you know if you’ve been hit by Google Panda?
One signal of potential Panda penalization is a sudden drop in your website’s organic traffic or search engine rankings correlating with a known date of the algorithm update.
As you know, Panda was released on February 23rd, 2011, you can easily log into your Google Analytics account and see if you had any big drastic traffic drops around this time.
How to optimize content for Google Panda?
Google has historically offered relatively vague information on how the Panda algorithm works to determine the quality of a site. For example, on May 6, 2011, Amit Singhal offered his advice on building high-quality sites. In it, he suggested a list of questions that you could use to determine if you were on such a site:
- Would you trust the information presented in this article?
- Is this article written by an expert or enthusiast who knows the topic well, or is it more shallow in nature?
- Does the site have duplicate, overlapping, or redundant articles on the same or similar topics with slightly different keyword variations?
- Would you be comfortable giving your credit card information to this site?
- Does this article have spelling, stylistic, or factual errors?
- Are the topics driven by the genuine interests of readers of the site, or does the site generate content by attempting to guess what might rank well in search engines?
- Does the article provide original content or information, original reporting, original research, or original analysis?
- Does the page provide substantial value when compared to other pages in search results?
- How much quality control is done on content?
- Does the article describe both sides of a story?
- Is the site a recognized authority on its topic?
- Is the content mass-produced by or outsourced to a large number of creators, or spread across a large network of sites, so that individual pages or sites don’t get as much attention or care?
- Was the article edited well, or does it appear sloppy or hastily produced?
- For a health-related query, would you trust information from this site?
- Would you recognize this site as an authoritative source when mentioned by name?
- Does this article provide a complete or comprehensive description of the topic?
- Does this article contain insightful analysis or interesting information that is beyond obvious?
- Is this the sort of page you’d want to bookmark, share with a friend, or recommend?
- Does this article have an excessive amount of ads that distract from or interfere with the main content?
- Would you expect to see this article in a printed magazine, encyclopedia, or book?
- Are the articles short, unsubstantial, or otherwise lacking in helpful specifics?
- Are the pages produced with great care and attention to detail versus less attention to detail?
- Would users complain when they see pages from this site?
There are a few key points that can be extracted from this advice, and the industry has been able to determine and clarify a number of Panda’s target areas. These include:
As you might expect, this is defined as pages with very little content. Examples might be user profile pages on forum sites with very little information filled in, or an eCommerce site with millions of products, but very little information provided about each one.
These may be scraped pages or pages that are only slightly rewritten, and Google can detect them relatively easily. Sites with even a small number of these types of pages can be impacted by Panda.
Even if you create all original articles, this may not be enough. If every page on your site covers topics that have been written about by others hundreds or thousands of times before, then you really have nothing new to add to the Web with your site.
This is content that is inaccurate or poorly assembled. In many cases, this may be hard to detect, but as mentioned in Amit Singhal’s article, one indicator is content that includes poor grammar or a lot of spelling mistakes. Google could also potentially use fact-checking as another way to determine poor-quality content.
Sites that have large numbers of pages with lists of curated links do get hit by Panda. Content curation is not inherently bad, but if you are going to do it, it’s important to incorporate a significant amount of thoughtful commentary and analysis. Pages that simply include lots of links will not do well, nor will pages that include links and only a small amount of unique text.
This was believed to be one of the original triggers for the Panda algorithm, as it was a popular tactic for content farms. Imagine you wanted to publish content on the topic of schools with nursing programs. Content farm sites would publish many articles on the same topic, with titles such as: “nursing schools,” “nursing school,” “nursing colleges,” “nursing universities,” “nursing education,” and so forth. There is no need for all of those different articles, which prompted Google to target this practice with Panda.
The practice of using a database to generate web pages is not inherently bad, but many companies were doing it to an excessive scale. This led to lots of thin-content pages or poor-quality pages, so many of these types of sites were hit by Panda.
How do I recover from a Panda penalty?
Weak content on even one single section of a larger site can cause Panda to lower the rankings for the whole site. This is true even if the content in question makes up less than 20% of the pages for the site. When you are putting together a plan to recover from Panda, it is important to take this into account.
The road to recovery from a Panda penalty may be a long one. Oftentimes it requires a substantial reevaluation of your site’s business model. You need to be prepared to look at your site with a highly critical eye, and this is often very hard to do with your own site.
Thus, it’s a good idea to consider bringing in an external perspective to evaluate your site. You need someone who is willing to look you in the eye and tell you that your baby is ugly. Once you go through this reevaluation process, you may realize that even the basic premise of your site is broken and that you need to substantially restructure it. Making these types of decisions is quite hard, but you need to be prepared to do it.
As you consider these tough choices, it can be helpful to look at your competition that did not get hit. Understand, however, that you may see instances of thin content, weak content, “me too” content, and other poor-quality pages on competitors’ sites that look just as bad as the content penalized on your site, and they may not appear to have been impacted by Panda. Don’t let this type of analysis deter you from making the hard choices.
There are so many factors that Google uses in its ranking algorithms that you will never really know why your site was hit by Panda and your competitor’s site was not. What you do know is that Google’s Panda algorithm does not like something about your site.
This may include complex signals based on how users interact with your listings in the search results, which is data you don’t have access to. To rebuild your traffic, it’s best to dig deep and take on hard questions about how you can build a site full of fantastic content that gets lots of user interaction and engagement.
While it is not believed that social media engagement is a factor in Panda, there is likely a strong correlation between high numbers of social shares and what Google considers to be good content. Highly differentiated content that people really want, enjoy, share, and link to is what you want to create on your site.
There is a science to creating content that people will engage with. We know that picking engaging titles for the content is important, and that including compelling images matters too.
Make a point of studying how to create engaging content that people will love, and apply those principles to every page you create. In addition, measure the engagement you get, test different methods, and improve your ability to produce great content over time.
Ways to address weak pages
As you examine your site, a big part of your focus should be addressing its weak pages. They may come in the form of an entire section of weak content, or a number of pages interspersed among the higher-quality content on your site. Once you have identified those pages, there are a few different paths you can take to address the problems you find:
- Improve the content. This may involve rewriting the content on the page, and making it more compelling to users who visit.
- Add the no-index meta tag to the page. This will tell Google to not include these pages in its index, and thus will take them out of the Panda equation.
- Delete the pages altogether, and 301-redirect visitors to other pages on your site. Use this option only if there are quality pages that are relevant to the deleted ones.
- Delete the pages and return a 410 HTTP status code when someone tries to visit the deleted page. This tells the search engine that the pages have been removed from your site.
- Use the URL removal tool to take the page out of Google’s index. This should be done with great care. You don’t want to accidentally delete other quality pages from the Google index!
Expected timeline for recovery
Even though they are no longer announced, Panda releases come out roughly once per month. However, once you have made the necessary changes, you will still need to wait. Google has to recrawl your site to see what changes you have made. It may take Google several months before it has seen enough of the changed or deleted pages to tilt the balance in your favor.
What if you don’t recover?
Sadly, if your results don’t change, this usually means that you have not done enough to please the Panda algorithm. It may be that you were not strict enough in deleting poorer-quality content from your site.
Or, it may mean that Google is not getting enough signals that people really care about what they find on your site. Either way, this means that you have to keep working to make your site more interesting to users.
Go beyond viewing this process as a way to deal with Panda, and instead see it as a mission to make your site one of the best on the Web. This requires substantial vision and creativity. Frankly, it’s not something that everybody can accomplish without making significant investments of time and money.
One thing is clear: you can’t afford to cut corners when trying to address the impact of the Panda algorithm.
Make sure your website has no thin content, no duplicate content, no low-quality content, and so on, and so on. In summary, Google Panda was a massive algo update, and in my opinion, it made the internet a much better place and I think you would agree too, as no one likes to see thin content or duplicate content on page one on Google.