Tuesday, January 25, 2011

Google Search and Search Engine Spam




January brought a spate of stories about Google's search quality. Reading through some of these recent articles, you might ask whether our search quality has gotten worse. The short answer is that according to the evaluation metrics that we've refined over more than a decade, Google's search quality is better than it has ever been in terms of relevance, freshness and comprehensiveness. Today, English-language spam in Google's results is less than half what it was five years ago, and spam in most other languages is even lower than in English. However, we have seen a slight uptick of spam in recent months, and while we've already made progress, we have new efforts underway to continue to improve our search quality.

Just as a reminder, webspam is junk you see in search results when websites try to cheat their way into higher positions in search results or otherwise violate search engine quality guidelines. A decade ago, the spam situation was so bad that search engines would regularly return off-topic webspam for many different searches. For the most part, Google has successfully beaten back that type of "pure webspam"—even while some spammers resort to sneakier or even illegal tactics such as hacking websites.

As we've increased both our size and freshness in recent months, we've naturally indexed a lot of good content and some spam as well. To respond to that challenge, we recently launched a redesigned document-level classifier that makes it harder for spammy on-page content to rank highly. The new classifier is better at detecting spam on individual web pages, e.g., repeated spammy words—the sort of phrases you tend to see in junky, automated, self-promoting blog comments. We've also radically improved our ability to detect hacked sites, which were a major source of spam in 2010. And we're evaluating multiple changes that should help drive spam levels even lower, including one change that primarily affects sites that copy others' content and sites with low levels of original content. We'll continue to explore ways to reduce spam, including new ways for users to give more explicit feedback about spammy and low-quality sites.

As "pure webspam" has decreased over time, attention has shifted instead to "content farms," which are sites with shallow or low-quality content. In 2010, we launched two major algorithmic changes focused on low-quality sites. Nonetheless, we hear the feedback from the web loud and clear: people are asking for even stronger action on content farms and sites that consist primarily of spammy or low-quality content. We take pride in Google search and strive to make each and every search perfect. The fact is that we're not perfect, and combined with users' skyrocketing expectations of Google, these imperfections get magnified in perception. However, we can and should do better.

One misconception that we've seen in the last few weeks is the idea that Google doesn't take as strong action on spammy content in our index if those sites are serving Google ads. To be crystal clear:
  • Google absolutely takes action on sites that violate our quality guidelines regardless of whether they have ads powered by Google;
  • Displaying Google ads does not help a site's rankings in Google; and
  • Buying Google ads does not increase a site's rankings in Google's search results.
These principles have always applied, but it's important to affirm they still hold true.

People care enough about Google to tell us—sometimes passionately—what they want to see improved. We deeply appreciate this feedback. Combined with our own scientific evaluations, user feedback allows us to explore every opportunity for possible improvements. Please tell us how we can do a better job, and we'll continue to work towards a better Google.


Recent statistics have shown a decline in the number of U.S. students taking computer science AP classes, which also leads to a decline in students declaring computer science as their majors—a concerning trend in the U.S. as we try to remain competitive in the global economy. With programs like Computer Science for High School (CS4HS), we hope to increase the number of CS majors —and therefore the number of people entering into careers in CS—by promoting computer science curriculum at the high school level.

For the fourth consecutive year, we're funding CS4HS to invest in the next generation of computer scientists and engineers. CS4HS is a workshop for high school and middle school computer science teachers that introduces new and emerging concepts in computing and provides tips, tools and guidance on how to teach them. The ultimate goals are to "train the trainer," develop a thriving community of high school CS teachers and spread the word about the awe and beauty of computing.

In 2011 we're expanding the program considerably and hope to double the number of schools we funded in 2010. If you're a university, community college, or technical School in the U.S., Canada, Europe, Middle East or Africa and are interested in hosting a workshop at your institution, please visit www.cs4hs.com to submit an application for grant funding. Applications will be accepted between January 18, 2011 and February 18, 2011.

In addition to submitting your application, on the CS4HS website you'll find info on how to organize a workshop, as well as websites and agendas from last year's participants to give you an idea of how the workshops were structured in the past. There's also a collection of CS4HS curriculum modules that previous participating schools have shared for future organizers to use in their own program.

Previous organizers have told us that teachers have left their workshops excited about the new materials they learned and the innovative ideas they've discussed with other teachers. We're hopeful that they'll pass on to their students not only the skills that they learned but also that passion.


When I joined Google in 2001 I never imagined—even in my wildest dreams—that we would get as far, as fast as we have today. Search has quite literally changed people's lives—increasing the collective sum of the world's knowledge and revolutionizing advertising in the process. And our emerging businesses—display, Android, YouTube and Chrome—are on fire. Of course, like any successful organization we've had our fair share of good luck, but the entire team—now over 24,000 Googlers globally—deserves most of the credit.

And as our results today show, the outlook is bright. But as Google has grown, managing the business has become more complicated. So Larry, Sergey and I have been talking for a long time about how best to simplify our management structure and speed up decision making—and over the holidays we decided now was the right moment to make some changes to the way we are structured.

For the last 10 years, we have all been equally involved in making decisions. This triumvirate approach has real benefits in terms of shared wisdom, and we will continue to discuss the big decisions among the three of us. But we have also agreed to clarify our individual roles so there's clear responsibility and accountability at the top of the company.

Larry will now lead product development and technology strategy, his greatest strengths, and starting from April 4 he will take charge of our day-to-day operations as Google's Chief Executive Officer. In this new role I know he will merge Google's technology and business vision brilliantly. I am enormously proud of my last decade as CEO, and I am certain that the next 10 years under Larry will be even better! Larry, in my clear opinion, is ready to lead.

Sergey has decided to devote his time and energy to strategic projects, in particular working on new products. His title will be Co-Founder. He's an innovator and entrepreneur to the core, and this role suits him perfectly.

As Executive Chairman, I will focus wherever I can add the greatest value: externally, on the deals, partnerships, customers and broader business relationships, government outreach and technology thought leadership that are increasingly important given Google's global reach; and internally as an advisor to Larry and Sergey.


From left to right - Eric, Larry and Sergey in a self-driving car in a photo taken earlier today

We are confident that this focus will serve Google and our users well in the future. Larry, Sergey and I have worked exceptionally closely together for over a decade—and we anticipate working together for a long time to come. As friends, co-workers and computer scientists we have a lot in common, most important of all a profound belief in the potential for technology to make the world a better place. We love Google—our people, our products and most of all the opportunity we have to improve the lives of millions of people around the world.


This is the latest in our series of YouTube highlights. Every couple of weeks, we bring you regular updates on new product features, interesting programs to watch and tips you can use to grow your audience on YouTube. Just look for the label "YouTube Highlights" and subscribe to the series. – Ed.

Since our last update, we've featured new music programs, brought you closer to what's going on in government and highlighted some of the best ads of 2010.

Music videos now on YouTube app for Android
We've welcomed VEVO's extensive library of official music videos from artists like Lady Gaga, Rihanna, Kanye West and U2 onto the YouTube 2.0 app for Android, available for mobile phones running Android 2.2 (Froyo). Enjoy!

Broken Social Scene goes live on YouTube
Earlier this week, Canada's indie rock collective Broken Social Scene kicked off their Winter 2011 tour with a live performance at NYC's Terminal 5. You can still catch the show on http://www.youtube.com/bowerypresents.



Your window into the 112th U.S. Congress
John Boehner, the new Speaker of the United States House, and House Oversight Committee Chairman Darrell Issa are making the activities of the House of Representatives more accessible to U.S. citizens via YouTube. Starting in this 112th Congress, all committee hearings of the House Oversight committee will be available on YouTube, on a new channel called HouseResourceOrg. This was made possible via a Google Project 10^100 grant made to Carl Malamud at PublicResource.org, who will be working with the House to access and upload all of the hearings that the Oversight Committee holds.

Meet the YouTube Symphony Orchestra 2011
The new members of the YouTube Symphony Orchestra 2011 have been selected: 101 people from more than 30 countries around the world are heading to Sydney Opera House to rehearse together for the first time under the conductorship of Michael Tilson Thomas. Come meet the winners and stay tuned for the final performance on Sunday, March 20, which will be streamed live to the world on YouTube.

A sneak peek at "Life in a Day"
In anticipation of the world premiere of "Life in a Day," at the 2011 Sundance Film Festival next week, we're releasing a series of clips between now and then. Life in a Day is a documentary film directed by Oscar-winner Kevin Macdonald, produced by Ridley Scott, and filmed on July 24, 2010 by thousands of YouTube users around the world. Watch the first teaser below.



Looking back at the best YouTube ads of 2010
2010 was a breakout year for online video advertising. Earning people's attention has become ever more challenging—but that's only making advertising more fun. Old Spice's "The Man Your Man Could Smell Like" was ranked number one among YouTube ads in an informal poll of the YouTube advertising team and reporters in the industry. Find out what other ads topped last year's list.



Until next time, visit the YouTube Blog for news and updates.