Tuesday, January 25, 2011

Google Search and Search Engine Spam




January brought a spate of stories about Google's search quality. Reading through some of these recent articles, you might ask whether our search quality has gotten worse. The short answer is that according to the evaluation metrics that we've refined over more than a decade, Google's search quality is better than it has ever been in terms of relevance, freshness and comprehensiveness. Today, English-language spam in Google's results is less than half what it was five years ago, and spam in most other languages is even lower than in English. However, we have seen a slight uptick of spam in recent months, and while we've already made progress, we have new efforts underway to continue to improve our search quality.

Just as a reminder, webspam is junk you see in search results when websites try to cheat their way into higher positions in search results or otherwise violate search engine quality guidelines. A decade ago, the spam situation was so bad that search engines would regularly return off-topic webspam for many different searches. For the most part, Google has successfully beaten back that type of "pure webspam"—even while some spammers resort to sneakier or even illegal tactics such as hacking websites.

As we've increased both our size and freshness in recent months, we've naturally indexed a lot of good content and some spam as well. To respond to that challenge, we recently launched a redesigned document-level classifier that makes it harder for spammy on-page content to rank highly. The new classifier is better at detecting spam on individual web pages, e.g., repeated spammy words—the sort of phrases you tend to see in junky, automated, self-promoting blog comments. We've also radically improved our ability to detect hacked sites, which were a major source of spam in 2010. And we're evaluating multiple changes that should help drive spam levels even lower, including one change that primarily affects sites that copy others' content and sites with low levels of original content. We'll continue to explore ways to reduce spam, including new ways for users to give more explicit feedback about spammy and low-quality sites.

As "pure webspam" has decreased over time, attention has shifted instead to "content farms," which are sites with shallow or low-quality content. In 2010, we launched two major algorithmic changes focused on low-quality sites. Nonetheless, we hear the feedback from the web loud and clear: people are asking for even stronger action on content farms and sites that consist primarily of spammy or low-quality content. We take pride in Google search and strive to make each and every search perfect. The fact is that we're not perfect, and combined with users' skyrocketing expectations of Google, these imperfections get magnified in perception. However, we can and should do better.

One misconception that we've seen in the last few weeks is the idea that Google doesn't take as strong action on spammy content in our index if those sites are serving Google ads. To be crystal clear:
  • Google absolutely takes action on sites that violate our quality guidelines regardless of whether they have ads powered by Google;
  • Displaying Google ads does not help a site's rankings in Google; and
  • Buying Google ads does not increase a site's rankings in Google's search results.
These principles have always applied, but it's important to affirm they still hold true.

People care enough about Google to tell us—sometimes passionately—what they want to see improved. We deeply appreciate this feedback. Combined with our own scientific evaluations, user feedback allows us to explore every opportunity for possible improvements. Please tell us how we can do a better job, and we'll continue to work towards a better Google.


Recent statistics have shown a decline in the number of U.S. students taking computer science AP classes, which also leads to a decline in students declaring computer science as their majors—a concerning trend in the U.S. as we try to remain competitive in the global economy. With programs like Computer Science for High School (CS4HS), we hope to increase the number of CS majors —and therefore the number of people entering into careers in CS—by promoting computer science curriculum at the high school level.

For the fourth consecutive year, we're funding CS4HS to invest in the next generation of computer scientists and engineers. CS4HS is a workshop for high school and middle school computer science teachers that introduces new and emerging concepts in computing and provides tips, tools and guidance on how to teach them. The ultimate goals are to "train the trainer," develop a thriving community of high school CS teachers and spread the word about the awe and beauty of computing.

In 2011 we're expanding the program considerably and hope to double the number of schools we funded in 2010. If you're a university, community college, or technical School in the U.S., Canada, Europe, Middle East or Africa and are interested in hosting a workshop at your institution, please visit www.cs4hs.com to submit an application for grant funding. Applications will be accepted between January 18, 2011 and February 18, 2011.

In addition to submitting your application, on the CS4HS website you'll find info on how to organize a workshop, as well as websites and agendas from last year's participants to give you an idea of how the workshops were structured in the past. There's also a collection of CS4HS curriculum modules that previous participating schools have shared for future organizers to use in their own program.

Previous organizers have told us that teachers have left their workshops excited about the new materials they learned and the innovative ideas they've discussed with other teachers. We're hopeful that they'll pass on to their students not only the skills that they learned but also that passion.


When I joined Google in 2001 I never imagined—even in my wildest dreams—that we would get as far, as fast as we have today. Search has quite literally changed people's lives—increasing the collective sum of the world's knowledge and revolutionizing advertising in the process. And our emerging businesses—display, Android, YouTube and Chrome—are on fire. Of course, like any successful organization we've had our fair share of good luck, but the entire team—now over 24,000 Googlers globally—deserves most of the credit.

And as our results today show, the outlook is bright. But as Google has grown, managing the business has become more complicated. So Larry, Sergey and I have been talking for a long time about how best to simplify our management structure and speed up decision making—and over the holidays we decided now was the right moment to make some changes to the way we are structured.

For the last 10 years, we have all been equally involved in making decisions. This triumvirate approach has real benefits in terms of shared wisdom, and we will continue to discuss the big decisions among the three of us. But we have also agreed to clarify our individual roles so there's clear responsibility and accountability at the top of the company.

Larry will now lead product development and technology strategy, his greatest strengths, and starting from April 4 he will take charge of our day-to-day operations as Google's Chief Executive Officer. In this new role I know he will merge Google's technology and business vision brilliantly. I am enormously proud of my last decade as CEO, and I am certain that the next 10 years under Larry will be even better! Larry, in my clear opinion, is ready to lead.

Sergey has decided to devote his time and energy to strategic projects, in particular working on new products. His title will be Co-Founder. He's an innovator and entrepreneur to the core, and this role suits him perfectly.

As Executive Chairman, I will focus wherever I can add the greatest value: externally, on the deals, partnerships, customers and broader business relationships, government outreach and technology thought leadership that are increasingly important given Google's global reach; and internally as an advisor to Larry and Sergey.


From left to right - Eric, Larry and Sergey in a self-driving car in a photo taken earlier today

We are confident that this focus will serve Google and our users well in the future. Larry, Sergey and I have worked exceptionally closely together for over a decade—and we anticipate working together for a long time to come. As friends, co-workers and computer scientists we have a lot in common, most important of all a profound belief in the potential for technology to make the world a better place. We love Google—our people, our products and most of all the opportunity we have to improve the lives of millions of people around the world.


This is the latest in our series of YouTube highlights. Every couple of weeks, we bring you regular updates on new product features, interesting programs to watch and tips you can use to grow your audience on YouTube. Just look for the label "YouTube Highlights" and subscribe to the series. – Ed.

Since our last update, we've featured new music programs, brought you closer to what's going on in government and highlighted some of the best ads of 2010.

Music videos now on YouTube app for Android
We've welcomed VEVO's extensive library of official music videos from artists like Lady Gaga, Rihanna, Kanye West and U2 onto the YouTube 2.0 app for Android, available for mobile phones running Android 2.2 (Froyo). Enjoy!

Broken Social Scene goes live on YouTube
Earlier this week, Canada's indie rock collective Broken Social Scene kicked off their Winter 2011 tour with a live performance at NYC's Terminal 5. You can still catch the show on http://www.youtube.com/bowerypresents.



Your window into the 112th U.S. Congress
John Boehner, the new Speaker of the United States House, and House Oversight Committee Chairman Darrell Issa are making the activities of the House of Representatives more accessible to U.S. citizens via YouTube. Starting in this 112th Congress, all committee hearings of the House Oversight committee will be available on YouTube, on a new channel called HouseResourceOrg. This was made possible via a Google Project 10^100 grant made to Carl Malamud at PublicResource.org, who will be working with the House to access and upload all of the hearings that the Oversight Committee holds.

Meet the YouTube Symphony Orchestra 2011
The new members of the YouTube Symphony Orchestra 2011 have been selected: 101 people from more than 30 countries around the world are heading to Sydney Opera House to rehearse together for the first time under the conductorship of Michael Tilson Thomas. Come meet the winners and stay tuned for the final performance on Sunday, March 20, which will be streamed live to the world on YouTube.

A sneak peek at "Life in a Day"
In anticipation of the world premiere of "Life in a Day," at the 2011 Sundance Film Festival next week, we're releasing a series of clips between now and then. Life in a Day is a documentary film directed by Oscar-winner Kevin Macdonald, produced by Ridley Scott, and filmed on July 24, 2010 by thousands of YouTube users around the world. Watch the first teaser below.



Looking back at the best YouTube ads of 2010
2010 was a breakout year for online video advertising. Earning people's attention has become ever more challenging—but that's only making advertising more fun. Old Spice's "The Man Your Man Could Smell Like" was ranked number one among YouTube ads in an informal poll of the YouTube advertising team and reporters in the industry. Find out what other ads topped last year's list.



Until next time, visit the YouTube Blog for news and updates.


Thursday, December 2, 2010

Place trust and credibility above everything else


For any business to get noticed on the Web, it's vital that Decision Makers understand how to promote their! digital presence on search engines. That's easy for larger companies with budgets for ongoing SEO services, but what about your small businesses?

  We've plucked a few of the ones that are most relevant to your Business with Our Web solution!

  1. Place trust and credibility above everything else.
Matt McGee from the Small Business Search Marketing blog wrote that "Trust is the #1 SEO ranking factor." To that end, he advises readers to earn trust from site visitors by "providing great content," assuring them that "search engines will follow suit."
"Establish yourself as an expert - create excellent content that people will want to link to and share."
If you are a business owner this should be your highest priority! If You use your Website to build credibility, everything else will fall into place naturally! This ties together nicely with another of Campbell's tips...
  1. Use your content to target humans, not search engines.
Marty Lamers at the blog Articulayers suggested that readers write site content with humans in mind rather than attracting the attention of search engine crawlers.
"Create stronger content versus increasing the density of targeted keywords," he wrote. This will make the content more valuable to a wider audience.
  1. Keep it local!
Since most consumers are using search engines to find local businesses, Businesses should include as much local-oriented content as possible, according to Vedran Tomic of SEO Rabbit.
"To attract visitors from your local area, include words on your website that your customers use to describe your business (not necessarily the words you use). If you serve a specific geographic area, describe that in detail on your website."
  1. Integrate Social Media
    This suggestion was provided to Campbell by Tamar Weinberg at Techipedia. According to Weinberg, the popularity and trust that consumers instill in social media sites like Twitter and Facebook mean that "search engines may rank these pages higher than other destinations on the 'net."
    The secret? It's a no-brainer, and it's so easy with our Web solution! "Create profiles on social media sites (e.g., Twitter) and add content to such sites regularly. Fresh content means search engines will visit often!"

Ready Set - Track Santa - Google Earth Engine




(Cross-posted from the Google.org blog)

Today, we launched a new Google Labs product called Google Earth Engine at the International Climate Change Conference in sunny Cancun, Mexico. Google Earth Engine is a new technology platform that puts an unprecedented amount of satellite imagery and data—current and historical—online for the first time. It enables global-scale monitoring and measurement of changes in the earth's environment. The platform will enable scientists to use our extensive computing infrastructure—the Google "cloud"—to analyze this imagery. Last year, we demonstrated an early prototype. Since then, we have developed the platform, and are excited now to offer scientists around the world access to Earth Engine to implement their applications.

Why is this important? The images of our planet from space contain a wealth of information, ready to be extracted and applied to many societal challenges. Scientific analysis can transform these images from a mere set of pixels into useful information—such as the locations and extent of global forests, detecting how our forests are changing over time, directing resources for disaster response or water resource mapping.

The challenge has been to cope with the massive scale of satellite imagery archives, and the computational resources required for their analysis. As a result, many of these images have never been seen, much less analyzed. Now, scientists will be able to build applications to mine this treasure trove of data on Google Earth Engine, providing several advantages:
  • Landsat satellite data archives over the last 25 years for most of the developing world available online, ready to be used together with other datasets including MODIS. And we will soon offer a complete global archive of Landsat.
  • Reduced time to do analyses, using Google's computing infrastructure. By running analyses across thousands of computers, for example, unthinkable tasks are now possible for the first time.
  • New features that will make analysis easier, such as tools that pre-process the images to remove clouds and haze.
  • Collaboration and standardization by creating a common platform for global data analysis.
Google Earth Engine can be used for a wide range of applications—from mapping water resources to ecosystem services to deforestation. It's part of our broader effort at Google to build a more sustainable future. We're particularly excited about an initial use of Google Earth Engine to support development of systems to monitor, report and verify (MRV) efforts to stop global deforestation.

Deforestation releases a significant amount of carbon into the atmosphere, accounting for 12-18% of annual greenhouse gas emissions. The world loses 32 million acres of tropical forests every year, an area the size of Greece. The United Nations has proposed a framework known as REDD (Reducing Emissions from Deforestation and Forest Degradation in Developing Countries) that would provide financial incentives to tropical nations to protect their forests. Reaching an agreement on early development of REDD is a key agenda item here in Cancun.

Today, we announced that we are donating 10 million CPU-hours a year over the next two years on the Google Earth Engine platform, to strengthen the capacity of developing world nations to track the state of their forests, in preparation for REDD. For the least developed nations, Google Earth Engine will provide critical access to terabytes of data, a growing set of analytical tools and our high-performance processing capabilities. We believe Google Earth Engine will bring transparency and more certainty to global efforts to stop deforestation.

Over the past two years, we've been working with several top scientists to fully develop this platform and integrate their desktop software to work online with the data available in Google Earth Engine. Those scientists—Greg Asner of the Carnegie Institution for Science, Carlos Souza of Imazon and Matt Hansen of the Geographic Information Science Center at South Dakota State University—are at the cutting edge of forest monitoring in support of climate science.

In collaboration with Matt Hansen and CONAFOR, Mexico's National Forestry Commission, we've produced a forest cover and water map of Mexico. This is the finest-scale forest map produced of Mexico to date. The map required 15,000 hours of computation, but was completed in less than a day on Google Earth Engine, using 1,000 computers over more than 53,000 Landsat scenes (1984-2010). CONAFOR provided National Forest Inventory ground-sampled data to calibrate and validate the algorithm.

A forest cover and water map of Mexico (southern portion, including the Yucatan peninsula), produced in collaboration with scientist Matthew Hansen and CONAFOR.

We hope that Google Earth Engine will be an important tool to help institutions around the world manage forests more wisely. As we fully develop the platform, we hope more scientists will use new Earth Engine API to integrate their applications online—for deforestation, disease mitigation, disaster response, water resource mapping and other beneficial uses. If you're interested in partnering with us, we want to hear from you—visit our website! We look forward to seeing what's possible when scientists, governments, NGO's, universities, and others gain access to data and computing resources to collaborate online to help protect the earth's environment.


(Cross-posted on the Google Enterprise Blog)

The U.S. General Services Administration (GSA) today announced its decision to move 17,000 employees and contractors to Google Apps for Government. GSA oversees the business of the U.S. federal government, providing real estate and building management services as well as acquisition and procurement assistance to other federal agencies.

GSA's decision to switch to Google Apps resulted from a competitive request for proposal (RFP) process that took place over the past six months, during which the agency evaluated multiple proposals for replacing their existing on-premises email system. GSA selected Google partner Unisys as the prime contractor to migrate all employees in 17 locations around the world to an integrated, flexible and robust email and collaboration service in 2011.

By making this switch, GSA will benefit in a number of ways. Modern email and collaboration tools will help make employees more efficient and effective. Google Apps will bring GSA a continual stream of new and innovative features, helping the agency keep pace with advances in technology in the years ahead. And taxpayers will benefit too—by reducing the burden of in-house maintenance and eliminating the need to replace hardware to host its email systems, GSA expects to lower costs by 50 percent over the next five years.

Earlier this year, Google Apps became the first suite of cloud computing email and collaboration applications to receive Federal Information Security Management Act (FISMA) certification, enabling agencies to compare the security features of Google Apps to that of existing systems.

GSA is leading the way in embracing the federal government's "cloud first" policy, under which agencies should opt for hosted applications when secure, reliable, cost-effective options are available. We are thrilled that GSA has chosen to move to the cloud with Google and look forward to expanding our productive partnership with them.


(Cross-posted from the Google Mobile Blog)

While we've had oodles of Google doodles on our desktop homepage since Larry and Sergey created our very first in 1998, doodles on our mobile homepage have been few and far between. Today, we're happy to announce that we're bringing more doodles to your phone, beginning with Android 2.0+ and iOS 3+ devices worldwide. In fact, almost all of the doodles we show on our desktop homepage will now have corresponding mobile versions on these phones. When the doodles are available, just go to google.com in your mobile browser to see them.


Want your doodles within easy reach? You can get to google.com quickly by adding a shortcut to your home screen.

A recent article by the New York Times related a disturbing story. By treating your customers badly, one merchant told the paper, you can generate complaints and negative reviews that translate to more links to your site; which, in turn, make it more prominent in search engines. The main premise of the article was that being bad on the web can be good for business.

We were horrified to read about Ms. Rodriguez's dreadful experience. Even though our initial analysis pointed to this being an edge case and not a widespread problem in our search results, we immediately convened a team that looked carefully at the issue. That team developed an initial algorithmic solution, implemented it, and the solution is already live. I am here to tell you that being bad is, and hopefully will always be, bad for business in Google's search results.

As always, we learned a lot from this experience, and we wanted to share some of that with you. Consider the obvious responses we could have tried to fix the problem:

  • Block the particular offender. That would be easy and might solve the immediate problem for that specific business, but it wouldn't solve the larger issue in a general way. Our first reaction in search quality is to look for ways to solve problems algorithmically.

  • Use sentiment analysis to identify negative remarks and turn negative comments into negative votes. While this proposal initially sounds promising, it turns out to be based on a misconception. First off, the terrible merchant in the story wasn't really ranking because of links from customer complaint websites. In fact, many consumer community sites such as Get Satisfaction added a simple attribute called rel=nofollow to their links. The rel=nofollow attribute is a general mechanism that allows websites to tell search engines not to give weight to specific links, and it's perfect for the situation when you want to link to a site without endorsing it. Ironically, some of the most reputable links to Decor My Eyes came from mainstream news websites such as the New York Times and Bloomberg. The Bloomberg article was about someone suing the company behind Decor My Eyes, but the language of the article was neutral, so sentiment analysis wouldn't have helped here either.

    As it turns out, Google has a world-class sentiment analysis system (Large-Scale Sentiment Analysis for News and Blogs). But if we demoted web pages that have negative comments against them, you might not be able to find information about many elected officials, not to mention a lot of important but controversial concepts. So far we have not found an effective way to significantly improve search using sentiment analysis. Of course, we will continue trying.

  • Yet another option is to expose user reviews and ratings for various merchants alongside their results. Though still on the table, this would not demote poor quality merchants in our results and could still lead users to their websites.
Instead, in the last few days we developed an algorithmic solution which detects the merchant from the Times article along with hundreds of other merchants that, in our opinion, provide an extremely poor user experience. The algorithm we incorporated into our search rankings represents an initial solution to this issue, and Google users are now getting a better experience as a result.

We can't say for sure that no one will ever find a loophole in our ranking algorithms in the future. We know that people will keep trying: attempts to game Google's ranking, like the ones mentioned in the article, go on 24 hours a day, every single day. That's why we cannot reveal the details of our solution—the underlying signals, data sources, and how we combined them to improve our rankings—beyond what we've already said. We can say with reasonable confidence that being bad to customers is bad for business on Google. And we will continue to work hard towards a better search.



From feasting on a turkey dinner to singing carols around the fire, there are certainly plenty of traditions to enjoy during the holiday season. Much to the delight of the child in each of us, the ritual of gift-giving continues today, and I know I still find cheer at the bottom of my stocking every Christmas morning.

Another tradition that brings joy to youngsters everywhere is the one started in 1955 by NORAD, the North American Aerospace Defense Command, which every year counts down to Christmas Eve and tracks Santa's whereabouts as he delivers presents across the globe. Google similarly started tracking Santa in 2004 and has been partnering with NORAD on this fun project since 2007. Keeping the tradition alive, today marks the kick-off of this year's countdown at www.noradsanta.org. On the NORAD website, kids can play holiday-themed games (a new one is released each day) and get updates from the North Pole as Santa prepares for his big sleigh ride.

If you haven't tracked Santa in years past, we hope this is the year you'll start a new tradition of visiting www.noradsanta.org and following Santa's journey all around the world. Starting at 2 a.m. EST on December 24, you'll be able to track him in real-time on Google Maps from your computer or phone as well as on Google Earth with the plug-in by searching for [santa].

So this year, along with my family's usual tradition of gathering around to hear my mum read "Twas the night before Christmas," we'll gather around the computer to see when Santa might be coming to our neighborhood. In honor of the occasion, I wrote a new opening verse:
'Twas the night before Christmas, and Santa was near
According to NORAD, he would soon be right here
So we hopped into bed and dreamt of new toys
And awoke in the morning to much Christmas joy
Happy holidays to all, and to tide you over till Christmas Eve, enjoy this video with highlights from Santa's journey last year!




(Cross-posted from the LatLong Blog)

Occasionally, we invite distinguished guests to contribute to our blogs and we're very happy to have Wangari Maathai share her perspective here. In collaboration with Wangari Maathai's Green Belt Movement and several other partners, the Google Earth Outreach team has created several narrated tours on the topic of climate change in preparation for the UNFCCC's COP16 Climate Summit 2010 in Cancun, Mexico. Fly underwater to learn about the effects of ocean acidification on sea life with Oceana. Zoom around Mexican mangroves in 3D and learn about the importance of this biodiverse habitat... and what must be done to protect it for future generations. Visit google.com/landing/cop16/climatetours.html to experience these tours. -Ed.

Ask most people what trees mean to them and the first thing that comes to mind is the tree outside their bedroom window or the forest where they played as a child. Trees do occupy a powerful place in our emotions, but the most powerful argument to protect our world's trees is not based on sentiment. There is a vital interdependency between communities and the trees they rely on for survival. Trees are our watersheds, protectors of the natural environment, and sources of food. Remove the trees from the equation and the community feels the impact.

I came to this realization in the 1970s in Kenya. I was talking to women in my community about their problems: hunger, access to water, poverty, wood fuel. I saw a link between their needs and the condition of the land and thought, "Why not plant trees to address these issues?" Trees hold the soil to the ground so that we can grow food in it, they protect watersheds and facilitate harvesting of rain water, fruits trees supplement food and trees give us domestic energy and wood with which to build our shelters. So while still working at the University of Nairobi, I established a tree nursery in my backyard, planted seven trees at a public park and founded the Green Belt Movement. The organization works to empower communities, to build their capacity to restore Africa's forests and put an end to the problems that deforestation and other forms of environmental degradation cause. As a result of this idea, more than 40 million trees have been planted to restore the environment and improve the lives of the people who are linked to the land.

When we were offered a unique opportunity to partner with the Google Earth Outreach team on a project using new Google Earth technology to visualize trees in 3D, we were thrilled. For accuracy and integrity we worked very closely with Google, advising them on the modeling of unique African trees like the broad-leaved Croton, the Nile tulip tree and the East African Cordia. These tree models illustrate the biodiversity in our tree planting sites, especially in the forests, and how we carefully select trees that are indigenous and sustainable to the natural surroundings.

Broad-leaved Croton, the Nile tulip tree and the East African Cordia (from left to right)

We then used data from real planting locations to "plant" the tree models in Google Earth and create 3D visualizations. Now, for the first time in Google Earth, people from all over the world will be able to virtually visit these planting sites, explore the 3D trees and connect with the work that we are doing.

Green Belt Movement planting site in 3D on Google Earth

Tree planting is a simple activity with tangible results, and anyone can participate. It helps people come together to address common problems and work collectively towards community improvement and sustainability. I hope that seeing our beautiful tree planting sites in 3D on Google Earth will be a source of inspiration for people to engage, plant trees and organize planting activities in their own communities. Taking charge of our lives and the environment around us can help ensure a lasting legacy and healthy future for our children.



Learn more about the Green Belt Movement and support our work at http://www.greenbeltmovement.org.