Monday, October 20, 2008

The future of search

The Internet has had an enormous impact on people's lives around the world in the 10 years since Google's founding. It has changed politics, entertainment, culture, business, health care, the environment and just about every other topic you can think of. Which got us to thinking, what's going to happen in the next 10 years? How will this phenomenal technology evolve, how will we adapt, and (more importantly) how will it adapt to us? We asked 10 of our top experts this very question, and over the next three weeks we will present their responses. As computer scientist Alan Kay has famously observed, the best way to predict the future is to invent it, so we will be doing our best to make good on our experts' words every day. - Karen Wickre and Alan Eagle, series editors.

I am a search addict. I'm naturally inquisitive – I've always liked finding things out. Plus, I've worked at Google on search for the past 9 years and 3 months. Of course I search - a lot. Yet I would guess that on any given day, I only do about 20% of the searches that I could. This past Saturday, I kept track of the things that came up in conversation that I wanted to search for right then but couldn't:

Are "fab," "goy" and "eely" words? (There was a Scrabble game going on.) What time does J.C. Penney open on Saturday? Which school has a team called the Banana Slugs? What is the team mascot for San Jose State? How much power does that hydroelectric dam generate? What do you call a group of turkeys? What time does Tropic Thunder show? What's the name of that great Irish flute player, first name James? What's the name of the largest city in Russia after Moscow and St. Petersburg? Which is older, a redwood or a cypress? What's the oldest living thing and how old is it? Who sings "Queen of Hearts"? What kind of bird is that flying over there? Is the "LF" in San Francisco on Union Square or Union Street? What are the dance steps to the Charleston? What day of the week was The Lawrence Welk Show on? What are the lyrics to "In the Mood"? How does Coumadin differ from aspirin in its blood thinning effects? What was the story behind the naming of the number "googol"?

And those are just the ones that I remember. Looking at this list, two things are very clear: (1) I could do a lot more searches and (2) search still has a lot of opportunity for innovation, change, and progress. There are lots of ways that search will need to evolve in order to easily meet user needs. Let's look at some of my unanswered questions from Saturday and consider how search might change over the next 10 years.

First, why couldn't I do these searches right then, when I needed to? Because search still isn't accessible enough or easy enough. Search needs to be more mobile – it should be available and easy to use in cell phones and in cars and on handheld, wearable devices that we don't even have yet. For example, when the topic of the oldest living thing came up during a boat ride, everyone in the conversation was curious about it, but no one wanted to break out an awkward, slow device to do a search. It would be much nicer if we had a device with great connectivity that could do searches without interruption. One far-fetched idea: how about a wearable device that does searches in the background based on the words it picks up from conversations, and then flashes relevant facts?

This notion brings up yet another way that "modes" of search will change – voice and natural language search. You should be able to talk to a search engine in your voice. You should also be able to ask questions verbally or by typing them in as natural language expressions. You shouldn't have to break everything down into keywords.

Further, why should a search be words at all? Why can't I enter my query as a picture of the birds overhead and have the search engine identify what kind of bird it is? Why can't I capture a snippet of audio and have the search engine identify and analyze it (a song or a stream of conversation) and tell me any relevant information about it? Services that do parts of that are available today, but not in an easy-to-use, integrated way.

In the next 10 years, we will see radical advances in modes of search: mobile devices offering us easier search, Internet capabilities deployed in more devices, and different ways of entering and expressing your queries by voice, natural language, picture, or song, just to name a few. It's clear that while keyword-based searching is incredibly powerful, it's also incredibly limiting. These new modes will be one of the most sweeping changes in search.

Then there's the media aspect. The 10 blue links offered as results for Internet search can be amazing and even life-changing, but when you are trying to remember the steps to the Charleston, a textual web page isn't going to be nearly as helpful as a video. The media of the results matters.

Universal search, which we released last May, was an important first step that included images, videos, news, books, and maps/local information in our main Google search results. Yet our presentation is still very linear (the results are just a list) and even (no one result is more important or larger than the next). What if the results page began to transform radically to really harness these different types of results into something that felt much more like an answer rather than just 10 independent guesses? What if results pages pulled the best media together and laid it out such that the most useful content was not only first but largest? What if we laid out content in columns to use more of the width available on newer, wider screens?

We've barely scratched the surface with universal search, but it's an important first step to exploring the full range of what we can do with rich media. For the past year, our goal has been to take advantage of these new types of results and evolve the interface design and user experience in response. You'll see the fruits of this experimentation in the coming months, but even these changes are just the beginning. The face of search will change dramatically over the next 10 years. Maybe it should contain even more videos and images, maybe it should sharply differentiate the relative weight and accuracy of the results more, maybe it should be more interactive in terms of refinements? We're not sure yet, but we do know that the one thing that the search experience can't be - especially in the face of the online media explosion we're currently experiencing - is stagnant.

Search engines 10 years from now will be a lot better than the ones we have now. We know this because Google itself gets a little better each day. We're constantly writing and revising new notions of search relevance, and we release improvements almost daily. Those improvements add up for us and for other search engines, so it follows that search engines 10 years from now will be markedly better. Therefore, the real question is not will search be better, but rather how will it be better?

One answer is clear: search engines of the future will be better in part because they will understand more about you, the individual user. Of course, you will be in control of your personal information, and whatever personal information the search engine uses will be with your permission and will be transparent to you. But even with the most rudimentary user information, search engines can and will provide drastically better search results. Maybe the search engines of the future will know where you are located, maybe they will know what you know already or what you learned earlier today, or maybe they will fully understand your preferences because you have chosen to share that information with us. We aren't sure which personal signals will be most valuable, but we're investing in research and experimentation on personalized search now because we think this will be very important later.

Your location is one potentially useful facet of personalized information. Looking at my questions, the answers to a number of them (What time does J.C. Penney open? How much power does that hydroelectric dam generate? What time does Tropic Thunder play?) require the search engine to know that I was in Yankton, South Dakota and Crofton, Nebraska when I asked. Since location is relevant to a lot of searches, incorporating user location and context will be pivotal in increasing the relevance and ease of search in the future.

Another element of personalization is social context. Who am I friends with, and how do I relate to them? How can I harness their knowledge more efficiently? For example, I have a friend who works at a store called LF in Los Angeles (hence, the question about LF in San Francisco). By itself, "LF" is a very ambiguous acronym. According to the first page of search results on Google, it could refer to my friend's trendy fashion store, but it could also refer to Leapfrog Enterprises, low frequency, Lebhar-Friedman, Li & Fung Investment Group, LF Driscoll Construction Management, large format, or a future concept car design from Lexus. Today, the person typing "LF" has to figure out which is the right result – to "disambiguate" the ambiguous term – but this is something that the search engine needs to get better at. Perhaps we'll understand the semantics of the question about where LF in San Francisco is, and infer that LF is a store. Or maybe, search could analyze my social graph and realize that one of my friends works at LF, that I saw that friend this weekend, and that in that context "LF" refers to her place of employment. Algorithmic analysis of the user's social graph to further refine a query or disambiguate it could prove very useful in the future.

In addition, there are searches where actually asking a friend helps. I was having a hard time finding out the answer to the question about aspirin versus Coumadin because I was spelling it 'cumitin' and Google wasn't correcting me. A quick email to a doctor friend, and I was back on the right track - equipped with the right spelling and his explanation of the difference, so I could search and learn even more about how these two drugs are used to thin blood. There's a lot of expertise, knowledge, and context in users' social graphs, so putting tools in place to make "friend-augmented" search easy could make search more efficient and more relevant.

The above examples show how modes, media, and various forms of personalization have the potential to vastly improve search – but what about language? We know there are cases where an answer exists on the web, but not in a language you read. This is why Google is investing in machine translation. We want to be able to unlock the power of web search for anyone speaking any language. The basic concept is – if the answer exists online anywhere in any language, we'll go get it for you, translate it and bring it back in your native tongue. This is an incredibly empowering idea that could really change the way that users experience the web and communicate with each other, particularly in languages where not a lot of native content is available. You can see our early explorations in this space here, by visiting our cross-language information retrieval tool.

We're all familiar with 80-20 problems, where the last 20% of the solution is 80% of the work. Search is a 90-10 problem. Today, we have a 90% solution: I could answer all of my unanswered Saturday questions, not ideally or easily, but I could get it done with today's search tool. (If you're curious, the answers are below.) However, that remaining 10% of the problem really represents 90% (in fact, more than 90%) of the work. Coming up with elegant, fitting and relevant solutions to meet the challenges of mobility, modes, media, personalization, location, socialization, and language will take decades. Search is a science that will develop and advance over hundreds of years. Think of it like biology and physics in the 1500s or 1600s: it's a new science where we make big and exciting breakthroughs all the time. However, it could be a hundred years or more before we have microscopes and an understanding of the proverbial molecules and atoms of search. Just like biology and physics several hundred years ago, the biggest advances are yet to come. That's what makes the field of Internet search so exciting.

So what's our straightforward definition of the ideal search engine? Your best friend with instant access to all the world's facts and a photographic memory of everything you've seen and know. That search engine could tailor answers to you based on your preferences, your existing knowledge and the best available information; it could ask for clarification and present the answers in whatever setting or media worked best. That ideal search engine could have easily and elegantly quenched my withdrawal and fueled my addiction on Saturday. I'm very proud that Google in its first 10 years has changed expectations around information and how quickly and easily it should be able to be retrieved. But I'm even more excited about what Google search can achieve in the future.

And here, in order, are the answers to my Saturday questions.

Are fab, goy, and eely words? Yes, yes, and yes, according to Merriam-Webster:
Search: [fab ]
Search: [goy]
Search:[eely ]

What time does J.C. Penney open on Saturday? 10 a.m.
Search: [jc penney yankton ]
Hours on results page:

Which school has a team called the Banana Slugs? University of California, Santa Cruz
Search: [banana slugs]

What is the team mascot for San Jose State? The San Jose State Spartans
Search: [san jose state mascot]
On results page:

How much power does that hydroelectric dam generate? $35M of electricity annually
Search: [hydroelectric dam crofton yankton]
Search: [gavins point dam]

What do you call a group of turkeys? A rafter of turkeys
Search: [group of turkeys]
On results page:

What time does Tropic Thunder show? 7 p.m.
Search: [movies yankton mall]

What's the name of that great Irish flute player, first name James? James Galway
Search: [irish flute player james]
On results page:

What's the name of the largest city in Russia after Moscow and St. Petersburg? Novobirsk
Search: [largest Russian cities]

What's older, a redwood or a cypress? Cypresses (4500 years old is oldest known) are older than redwoods (2200 years old is oldest known)
Search: [cypress tree age]
Search: [redwood tree age]

What's the oldest living thing and how old is it? The bristlecone pine, living for 5,000-11,000 years
Search: [oldest living thing]

Who sings "Queen of Hearts"? Juice Newton
Search: ["queen of hearts" song]
On results page:

What kind of bird is that flying over there? A turkey vulture
Search: [turkey vulture flying] on Google image search
Pictures that match on results page:

Is the LF in San Francisco on Union Square or Union Street? 1870 Union Street
Search: [lf san francisco]
Address on results page:

What are the dance steps to the Charleston? Show in video below
Search : [Charleston dance demonstration]
Video result:

What day of the week was The Lawrence Welk Show on? Saturday
Search: [lawrence welk show]

What are the lyrics to "In the Mood"?
"In the mood, that's what he told me,
In the mood, and when he told me,
In the mood, my heart was skippin',
It didn't take me long to say "I'm in the mood now"."
Search: ["in the mood" lyrics]

How does Coumadin differ from aspirin in its blood thinning effects? Aspirin is an anti-platelet agent that prevents clotting. Coumadin also prevents clotting but the mechanism is different. Both thin the blood, but Coumadin is stronger and much more effective in certain instances like atrial fibrillation.
Search: [aspirin Coumadin how different]

Link - from The Official Google Blog
More eye candy for iGoogle
At the Republican Convention: Politics in the cloud and on the ground
Update to Google Suggest
Give your gadgets some space

No comments: