Waterloo research: Why can Google read your mind?

When Professor Charles Clarke and one of his students entered a name into search engines on their computers recently, they received wildly different results.

For Clarke, associate director of undergraduate studies at Waterloo’s Cheriton School of Computer Science, information about a local computer scientist was the first link in a list of responses.

But the student’s top link led to information about an Iranian pop singer.

Getting such different responses isn’t as strange as it might seem. After all, the student was from Iran, and the web search history on her computer reflected her interests.

Charles Clarke

Photo Credit: Mirko Petricevic

“Why can Google read your mind?” Clarke asks. “It’s seen your mind before.”

Clarke, himself, is partly responsible.

Drowning in data

He’s working to make your digital information searches more relevant to you. Clarke says academics, like himself, and researchers who work for businesses need to work together — partly because commercial search providers have so much information.

“I have a drop, they have an ocean,” says Clarke. Drowning in data, large companies sometimes provide real-world search data for Clarke’s research.

But while online search engines occupy a huge part of the field known as information retrieval, or information access, they aren’t the only technologies being used.

Researchers are also working to make search results more relevant for people looking for information that’s hidden in files on their personal computers or lurking in electronic libraries.

Sometimes a person needs only two paragraphs from a 300-page book, Clarke says. So he’s trying to help users zero in on those relevant paragraphs.

Research on the effectiveness of search tools

In a recent paper co-authored by Clarke and Mark Smucker, of Waterloo’s Department of Management Sciences, the researchers explain a new method they devised to measure the effectiveness of information retrieval systems.

In their method, the value of a particular user’s experience is considered to be greater when he or she has spent more time consuming useful material than searching for the material - a variable know as “Time Well Spent.”

The new method can be used to estimate:

the difference of Time Well Spent by the general populations using one system compared to another
the probability that a randomly selected individual user, or topic, would prefer one retrieval system over another

Clarke and Smucker’s method produces measurements on a per-query basis. Traditionally, researchers used cruder measurements, calculated from assumptions about larger populations, to gauge the effectiveness of information retrieval systems.

Search on mobile devices is new frontier

Clarke says the new frontier in information access is making the search experience better for people using mobile devices.

Retrieving information based on a user’s location, automatic suggestions and automatic completion of queries are just some of researchers’ current concerns.

And when he envisions the future, Clarke turns to the history of the tablet computer. Few people predicted its popularity. But now, he says, people are “glued” to their iPads.

“In information access, there’s still room to provide these types of magical experiences."