The internet giant is reportedly interested in a new measure for ranking search results: How factually accurate a site is.
image via (cc) flickr user __Andrew
For those who run a website, playing the search engine optimization game is a necessary evil if you want your stuff to be seen by anyone who isn’t in your immediate family. Mastering the dark alchemy of SEO to ensure your site becomes one of the top hits on Google is one of the more difficult forms of internet manipulation, but doing so can mean the difference between showing up on the first page of a web search, or the 947th.
Certain factors, like key words and the number of other sites linking to yours, have, to date, been what helps Google determine where to rank a page. Now a new report says the search engine giant is looking at another way to potentially determine the order in which they present search results: How factually accurate a site is.
A Google research team is adapting that model [of inbound links] to measure the trustworthiness of a page, rather than its reputation across the web. Instead of counting incoming links, the system—which is not yet live—counts the number of incorrect facts within a page. "A source that has few false facts is considered to be trustworthy," says the team (arxiv.org/abs/1502.03519v1). The score they compute for each page is its Knowledge-Based Trust score.
The study cited above, entitled "Knowledge-Based Trust: Estimating the Trustworthiness of Web Sources," explains:
On synthetic data, we show that our method can reliably compute the true trustworthiness levels of the sources. We then apply it to a database of 2.8B facts extracted from the web, and thereby estimate the trustworthiness of 119M webpages. Manual evaluation of a subset of the results confirms the effectiveness of the method.
In effect, the team has developed a way to determine what data should to be pulled from a webpage in order to be evaluated for truthfulness, and then, by measuring that against information stored in Google’s “Knowledge Vault” of accepted true facts, can determine just how truthful a web page is. They can then factor that factualness into Google’s search results, and—voilá—the truth-ier pages rise to the top.
Separating the factual from the bullshit is, in theory, great—especially given the degree to which ambiguity, half-truths, and outright lies can thrive online. And, as New Scientist details, Google’s Knowledge Vault consistently cross-indexes existing “facts” with newly incorporated data to determine the continued validity of a claim. Truthfulness is determined by informed probability, rather than simply as a “true/false” binary. For example, nearly all websites probably agree that Barack Obama is the President of the United States, so that “fact” would be considered to have an exceptionally high probability of being true.
Still, given the make-or-break power of Google’s ubiquitous search ranking when it comes to what information a web user typically does, and doesn’t see, there's something a bit unsettling about a corporation looking to become an arbiter of truthfulness. It sounds as if Google’s algorithm is both effective and accurate, but when it comes to internet “truths” we’re probably all better off taking everything with a few grains of salt. And that’s a fact.