Por · Friday, March 7th, 2014

Despite his success, famous Google Page Rank algorithm has never understood the word of billions of web pages to which led to the people over the years. This is why, in 2010, Google acquired Metaweb, a company dedicated to the construction of a database to give computers the ability to understand the world. Two years later, the company’s technology resurfaced as the graph of knowledge, or graphical knowledge. Google Engineering Vice President and co-founder of Metaweb, John Giannandrea, ensures that this will help future Google products to really understand people who use their and the things that matter. He talked about Tom Simonite, MIT Technology review, on how to do it through a data warehouse designed to unite all existing knowledge on Earth.

What is the graph of knowledge?

It is a synthesis of what Google knows about the world. Cards are an analogy that I use often. For a product of maps, you must build a database of the real world and know that there are things that are called streets, rivers and countries in the physical world. It is to create a symbolic structure for the physical world, while the graph of knowledge does the same thing, but in the world of ideas and common sense. We have entities in the graph of knowledge for food, recipes, products, ideas of philosophy and history and famous people. We can create relationships between them, to be able to say that two people are married or that this place is in this country, or say that this film has to do with this person.

What is different about a Google search on the web?

We went from level only the floor to talk about what something is in reality. We can now add a understanding of document scanning and indexing of documents. If the document is for famous players, we know that this is sport and tennis. Each element information item which capture us, index or search is analyzed in the context of the graph of knowledge. This is not the same thing as you and I could understand it, understand the text, but it is a step in this direction.

Now, we can do the questions and answers on, as for example search for “old is Barack Obama?”. We also do things related to exploration. We have a feature called carousel to explore the categories of entities, so if you type “London bridges” show you several bridges.

Of course, being able to understand what people are seeking help guide search ads. The applications of knowledge beyond research, but no graph?

Inside Google, the graph of knowledge is an element of increasingly large infrastructure, wide and deep. This is an effort of the entire company. Almost structured data all of all our products such as maps, music, movies and finance are in the graph of knowledge, so you can say, reasonably, that everything we know is always a canonical structure. It allows our product manager in every corner of the society to be more ambitious.

Generally, we try to go beyond and do not stay in the research, but really to know things. We believe that it is essential, because we want to understand what you’re trying to do and to be able to help. Instant Google is an example of a product that attempts to determine the State where you are and make suggestions. To do so effectively, it is necessary to understand people, will travel and air travel may be delayed.

One of the main areas seeks to understand, at a slightly higher level, what are the texts. The words that you see in a text are fundamentally ambiguous [laptop], but if you have the graph of knowledge and understand how they relate to each other the words, you can then remove its ambiguities. If you see a document that talks about George Bush, Saddam Hussein, and Norman Schwarzkopf, might be able to guess what Bush is because only one of them had to submit Norman Schwarzkopf. It is a small step toward understanding what really means the document.

The graph of knowledge is already finished?

It is more and more every second. If a local company to update their data with Google opening lists eventually go to the curve of knowledge, for example, and there are algorithms that seek changes in many public web sites, such as Wikipedia. Basically, take us these data in raw and filter to determine our level of confidence and see if you need to modify the chart. If a famous person dies, for example, we realize, and the graph of knowledge is updated.

Previously, the people proposed the construction of this type of representations of common sense by using artificial intelligence. I think that what distinguishes the graph of knowledge is that it is a very large and practical implementation of the same. The extent and accuracy of the knowledge curve is probably unique in history.

What with the subjective, as for example information if a restaurant is romantic?

It is an area of work in progress, even if the graph of knowledge contain subjective data. Sometimes one can see certain words, such as if this restaurant is known for X, or Z. Genres, in general, are difficult and the even more difficult genres because the people are not in agreement. But most databases would make an attempt to enumerate the kind and this is something that we can use.

Why does the graph of the knowledge of the vision of the semantic web developed by Tim Berners-Lee and others seem different?

The original idea of the semantic web was that people emit their data in standard formats and then a Google search engine could add and offer all kinds of wonderful services. This powerful idea to teach computers to the world of knowledge was not fast enough, and we wanted to accelerate to gather a critical mass of elements. We recognize that we have all the data of the world, but we believe that this model is useful. Us still manage to a public website called Freebase where people can contribute data to the database of open source and Google provides public APIs to access. Utilization and contributions to the Freebase are always higher.

