Intelligence comes in many flavors from kinesthetic intelligence in the sports all star, logical processing in the math genius, or interpersonal skills found in the charismatic sales person. No matter the domain, intelligence is often measured by the information people know or what people can do. Similarly, in order for artificial intelligence to become a reality, computers need to learn information across a variety of domains and demonstrate an understanding of common knowledge.
Why computers are dumb…
A point which is often overlooked by the public, but jumps out to people working in the field is that AI can only do very unique specific tasks. (This is one reason why apocalyptic movies like iRobot or the Terminator are still a long way off). Lots of time and effort is required to arrange the problem in a way computers can solve it. The famous, AlphaGo, system took several years to build and could compete with the world’s best Go players. However this same system doesn’t know the difference between Go and Chess. Since software is written to complete only very specific tasks, it often appears idiotic: like when Siri doesn’t understand the difference between a highway and a “high” way. Or when google pulls up all the wrong links. Or Alexa misunderstands what you’re saying.
This is because software doesn’t have any notion of what we call common sense, the things that 99% of people understand by the time they are seven years old. Things like a cup will fall off the table if you push it over the edge, a furrowed brow and frowny face means someone is unhappy or a person rides in a car, not the other way around. People are hardwired to learn patterns, something our unconscious minds started doing the moment we were born.
In contrast, computers learn from data. Google’s vision app learned how to identify objects through tens of millions of labeled images. Natural Language Processing (NLP) models learn to read by scanning billions of documents to identify the patterns of how words are used together.
Therefore, in order for software to gain a commonsense understanding of the world, we need to create datasets which captures all of these relationships between objects and how they can interact with each other. Something that demonstrates that people read books, boxes hold stuff and CEOs lead organizations.
One way to store these type of insights is referred to as a “Knowledge Graph.”
Knowledge Graph a computer’s hippocampus
Imagine writing down the name of everything around you on a piece of paper and then drawing a circle around each item. Now draw a line between items that are related and describe the relationship between the two. For example, a person sits on a chair or a light brightens a room or a person wears shoes. Ambiverse has a collection of over 10.5 million named entities (like people, places, and things). Here’s an example from their website:
The knowledge graph captures the simple relationships between objects and in doing so forms a web of information. In 2012, Google released the first version of its knowledge graph which allows people to obtain instant answers to their questions rather than hunting through multiple pages to find them. Concerns have been raised about this implementation because it doesn’t provide references which can easily result in misleading people if the source is not valid. These concerns need to be addressed, but the knowledge graph is still another small step forward towards the goal of General Artificial Intelligence (GAI).
Amazon is also working on a similar system which captures the relationship of all the products within their vast catalog, dubbed a Product Graph. According to Lua Dong, who helped to pioneer the Knowledge-Based Trust system for google in 2015, a product graph has some overlap with a knowledge graph but also contains a lot of new information which is unique to products. The knowledge graph allows Google to help you find what you are looking for on the web and the product graph will help Amazon users find what they are shopping for online.
Why is a Knowledge Graph is useful?
Beyond providing results for searches, Knowledge and Product graphs provide useful insights to the world around us. It allows for quick retrieval of information just as the bios which pop up when you search for a person, place or event in google. Here’s a quick video from google highlighting the concept:
It also helps to improve search results by giving the computer greater insight and context into the meaning of each word. A knowledge graph helps software understand the meaning of words within sentences and do a better job of going from speech to text or improving autocorrecting.
In the future, a knowledge graph could help software interpret the world better and begin to obtain an understanding of persons, places or things. As they grow in complexity, a similar system could be the foundation for an artificial “semantic memory” which describes the relationships between ideas. Rather than simply relating individual nouns, the system could learn to capture the relationship between more abstract ideas or concepts. For example, what is the relationship between BYU and the University of Utah, or laziness and hard work or the rich and poor? People naturally develop an intuition about these concepts based on real life experience, but computers need data to learn these relationships and gain an understanding of the world.
Unfortunately, our experiences can sometimes lead to biases, resulting in unjustified discrimination, which needs to be addressed head on. Similarly, computers will simply learn from the data we provide it and therefore if the data sources used (e.g. wikipedia, twitter, blogs, etc) to talk about ideas and concepts is implicitly discriminatory, the computer will learn that same discrimination. To combat this, society and AI developers will need to be intentional and cognizant of the implicit biases contained within the datasets used (for example, wikipedia, news sources, twitter, etc). We won’t be able to identify all of the biases upfront, so we need to start asking what can be done about it after the fact, when they are discovered. How do we monitor for discrimination within an AI model?
Great progress has been made recently in the ways to automatically create these knowledge and product graphs. As they become more widespread, let’s be mindful of the data being used to develop more complex concept or idea graphs in the future and consider how they can be checked for implicit biases or unwanted associations. Nevertheless, a knowledge graph is yet another way that software continues to grow smarter allowing the completion of more complex tasks.
What tasks do you wish software could take off your hands? How do you think we could decide what biases to eliminate from a knowledge graph?