You Are Browsing The KGO Category

Knowledge Graph Optimization

March 10 2014 // KGO + SEO // 40 Comments

A few months ago I offhandedly made a reference to KGO which stands for Knowledge Graph Optimization.

Now, I know what you’re thinking. We need another acronym like another hole in the head! But over the past year I feel like there are a set of tactics that can help you optimize your site’s connection to the Knowledge Graph. And that can yield material gains in search visibility.

The Knowledge Graph

The Knowledge Graph

Here’s a brief explanation from Google for those not familiar with the Knowledge Graph.

The Knowledge Graph enables you to search for things, people or places that Google knows about—landmarks, celebrities, cities, sports teams, buildings, geographical features, movies, celestial objects, works of art and more—and instantly get information that’s relevant to your query. This is a critical first step towards building the next generation of search, which taps into the collective intelligence of the web and understands the world a bit more like people do.

It’s about searching for things instead of strings. Or without the rhyming, it’s about entities instead of text.

Take the query ‘Golden State Warriors’. From a string stand point you’d be looking at the individual keywords which might be confusing. Now, Google got extremely good at understanding terms that were most frequently used together using bigrams and other methods so that this query would yield a result about the NBA basketball team.

But with the Knowledge Graph Google can instead identify ‘Golden State Warriors’ as an entity (a thing) that has a specific entry in the Knowledge Graph and return a much richer result.

Knowledge Graph Result for Golden State Warriors

Pretty amazing stuff really. (Go Warriors!)  Hummingbird was largely an infrastructure update that allowed Google to take advantage of burgeoning entity data. So we’re just getting started with the application of entities on search.

Entity Challenge

Challenge Accepted

You need only look to the Entity Recognition and Disambiguation Challenge co-sponsored by Microsoft and Google to see the writing on the wall.

The objective of an Entity Recognition and Disambiguation (ERD) system is to recognize mentions of entities in a given text, disambiguate them, and map them to the entities in a given entity collection or knowledge base.

Can it be any more clear? Well, actually, it can.

The Challenge is composed of two parallel tracks. In the “long text” track, the challenge targets are pages crawled from the Web; these contain documents that are meant to be easily understood by humans. The “short text” track, on the other hand, consists of web search queries that are intended for a machine. As a result, the text is typically short and often lacks proper punctuation and capitalization.

Search engines are chomping at the bit to get better at extracting entities from documents and queries so they can return more relevant and valuable search results.

So …

Wiley Coyote Bat Suit

But what exactly are we supposed to do? There has been little in the way of real rubber-meets-the-road content that describes how you might go about optimizing for this new world full of entities. One of the exceptions would be Aaron Bradley’s Semantic SEO post, though it mixes both theory and tactics.

Now, I love theory. That’s pretty clear from my writing. But today I want to talk more about tactics, about the actual stuff we can do as marketers to affect change in the Knowledge Graph.

Nouns

Noun

The first thing we can do is make sure we’re using the entity names in our writing. That ERD challenge above? Well, the systems they’re designing are looking to extract entities from text.

So if you’re not using the entity names – the nouns – in your writing then you’re going to make it vastly more difficult for search engines to identify and match entities. This does not mean you should engage in entity stuffing and mention every associated entity you can think of in your content.

Write clearly so that both humans and search engines know what the hell you’re talking about.

Connect

Connect All The Things!

Stop hoarding authority and ‘link juice’ by not linking out to other sites. The connections between sites and pages are important and not just in a traditional PageRank formula.

I think of it this way. The entities that are contained on one page are transmitted to linked pages and vice versa.

Entities are meta information passed in links.

Structured Data

Structured Data

You can make the identification of entities easier for search engines by using schema.org markup along with some other forms of structured data. Not only will this ensure that the number of entities that are transmitted via links increase, it can often make connections to the Knowledge Graph with a very limited amount of data.

Google Maps Entity Hack

So, here’s the actual bit of discovery that I’ve been holding onto for six months and is the real impetus for this entire post. If you go to Google Maps and search for a branded term coupled with a geographic location you often get some very interesting results. Take ‘zillow san diego, ca‘ for instance.

Google Maps Result for Zillow San Diego CA

Look at all those results and red dots! I didn’t ask for realtors, mortgage brokers or appraisers in my query. I simply used the term Zillow in combination with a geography and got these very related and relevant results. They’re not simply looking for a Zillow office located in San Diego.

So, lets look at the details here to see what’s going on. I’ll take one of the red dots and investigate further.

Mesa Pacific Mortgage Google Maps Result

So why is this on the map results? First I go to the linked website.

Mesa Pacific Mortgage Website

So, there are no links to Zillow anywhere on the site and the address and phone number here don’t match the one on Google Maps. But they are the ones listed on his Zillow Profile.

Zillow Structured Data Example

Now the link to the website closes the connection here so it’s not completely linkless, but I still find it pretty amazing. And this is without Zillow fully optimizing the markup. They declare the page as an organization.

Zillow Organization Schema

But they don’t detail out the professional information with schema markup.

Zillow Definition List Markup

Instead they’re using some old(er) school definition list markup for list term and description. Combined with the organization scope it looks like Google can put 1 and 1 together.

Google+

In doing due diligence I found Mesa Pacific Mortgage also has a Google+ page which reinforces the right address and phone number. So the connection isn’t as startling as it might seem but it’s still intriguing.

And I have no idea in what order these things came into existence. It’s pretty clear the Zillow listing probably came first based on the 2006 Member Since date on his profile. Whether the Google+ Local page and associated map listing came directly as a result is unknown.

In fact, as you do more and more investigation as to what shows up on the map and what doesn’t it seems like a Google+ Local page is required. However, a fair amount of them have been created by Google. Obviously Google uses a multitude of sources to create these listing. If you can be one of those sources, all the better. But even if you’re not, connecting to these entities delivers value to all involved.

Lets look at another Google Maps result.

Google Maps Result for Pacific Sotheby's

If you follow that reviews link you wind up on their Google+ page.

Pacific Sotheby's Google+ Page

Odd that Google isn’t sucking in the reviews from Zillow, which would show a greater connection. Google+ Local Pages provide a vast database of entities for Google. And they rely on the data in Google+ more than that from other sources.

Zillow Profile for Keke Jones

Here the phone number on Zillow doesn’t match the one on Google+ or Google Maps. A quick aside that you’re also seeing the potential to create a relationship between Keke Jones (person) and Pacific Sotheby’s Int’l Realty (place). But I digress.

Outside of the website connection and address match in that Professional Information section, the other reason this result shows up for this search is because they use Zillow products on their website.

Pacific Sotheby's Links to Zillow

The rest of you can run away of these types of implementations based on poor analysis of a Matt Cutts video if you like, but that would be a mistake in my view.

Okay, one last example. Lets zoom in and find another result.

Google Maps Result for Roger Ma

The hours data indicates that Roger probably has a Google+ Page. Yup.

Google+ Page for Roger Ma

Now we can see that they’re pulling in reviews from Zillow and Roger does have a profile on Zillow. So why he shows up for a Zillow+Geography search is pretty straight-forward.

Interestingly, searching for ‘homethinking san diego, ca’ on Google Maps does not return Roger Ma. Perhaps because they don’t include an address line 1 or because they only use hreview-aggregate and don’t declare a schema.org scope (thank you handy structured data testing tool bookmarklet).

Tough to say but you can see how important it might be to ensure you did what was necessary to confirm these connections.

People Talk About

People Talk About Amber Bistro

Now lets home in (pun intended) on the ‘People talk about’ feature. These terms are generated though some process/algorithm that analyzes the review text and pulls out the relevant (depending on who you ask) key phrases.

Now, I’m not going to go too far down this rabbit hole, though I think it’s possible Google might be using both review text and query syntax to create these phrases. Bill Slawski did a nice job teasing out how Google finds ‘known for’ terms for entities.

What’s important in my view is that these key phrases become more meta information that gets passed back and forth through entity connections.

Google is assigning this entity (Roger Ma) a certain cluster of key phrases including ‘sell a home’ and ‘great realtor’. Zillow is connected to this entity, as we’ve demonstrated, which means that those key phrases are, on some level, applied to Zillow’s page and site.

Now imagine the aggregated key phrases from connected entities that are flowing into Zillow. Do you think that might give Google a better idea of exactly when and for what queries they should return Zillow content?

And Google might very well know the terms people used to get to Roger Ma’s page on Zillow and use that to inform all of the other connected entities. That’s speculation but it’s made with over six months of experimentation and observation.

I can’t share many of the details because I’m under various NDAs, but once you make these connections using structured data there seems to be an increased ability to rank for relevant terms.

SameAs

Okay, we veered off a bit into theory so lets get back to tactics. If you have a page that is about a known entity you may want to use the SameAs schema.org property.

sameAs Schema Property

If I had to describe it plainly, I’d say sameAs acts as an entity canonical. Sure, it’s a bit more complicated than that and has a lot to do with confirming identity but in my experience using sameAs properly can be a valuable (and more direct) way of telling search engines what entity that page contains or represents.

sameAs Schema Example

Here you see that a page about Leonardo DiCaprio a sameAs property to his Wikipedia entry. Now, obviously you could try to spam this property but there would be a number of ways to catch this type of behavior. Sadly, I know that won’t stop some of you.

Wikipedia

Cat Editing Wikipedia

Like it or not Wikipedia is still a primary source of data for the Knowledge Graph. If you’ve got a lot of time, patience and can be objective rather than subjective you can wade into Wikipedia to help create company profiles, provide reference links (more important than you may imagine) and generally ensure that your brand is represented in as many legitimate places as possible.

Your goal here isn’t to spam Wikipedia but to simply crack the Kafka-like nature of Wikipedia moderation and provide a real representation of your site or brand that adds value to the entire corpus and platform.

Freebase

Freebase on the other hand has a different type of challenge. Instead of obstinate editors and human drama, Freebase is just … a byzantine structure of updates. The good news? It’s a direct line to the Knowledge Graph.

For instance if you search for Twitter this is the Knowledge Card you get as a result.

Knowledge Graph Result for Twitter

There’s no Google+ part of the Knowledge Card because there is no reference to a Google+ Page under Social Media Presence.

Twitter Freebase Profile

Turns out they don’t have a Google+ Page. Seriously? Man, get with it Twitter. Compare this to StumbleUpon.

Knowledge Card for StumbleUpon

They’ve got the business specific information as well as the Google+ integration with the Recent posts unit. Why? They’ve got a Google+ entry in their Social Media Presence on Freebase.

StumbleUpon Freebase Profile

How about Foursquare?

Knowledge Card for Foursquare

Oy! Not so good. They’ve got their Google+ account in Freebase.

Foursquare Freebase Entry

However, the business section on their ‘Inc.’ entry in Freebase (different from the standard entry) is empty.

Foursquare Business on Freebase

Now, the interplay between a standard entry and a business entry on Freebase can be strange and some entities don’t even need this dual set-up, which makes understanding how to enter it all really complex. So, it’s not just you who thinks updating Freebase is hard. But … it’s totally worth it.

Because Freebase really is where the Knowledge Graph flows as I’ve just shown. For just one more example, look at the Knowledge Card for Garret Dillahunt and then look at the data in his Freebase entry. Match the elements that show up in the Knowledge Card. Convinced?

You might ask why Google links to Wikipedia in the Knowledge Cards and not Freebase? Have you looked at Freebase!? It’s not a destination site anyone on the Google search team would wish on a user. That and Wikipedia has a solid brand that likely resonates with a majority of users.

KGO

Knowledge Graph Optimization is just getting started but here are the real things (pun intended) you can do to start meeting this new world head on.

Use Entities (aka Nouns) In Your Writing

Make it easy for users and search engines to know what you’re talking about by using the actual names of the entities in your writing.

Get Connected and Link Out To Relevant Sites

Stop hoarding link juice and link out to relevant sites so that the entity information can begin to flow between sites.

Use Structured Data To Increase Entity Detection

Make it easier for search engines to detect, extract and connect entities to the Knowledge Graph by using various forms of structured data.

Go A Step Further and Use the sameAs Property 

When appropriate use the sameAs property to reference the exact Freebase or Wikipedia entry for that entity. Think of it as an entity canonical.

Claim and Optimize Your Google+ Presence

There’s no doubt that Google+ sits in the middle of a lot of the knowledge graph, particularly about places. So claim and optimize your presence, which also extends to getting reviews.

Get Exposure on Wikipedia

Put on some music and slug it out with Wikepedians who seem straight from Monty Python’s Argument sketch and edit your profile and add some appropriate references.

Edit and Update Your Freebase Entry

Update your Freebase entry and make it as complete as possible. I hope to have a more instructive post on editing Freebase some time in the near future.

Knowledge Graph Optimization (KGO) is about making it easy to connect to as many relevant entities as possible so that search engines better understand your site on a ‘thing’ level and can pass important meta information between connected entities.