You Are Browsing The SEO Category

You Won’t Remember That Infographic

June 25 2014 // Marketing + SEO // 40 Comments

Infographics are (still) popular. Clients ask me about them all the time. I ask them to tell me about the last three infographics they remembered.

The response is generally full of stammering as they grope for an answer. Rarely do I get specifics. Even when I do they say things like 'that infographic about craft beer'. When I ask where the infographic came from? Crickets.

Can you name the brands associated with infographics? The brands that come up most often are Mint and OK Cupid. Everyone else is an also ran. And that's the thing. For all of their popularity, you won't remember that infographic.

Or, at least, you won't remember it the right way.

Triangle Of Memory

To understand why infographics are so problematic we need to look at how we remember content.

Triangle of Memory

The triangle of memory is a variant of the project management triangle that includes better, faster and cheaper attributes, of which you can only have two at any given time. You can have a project fast and cheap but it won't be better. You can have a project fast and better but it will cost you an arm and a leg.

In terms of memory, we don't have a massive tag based annotation system in our brains. (That's what Delicious is for.) Instead, we remember content at a very basic level: site, author and topic. This is why I tell clients to make their content cocktail party ready.

Because you remember 'that post on Moz about Hummingbird' or 'Danny Sullivan's analysis of New York Times subscription costs'.

It's site and topic, but not the author. It's author and topic but not site. Rarely it is author and site but not topic. Examples of this might be 'the latest column by Krugman in the New York Times' or 'last week's episode of John Oliver on HBO'.

I'm not saying you never get all three. You hit the three cherries jackpot once in a while. But it's rare. Counting on it is like counting on winning at the casino.

The Infographics Monster

Infographics Monster

The problem with infographics is that they destroy the triangle of memory. They gobble up one of those three memory attributes leaving you with only one left to use. It's always 'that infographic'. And like it or not the attribute most people select is the topic, resulting in the phrase 'that infographic about ...'.

That means your site or brand disappears! And no. No one remembers (and may not even see) your logo that you've slapping on there.

'That infographic about AdWords conversion rates' is done by who exactly? Where do I find it again? Ah, never mind. Or worse yet they search for it and they find something or someone else instead.

If users don't remember that it's your brand or site, have you really succeeded?

Wasted Attention

Chocolate Covered Donut

Not only are infographics often costly (both in time and money) but you've wasted that sliver of attention you've worked so hard to earn.

Here you've got the eyeballs of a user and they leave without remembering who you are or where they saw it. Heck they might even attribute it to the platform where they discovered it such as Facebook, Pinterest or Google+.

Winning the attention auction isn't easy and when you do win it you better ensure you're using that attention wisely. I'd argue an infographic is wasted attention. It's attention without any lasting value. It's empty (branding) calories.

When Infographics Work

LOL Cat vs OMG Cat

By and large I steer clients away from infographics and prefer to have them work on other content initiatives where they'll build brand equity. But that's not to say that infographics can't work. They can. But it takes a serious commitment and attention to execution.

Doing one or two infographics is like flushing a fist full of hundred dollar bills down the toilet. If you're going to do infographics, do infographics. Commit to producing one every month for 18 months.

Consistent engaging infographics is what makes your brand stick. It's why Mint and OK Cupid succeeded where so many others failed.

I'd also argue that infographics must make users LOL or OMG. If they don't provoke one of those two reactions then you're not going to gain traction or attention.

The other way to go is to leverage the infographic into other channels and make it repeatable. Search Engine Land's Periodic Table of SEO Success Factors (a bit of a mouthful) was printed and handed out at SMX Advanced and has been updated three (?) times now.

It's an iconic piece pushed through multiple marketing channels to reinforce the site and brand. That's how you do it.

Don't Talk To Me About Links

I know some of you are about flexing your fingers about to type out a comment about how your infographic obtained 12 links with an average DA of 49.

Velma Says You Stop That!

Links aren't the goal of your infographic campaign. Your customers don't care if you're on some cheesoid infographic aggregator site. Instead I want to know if that infographic won the brand more true fans. Did it increase the brand's visibility? Because those things will lead to long-term authority and, by the way, downstream links.

If you're in such desperate need of links there are far better and cheaper ways to earn them than the branding black hole known as the infographic.

Visibility

Zero Visibility

Another argument for infographics is that they provide you with more visibility. If I see an infographic and then I see a Slideshare deck and then I search and I find a blog post over the course of weeks or months, then perhaps the brand or site begins to sink in.

In principle, I agree. But that only works if I associate that infographic with the other pieces of content and that I have those other pieces of content, which all support my site or brand.

In other words, you better have a comprehensive content strategy (including promotion) that doesn't rely on just one tactic or medium. I like Jason Miller's idea around Big Rock Content, though I think the missing ingredient is being memorable.

TL;DR

Infographics are a poor way to build your brand and earn true fans because they destroy the triangle of memory. A successful infographic campaign must be part of a larger content strategy, focusing on repeatable efforts that make people LOL or OMG and can be pushed through multiple marketing channels to reinforce the site or brand.

Social Signals and SEO

April 07 2014 // SEO + Social Media // 99 Comments

Do social signals (Tweets, Likes and Pluses) impact search rankings? The answer to this question is yes, but not in the traditional sense. That's why so much misinformation exists on the topic.

So before you run off and get all your friends to Tweet your post (or worse yet buy Likes etc.), read on to understand the math and real reason why social works.

Social Signals Are Not Part Of The Algorithm

Cat On A Leash

No matter how much we want it, or how many times we think it would make sense, it's just not happening.

Social is not currently part of Google's search algorithm.

At SMX West 2014 Amit Singhal stated that Google+ doesn't have an impact on the relevance of non-personalized search results. (I was there and heard those words come out of his mouth.)

That's the head of Google's search effort telling you that they're not even using their own social signals to improve search. So they sure as heck aren't using Twitter or Facebook, sources in which they have less visibility and trust.

Using social signals in the algorithm is wicked hard for a number of reasons. While I'm sure smart people at Google and Bing are working on ways to use them, they aren't currently being used. Period. End of story.

But ... Correlation!

Correlation Does Not Equal Causation

Of course you've seen all the correlation studies that seem to show that social improves rankings. Now, the thing is, social is correlated with improved rankings, just as ice cream consumption and amount of clothing worn are correlated.

The key is to find the confound or confounding variable, that thing that explains why those two things are correlated. In the case of ice cream and clothing the confound is (of course) temperature. This is what is generally missing in the conversation around social signals and SEO.

Finding The Confound

It's not the actual social activity that matters, but what happens as a result of that activity. 

One of the best things that can happen is if your content is seen by creators, the 1% of users who create all the content floating around the Internet.

Before we continue, you might want to acquaint yourself with the concept of participation inequality, something I talk about frequently, most recently as it relates to blog commenting. Because I'm going to mash-up social, participation inequality and the link graph to make my point.

Creators power the link graph and that's why social can be so important if you follow the math.

Social SEO Math

How Social Signals Impact SEO

Say I get 100 Tweets on a blog post. Those 100 Tweets are seen by 10,000 people. I'm using round numbers here to make the math easier. But the idea is to understand the reach of those social shares.

If we use the standard distribution of participation inequality we determine that 1% of those 10,000 people are creators who might decide to include your brand or site in a future piece of content.

So, if 10,000 people see your content and (on average) 1% of those are creators then you've reached the eyeballs of 100 creators (10,000 x 1%), the folks who power the link graph.

Some of those creators will follow through and include you (links and mentions) in their content. It's something I've referred to as the 'Social Echo' in the past. But how do we measure and steer our efforts with this math in mind?

All Social Shares Are Not Equal

Does the share from your buddy with 10 followers (half of which are actually accounts for his pets) mean as much as a share from an industry leader with 20,000 followers? Of course not.

This is one of the reasons why buying Tweets or Likes just for the sake of pumping up that number is a waste of money. Shares that fail to find an audience with the appropriate creator mix will do nothing for SEO ... or your marketing efforts in general.

Even the size of the following might not help you. It all depends on the creator mix.

Creator Mix of Followers Matters

For instance, 50,000 followers with a creator mix of .1 (a tenth of a percent) would only give you the opportunity to get in front of 50 creators. On the other hand, 10,000 followers with a 3% creator mix would give you the opportunity to get in front of 300 creators. (Note to self. Someone should come up with a way to quantify the creator mix of someone's followers.)

The caveat here is that some of those 50,000 followers might re-share that content and they might have a better creator mix and get you to more creator eyeballs. You can see how this can quickly get complicated.

Long story short, the number of creators following someone who shares your content is important.

Did They See It?

Polar Bear Covering Eyes

You'll notice that I say that you have the opportunity to reach a certain number of creators with those social shares. But there's no guarantee that those creators actually see that one specific share amid all the other content passing through their social feeds. And there's an argument here that creators might be more difficult to reach based on their time constraints.

So while I'm not in love with the idea of timing your social shares, it actually make a bit of sense. Because you want to maximize the potential for creators to see your content. Be warned, this is highly dependent on your vertical and will change over time so don't get lazy and rely on cookie-cutter data.

You must win the attention auction. That means optimizing your social snippets, using paid organic amplification to get things off the ground and sharing your content more than once (second chance Tweets etc.) among other things. At the end of the day you want to do everything you can to ensure creators are seeing your stuff.

Optimize and maximize creator impressions.

Creator Conversion Rate

Red Neon Yes No Maybe So

The last variable in the equation might be the most important one of all - the percentage of creators who wind up linking to you as a result of a social impression.

So lets go back to my initial math: 100 shares produce 10,000 impressions of which 1% or 100 are creators. How many of them are going to do something with your content that will impact the link graph?

I don't have any hard data on this and, frankly, it is super dependent on the content. Really awesome content that's relevant, timely and memorable might have a high conversion rate. Content that makes creators roll their eyes and curse themselves for clicking through in the first place may not get a single link.

I tend to use a 1% conversion rate when discussing this with clients. So in my example, those initial 100 shares would net 1 link.

That's it folks. Links are the confound in the correlation between social shares and rankings.

Content that hits that sweet spot, getting a high number of shares that creates downstream links from creators (particularly in a short period of time), produces wildly successful results. Those additional references by creators often creates a tailwind of sharing on the original content, reinforcing the correlation we all recognize exists.

Fuzzy Math

Evil Distribution Plushies

Now, I've provided math on why I believe social is a valuable part of SEO. Downstream links matter. No doubt about it.

But it's more than just a mathematical equation of links. Social drives more people to your site who might convert and become a reader or customer. Those people might wind up sharing in the future and the traditional math above kicks in again.

You'll gain additional followers and true fans who help to distribute your future content. Guess what? You're just optimizing the top of the Social SEO funnel. More shares lead to more impressions lead to more creator impressions and more opportunities for gaining authoritative references (i.e. - links).

You also might get more direct traffic as a result, as the mere exposure effect takes hold and they begin to associate you with specific topics and visit your site as needed. Even this could probably be reduced to math if you really wanted to go down the rabbit hole.

Good things happen when your brand is seen by more people.

TL;DR

Social has an indirect but powerful impact on search rankings. It's not the actual social activity that matters, but what happens as a result of that activity. Optimizing and maximizing creator impressions increases the chance of obtaining links from the group of people who power the link graph.

The Ridiculous Power of Blog Commenting

March 25 2014 // Marketing + SEO // 131 Comments

Blog commenting is the not-so-secret weapon to building your brand and authority. I'm not talking about comment spam or finding do follow blogs and littering them with links. No, the blog commenting I'm talking about lets you cut through the clutter and tap into the attention of creators.

Participation Inequality

To understand why blog commenting is so powerful you first need to grasp the concept of participation inequality.

In most online communities, 90% of users are lurkers who never contribute, 9% of users contribute a little, and 1% of users account for almost all the action.

You might also hear this concept referred to as the 90:9:1 Principle or The 1% Rule. You could even stretch a bit and mention the Pareto Principle in this discussion. The idea here is that the vast majority of people lurk and never participate. They are consumers of content.

A small minority, the 9%, may comment, share or participate in other ways. But it's the 1% left that actually create the content that is consumed. When I explain it to people I refer to these groups as lurkers, reactors and creators respectively.

Participation Inequality Pyramid

What's surprising (to me at least) is that many people still haven't caught on to this idea. They remain shocked and appalled that 90%+ of Yelp reviews come from 1% of users. They use low activity (defined as contributing) on services like Twitter and Google+ to argue that they're not viable.

Twopcharts Twitter Activity by Year

Now what type of person do you think was more likely to sign up and use Twitter back in 2008?  Lurker, reactor or creator? Give yourself a gold star if you answered creator. And that's why the percentage still Tweeting from those years is higher. As the service became more mainstream, more lurkers joined the service.

And that's okay!

Trying to 'fix' participation inequality is a losing battle against human nature. Most people simply aren't going to create content for a wide variety of reasons. Sure, technology may slide the percentages a small amount but material changes to this dynamic won't happen.

Creators

Hand Painted Saucer

If creators are responsible for nearly all of the content we consume, that makes them ... pretty powerful. I dare say, you might call them influencers. Now, I don't particularly like that term but there's a certain amount of truth in it.

The sad thing is that most of the 'influencer outreach' content I've seen talks about how to identify (zzzzzz) and email these people or ways to interact with them on Twitter. I suppose that can work once in a while but the odds of securing their attention in these ways is limited and inefficient.

People continue to do this type of outreach because of the huge upside in gaining the attention of a creator. Creators often have a large audience so a mention or link in the content they create can provide a real boost to your brand and authority.

If you didn't put the pieces together already, creators power the link graph.

Attention

Hangout Cat

Attention is at a premium and it's your job to win the attention auction as many times as you can. It's even more important to win the attention of creators. Yet, creators may have a more limited amount of attention to give. Why? They're busy creating content! Seriously, it takes time (and lots of it) if you're doing it right.

Not only that but if they're a successful creator, the demands on their time increase. They get more email, more requests, more clients.

So how do you get the attention of a creator? Funny thing, there's actually a really easy way to hack the attention of a creator. That's right. Blog commenting.

You know that creators are going to be paying attention to the comments on their content. They worked hard to produce it and they're looking to see how it's received. Make no mistake, creators thrive on feedback and validation.

Creators hang out in the comments section. So take advantage of the implicit focus creators have on comments.

Blog Commenting

Blog Commenting

The problem with blog commenting is that most people suck at it. I'm not even talking about the cesspool of comments that often overwhelms YouTube videos or the comment spam with their ever present and overly complimentary prose clogging up moderation queues.

Commenting is your chance to get the undivided attention of that creator, if only for a few seconds as they determine whether the comment is interesting.

"Nice post. Very helpful."

Is that comment interesting? Nope. Is it memorable? Nope. Comments like this do absolutely nothing for you. In fact, if a creator associates you with these types of moronic bland comments, you reduce your chances of securing their attention in the future.

Remember, attention is a habit. You figure out which people are worthy of your attention and which are not. The more times I choose not to pay attention to you, the more likely I am to do that in the future.

When you comment, your job is to add value to that content. That means you come with an opinion and point of view. You come with other related content that you'll link to in your comment. Those links should not always be to your own site. No one likes the person who always talks about me, me, me.

Most creators want a reaction. They want a debate. They want a conversation. They want to learn. They want to be challenged. They want to be mentally stimulated.

Who Is This Person?

Thought Bubble

If you've done your job right and provided a comment that engages the creator, a thought bubble should appear over their head reading 'who is this person?'

At that point they're clicking on the links in your comment or on the 'site' link you provided in the comment meta that's on nearly every comment platform.

A good comment gets a creator curious about that person.

They click around and do some research. Maybe you have a blog yourself and they read your latest post (or more). Maybe they like it enough they add it to their RSS reader or they find your Twitter handle and follow you there.

Of course this means you need those exploratory clicks to land somewhere that showcases your brand. Don't make the mistake of leaving a great comment and then have the creator come through to a site that hasn't been updated in over a year or a half-ass product page with a broken image.

If you've engaged the creator enough to garner more attention, don't squander it with poor content assets.

Putting It All Together

I've been wanting to write this post for a few months but it wasn't until I bumped into Larry Kim (who is a great guy) at SMX West that everything fell into place. I was chatting with Larry about this topic and he gave me a perfect example of the power of blog commenting in practice.

On February 25th the talented Elisa Gabbert compiled the opinions of SEO experts on the 'dwindling value of links' (bollocks, but that's another story.)

Wordstream PageRank Post

The post was popular and garnered 37 comments, many from other notable creators. One of those was a very comprehensive comment by Russ Jones from Virante.

Comment by Russ Jones on Wordstream PageRank post

On February 28th (three days later) the indomitable Rand Fishkin released a Whiteboard Friday video that not only linked to the Wordstream post but referenced comments by Russ Jones.

Whiteboard Friday on Link Value

And if you watch the video or read the transcript it's crystal clear that Rand has read the comments. Heck, he uses them as the basis for a material amount of this video! It might have been nice if Moz had also linked to Virante but c'est la vie.

Do you see what just happened here!? Have I convinced you how powerful blog commenting can be in getting the attention of creators? That those creators can then provide your brand, site or product exposure by including them in their content.

But ... Reasons Excuses

Cheese

I know some of you are going to complain that blog commenting like this is too time consuming. Oh? Can I offer you some cheese for that whine?

Seriously! There are few better ways to interact with creators. Not every one will result in a mention or link in three days time but done right you're going to build your expertise and authority with the 'right' people.

Would you rather send out a bunch of email pitches to influencers which are essentially interruptions and attacks on their attention or instead build lasting content assets (comments my friend) while gaining exposure with said influencers? Choose wisely.

Others are rightly frustrated with comment censorship, both human and algorithmic (i.e. - spam filters). But the answer is not to remove comments (and chase away creators) but to figure out a better way to have these discussions.

TL;DR

A small amount of creators are responsible for the vast majority of the content we consume. They have a limited amount of attention yet wield a lot of influence through their ability to reference sites, products, brands or content in the content they produce.

Creators hangout in (aka devote their attention to) the comments section of their content and that of others. Thus, memorable blog comments that provoke creator curiosity (and clicks) build your authority and improve your chances of gaining a mention or link in their content in the future.

SEO Is Stone Soup

March 17 2014 // SEO // 24 Comments

Each year my wife and I take our daughter to the Chevron Family Theatre Festival at the Lesher Center for the Arts. It's a great local tradition where kids sit on stage during fairy tale plays, get their face painted and eat ice cream among other things.

Last year we also saw The Pushcart Players, a smaller touring group, put on Stone Soup and Other Stories. They were excellent. Not only that but the Stone Soup fable they performed was new to me and, oddly, fit my view of SEO.

Stone Soup

Stone Soup

The Stone Soup story is "an old folk story in which hungry strangers persuade local people of a town to give them food." The scene opens with a traveller going through a Russian town way back when. In this instance the traveler is acquainted with the townspeople but it's been a hard year. Everyone is hungry and they're in no mood to share.

The traveller sets up his pot, puts water and a stone in it and gets a fire going, telling the passing townspeople that he's making delicious Stone Soup. The townspeople are super curious (and hungry) so he promises to share the recipe if they help him make it.

He's got them hooked! So as he's stirring the traveller says "Well, you know what makes Stone Soup even better? Carrots." And a townsperson runs off to fetch a carrot which is added to the soup.

Then the traveller says, "But, you know what else makes Stone Soup great? Onions." And so another townsperson runs off to fetch an onion. The pattern continues with the traveller asking for a potato and finally a chicken until the Stone Soup is truly a delicious and hearty soup.

SEO is the stone.

SEO is Stone Soup

Unopened Geode

Watching and listening to the story I was struck that the way I approach SEO is just like Stone Soup.

A client might come to me for what many might think of as traditional SEO, including technical issues, keyword research, on-page optimization and (the dreaded) link building. But as I work with a client I take on the role of the traveller.

You know what would make your SEO even better? Some conversion rate optimization. Even more so? Focusing on user experience. Oh, and do you have an email marketing program? No? Oh, well that's critical. You're doing social right? How about remarketing? No! Well that's a no-brainer. Have you looked at these business development opportunities? Can we talk about your product? And lets start promoting benefits instead of features, okay?

The list is pretty long and at the end of the day SEO is simply digital marketing.

Why Sell SEO?

If SEO is just digital marketing why do I continue to sell SEO? That's how my clients talk about this problem! I've done my intent and user syntax research.

Google Trends for SEO, Inbound Marketing and Growth Hacking

Honestly, I don't really care what we call ourselves. All of the above terms are just fine and you might have found a way to use other terms to sell your services. Awesome! More power to you.

But in my experience, when people are looking for help with their business (aka more traffic) they often use the term 'SEO' as a proxy. So I want to be there when those clients come knocking.

In the end it doesn't matter what was said about SEO on The Good Wife. It only matters how prospective clients talk about and express this intent. Because the intent behind those looking for SEO matches what I offer, whether they realize it or not.

Baggage

Baggage

There's a tremendous amount of angst around the 'baggage' that comes with the term SEO. People say that the mainstream thinks of us as spammers. That the term SEO is toxic.

Every objection is an opportunity.

That's what I used to tell my employees when I ran telemarketing programs at George Washington University, American University and UCSD. One of my jobs was to arm my callers with answers to predictable objections.

Objection: "The university doesn't need any more money. They get plenty from the state of California."

Response: "Actually, the amount the university gets from the state has dropped from 80% to 20% in the last 10 years. We need your help more than ever."

I'm not sure I have the facts right here but it's something like this and you get the gist of it, right? When someone raises an objection it's actually an opportunity to have a dialog.

Why not use that to your advantage?

Predictability

Cats Sitting in Boxes

One of the other reasons I still use the term SEO is that the questions (or objections) I'm going to get are predictable. I can anticipate the questions and can have great answers to them ready to roll off my tongue or fingertips.

I've used this technique in many places. For instance, what do you think one of the most frequent questions is about Blind Five Year Old? The name! So I get to tell them about my philosophy of SEO right away. #winning

Back in the day, I left a gap in employment on my resume knowing that a good interviewer would ask me about that gap. I had a great answer lined up. And I'd rather talk about what I want to talk about rather than leave everything up to chance.

Same thing when you're creating a pitch deck. Engineer the presentation so that you can anticipate the questions! I'm not saying look like an idiot. Just be smart and you'll increase your odds of getting the questions you want to answer.

So with the term SEO I know a lot of the questions are going to be about spam, links and other dated techniques. I have my answers. Not only that but I can qualify clients by how they respond to my answers.

Marketing

I Love Marketing

In the end I approach things as a marketer. I'm using what is sometimes referred to as a bowling pin strategy. I'm following the user syntax of my clients and matching their intent.

I know the funnel is going to be dirty, chock full of the misinformed working off of 2007 rank-fast SEO tactics. But it's a ginormous funnel that isn't going anywhere anytime soon. So my job is to quickly find the diamonds in the rough.

Mind you, I no longer have to do a lot of this since nearly 100% of my business comes via referrals. But I got to this point using the strategy outlined here and hope I'd do it again (nearly) the same way if I were starting from scratch.

TL;DR

SEO as a profession is much larger than the specific acronym indicates. Yet, SEO remains a powerful term because of user behavior and intent, providing an opportunity to deliver other important digital marketing techniques to a larger audience. So I sell SEO as if I'm making Stone Soup.

Knowledge Graph Optimization

March 10 2014 // KGO + SEO // 36 Comments

A few months ago I offhandedly made a reference to KGO which stands for Knowledge Graph Optimization.

Now, I know what you're thinking. We need another acronym like another hole in the head! But over the past year I feel like there are a set of tactics that can help you optimize your site's connection to the Knowledge Graph. And that can yield material gains in search visibility.

The Knowledge Graph

The Knowledge Graph

Here's a brief explanation from Google for those not familiar with the Knowledge Graph.

The Knowledge Graph enables you to search for things, people or places that Google knows about—landmarks, celebrities, cities, sports teams, buildings, geographical features, movies, celestial objects, works of art and more—and instantly get information that’s relevant to your query. This is a critical first step towards building the next generation of search, which taps into the collective intelligence of the web and understands the world a bit more like people do.

It's about searching for things instead of strings. Or without the rhyming, it's about entities instead of text.

Take the query 'Golden State Warriors'. From a string stand point you'd be looking at the individual keywords which might be confusing. Now, Google got extremely good at understanding terms that were most frequently used together using bigrams and other methods so that this query would yield a result about the NBA basketball team.

But with the Knowledge Graph Google can instead identify 'Golden State Warriors' as an entity (a thing) that has a specific entry in the Knowledge Graph and return a much richer result.

Knowledge Graph Result for Golden State Warriors

Pretty amazing stuff really. (Go Warriors!)  Hummingbird was largely an infrastructure update that allowed Google to take advantage of burgeoning entity data. So we're just getting started with the application of entities on search.

Entity Challenge

Challenge Accepted

You need only look to the Entity Recognition and Disambiguation Challenge co-sponsored by Microsoft and Google to see the writing on the wall.

The objective of an Entity Recognition and Disambiguation (ERD) system is to recognize mentions of entities in a given text, disambiguate them, and map them to the entities in a given entity collection or knowledge base.

Can it be any more clear? Well, actually, it can.

The Challenge is composed of two parallel tracks. In the “long text” track, the challenge targets are pages crawled from the Web; these contain documents that are meant to be easily understood by humans. The “short text” track, on the other hand, consists of web search queries that are intended for a machine. As a result, the text is typically short and often lacks proper punctuation and capitalization.

Search engines are chomping at the bit to get better at extracting entities from documents and queries so they can return more relevant and valuable search results.

So ...

Wiley Coyote Bat Suit

But what exactly are we supposed to do? There has been little in the way of real rubber-meets-the-road content that describes how you might go about optimizing for this new world full of entities. One of the exceptions would be Aaron Bradley's Semantic SEO post, though it mixes both theory and tactics.

Now, I love theory. That's pretty clear from my writing. But today I want to talk more about tactics, about the actual stuff we can do as marketers to affect change in the Knowledge Graph.

Nouns

Noun

The first thing we can do is make sure we're using the entity names in our writing. That ERD challenge above? Well, the systems they're designing are looking to extract entities from text.

So if you're not using the entity names - the nouns - in your writing then you're going to make it vastly more difficult for search engines to identify and match entities. This does not mean you should engage in entity stuffing and mention every associated entity you can think of in your content.

Write clearly so that both humans and search engines know what the hell you're talking about.

Connect

Connect All The Things!

Stop hoarding authority and 'link juice' by not linking out to other sites. The connections between sites and pages are important and not just in a traditional PageRank formula.

I think of it this way. The entities that are contained on one page are transmitted to linked pages and vice versa.

Entities are meta information passed in links.

Structured Data

Structured Data

You can make the identification of entities easier for search engines by using schema.org markup along with some other forms of structured data. Not only will this ensure that the number of entities that are transmitted via links increase, it can often make connections to the Knowledge Graph with a very limited amount of data.

Google Maps Entity Hack

So, here's the actual bit of discovery that I've been holding onto for six months and is the real impetus for this entire post. If you go to Google Maps and search for a branded term coupled with a geographic location you often get some very interesting results. Take 'zillow san diego, ca' for instance.

Google Maps Result for Zillow San Diego CA

Look at all those results and red dots! I didn't ask for realtors, mortgage brokers or appraisers in my query. I simply used the term Zillow in combination with a geography and got these very related and relevant results. They're not simply looking for a Zillow office located in San Diego.

So, lets look at the details here to see what's going on. I'll take one of the red dots and investigate further.

Mesa Pacific Mortgage Google Maps Result

So why is this on the map results? First I go to the linked website.

Mesa Pacific Mortgage Website

So, there are no links to Zillow anywhere on the site and the address and phone number here don't match the one on Google Maps. But they are the ones listed on his Zillow Profile.

Zillow Structured Data Example

Now the link to the website closes the connection here so it's not completely linkless, but I still find it pretty amazing. And this is without Zillow fully optimizing the markup. They declare the page as an organization.

Zillow Organization Schema

But they don't detail out the professional information with schema markup.

Zillow Definition List Markup

Instead they're using some old(er) school definition list markup for list term and description. Combined with the organization scope it looks like Google can put 1 and 1 together.

Google+

In doing due diligence I found Mesa Pacific Mortgage also has a Google+ page which reinforces the right address and phone number. So the connection isn't as startling as it might seem but it's still intriguing.

And I have no idea in what order these things came into existence. It's pretty clear the Zillow listing probably came first based on the 2006 Member Since date on his profile. Whether the Google+ Local page and associated map listing came directly as a result is unknown.

In fact, as you do more and more investigation as to what shows up on the map and what doesn't it seems like a Google+ Local page is required. However, a fair amount of them have been created by Google. Obviously Google uses a multitude of sources to create these listing. If you can be one of those sources, all the better. But even if you're not, connecting to these entities delivers value to all involved.

Lets look at another Google Maps result.

Google Maps Result for Pacific Sotheby's

If you follow that reviews link you wind up on their Google+ page.

Pacific Sotheby's Google+ Page

Odd that Google isn't sucking in the reviews from Zillow, which would show a greater connection. Google+ Local Pages provide a vast database of entities for Google. And they rely on the data in Google+ more than that from other sources.

Zillow Profile for Keke Jones

Here the phone number on Zillow doesn't match the one on Google+ or Google Maps. A quick aside that you're also seeing the potential to create a relationship between Keke Jones (person) and Pacific Sotheby's Int'l Realty (place). But I digress.

Outside of the website connection and address match in that Professional Information section, the other reason this result shows up for this search is because they use Zillow products on their website.

Pacific Sotheby's Links to Zillow

The rest of you can run away of these types of implementations based on poor analysis of a Matt Cutts video if you like, but that would be a mistake in my view.

Okay, one last example. Lets zoom in and find another result.

Google Maps Result for Roger Ma

The hours data indicates that Roger probably has a Google+ Page. Yup.

Google+ Page for Roger Ma

Now we can see that they're pulling in reviews from Zillow and Roger does have a profile on Zillow. So why he shows up for a Zillow+Geography search is pretty straight-forward.

Interestingly, searching for 'homethinking san diego, ca' on Google Maps does not return Roger Ma. Perhaps because they don't include an address line 1 or because they only use hreview-aggregate and don't declare a schema.org scope (thank you handy structured data testing tool bookmarklet).

Tough to say but you can see how important it might be to ensure you did what was necessary to confirm these connections.

People Talk About

People Talk About Amber Bistro

Now lets home in (pun intended) on the 'People talk about' feature. These terms are generated though some process/algorithm that analyzes the review text and pulls out the relevant (depending on who you ask) key phrases.

Now, I'm not going to go too far down this rabbit hole, though I think it's possible Google might be using both review text and query syntax to create these phrases. Bill Slawski did a nice job teasing out how Google finds 'known for' terms for entities.

What's important in my view is that these key phrases become more meta information that gets passed back and forth through entity connections.

Google is assigning this entity (Roger Ma) a certain cluster of key phrases including 'sell a home' and 'great realtor'. Zillow is connected to this entity, as we've demonstrated, which means that those key phrases are, on some level, applied to Zillow's page and site.

Now imagine the aggregated key phrases from connected entities that are flowing into Zillow. Do you think that might give Google a better idea of exactly when and for what queries they should return Zillow content?

And Google might very well know the terms people used to get to Roger Ma's page on Zillow and use that to inform all of the other connected entities. That's speculation but it's made with over six months of experimentation and observation.

I can't share many of the details because I'm under various NDAs, but once you make these connections using structured data there seems to be an increased ability to rank for relevant terms.

SameAs

Okay, we veered off a bit into theory so lets get back to tactics. If you have a page that is about a known entity you may want to use the SameAs schema.org property.

sameAs Schema Property

If I had to describe it plainly, I'd say sameAs acts as an entity canonical. Sure, it's a bit more complicated than that and has a lot to do with confirming identity but in my experience using sameAs properly can be a valuable (and more direct) way of telling search engines what entity that page contains or represents.

sameAs Schema Example

Here you see that a page about Leonardo DiCaprio a sameAs property to his Wikipedia entry. Now, obviously you could try to spam this property but there would be a number of ways to catch this type of behavior. Sadly, I know that won't stop some of you.

Wikipedia

Cat Editing Wikipedia

Like it or not Wikipedia is still a primary source of data for the Knowledge Graph. If you've got a lot of time, patience and can be objective rather than subjective you can wade into Wikipedia to help create company profiles, provide reference links (more important than you may imagine) and generally ensure that your brand is represented in as many legitimate places as possible.

Your goal here isn't to spam Wikipedia but to simply crack the Kafka-like nature of Wikipedia moderation and provide a real representation of your site or brand that adds value to the entire corpus and platform.

Freebase

Freebase on the other hand has a different type of challenge. Instead of obstinate editors and human drama, Freebase is just ... a byzantine structure of updates. The good news? It's a direct line to the Knowledge Graph.

For instance if you search for Twitter this is the Knowledge Card you get as a result.

Knowledge Graph Result for Twitter

There's no Google+ part of the Knowledge Card because there is no reference to a Google+ Page under Social Media Presence.

Twitter Freebase Profile

Turns out they don't have a Google+ Page. Seriously? Man, get with it Twitter. Compare this to StumbleUpon.

Knowledge Card for StumbleUpon

They've got the business specific information as well as the Google+ integration with the Recent posts unit. Why? They've got a Google+ entry in their Social Media Presence on Freebase.

StumbleUpon Freebase Profile

How about Foursquare?

Knowledge Card for Foursquare

Oy! Not so good. They've got their Google+ account in Freebase.

Foursquare Freebase Entry

However, the business section on their 'Inc.' entry in Freebase (different from the standard entry) is empty.

Foursquare Business on Freebase

Now, the interplay between a standard entry and a business entry on Freebase can be strange and some entities don't even need this dual set-up, which makes understanding how to enter it all really complex. So, it's not just you who thinks updating Freebase is hard. But ... it's totally worth it.

Because Freebase really is where the Knowledge Graph flows as I've just shown. For just one more example, look at the Knowledge Card for Garret Dillahunt and then look at the data in his Freebase entry. Match the elements that show up in the Knowledge Card. Convinced?

You might ask why Google links to Wikipedia in the Knowledge Cards and not Freebase? Have you looked at Freebase!? It's not a destination site anyone on the Google search team would wish on a user. That and Wikipedia has a solid brand that likely resonates with a majority of users.

KGO

Knowledge Graph Optimization is just getting started but here are the real things (pun intended) you can do to start meeting this new world head on.

Use Entities (aka Nouns) In Your Writing

Make it easy for users and search engines to know what you're talking about by using the actual names of the entities in your writing.

Get Connected and Link Out To Relevant Sites

Stop hoarding link juice and link out to relevant sites so that the entity information can begin to flow between sites.

Use Structured Data To Increase Entity Detection

Make it easier for search engines to detect, extract and connect entities to the Knowledge Graph by using various forms of structured data.

Go A Step Further and Use the sameAs Property 

When appropriate use the sameAs property to reference the exact Freebase or Wikipedia entry for that entity. Think of it as an entity canonical.

Claim and Optimize Your Google+ Presence

There's no doubt that Google+ sits in the middle of a lot of the knowledge graph, particularly about places. So claim and optimize your presence, which also extends to getting reviews.

Get Exposure on Wikipedia

Put on some music and slug it out with Wikepedians who seem straight from Monty Python's Argument sketch and edit your profile and add some appropriate references.

Edit and Update Your Freebase Entry

Update your Freebase entry and make it as complete as possible. I hope to have a more instructive post on editing Freebase some time in the near future.

Knowledge Graph Optimization (KGO) is about making it easy to connect to as many relevant entities as possible so that search engines better understand your site on a 'thing' level and can pass important meta information between connected entities.

Are You Winning The Attention Auction?

January 20 2014 // Marketing + SEO + Social Media // 30 Comments

Every waking minute of every day we choose to do one thing or another.

For a long time we didn't have many choices. Hunt the mammoths or mind the fire. Read the bible or tend the crops. I can remember when we only got six television stations on an old black and white TV.

But as technology advances we're afforded more choices more often.

Freedom of Choice by Devo

We can decide to talk about the weather with the person next to us in the doctor's waiting room or stare into our phone and chuckle at a stupid BuzzFeed article. We can focus on that Excel spreadsheet or we can scroll through our Facebook feed.

You can sit on the couch and watch The Blacklist or you can sit on that same couch and read Gridlinked by Neal Asher on a Kindle. You could go out and play tennis or you could go out and play Ingress and hack some portals.

I was going to overwhelm you with statistics that showed how many choices we have in today's digital society, such as the fact that the typical email subscriber gets 416 commercial emails every month. That's more than 10 a day!

I could go on and on because there's a litany of surveys and data that tell the same story. But ... we all know this from experience. We live and breath it every day.

We all choose to look, hear and do only so many things. Because there are only so many hours in each day.

Our time and attention is becoming our most valued resource. (Frankly, we should really guard it far more fiercely than we do.) As marketers we must understand and adapt to this evolving environment. But ... it's not new.

The Attention Auction

Content Doge Meme

There's always been an auction on attention. That critical point in time where people decide to give their attention to one thing over the other.

Recently, there's been quite a kerfluffle over the idea of content shock. That there's too much content. There are some interesting points in that debate but I tend to believe the number of times content comes up in the auction has increased quite a bit. We consume far more content due to ubiquitous access.

Sure there's more content vying for attention. But there are more opportunities to engage and a large amount of content never comes up in the auction because of poor quality or mismatched interest.

There are hundreds of TV channels but really only a handful that are contextually relevant to you at any given time. Even if there are 68 sports channels the odds that you are in the mood to watch sports and that there will be something on each of those stations at the same time that you want to watch is very small. If you're looking to watch NFL Football then Women's College Badminton isn't really an option.

More importantly, I believe that we've adapted to the influx of content. It's knowing how we've adapted that can help marketers win the attention auction more often.

We Are Internet Old!

Sample Geocities Page

Adolescents often do very reckless things. They run red lights. They engage in binge drinking. They have unprotected sex. While some point to brain development as the cause (and there's some truth to that), I tend to believe Dr. Valerie Reyna has it right.

The researchers found that while adults scarcely think about engaging in many high-risk behaviors because they intuitively grasp the risks, adolescents take the time to mull over the risks and benefits.

It's not that adolescents don't weigh the pros and cons. They do and actually overestimate the potential cons. But despite that, they choose to play the odds and risk it more often than adults. In large part, this can be attributed to less life experience. They've had fewer opportunities to land on the proverbial whammy.

As we grow older we actually think less about many decisions because we have more experience and we can make what is referred to as 'gist' decisions. From my perspective it simply means we grok the general idea and can quickly say yea or nay.

So what does any of this have to do with the Internet, attention or content?

When it comes to consuming digital content, we're old. We've had plenty of opportunities to experience all sorts of content to the point where we don't have to think too hard about whether we're going to click or not. If it fits a certain pattern we have a certain response.

Nigerian Email Scam

Nay! A thousand times nay.

The vast majority of content being produced is, to put it bluntly, crap. Technology has a lot to do with this. It is both easy and free to create content in written or visual formats. From WordPress to Tumblr to Instagram, nearly anyone can add to the content tidal wave.

Of course, the popularity of 'content marketing' has increased the number of bland, "me too" articles, not to mention the eyesore round-up posts that are a simulacrum of true curation.

People have wasted too much time and attention on shitty content. The result? We're making decisions faster and faster by relying on those past experiences.

We create internal shortcuts in our mind for what is good or bad. It's a shortcut that protects us from wasting our time and attention, but may also prevent us from finding new legitimate content. So how do we address this cognitive shortcut? How do you win the attention auction?

You can ensure that you fit that shortcut and you can add yourself to that shortcut.

Fit The Shortcut

Getting Attention

Purple Goldfish

Fitting the shortcut is simple to say, but often difficult to execute. Make sure that, at a glance, you get the attention of your user. There are plenty of ways to do this from writing good titles to using appropriate images to leveraging social sharing.

When '1-800 service' pops up on caller ID you're probably making a snap decision that it's a telemarketer and you'll ignore the call. When it's the name of your doctor or someone from your family you pick up the phone. This same type of process happens on nearly all social platforms as people scan feeds on Twitter, Google+ and Facebook.

Recently Facebook even admitted to the issues revolving around feed consumption.

The fact that less and less of brands' content will surface is described as a result of increased competition for limited space, since "content that is eligible to be shown in news feed is increasing at a faster rate than people's ability to consume it."

Now this is a bit disingenuous since Facebook is crowding out legitimate content for ads (a whole lot of ads) but the essence of this statement is true. Not only that but your content is at a disadvantage on Facebook since much of the content is personal in nature. Cute pictures of your cousin's kids are going to trump and squeeze out content from brands.

So with what space you're left with on these platforms, you better make certain it has the best chance of getting noticed and fitting that shortcut. The thing is, too many still don't do what's necessary to give their content the best chance of success.

If you're not optimizing your social snippet you're shooting your content in the foot.

Be sure your title is compelling, that you have an eye catching image, that the description is (at a minimum) readable and at best engages and entices. Of course, none of this matters unless that content finds its way to social platforms.

Make sure you're encouraging social sharing. Don't make me hunt down where you put the sharing options or jump through hoops once I get there.

Ensure your content is optimized for both social and search. And when you're doing the latter rely on user centric syntax and intent to guide your optimization efforts.

Your job is to fit into that cognitive shortcut by making it easy for users to see and understand your content in the shortest amount of time possible.

Keeping Attention

Bored One Ear To Death LOLcat

Getting them to your content is the first step in winning their attention. At that point they're giving you the opportunity to take up more of their time and attention. They made a choice but they're going to be looking to confirm whether it was a good one with almost the same amount of speed.

When you land on a new website you instantly (perhaps unconsciously) make a decision about the quality and authority of that site and whether you'll stick around.

A websites’ first impression is known to be a crucial moment for capturing the users interest. Within a fraction of time, people build a first visceral “gut feeling” that helps them to decide whether they are going to stay at this place or continue surfing to other sites. Research in this area has been mainly stimulated by a study of Lindgaard et al. (2006), where the authors were able to show that people are able to form stable attractiveness judgments of website screenshots within 50 milliseconds.

That's from a joint research paper from the University of Basel and Google Switzerland about the role of visual complexity and prototypicality regarding first impression of websites (pdf).

Once they get to the content you need to ensure they instantly get positive reinforcement. Because at the same time there are other pieces of content, other things, battling for attention.

Grumpy Cat Nope

So if they don't instantly see what they're looking for you're giving them a reason to say nope. If what they see on that page looks difficult to read. Nope. If they see grammatical errors. Nope. If they feel the site is spammy looking. Nope.

There is a drum beat of research, examples and terms that underscore the importance of reducing friction.

Books On Reducing Friction

Call it cognitive fluency or cognitive ease, either way we seek out things that are familiar and look like we expect. Books such as Barry Schwartz's Paradox of Choice and Steve Krug's Don't Make Me Think make it clear that too many choices reduce action and satisfaction. And we should all internalize the fact that the majority of people don't read but instead skim articles.

That doesn't mean that the actual content has to suffer. I still write what are considered long-form posts but format them in ways that allow people to get meaning from them without having to read them word for word.

Do I hope they're poring over every sentence? Absolutely! I'm passionate about my writing and writing in general. But I'm a realist and would prefer that more people learn or take something from my writing than have a select few read every word and laud me for sentence construction.

I still point people to my post on readability as a way to get started down this road. Make no mistake, those who optimize for readability will succeed (even with lesser content) than those that refuse to do so out of ego or other rationalizations (I'm looking at you Google blogs).

I will shout in the face of the next person who whines that they shouldn't have to use an image in their post or that they only want people who are 'serious about the subject' to read their article. Wake up before you're the Geocities of the Internet.

Tomato

The one thing I do know is that being authentic and having a personality can help you stand out. It can help you to at least get and retain attention and sometimes even become memorable. Here's a bit of writing advice from Charles Stross.

Third and final piece of advice: never commit to writing something at novel length that you aren't at least halfway in love with. Because if you're phoning it in, your readers will spot it and throw rotten tomatoes at you. And because there's no doom for a creative artist that's as dismal as being chained to a treadmill and forced to play a tune they secretly hate for the rest of their working lives.

The emphasis is mine. Don't. Phone. It. In.

Add To The Shortcut

Using Attention

Dude Where's My Car?

When you do get someone's attention, what are you doing with it? You want them to add your site, product or brand to that cognitive shortcut. So the next time a piece of that content comes up in the attention auction you've got the inside track. They recognize it and select it intuitively.

For instance, every time I see something new from Matthew Inman of The Oatmeal, I give it my attention. He's delivered quality and memorable content enough times that he doesn't have to fight so hard for my attention. I have a preconceived notion of quality that I bring to each successive interaction with his content.

Welcome to branding 101.

Consistently creating positive and memorable interactions (across multiple channels) will cause users to associate your site, product or brand as being worthy of attention.

Let me be more explicit about that term 'interactions'. Every time you're up in the attention auction counts as an interaction. So if I choose to pass on reading your content, that counts and not in a good way. We're creatures of habit so the more times I pass on something the more likely I am to continue passing on it.

Add to that the perception (or reality) that we have less time per piece of content and each opportunity you have to get in front of a user is critical.

Now, if I actually get someone to share a piece of content, will it be presented in a way that will win the attention auction? If it isn't not only have I squandered that user action but I may have created a disincentive for sharing in the future. If I share something and no one gives me a virtual high five of thanks for doing so will I continue to share content from that source?

Poor social snippet optimization is like putting a kick-me sign on your user's back.

Memorable

Make A Short Cut

If you want to be added to that cognitive shortcut you need to make it easy for them to do so. You need them to remember and remember in the 'right' way.

I've read quite a bit lately about ensuring your content is useful. I find this bit of advice exceedingly dull. I mean, are you creating content to be useless? I'm sure content spammers might but by in large most aren't. Not only that but there's plenty of great content that isn't traditionally useful unless you count tickling the funny bone as useful.

Of course you've also probably read about how tapping into emotion can propel your content to the top! Well, there's some truth to that but that's often at odds with being useful such as creating a handy bookmarklet or a tutorial on Excel. I suppose you could link it to frustration but you're not going to have some Dove soap tear-jerker piece mashed up with Excel functions. Even Annie Cushing can't pull that off.

Story telling is also a fantastic device but it's not a silver bullet either. Mind you, I think it has a better chance than most but even then you're really retaining attention instead of increasing memory.

Cocktail Party

You have to make your content cocktail party ready. Your content has to roll off the tongue in conversation.

I read this piece on Global Warming in The New York Times.

I heard this song by Katy Perry about believing in yourself.

I saw this funny ad where Will Ferrell tosses eggs at a Dodge.

Seriously, when you're done with a piece of your content, describe it to someone out loud in one sentence. That's what it'll be reduced to for the most part.

As humans we categorize or tag things so we can easily recall them. I think the scientific term here is 'coding' of information. If we can't easily do so it's tough for us to talk about them again, much less find them again. As an aside, re-finding content is something we do far more often than we realize and is something Google continues to try to solve.

Even when we can easily categorize and file away that bit of information, we're not divvying it up into a very fine structure. Only the highlights make it into memory. We only take a few things from the source information. A sort of whisper down the lane effect takes place. You suddenly don't remember who wrote it, or where you saw it.

We're trying to optimize the ability to recall that information by using the right coding structure, one that we'll be able to remember.

Shh Armpit

It's the reason you need to be careful about if or how you go about guest blogging. This is also why I generally despise (strong word I know) Infographics. Because more often than not if you hear someone refer to one they say 'That Infographic on Water Conservation' or 'That Infographic on The History of Beer'.

Guess what, they have no clue where they saw it or what brand it represents. Seriously. Because usually the only two things remembered are the format (Infographic) and the topic. When I ask people to name the brands behind Infographics I usually get two responses: Mint and OK Cupid. Kudos to them but a big raspberry for the rest of you.

"But the links" I hear some of you moan. Stop. Stop it right now! That lame ass link (no don't tell me about the DA number) is nothing compared to the attention you just squandered.

I'm not saying that Infographics can't work, but they have to be done thoughtfully, for the right reasons and to support your brand. Okay, rant over.

Ensuring people walk away with a concise meaning increases satisfaction. And getting them to repeat it to someone else helps secure your content in memory. The act of sharing helps add your site or brand to that user's shortcut.

If there were a formula you could follow that would guarantee great content, why is there so much crap? If we all knew what makes a hit song or a hit movie why isn't every song and film a success? This isn't easy and anyone telling you different is lying.

Consistent

Janet Jackson

You can also add to the shortcut by creating an expectation. This can be around the quality of your content but that's pretty tough to execute on. I mean, I completely failed at generating enough blog content last year. I'm not advocating a paint-by-numbers schedule, but I had more to say and at some point if you're name isn't out there they begin to forget you.

There's a fair amount of research that shows that memory is a new mapping of neurons and that the path becomes stronger with repeated exposure. You inherently know this by studying. The more you study the more you remember.

But what if the memory of your site or brand, that path you're creating in your user's mind, isn't clear. What if the first time you associate the brand with one thing and the next time it's not quite that thing you thought it was. Or that the time between exposures is so great that you can't find that path anymore and inadvertently create a new path. How many times have you saved something only to realize you already saved it at some point in the past?

Now, I'm out there in other ways. I keep my Twitter feed going with what I hope is a great source of curated content across a number of industries. My Google+ feed is full of the same plus a whole bunch of other content that serves as a sort of juxtaposition to the industry specific content.

One of the more successful endeavors on Google+ is my #ididnotwakeupin series where I share photos from places around the world. It's a way for me to vicariously travel. So every morning for more than two years I've posted a photo tagged with #ididnotwakeupin.

The series gets a decent amount of engagement and if I tried harder (i.e. - interacted with other travel and photography folks) I'm pretty sure I could turn it into something bigger. I even had an idea of turning it into a coffee table book. I haven't though. Why? Because there's only so much time in every day. See what I did there?

Another example of this is Moz's Whiteboard Friday series. You aren't even sure what the topic is going to be but over time people expect it to be good so they tune in.

Or there's Daily Grace It's Grace on YouTube where people expect and get a new video from Grace Helbig every Monday through Friday. Want to double-down on consistent? Tell me what phrase you remember after watching this video from Grace (might be NSFW depending on your sensitivity).

Very ... yeah, you know.

That's right. Repetition isn't a bad thing. The mere exposure effect demonstrates that the more times we're exposed to something the better chance we'll wind up liking it. This is what so many digital marketing gurus don't want you to hear.

Saturation marketing (still) works because more exposure equals familiarity which improves cognitive fluency which makes it easier to remember.

It's sort of like the chorus in a song, right? Maybe you don't know all the words to each verse but you know the chorus! Particularly if you can't get away from hearing it on the radio every 38 minutes.

In some ways, the number of exposures necessary is inversely proportional to the quality of the content. Great content or ads don't need much repetition but for me to know that it's JanuANY at Subway this month might take a while.

Climbing Mount Diablo

And the biggest mistake I see people make is stopping. "We blogged for a few months and saw some progress but not enough to keep investing in it." This is like stopping your new diet and exercise regimen because you only lost 6 pounds.

You always have to be out there securing and reinforcing your brand as a cognitive shortcut.

Does Pepsi decide that they just don't need to do any more advertising? Everyone knows about Pepsi so why spend a billion dollars each year marketing it? You just can't coast. Well, you can, but you're taking a huge risk. Because someone or something else might fill the void. (Note to self, I need to take this advice.)

Shared

Everywhere

The act of sharing content likely means it will be remembered. To me it's almost like having to describe that content in your head again as you share it. You have that small moment where you have to ask questions about what you're sharing, with who and why it's interesting.

So sharing isn't just about getting your content in front of other people it's helping to cement your content in the mind of that user.

Of course, having the same piece of content float in front of your face a number of times from different sources helps tremendously. Not only are you hitting on the mere exposure effect you're also introducing some social proof to the equation.

To me the goal isn't really to 'go viral' but to increase the number of times I'm winning the attention auction by getting there more often with an endorsement.

You might not click on that 'What City Should You Actually Live In?' quiz on Facebook the first time but after four people have posted their answers you just might cave and click through. (Barcelona by they way.)

Examples

Breaking Bad

Walt and Jessie Suited Up on The Couch Eating

How did Breaking Bad become such a huge hit? It wasn't when it first started out. I didn't watch the first two seasons live.

But enough people did and AMC kept the faith and kept going. Because enough people were talking about it. It was easy to talk about too. "This show where a chemistry teacher becomes a meth dealer." Bonus points that the plot made it stand out from anything else on TV.

And then you figured out that you could watch it on Netflix! People gave it a try. Then they began to binge watch seasons and they were converts. They wanted more. MOAR!

Of course none of it would have happened if it weren't a great show. But Breaking Bad was also consistent, persistent, memorable and available.

BuzzFeed

BuzzFeed Logo

I know what you're thinking. BuzzFeed? Come on, their content sucks! And for the most part I'd have to agree. But it's sort of a guilty pleasure isn't it?

Here's why I think BuzzFeed works. You've found yourself on a BuzzFeed 'article' a number of times. It's not high quality in most senses of the word but it does often entertain. Not only that it does so very quickly.

If I'm 'reading' the 25 Times Anna Kendrick Was Painfully Accurate post I'm only scrolling through briefly and I do get a chuckle or two out of it. This has happened enough times that I know what to expect from BuzzFeed.

I've created a cognitive shortcut that tells me that I can safely click-through on a BuzzFeed post because I'll get a quick laugh out of it. They entertain and they respect my time. For my wife that same function is filled by Happy Place.

Blind Five Year Old

Blind Five Year Old Logo

How about my site and personal brand? I've done pretty well but it took me quite a while to get there, figuring out a bunch of stuff along the way.

Seriously, I blogged in relative obscurity from 2008 to 2010. But over time the quality of my posts won over a few people. But quality wasn't enough. I also got better and better at optimizing my content for readability and for sharing.

I use a lot of images in my content. And I spend a lot of time on selecting and placing them. I still think I botched the placement of an image in my Keywords Still Matter post. And it still irks me. No, I'm not joking.

The images make it easier to read. Not only do they give people a rest, they allow me to connect on a different level. Sometimes I might be able to communicate an idea better with the help of that image. It helps to make it all click.

I use a lot of music references as images. Part of it is because I like music but part of it is because if you're suddenly singing that song in your head, then you're associating my content with that song, if even just a little. When I do that I have a better chance of you remembering that content. I've helped create a tag in your mental filing system.

I try to build more ways for you to connect my content in your head.

TL;DR

We have more choices more often when it comes to content. In response to this we're protecting our time and attention by making decisions on content faster. Knowing this, marketers must work harder to fit cognitive shortcuts we've created, based on experience, for what is perceived as clickable or authoritative content.

Alternatively, the consistent delivery and visibility of memorable content can help marketers create a cognitive shortcut, giving themselves an unfair advantage when their content comes up in the attention auction.

Stop Carly Rae Content Marketing

December 17 2013 // Marketing + SEO // 11 Comments

Lately I've gotten a few too many Carly Rae content marketing emails, which makes me both sad and grouchy. This is not the way to promote content, gain fans or build a brand. Stop it.

What Is Carly Rae Content Marketing?

Carly Rae Content Marketing

The term comes from Carly Rae Jespen's popular Call Me Maybe song which contains the following lyrics.

Hey I just met you
And this is crazy
But here's my number
So call me maybe

I've changed the lyrics slightly to reflect the emails I'm increasingly receiving from folks.

Hey I just met you
And this is crazy
But here's my content
So promote me maybe

Carly Rae content marketing are out of the blue outreach emails from people you have no relationship with asking you to promote their content or engage in some other activity. In the end it's just shoddy email spam.

It's An Honor To Be Nominated?

The Oscars

I'm sure some of you are thinking that I'm ungrateful. The fact that I'm getting these emails shows that people want my endorsement. Perhaps it is better to be noticed than not but if I am some sort of influencer wouldn't you want to put your best foot forward?

First impressions matter and this one isn't going to win me over. In fact, I might remember you, your site or brand for the lousy outreach instead.

Win Over The Persnickety

I might demand a higher level of quality than others. So you could simply write me off as some anal-retentive prat with outrageous expectations and a self-inflated ego. But that would be a mistake.

Mr. Fussy

Because if you can put together a pitch that doesn't make me vomit in my mouth a little bit then you're likely going to have better luck with everyone else too. In short, win over your toughest critic and you'll have a powerful outreach message.

Content Marketing Basics

Johns

If you're doing outreach there are a few things you must get right. A recent post by Tadeusz Szewczyk about the perfect outreach message covered some of the basics. (It's not perfect in my view but it's certainly above average.)

You must be relevant, have a decent subject line, get my name right, respect my time and show that you've done some rudimentary homework about me. The sad part is that 50% of people fail to even get my name correct. Yup, somehow AJ Kohn is transformed into John. (Clicks trash icon.)

Respect My Time And Brain

Do or Do Not Dumbledore

One of the things that has bothered me lately is the number of people asking me to take time to provide feedback on their content. Feedback! Some of these people might actually want feedback but I'm going to call bullshit on the vast majority. You don't want feedback. You want me to share and promote your content.

Do you really want me to tell you that your infographic is an eyesore and then not share or promote it? Probably not. I mean, kudos if you really are open to that sort of thing but I'm guessing you're in promotion mode at this stage and you won't be asking for a redesign.

Getting me (or anyone) to do something is a high-friction event. Don't waste it asking them to do the wrong thing.

Honest Teasing

Teased Hair with Aqua Net

Being transparent about what you're trying to accomplish is almost always the best way to go. If you're looking for a link, tell them you're looking for a link. Stop beating around the bush.

I'd also argue that you should be applying marketing principles to outreach. Half the battle is getting me to actually click and read the content. So tease me! Get me interested in what you have to say. Give me a cliff-hanger! Don't put me to sleep or ask me to promote the content without reading it.

Get me interested so that I view or read your content. At that point you have to be confident that the content is good enough that I'll share and promote it. Stop trying to do everything all at once in your outreach email.

TL;DR

Stop Carly Rae content marketing! Fast and shoddy outreach might get you a handful of mentions but it won't lead to long-term success and may actually prevent it in many cases.

Google Now Topics

November 26 2013 // SEO + Technology // 16 Comments

Have you visited your Google Now Topics page? You should if you want to get a peek at how Google is translating queries into topics, which is at the core of the Hummingbird Update.

Google Now Topics

If you are in the United States and have Google Web History turned on you can go to your Google Now Topics page and see your query and click behavior turned into specific topics.

Google Now Topics Example

This is what my Google Now Topics page looked like a few weeks back. It shows specific topics that I've researched in the last day, week and month. If you're unfamiliar with this page this alone might be eye opening. But it gets even more interesting when you look at the options under each topic.

Topic Intent

The types of content offered under each topic is different.

Why is this exciting? To me it shows that Google understands the intent behind each topic. So the topic of New York City brings up 'attractions and photos' while the topic of Googlebot just brings up 'articles'. Google clearly understands that Back to the Future is a movie and that I'd want reviews for the Toyota Prius Plug-in Hybrid.

In essence, words map to a topic which in turn tells Google what type of content should most likely be returned. You can see how these topics were likely generated by looking back at Web History.

Search of Google Web History for Moto X

This part of my web history likely triggered a Moto X topic. I used the specific term 'Moto X' a number of times in a query which made it very easy to identify. (I did wind up getting the Moto X and love it.)

Tripping Google Now Topics

When I first saw this page  back in March and then again in June I wanted to start playing around with what combination of queries would produce a Google Now Topic. However, I've been so busy with client work that I never got a chance to do that until now.

Here's what I did. Logged into my Google account and using Chrome I tried the following series of queries (without clicking through on any results) at 1:30pm on November 13th.

the stranger
allentown
downeaster alexa
big shot
pressure
uptown girl
piano man

But nothing ever showed up in Google Now Topics. So I took a similar set of terms but this time engaged with the results at 8:35am on November 16th.

piano man (clicked through on Wikipedia)
uptown girl (clicked through on YouTube)
pressure (no click)
big shot (clicked through on YouTube)
the stranger lyrics (clicked through on atozlyrics, then YouTube)
scenes from an italian restaurant (no click)

Then at 9:20am a new Google Now Topic shows up!

Google Now Topic for Billy Joel Songs

Interestingly it understands that this is about music but it hasn't made a direct connection to Billy Joel. I had purposefully not used his name in the queries to see if Google Now Topics would return him as the topic instead of just songs. Maybe Google knows but I had sort of hoped to get a Billy Joel topic to render and think that might be the better result.

YouTube Categories

Engagement certainly seems to count based on my limited tests. But I couldn't help but notice the every one of the songs in that Google Now Topic was also a YouTube click. Could I get a Google Now Topic to render without a YouTube click.

The next morning I tried again with a series of queries at 7:04am.

shake it up (no click)
my best friend's girl (lyricsfreak click)
let the good times roll (click on Wikipeida, click to disambiguated song)
hello again (no click)
just what i needed (lastfm click)
tonight she comes (songmeanings click)
shake it up lyrics (azlyrics click)

At 10:04 nothing showed up so I decided to try another search.

let the good times roll (clicked on YouTube)

At 10:59 nothing showed up and I was getting antsy, which was probably not smart. I should have waited! But instead I performed another query.

the cars (clicked on knowledge graph result for Ric Ocasek)

And at 12:04 I get a new Google Now Topic.

Let The Good Times Roll Google Now Topic

I'm guessing that if I'd waited a bit longer after my YouTube click that this would have appeared, regardless of the click on the knowledge graph result. It seems that YouTube is a pretty important part of the equation. It's not the only way to generate a Google Now Topic but it's one of the faster ways to do so right now.

Perhaps it's easier to identify the topic because of the more rigid categorization on YouTube?

The Cars on YouTube

I didn't have time to do more research here but am hoping others might begin to compile a larger corpus of tests so we can tease out some conclusions.

Topic Stickiness

I got busy again and by the time I was ready to write this piece I found that my topics had changed.

New Google Now Topics

It was fairly easy to deduce why each had been produced, though the Ice Bath result could have been simply from a series of queries. But what was even more interesting was what my Google Now Topics looked like this morning.

My Google Now Topics Today

Some of my previous topics are gone! Both Ice Bath and Let The Good Times Roll are nowhere to be found. This seems to indicate that there's a depth of interaction and distance from event (time) factor involved in identifying relevant topics.

It would make sense for Google to identify intent that was more consistent from intent that was more ephemeral. I was interested in ice baths because my daughter has some plantar fascia issues. But I've never researched it before and likely (fingers crossed) won't again. So it would make sense to drop it.

There are a number of ways that Google could determine which topics are more important to a user, including frequency of searching, query chains, depth of interaction as well as type and variety of content.

Google Now Topics and Hummingbird

OMG It's Full of Stars Cat

My analysis of the Hummingbird Update focused largely on the ability to improve topic modeling through a combination of traditional text analysis natural and entity detection.

Google Now Topics looks like a Hummingbird learning lab.

Watching how queries and click behavior turn into topics (there's that word again) and what types of content are displayed for each topic is a window into Google's evolving abilities and application of entities into search results.

It may not be the full picture of what's going on but there's enough here to put a lot of paint on the canvass.

TL;DR

Google Now Topics provide a glimpse into the Hummingbird Update by showing how Google takes words, queries and behavior and turns them into topics with defined intent.

What Does The Hummingbird Say?

November 07 2013 // SEO + Technology // 29 Comments

What Does The Fox Say Video Screencap

Dog goes woof
Cat goes meow
Bird goes tweet
and mouse goes squeak

Cow goes moo
Frog goes croak
and the elephant goes toot

Ducks say quack
and fish go blub
and the seal goes ow ow ow ow ow

But theres one sound
That no one knows
What does the hummingbird say?

What Does The Hummingbird Say?

For the last month or so the search industry has been trying to figure out Google's new Hummingbird update. What is it? How does it work? How should you react.

There's been a handful of good posts on Hummingbird including those by Danny SullivanBill Slawski, Gianluca Fiorelli, Eric Enge (featuring Danny Sullivan), Ammon Johns and Aaron Bradley. I suggest you read all of these given the chance.

I share many of the views expressed in the referenced posts but with some variations and additions, which is the genesis of this post.

Entities, Entities, Entities

Are you sick of hearing about entities yet? You probably are but you should get used to it because they're here to stay in a big way. Entities are at the heart of Hummingbird if you parse statements from Amit Singhal.

We now get that the words in the search box are real world people, places and things, and not just strings to be managed on a web page.

Long story short, Google is beginning to understand the meaning behind words and not just the words themselves. And in August 2013 Google published something specifically on this topic in relation to an open source toolkit called word2vec, which is short for word to vector.

Word2vec uses distributed representations of text to capture similarities among concepts. For example, it understands that Paris and France are related the same way Berlin and Germany are (capital and country), and not the same way Madrid and Italy are. This chart shows how well it can learn the concept of capital cities, just by reading lots of news articles -- with no human supervision:

Example of Getting Meaning Behind Words

So that's pretty cool isn't it? It gets even cooler when you think about how these words are actually places that have a tremendous amount of metadata surrounding them.

Topic Modeling

It's my belief that the place where Hummingbird has had the most impact is in the topic modeling of sites and documents. We already know that Google is aggressively parsing documents and extracting entities.

When you type in a search query -- perhaps Plato -- are you interested in the string of letters you typed? Or the concept or entity represented by that string? But knowing that the string represents something real and meaningful only gets you so far in computational linguistics or information retrieval -- you have to know what the string actually refers to. The Knowledge Graph and Freebase are databases of things, not strings, and references to them let you operate in the realm of concepts and entities rather than strings and n-grams.

Reading this I think it becomes clear that once those entities are extracted Google is then performing a lookup on an entity database(s) and learning about what that entity means. In particular Google wants to know what topic/concept/subject to which that entity is connected.

Google seems to be pretty focused on that if you look at the Freebase home page today.

Freebase Topic Count

Tamar Yehoshua, VP of Search, also said as much during the Google Search Turns 15 event.

So the Knowledge Graph is great at letting you explore topics and sets of topics.

One of the examples she used was the search for impressionistic artists. Google returned a list of artists and allowed you to navigate to different genres like cubists. It's clear that Google is relating specific entities, artists in this case, to a concept or topic like impressionist artists, and further up to a parent topic of art.

Do you think that having those entities on a page might then help Google better understand what the topic of that page is about? You better believe it.

Based on client data I think that the May 2013 Phantom Update was the first application of a combined topic model (aka Hummingbird). Two weeks later it was rolled back and then later reapplied with some adjustments.

Hummingbird refined the topic modeling of sites and pages that are essential to delivering relevant results.

Strings AND Things

Hybrid Car

This doesn't mean that text based analysis has gone the way of the do-do bird. First off, Google still needs text to identify entities. Anyone who thinks that keywords (or perhaps it's easier to call them subjects) in text isn't meaningful is missing the boat.

In almost all cases you don't have as much labeled data as you'd really like.

That's a quote from a great interview with Jeff Dean and while I'm taking the meaning of labeled data out of context I think it makes sense here. Writing properly (using nouns and subjects) will help Google to assign labels to your documents. In other words, make it easy for Google to know what you're talking about.

Google can still infer a lot about what that page is about and return it for appropriate queries by using natural language processing and machine learning techniques. But now they've been able to extract entities, understand the topics to which they refer and then feed that back into the topic model. So in some ways I think Hummingbird allows for a type of recursive topic modeling effort to take place.

If we use the engine metaphor favored by Amit and Danny, Hummingbird is a hybrid engine instead of a combustion or electric only engine.

From Caffeine to Hummingbird

Electrical Outlet with USB and Normal Sockets

One of the head scratching parts of the announcement was the comparison of Hummingbird to Caffeine. The latter was a huge change in the way that Google crawled and indexed data. In large part Caffeine was about the implementation of Percolator (incremental processing), Dremel (ad-hoc query analysis) and Pregel (graph analysis). It was about infrastructure.

So we should be thinking about Hummingbird in the same way. If we believe that Google now wants to use both text and entity based signals to determine quality and relevance they'd need a way to plug both sources of data into the algorithm.

Imagine a hybrid car that didn't have a way to recharge the battery. You might get some initial value out of that hybrid engine but it would be limited. Because once out of juice you'd have to take the battery out and replace it with a new one. That would suck.

Instead, what you need is a way to continuously recharge the battery so the hybrid engine keeps humming along. So you can think of Hummingbird as the way to deliver new sources of data (fuel!) to the search engine.

Right now that new source of data is entities but, as Danny Sullivan points out, it could also be used to bring social data into the engine. I still don't think that's happening right now, but the infrastructure may now be in place to do so.

The algorithms aren't really changing but the the amount of data Google can now process allows for greater precision and insight.

Deep Learning

Mr. Fusion Home Reactor

What we're really talking about is a field that is being referred to as deep learning, which you can think of as machine learning on steroids.

This is a really fascinating (and often dense) area that looks at the use of labeled and unlabeled data and the use of supervised and unsupervised learning models. These concepts are somewhat related and I'll try to quickly explain them, though I may mangle the precise definitions. (Scholarly types are encouraged to jump in an provide correction or guidance.)

The vast majority of data is unlabeled, which is a fancy way of saying that it hasn't been classified or doesn't have any context. Labeled data has some sort of classification or identification to it from the start.

Unlabeled data would be the tub of old photographs while labeled data might be the same tub of photographs but with 'Christmas 1982', 'Birthday 1983', 'Joe and Kelly' etc. scrawled in black felt tip on the back of each one. (Here's another good answer to the difference between labeled and unlabeled data.)

Why is this important? Let's return to Jeff Dean (who is a very important figure in my view) to tell us.

You're always going to have 100x, 1000x as much unlabeled data as labeled data, so being able to use that is going to be really important.

The difference between supervised learning and unsupervised learning is similar. Supervised learning means that the model is looking to fit things into a pre-conceived classification. Look at these photos and tell me which of them are cats. You already know what you want it to find. Unsupervised learning on the other hand lets the model find it's own classifications.

If I have it right, supervised learning has a training set of labeled data where a unsupervised learning has no initial training set. All of this is wrapped up in the fascinating idea of neural networks.

The different models for learning via neural nets, and their variations and refinements, are myriad. Moreover, researchers do not always clearly understand why certain techniques work better than others. Still, the models share at least one thing: the more data available for training, the better the methods work.

The emphasis here is mine because I think it's extremely relevant. Caffeine and Hummingbird allow Google to both use more data and to process that data quickly. Maybe Hummingbird is the ability to deploy additional layers of unsupervised learning across a massive corpus of documents?

And that cat reference isn't just because I like LOLcats. A team at Google (including Jeff Dean) was able to use unlabeled, unsupervised learning to identify cats (among other things) in YouTube thumbnails (PDF).

So what does this all have to do with Hummingbird? Quite a bit if I'm connecting the dots the right way. Once again I'll refer back the Jeff Dean interview (which I seem to get something new out of each time I read it).

We're also collaborating with a bunch of different groups within Google to see how we can solve their problems, both in the short and medium term, and then also thinking about where we want to be four years, five years down the road. It's nice to have short-term to medium-term things that we can apply and see real change in our products, but also have longer-term, five to 10 year goals that we're working toward.

Remember at the end of Back to The Future when Doc shows up and implores Marty to come to the future with him? The flux capacitor used to need plutonium to reach critical mass but this time all it takes is some banana peels and the dregs from some Miller Beer in a Mr. Fusion home reactor.

So not only is Hummingbird a hybrid engine but it's hooked up to something that can turn relatively little into a whole lot.

Quantum Computing

So lets take this a little bit further and look at Google's interest in quantum computing. Back in 2009 Hartmut Neven was talking about the use of quantum algorithms in machine learning.

Over the past three years a team at Google has studied how problems such as recognizing an object in an image or learning to make an optimal decision based on example data can be made amenable to solution by quantum algorithms. The algorithms we employ are the quantum adiabatic algorithms discovered by Edward Farhi and collaborators at MIT. These algorithms promise to find higher quality solutions for optimization problems than obtainable with classical solvers.

This seems to have yielded positive results because in May 2013 Google upped the ante and entered into a quantum computer partnership with NASA. As part of that announcement we got some insight into Google's use of quantum algorithms.

We’ve already developed some quantum machine learning algorithms. One produces very compact, efficient recognizers -- very useful when you’re short on power, as on a mobile device. Another can handle highly polluted training data, where a high percentage of the examples are mislabeled, as they often are in the real world. And we’ve learned some useful principles: e.g., you get the best results not with pure quantum computing, but by mixing quantum and classical computing.

A highly polluted set of training data where many examples are mislabeled? Makes you wonder what that might be doesn't it? Link graph analysis perhaps?

Are quantum algorithms part of Hummingbird? I can't be certain. But I believe that Hummingbird lays the groundwork for these types of leaps in optimization.

What About Conversational Search?

Dog Answering The Phone

There's also a lot of talk about conversational search (pun intended). I think many are conflating Hummingbird with the gains in conversational search. Mind you, the basis of voice and conversational search is still machine learning. But Google's focus on conversational search is largely a nod to the future.

We believe that voice will be fundamental to building future interactions with the new devices that we are seeing.

And the first area where they've made advances is the ability to resolve pronouns in query chains.

Google understood my context. It understood what I was talking about. Just as if I was having a conversation with you and talking about the Eiffel Tower, I wouldn't have to keep repeating it over and over again.

Does this mean that Google can resolve pronouns within documents? They're getting better at that (there a huge corpus of research actually) but I doubt it's to the level we see in this distinct search microcosm.

Conversational search has a different syntax and demands a slightly different language model to better return results. So Google's betting that conversational search will be the dominant method of searching and is adapting as necessary.

What Does Hummingbird Do?

What's That Mean Far Field Productions

This seems to be the real conundrum when people look at Hummingbird. If it affects 90% of searches worldwide why didn't we notice the change?

Hummingbird makes results even more useful and relevant, especially when you ask Google long, complex questions.

That's what Amit says of Hummingbird and I think this makes sense and can map back to the idea of synonyms (which are still quite powerful). But now, instead of looking at a long query and looking at word synonyms Google could also be applying entity synonyms.

Understanding the meaning of the query might be more important than the specific words used in the query. It reminds me a bit of Aardvark which was purchased by Google in February 2010.

Aardvark analyzes questions to determine what they're about and then matches each question to people with relevant knowledge and interests to give you an answer quickly.

I remember using the service and seeing how it would interpret messy questions and then deliver a 'scrubbed' question to potential candidates for answering. There was a good deal of technology at work in the background and I feel like I'm seeing it magnified with Hummingbird.

And it resonates with what Jeff Dean has to say about analyzing sentences.

I think we will have a much better handle on text understanding, as well. You see the very slightest glimmer of that in word vectors, and what we'd like to get to where we have higher level understanding than just words. If we could get to the point where we understand sentences, that will really be quite powerful. So if two sentences mean the same thing but are written very differently, and we are able to tell that, that would be really powerful. Because then you do sort of understand the text at some level because you can paraphrase it.

My take is that 90% of the searches were affected because documents that appear in those results were re-scored or refined through the addition of entity data and the application of machine learning across a larger data set.

It's not that those results have changed but that they have the potential to change based on the new infrastructure in place.

Hummingbird Response

Le homard et le chat

How should you respond to Hummingbird? Honestly, there's not a whole lot to do in many ways if you've been practicing a certain type of SEO.

Despite the advice to simply write like no one's watching, you should make sure you're writing is tight and is using subjects that can be identified by people and search engines. "It is a beautiful thing" won't do as well as "Picasso's Lobster and Cat is a beautiful painting".

You'll want to make your content easy to read and remember, link out to relevant and respected sources, build your authority by demonstrating your subject expertise, engage in the type of social outreach that produces true fans and conduct more traditional marketing and brand building efforts.

TL;DR

Hummingbird is an infrastructure change that allows Google to take advantage of additional sources of data, such as entities, as well as leverage new deep learning models that increase the precision of current algorithms. The first application of Hummingbird was the refinement of Google's document topic modeling, which is vital to delivering relevant search results.

Authorship Is Dead, Long Live Authorship

October 24 2013 // SEO // 61 Comments

Google's Authorship program is still a hot topic. A constant string of blog posts, conference sessions and 'research' projects about Authorship and the idea that it can be used as a ranking signal fill our community.

I Do Not Think It Means What You Think It Does

Yet, the focus on the actual markup and clock-watching when AuthorRank might show up may not be the best use of time.

Would it surprise you to learn that the Authorship Project at Google has been shuttered? Or that this signals not the death of Authorship but a different method of assigning Authorship.

Here's my take on where Authorship stands today.

RIP Authorship Project

The Authorship Project at Google was headed up by Othar Hansson. He's an incredibly smart and amiable guy, who from time to time was kind enough to provide answers and insight into Authorship. I was going to reach out to him again the other day and discovered something.

Othar Hansson Google+ About

Othar no longer works on the Authorship Project. He's now a principal engineer on the Android search team, which is a pretty sweet gig. Congratulations!

Remember that it was Othar who announced the new markup back in June of 2011 and then appeared with Matt Cutts in the Authorship Markup video. His departure is meaningful. More so because I can't locate a replacement. (That doesn't mean there isn't one but ... usually I'm pretty good at connecting with folks.)

Not only that but there was no replacement for Sagar Kamdar, who left as product manager of Authorship (among other things) in July of 2012 to work at Google X and, ultimately, Project Loon.

At the time I thought the writing was on the wall. The Authorship Project wasn't getting internal resources and wasn't a priority for Google.

Authorship Adoption

Walter White with his Pontiac Aztec

The biggest problem with Authorship markup is adoption. Not everyone is participating. Study after study after study show that there are material gaps in who is and isn't using the markup. Even the most rosy study of Authorship adoption by technology writers isn't anything to write home about.

Google is unable to use Authorship as a ranking signal if important authors aren't participating.

That means people like Neil Gaiman and Kevin Kelly wouldn't rank as well since they don't employ Authorship markup. It doesn't take a lot of work to find important people who aren't participating and that makes any type of AuthorRank that relies on markup a non-starter.

Authorship SERP Benefits

Search Result Heatmap For Authorship Snippet

Don't get me wrong. Google still supports Authorship markup and there are clear click-through rate benefits to having an Authorship snippet on a search result. Even if you don't believe me or Cyrus Shepard, you should believe Google and the research they've done on social annotations in 2012 (PDF) and 2013 (PDF).

So if you haven't implemented Google Authorship yet it's still a good idea to do so. You'll receive a higher click-through rate and will build authority (different from AuthorRank), both of which may help you rank better over time.

Google knows users respond to Authorship.

Inferred Authorship

I Know What You Did Last Summer

It's clear that Google still wants to do something about identifying authority and expertise. Any monkey with a keyboard can add content to the Internet. So increasingly it's about who is creating that content and why you should trust and value their opinion.

One of the first ways Google was able to infer identity (aka authorship) was by crawling the public social graph. Rapleaf took the brunt of the backlash for this but Google was quietly mapping all of your social profiles as well.

So even if you don't have Authorship markup on a Quora or Slideshare profile Google probably knows about it and could assign Authorship. All this data used to be available via social circles but Google removed this feature a few years ago. But that doesn't mean Google isn't mining the social graph.

Heck, Google could even employ usernames as a way to identify accounts from the same person. What we're really talking about here is how Google can identify people and their areas of expertise.

Authors are People are Entities

But what if Google took another approach to identifying authors? Instead of looking for specific markup what if they looked for entities that happen to be people.

Authors are people are entities.

This would solve the adoption issue. And that's what the Freebase Annotations of the ClueWeb Corpora (FACC) seems to indicate.

Identifying Authors in Text

The picture makes it pretty clear in my mind. Here we're seeing that Google has been able to identify an entity (a person in this instance) within the text of a document and match it to a Freebase identifier.

Based on review of a sample of documents, we believe the precision is about 80-85%, and recall, which is inherently difficult to measure in situations like this, is in the range of 70-85%. Not every ClueWeb document is included in this corpus; documents in which we found no entities were excluded from the set. A document might be excluded because there were no entities to be found, because the entities in question weren’t in Freebase, or because none of the entities were resolved at a confidence level above the threshold.

At a glance you might think this means that Google still has a 'coverage' problem if they were to use entities as their approach to Authorship. But think about who is and isn't in Freebase (or Wikipedia). In some ways, these repositories are biased towards those who have achieved some level of notoriety.

Would Google prefer to rely on self referring markup or a crowd based approach to identifying experts?

Google+ Is An Entity Platform

AJ Kohn Cheltenham High School ID

While Google might prefer to use a smaller set of crowd sourced entities to assign Authorship initially I think they'd ultimately like to have a larger corpus of Authors. That's where Google+ fits into the puzzle.

I think most people understand that Google+ is an identity platform. But if people are entities (and so are companies) then Google+ is a huge entity platform, a massive database of people.

Google+ is the knowledge graph of everyday people.

And if we then harken back to social circles, to mapping the social graph and to measuring engagement and activity, we can begin to see how a comprehensive Authorship program might take shape.

Extract, Match and Measure

Concentration Board Game

Authorship then becomes about Google's ability to extract entities from documents, matching those entities to a corpus that contains descriptors of that entity (i.e. - social profiles, official page(s), subjects) and then measuring the activity around that entity.

Perhaps Google could even go so far as to understand triples on a very detailed (document) level, noting which documents I might have authored as well as the documents in which I've been mentioned.

The presence of Authorship markup might increase the confidence level of the match but it will likely play a supporting and refining role instead of the defining role in the process.

Trust and Authority

Trust Me Sign

I'm reminded that Google talks frequently about trust and authority. For years that was about how it assessed sites but that same terminology can (and should) be applied to people as well.

Authorship markup is but one part of the equation but that alone won't translate into some magical silver bullet of algorithmic success. Building authority is what will ultimately matter and be reflected in any related ranking signal.

Are the documents you author well regarded by your peers? Are they shared? By who? How often? With what velocity? And are you mentioned (or cited) by other documents? Do they sit on respected sites? Who are they authored by? What text surrounded your mention?

So part of this is doing the hard work of producing memorable content, marketing yourself and engaging with your community. The other part will be ensuring that your entity information is both comprehensive and up-to-date. That means filling out your entire Google+ profile and potentially finding ways to add yourself to traditional entity resources such as Wikipedia and Freebase.

Just as links are the result and not the goal of your efforts, any sort of AuthorRank will be the result of building your own trust and authority through content and engagement.

TL;DR

The Authorship Project at Google has been abandoned. But that doesn't mean Authorship is dead. Instead it signals a change in tactics from Authorship markup to entity extraction as a way to identify experts and a pathway to using Authorship as a ranking signal.