Algorithm Analysis In The Age of Embeddings

// // November 19th 2018 // Analytics + SEO

On August 1st, 2018 an algorithm update took 50% of traffic from a client site in the automotive vertical. An analysis of the update made me certain that the best course of action was … to do nothing. So what happened?

Algorithm Changes Google Analytics

Sure enough, on October 5th, that site regained all of its traffic. Here’s why I was sure doing nothing was the right thing to do and why I dismissed any E-A-T chatter.

E-A-T My Shorts

Eat Pant

I find the obsession with the Google Rating Guidelines to be unhealthy for the SEO community. If you’re unfamiliar with this acronym it stands for Expertise, Authoritativeness and Trustworthiness. It’s central to the published Google Rating Guidelines.

The problem is those guidelines and E-A-T are not algorithm signals. Don’t believe me? Believe Ben Gomes, long-time search quality engineer and new head of search at Google.

“You can view the rater guidelines as where we want the search algorithm to go,” Ben Gomes, Google’s vice president of search, assistant and news, told CNBC. “They don’t tell you how the algorithm is ranking results, but they fundamentally show what the algorithm should do.”

So I am triggered when I hear someone say they “turned up the weight of expertise” in a recent algorithm update. Even if the premise were true, you have to connect that to how the algorithm would reflect that change. How would Google make changes algorithmically to reflect higher expertise?

Google doesn’t have three big knobs in a dark office protected by biometric scanners that allows them to change E-A-T at will.

Tracking Google Ratings

Before I move on I’ll do a deeper dive into quality ratings. I poked around to see if there are material patterns to Google ratings and algorithmic changes. It’s pretty easy to look at referring traffic from the sites that perform ratings.

Tracking Google Ratings in Analytics

The four sites I’ve identified are raterlabs.com, raterhub.com, leapforceathome.com and appen.com. At present there’s really only variants of appen.com, which rebranded in the last few months. Either way, create an advanced segment and you can start to see when raters have visited your site.

And yes, these are ratings. A quick look at the referral path makes it clear.

Raters Program Referral Path

The /qrp/ stands for quality rating program and the needs_met_simulator seems pretty self-explanatory.

It can be interesting to then look at the downstream traffic for these domains.

SEMRush Downstream Traffic for Raterhub.com

Go the extra distance and you can determine what page(s) the raters are accessing on your site. Oddly, they generally seem to focus on one or two pages, using them as a representative for quality.

Beyond that, the patterns are hard to tease out, particularly since I’m unsure what tasks are truly being performed. A much larger set of this data across hundreds (perhaps thousands) of domains might produce some insight but for now it seems a lot like reading tea leaves.

Acceptance and Training

The quality rating program has been described in many ways so I’ve always been hesitant to label it one thing or another. Is it a way for Google to see if their recent algorithm changes were effective or is it a way for Google to gather training data to inform algorithm changes?

The answer seems to be yes.

Appen Home Page Messaging

Appen is the company that recruits quality raters. And their pitch makes it pretty clear that they feel their mission is to provide training data for machine learning via human interactions. Essentially, they crowdsource labeled data, which is highly sought after in machine learning.

The question then becomes how much Google relies on and uses this set of data for their machine learning algorithms.

“Reading” The Quality Rating Guidelines

Invisible Ink

To understand how much Google relies on this data, I think it’s instructive to look at the guidelines again. But for me it’s more about what the guidelines don’t mention than what they do mention.

What query classes and verticals does Google seem to focus on in the rating guidelines and which ones are essentially invisible? Sure, the guidelines can be applied broadly, but one has to think about why there’s a larger focus on … say, recipes and lyrics, right?

Beyond that, do you think Google could rely on ratings that cover a microscopic percentage of total queries? Seriously. Think about that. The query universe is massive! Even the query class universe is huge.

And Google doesn’t seem to be adding resources here. Instead, in 2017 they actually cut resources for raters. Now perhaps that’s changed but … I still can’t see this being a comprehensive way to inform the algorithm.

The raters clearly function as a broad acceptance check on algorithm changes (though I’d guess these qualitative measures wouldn’t outweigh the quantitative measures of success) but also seem to be deployed more tactically when Google needs specific feedback or training data for a problem.

Most recently that was the case with the fake news problem. And at the beginning of the quality rater program I’m guessing they were struggling with … lyrics and recipes.

So if we think back to what Ben Gomes says, the way we should be reading the guidelines is about what areas of focus Google is most interested in tackling algorithmically. As such I’m vastly more interested in what they say about queries with multiple meanings and understanding user intent.

At the end of the day, while the rating guidelines are interesting and provide excellent context, I’m looking elsewhere when analyzing algorithm changes.

Look At The SERP

This Tweet by Gianluca resonated strongly with me. There’s so much to be learned after an algorithm update by actually looking at search results, particularly if you’re tracking traffic by query class. Doing so I came to a simple conclusion.

For the last 18 months or so most algorithm updates have been what I refer to as language understanding updates.

This is part of a larger effort by Google around Natural Language Understanding (NLU), sort of a next generation of Natural Language Processing (NLP). Language understanding updates have a profound impact on what type of content is more relevant for a given query.

For those that hang on John Mueller’s every word, you’ll recognize that many times he’ll say that it’s simply about content being more relevant. He’s right. I just don’t think many are listening. They’re hearing him say that, but they’re not listening to what it means.

Neural Matching

The big news in late September 2018 was around neural matching.

But we’ve now reached the point where neural networks can help us take a major leap forward from understanding words to understanding concepts. Neural embeddings, an approach developed in the field of neural networks, allow us to transform words to fuzzier representations of the underlying concepts, and then match the concepts in the query with the concepts in the document. We call this technique neural matching. This can enable us to address queries like: “why does my TV look strange?” to surface the most relevant results for that question, even if the exact words aren’t contained in the page. (By the way, it turns out the reason is called the soap opera effect).

Danny Sullivan went on to refer to them as super synonyms and a number of blog posts sought to cover this new topic. And while neural matching is interesting, I think the underlying field of neural embeddings is far more important.

Watching search results and analyzing keyword trends you can see how the content Google chooses to surface for certain queries changes over time. Seriously folks, there’s so much value in looking at how the mix of content changes on a SERP.

For instance, the query ‘Toyota Camry Repair’ is part of a query class that has fractured intent. What is it that people are looking for when they search this term? Are they looking for repair manuals? For repair shops? For do-it-yourself content on repairing that specific make and model?

Google doesn’t know. So it’s been cycling through these different intents to see which of them performs the best. You wake up one day and it’s repair manuals. A month of so later they essentially disappear.

Now, obviously this isn’t done manually. It’s not even done in a traditional algorithmic sense. Instead it’s done through neural embeddings and machine learning.

Neural Embeddings

Let me first start out by saying that I found a lot more here than I expected as I did my due diligence. Previously, I had done enough reading and research to get a sense of what was happening to help inform and explain algorithmic changes.

And while I wasn’t wrong, I found I was way behind on just how much had been taking place over the last few years in the realm of Natural Language Understanding.

Oddly, one of the better places to start is at the end. Very recently, Google open-sourced something called BERT.

Bert

BERT stands for Bidirectional Encoder Representations from Transformers and is a new technique for pre-NLP training.  Yeah, it gets dense quickly. But the following excerpt helped put things into perspective.

Pre-trained representations can either be context-free or contextual, and contextual representations can further be unidirectional or bidirectional. Context-free models such as word2vec or GloVe generate a single word embedding representation for each word in the vocabulary. For example, the word “bank” would have the same context-free representation in “bank account” and “bank of the river.” Contextual models instead generate a representation of each word that is based on the other words in the sentence. For example, in the sentence “I accessed the bank account,” a unidirectional contextual model would represent “bank” based on “I accessed the” but not “account.” However, BERT represents “bank” using both its previous and next context — “I accessed the … account” — starting from the very bottom of a deep neural network, making it deeply bidirectional.

I was pretty well-versed in how word2vec worked but I struggled to understand how intent might be represented. In short, how would Google be able to change the relevant content delivered on ‘Toyota Camry Repair’ algorithmically?  The answer is, in some ways, contextual word embedding models.

Vectors

None of this may make sense if you don’t understand vectors. I believe many, unfortunately, run for the hills when the conversation turns to vectors. I’ve always referred to vectors as ways to represent words (or sentences or documents) via numbers and math.

I think these two slides from a 2015 Yoav Goldberg presentation on Demystifying Neural Word Embeddings does a better job of describing this relationship.

Words as Vectors

So you don’t have to fully understand the verbiage of “sparse, high dimensional” or the math behind cosine distance to grok how vectors work and can reflect similarity.

You shall know a word by the company it keeps.

That’s a famous quote from John Rupert Firth, a prominent linguist and the general idea we’re getting at with vectors.

word2vec

In 2013, Google open-sourced word2vec, which was a real turning point in Natural Language Understanding. I think many in the SEO community saw this initial graph.

Country to Capital Relationships

Cool right? In addition there was some awe around vector arithmetic where the model could predict that [King] – [Man] + [Woman] = [Queen]. It was a revelation of sorts that semantic and syntactic structures were preserved.

Or in other words, vector math really reflected natural language!

What I lost track of was how the NLU community began to unpack word2vec to better understand how it worked and how it might be fine tuned. A lot has happened since 2013 and I’d be thunderstruck if much of it hadn’t worked its way into search.

Context

These 2014 slides about Dependency Based Word Embeddings really drives the point home. I think the whole deck is great but I’ll cherry pick to help connect the dots and along the way try to explain some terminology.

The example used is looking at how you might represent the word ‘discovers’. Using a bag of words (BoW) context with a window of 2 you only capture the two words before and after the target word. The window is the number of words around the target that will be used to represent the embedding.

Word Embeddings using BoW Context

So here, telescope would not be part of the representation. But you don’t have to use a simple BoW context. What if you used another method to create the context or relationship between words. Instead of simple words-before and words-after what if you used syntactic dependency – a type of representation of grammar.

Embedding based on Syntactic Dependency

Suddenly telescope is part of the embedding. So you could use either method and you’d get very different results.

Embeddings Using Different Contexts

Syntactic dependency embeddings induce functional similarity. BoW embeddings induce topical similarity. While this specific case is interesting the bigger epiphany is that embeddings can change based on how they are generated.

Google’s understanding of the meaning of words can change.

Context is one way, the size of the window is another, the type of text you use to train it or the amount of text it’s using are all ways that might influence the embeddings. And I’m certain there are other ways that I’m not mentioning here.

Beyond Words

Words are building blocks for sentences. Sentences building blocks for paragraphs. Paragraphs building blocks for documents.

Sentence vectors are a hot topic as you can see from Skip Thought Vectors in 2015 to An Efficient Framework for Learning Sentence RepresentationsUniversal Sentence Encoder and Learning Semantic Textual Similarity from Conversations in 2018.

Universal Sentence Encoders

Google (Tomas Mikolov in particular before he headed over to Facebook) has also done research in paragraph vectors. As you might expect, paragraph vectors are in many ways a combination of word vectors.

In our Paragraph Vector framework (see Figure 2), every paragraph is mapped to a unique vector, represented by a column in matrix D and every word is also mapped to a unique vector, represented by a column in matrix W. The paragraph vector and word vectors are averaged or concatenated to predict the next word in a context. In the experiments, we use concatenation as the method to combine the vectors.

The paragraph token can be thought of as another word. It acts as a memory that remembers what is missing from the current context – or the topic of the paragraph. For this reason, we often call this model the Distributed Memory Model of Paragraph Vectors (PV-DM).

The knowledge that you can create vectors to represent sentences, paragraphs and documents is important. But it’s more important if you think about the prior example of how those embeddings can change. If the word vectors change then the paragraph vectors would change as well.

And that’s not even taking into account the different ways you might create vectors for variable-length text (aka sentences, paragraphs and documents).

Neural embeddings will change relevance no matter what level Google is using to understand documents.

Questions

But Why?

You might wonder why there’s such a flurry of work on sentences. Thing is, many of those sentences are questions. And the amount of research around question and answering is at an all-time high.

This is, in part, because the data sets around Q&A are robust. In other words, it’s really easy to train and evaluate models. But it’s also clearly because Google sees the future of search in conversational search platforms such as voice and assistant search.

Apart from the research, or the increasing prevalence of featured snippets, just look at the title Ben Gomes holds: vice president of search, assistant and news. Search and assistant are being managed by the same individual.

Understanding Google’s structure and current priorities should help future proof your SEO efforts.

Relevance Matching and Ranking

Obviously you’re wondering if any of this is actually showing up in search. Now, even without finding research that supports this theory, I think the answer is clear given the amount of time since word2vec was released (5 years), the focus on this area of research (Google Brain has an area of focus on NLU) and advances in technology to support and productize this type of work (TensorFlow, Transformer and TPUs).

But there is plenty of research that shows how this work is being integrated into search. Perhaps the easiest is one others have mentioned in relation to Neural Matching.

DRMM with Context Sensitive Embeddings

The highlighted part makes it clear that this model for matching queries and documents moves beyond context-insensitive encodings to rich context-sensitive encodings. (Remember that BERT relies on context-sensitive encodings.)

Think for a moment about how the matching model might change if you swapped the BoW context for the Syntactic Dependency context in the example above.

Frankly, there’s a ton of research around relevance matching that I need to catch up on. But my head is starting to hurt and it’s time to bring this back down from the theoretical to the observable.

Syntax Changes

I became interested in this topic when I saw certain patterns emerge during algorithm changes. A client might see a decline in a page type but within that page type some increased while others decreased.

The disparity there alone was enough to make me take a closer look. And when I did I noticed that many of those pages that saw a decline didn’t see a decline in all keywords for that page.

Instead, I found that a page might lose traffic for one query phrase but then gain back part of that traffic on a very similar query phrase. The difference between the two queries was sometimes small but clearly enough that Google’s relevance matching had changed.

Pages suddenly ranked for one type of syntax and not another.

Here’s one of the examples that sparked my interest in August of 2017.

Query Syntax Changes During Algorithm Updates

This page saw both losers and winners from a query perspective. We’re not talking small disparities either. They lost a lot on some but saw a large gain in others. I was particularly interested in the queries where they gained traffic.

Identifying Syntax Winners

The queries with the biggest percentage gains were with modifiers of ‘coming soon’ and ‘approaching’. I considered those synonyms of sorts and came to the conclusion that this page (document) was now better matching for these types of queries. Even the gains in terms with the word ‘before’ might match those other modifiers from a loose syntactic perspective.

Did Google change the context of their embeddings? Or change the window? I’m not sure but it’s clear that the page is still relevant to a constellation of topical queries but that some are more relevant and some less based on Google’s understanding of language.

Most recent algorithm updates seem to be changes in the embeddings used to inform the relevance matching algorithms.

Language Understanding Updates

If you believe that Google is rolling out language understanding updates then the rate of algorithm changes makes more sense. As I mentioned above there could be numerous ways that Google tweaks the embeddings or the relevance matching algorithm itself.

Not only that but all of this is being done with machine learning. The update is rolled out and then there’s a measurement of success based on time to long click or how quickly a search result satisfies intent. The feedback or reinforcement learning helps Google understand if that update was positive or negative.

One of my recent vague Tweets was about this observation.

Or the dataset that feeds an embedding pipeline might update and the new training model is then fed into system. This could also be vertical specific as well since Google might utilize a vertical specific embeddings.

August 1 Error

Based on that last statement you might think that I thought the ‘medic update’ was aptly named. But you’d be wrong. I saw nothing in my analysis that led me to believe that this update was utilizing a vertical specific embedding for health.

The first thing I do after an update is look at the SERPs. What changed? What is now ranking that wasn’t before? This is the first way I can start to pick up the ‘scent’ of the change.

There are times when you look at the newly ranked pages and, while you may not like it, you can understand why they’re ranking. That may suck for your client but I try to be objective. But there are times you look and the results just look bad.

Misheard Lyrics

The new content ranking didn’t match the intent of the queries.

I had three clients who were impacted by the change and I simply didn’t see how the newly ranked pages would effectively translate into better time to long click metrics. By my way of thinking, something had gone wrong during this language update.

So I wasn’t keen on running around making changes for no good reason. I’m not going to optimize for a misheard lyric. I figured the machine would eventually learn that this language update was sub-optimal.

It took longer than I’d have liked but sure enough on October 5th things reverted back to normal.

August 1 Updates

Where's Waldo

However, there were two things included in the August 1 update that didn’t revert. The first was the YouTube carousel. I’d call it the Video carousel but it’s overwhelmingly YouTube so lets just call a spade a spade.

Google seems to think that the intent of many queries can be met by video content. To me, this is an over-reach. I think the idea behind this unit is the old “you’ve got chocolate in my peanut butter” philosophy but instead it’s more like chocolate in mustard. When people want video content they … go search on YouTube.

The YouTube carousel is still present but its footprint is diminishing. That said, it’ll suck a lot of clicks away from a SERP.

The other change was far more important and is still relevant today. Google chose to match question queries with documents that matched more precisely. In other words, longer documents receiving questions lost out to shorter documents that matched that query.

This did not come as a surprise to me since the user experience is abysmal for questions matching long documents. If the answer to your question is in the 8th paragraph of a piece of content you’re going to be really frustrated. Google isn’t going to anchor you to that section of the content. Instead you’ll have to scroll and search for it.

Playing hide and go seek for your answer won’t satisfy intent.

This would certainly show up in engagement and time to long click metrics. However, my guess is that this was a larger refinement where documents that matched well for a query where there were multiple vector matches were scored lower than those where there were fewer matches. Essentially, content that was more focused would score better.

Am I right? I’m not sure. Either way, it’s important to think about how these things might be accomplished algorithmically. More important in this instance is how you optimize based on this knowledge.

Do You Even Optimize?

So what do you do if you begin to embrace this new world of language understanding updates? How can you, as an SEO, react to these changes?

Traffic and Syntax Analysis

The first thing you can do is analyze updates more rationally. Time is a precious resource so spend it looking at the syntax of terms that gained and lost traffic.

Unfortunately, many of the changes happen on queries with multiple words. This would make sense since understanding and matching those long-tail queries would change more based on the understanding of language. Because of this, many of the updates result in material ‘hidden’ traffic changes.

All those queries that Google hides because they’re personally identifiable are ripe for change.

That’s why I spent so much time investigating hidden traffic. With that metric, I could better see when a site or page had taken a hit on long-tail queries. Sometimes you could make predictions on what type of long-tail queries were lost based on the losses seen in visible queries. Other times, not so much.

Either way, you should be looking at the SERPs, tracking changes to keyword syntax, checking on hidden traffic and doing so through the lens of query classes if at all possible.

Content Optimization

This post is quite long and Justin Briggs has already done a great job of describing how to do this type of optimization in his On-page SEO for NLP post. How you write is really, really important.

My philosophy of SEO has always been to make it as easy as possible for Google to understand content. A lot of that is technical but it’s also about how content is written, formatted and structured. Sloppy writing will lead to sloppy embedding matches.

Look at how your content is written and tighten it up. Make it easier for Google (and your users) to understand.

Intent Optimization

Generally you can look at a SERP and begin to classify each result in terms of what intent it might meet or what type of content is being presented. Sometimes it’s as easy as informational versus commercial. Other times there are different types of informational content.

Certain query modifiers may match a specific intent. In its simplest form, a query with ‘best’ likely requires a list format with multiple options. But it could also be the knowledge that the mix of content on a SERP changed, which would point to changes in what intent Google felt was more relevant for that query.

If you follow the arc of this story, that type of change is possible if something like BERT is used with context sensitive embeddings that are receiving reinforcement learning from SERPs.

I’d also look to see if you’re aggregating intent. Satisfy active and passive intent and you’re more likely to win. At the end of the day it’s as simple as ‘target the keyword, optimize the intent’. Easier said than done I know. But that’s why some rank well and others don’t.

This is also the time to use the rater guidelines (see I’m not saying you write them off completely) to make sure you’re meeting the expectations of what ‘good content’ looks like. If your main content is buried under a whole bunch of cruft you might have a problem.

Much of what I see in the rater guidelines is about capturing attention as quickly as possible and, once captured, optimizing that attention. You want to mirror what the user searched for so they instantly know they got to the right place. Then you have to convince them that it’s the ‘right’ answer to their query.

Engagement Optimization

How do you know if you’re optimizing intent? That’s really the $25,000 question. It’s not enough to think you’re satisfying intent. You need some way to measure that.

Conversion rate can be one proxy? So too can bounce rate to some degree. But there are plenty of one page sessions that satisfy intent. The bounce rate on a site like StackOverflow is super high. But that’s because of the nature of the queries and the exactness of the content. I still think measuring adjusted bounce rate over a long period of time can be an interesting data point.

I’m far more interested in user interactions. Did they scroll? Did they get to the bottom of the page? Did they interact with something on the page? These can all be tracking in Google Analytics as events and the total number of interactions can then be measured over time.

I like this in theory but it’s much harder to do in practice. First, each site is going to have different types of interactions so it’s never an out of the box type of solution. Second, sometimes having more interactions is a sign of bad user experience. Mind you, if interactions are up and so too is conversion then you’re probably okay.

Yet, not everyone has a clean conversion mechanism to validate interaction changes. So it comes down to interpretation. I personally love this part of the job since it’s about getting to know the user and defining a mental model. But very few organizations embrace data that can’t be validated with a p-score.

Those who are willing to optimize engagement will inherit the SERP.

There are just too many examples where engagement is clearly a factor in ranking. Whether it be a site ranking for a competitive query with just 14 words or a root term where low engagement has produced a SERP geared for a highly engaging modifier term instead.

Those bound by fears around ‘thin content’ as it relates to word count are missing out, particularly when it comes to Q&A.

TL;DR

Recent Google algorithm updates are changes to their understanding of language. Instead of focusing on E-A-T, which are not algorithmic factors, I urge you to look at the SERPs and analyze your traffic including the syntax of the queries.

Postscript: Leave A Comment // Subscribe (RSS Feed)

The Next Post:
The Previous Post:

5 trackbacks/pingbacks

  1. Pingback: Agency Insider: Is Local Google's Backdoor to Social? - Tidings on November 26, 2018
  2. Pingback: 17 Best Marketing Blogs to Follow in 2019 - Ahrefs on January 15, 2019
  3. Pingback: Sometimes All You Need to E-A-T is a Sandwich - BrightLocal on January 18, 2019
  4. Pingback: There's A Better Way To Classify Search Intent - Content Harmony on February 11, 2019
  5. Pingback: My Perfectly Healthy Obsession With Query Syntax on February 11, 2019

Comments About Algorithm Analysis In The Age of Embeddings

// 50 comments so far.

  1. Dylan // November 19th 2018

    Great post. Need to continue my research on NLP. Thanks.

  2. AJ Kohn // November 19th 2018

    Thanks Dylan. And I too need to keep up on the research. I’m also hoping to leverage some of the pre-trained embeddings for analysis. It’s a steep learning curve but I think it’s important to gain additional understanding of how this all might be working. It’s only going to get more complex.

  3. john andrews // November 19th 2018

    Nice. Now do how Brand skews all of this, where Google decides it matters. IMHO like a wrench thrown into a grinder.

    Theory is awesome. But also super frustrating.

  4. AJ Kohn // November 19th 2018

    Brand affinity is certainly important and can show up in users selecting branded content (content from that brand) over others in a SERP. That type of reinforcement learning winds up helping that content improve their position.

    There’s also a good deal Google can do in determining how often certain query syntax is used with a brand. Essentially, if your brand is mentioned frequently with a certain query then it’s more relevant for that query.

    You could conceivably create a neural embedding using just the corpus of search queries (perhaps for a year) and then measure how similar certain brands where to certain queries using cosine distance.

  5. Tom Dehnel // November 19th 2018

    Loved reading this, thanks for putting in the work. Thing about SEO is: as it gets more complex, it also gets simpler. Make things people want to read (or listen to, or watch, etc.)

  6. AJ Kohn // November 19th 2018

    Yes and no. I think saying that is easy but when people sit down to create something they botch it – and badly.

    You tell someone to write for the user and you get someone trying to write Hemmingway in article form. It’s awful. Writing for the web is a different skill and few seem to do it well.

    An easy analogy – if it was easy to write a blockbuster movie than why isn’t every movie a blockbuster?

  7. John Locke // November 19th 2018

    Thanks for this research, AJ. Understanding search intent will continue to become more important, especially as the algorithm understands how different factors fit together better. (word relations, content, brand affinity, etc).

    It’s going to become increasingly harder to make unsatisfying content rank.

  8. AJ Kohn // November 20th 2018

    Thank you for the kind words. And to a large degree I think you’re right. Content that doesn’t truly satisfy intent is going to get harder and harder to rank.

    My only concern is that Google’s window for satisfaction currently seems limited to one session. So ‘rage-quit’ abandonments might be seen as satisfaction right now. But move to a multi-session view and you might find users re-attempting that query.

    In addition, you might find that the first satisfied result is … incorrect. By that I mean, you may think you’ve got the answer you attempt it in the real world and find it doesn’t work and thus query again.

  9. Emma Russell // November 20th 2018

    This was a great piece of research, thank you for taking the time to write it all down and share it.

    I’m wondering now about Google’s understanding of language and their increasing focus on entities. Do you think that stipulating entities (e.g. within schema) would have much bearing on how Google understands and ranks a page or do you think that NLP would be the primary factor being taken into consideration?

  10. AJ Kohn // November 20th 2018

    Thanks for the comment Emma. Entities are still a large part of the picture for Google and can be integrated into different embeddings or used in combination with other techniques.

    I still think that entity extraction (from the content of the page) is likely more important given the sparse coverage of sites using schema. But why not give Google a cheat sheet and make it easier for them!

  11. Neil Dickson SEO // November 20th 2018

    Great post as always – I’ve experienced the same- drop followed by recovery – in the health sector.

  12. AJ Kohn // November 20th 2018

    Thanks Neil and glad you saw the same recovery.

  13. David Amerland // November 20th 2018

    Brilliant! There is an ongoing struggle between what Google understands and what website owners do (and the SEOs who help them) and it’s never going to be resolved permanently. But by understanding the direction of Google search the SEO efforts can be funneled into something that delivers the best match possible between a query and a searcher. The importance of relevance has never been made more explicit.

  14. AJ Kohn // November 20th 2018

    Thanks David. And yes, Google’s aim is to emulate the human evaluation of content and that’s just dreadfully difficult to accomplish. But they’re getting far better at it than even I realized. While I think they’ll be bumps in the road (misheard lyrics from time to time) Google seems to have the right mechanism to continually deeper their knowledge of relevance.

  15. Rowan Collins // November 20th 2018

    Heya,

    An excellent article, the tone started a bit cocky but ended up being full of great insight.

    I mainly switched on when you called out the people ignoring John Mueller. At this point, it’s so cliche to ignore him – that he could tell everyone exactly how the algorithm works and they still wouldn’t believe him.

    I’ve found in recent times that SERP analysis and competitor analysis is one of the best ways to rank websites. More often than not I’m having to remove irrelevant content rather than add generic content. I suspect you find similar results?

    I’m always interested in this question:

    How much would you break it down to on-site versus link building out of 100?

    But for you, I would also love to hear what your thoughts are about how much content contributes to the on-site evaluation, as opposed to other metrics such as perceptual loading and crawl/render optimisations.

    Would be great to chat some time.

    Kind Regards

  16. AJ Kohn // November 20th 2018

    Rowan,

    Cocky is an adjective I am not unfamiliar having ascribed to me. But I try to back it up with logic to support why I feel so strongly.

    John is a great resource and you can learn a lot by really listening to what he’s saying. Now, sometimes I think the theoretical doesn’t match the practical, so not all advice winds up being ‘true’. But it all comes from a good place.

    In terms of irrelevant content. Yes. Removing that from a single piece or pages from a site can be beneficial. Much of my work in the last few years has been around corpus control and reducing the number of pages. That can be content driven but also has benefits from a crawl and internal link perspective.

    As for the other questions. Links will get you to the first page but relevance, intent optimization and engagement will get you to the top of page one. And while speed is important (particularly to conversion and engagement) content is vastly more important. If speed were truly a large factor we’d see a very different SERP.

  17. Lyndon NA (Autocrat) // November 20th 2018

    More than worth the time to read 😀

    Love the “is it A or B, Yes” line – more than true!

    People have been told for Years that G is looking to try to understand and comprehend. The problem has always been the balance of cost vs gain. With advances in resources and approaches that can utilise them, and breaking away from traditional NLP methods (that are, in all honesty, out of date by decades), and pushing deeper into statistical analysis,
    G have made massive headway.

    No more need for man-made onotologies or synsets – the machine(s) do the associations and the cullings. The squishes just approve the hedge-cases 😀

    As for the whole GQRGs – people really do seem to be, well, stupid.
    G has been trying to hide the secret sauce for years.
    Do people really think they would have spelt it out in a flaming handbook?
    I think it was initially a performance-audit guide, and later became additional feed for reiterative improvements.
    What the SEO community seems to have missed is the level(s) of segregation G has for types of content/topics.
    Not all pages are treated the same.

    Now let’s see how long it takes the SEO community to mangle NLP/NLU.
    Based on the sheer number of them that struggle to write a comprehensive sentence … I can only see good coming from it 😀
    (It’s hard to learn NLP without getting better at spelling, grammar and syntax :D)

    +

    Most important of all … good to see you and your content again 😀

  18. AJ Kohn // November 20th 2018

    This comment literally made me grin as I read it Lyndon. Not only do I think you’re spot on with your insight into how far Google has progressed, you were able to say things more bluntly than I.

    In many ways I view the GQRG as a public relations vehicle. “But look, people are reviewing the algorithm!” It makes the masses feel better about the results than knowing it’s all just cold machine learning.

    Hopefully it won’t be another ten months until my next blog post.

  19. James Dooley // November 20th 2018

    Best article I have read for a long time. We have created a few gap analysis tools for keyword research using the neural cluster methods and it’s worked a treat

  20. AJ Kohn // November 20th 2018

    Thanks James and that tool sounds pretty interesting. Glad folks are exploring with the technology. I know I need to do more there myself – going from academic to practical application.

  21. Victor Pan // November 20th 2018

    For head terms that have signs of fractured intent, I feel like there’s just two playbooks an internet marketers can take:

    1. Be like Apple and tell them no, you don’t need a tablet. You need an iPad. Heck there’s no mention of tablets anywhere on your page, but you keep marketing your iPads until folks search for iPads > tablets. Sadly I’ve been trying to get my kids to call their Kindle Fire tablets but they’re still iPads to them (thanks Mom).

    2. Empathize with the searcher deeply. Dig into the SERPs, the content out there, the obscure web communities, and find the root problems in a given head query. Once that’s done, provide the best format that answers the problem, ideally in a way where a competitor can’t replicate or Google take over via a vertical search feature (think hotels, flights, jobs).

    Is there a third (fourth/fifth/sixth…) approach I’m missing here?

  22. AJ Kohn // November 20th 2018

    Thanks for the comment Victor and for talking about a favorite subject: fractured intent.

    Option 2 can also be a decision tree of sorts where you immediately identify the types of intent and create pathways for each. So maybe that’s Option 2 or perhaps 2A.

    Option 2B would be to do some of that but do it with user behavior (heatmaps etc.) and get really good at optimizing engagement.

    Option 3 is to essentially abandon the root term, do all that research and develop specific content to capture all of the specific intent.

    Option 4 is to simply optimize to be the first results for that intent/content match in a SERP. So if it’s a local query and you have a national directory you want to be the first national directory on the page.

  23. James Svoboda // November 20th 2018

    Great article AJ. This piece said a lot. I’ve seen too many high ranking long-form pieces of content in the last year or so that answer the query, but have a horrible user experience for the query.

    “The other change was far more important and is still relevant today. Google chose to match question queries with documents that matched more precisely. In other words, longer documents receiving questions lost out to shorter documents that matched that query.

    This did not come as a surprise to me since the user experience is abysmal for questions matching long documents. If the answer to your question is in the 8th paragraph of a piece of content you’re going to be really frustrated. Google isn’t going to anchor you to that section of the content. Instead you’ll have to scroll and search for it.”

  24. AJ Kohn // November 20th 2018

    Thanks James. And … I think some of the reason the long-form content may still be out there is SEOs are terrified of ‘thin content’. A well executed Q&A strategy (using leaf pages) can produce massive results right now.

  25. Abby R. // November 20th 2018

    This piece has both unsettled and excited me. I’ve been investing plenty of time in E-A-T optimization over the last few weeks, and I have a couple questions for you.

    1. Are you saying that all E-A-T optimization is not worthwhile? We had a health client that got massively hit and saw without a doubt from the data that certain terms were the culprit e.g. “fat”, “weight loss”, etc. Things that would be harmful if someone other than a doctor recommended them. The site that gained all of these rankings was Healthline, and all of their content comes from authoritative sources. This, to me, says that authority is taken into account in some way in the algorithm, not just in the QRG.

    2. If you don’t believe that the update is accurately referred to as “Medic”, how do you explain why the health industry (or YMYL sites in general) were shown by data to be the most impacted?

  26. AJ Kohn // November 20th 2018

    Thanks for the comment Abby. I don’t provide recommendations around E-A-T to clients. Instead I talk about content, intent and engagement.

    As I mention in the piece, if you’re burying main content below a ton of horrible ad units and the GQRG helps you realize that’s bad then … great. But each vertical, heck each query class, demands different content, both in substance and presentation.

    I know a bit more about Healthline and know they have a very specific strategy in place, which has been wildly effective. I’m unable to go into details due to confidentiality but it’s more aligned with syntax matching than anything else.

    As for an explanation of the data around “medic”, the key point here is that we don’t have enough data to come to that ‘most impacted’ conclusion. We don’t have data by vertical. We have scores or maybe hundreds of sites that self-reported. There’s a lot of bias there. Are health sites more likely to self-report? Have more SEO attention?

    There are many instances of this update impacting non-health sites. Was health hit, sure. But labeling it “medic” leads many to the wrong conclusion IMO.

  27. Michael Rebueno // November 20th 2018

    Awesome insight, will take a closer look. E-A-T is hot this month. Noticed some movement last Friday as well. Thanks!

  28. Chris Labbate // November 20th 2018

    Thanks for the article AJ, truly enjoyed every word. I was doing some research on algo updates / the myth of LSI keywords and the roadmap went something like this…

    -Ahrefs recent blog post on On-page
    -SEO by the SEA (Does Google Use Latent Semantic Indexing?)
    -25 Facebook SEO Group Posts (SEO Signal Labs was the one to mention this)
    -Your Post 🙂

    I have now, given up my day job and seo agency to read the rest of your website.

    All my best in the Future,
    Chris

  29. AJ Kohn // November 21st 2018

    Thanks for the comment Chris. I’ve been thinking about collecting some of the more important pieces for folks who are new readers. Let me know if you think that would be helpful after reading more.

  30. Emma Russell // November 21st 2018

    One other comment, just spitballing here, you mention that Google can’t measure authority algorithmically, but using the logic above, would Google not be able to match people to words commonly associated with them and in doing so associate that person with a topic? So e.g. a medical practitioner who writes articles with a bio would commonly be associated with various qualifications etc. And if that person were also to have a unique entity then I’m sure Google would be able to process the ability of that person to write authoritatively on that topic.

    This would possibly reconcile the fact that once people “optimised for EAT”, they saw better results.

    Could be way off, let me know what you think!

  31. AJ Kohn // November 21st 2018

    The ability to assign authority to people is difficult and one that Google has approached at least once if not twice. You might remember their experiment around rel=author, which if it had gained wide adoption might have worked. Alas, it did not and it failed.

    Beyond that, there’s a huge issue around disambiguation, sentiment and popularity. Is Dr. Oz an expert medical practitioner? Some might think so though I think he’s a quack. If I gain popularity for the volume of my work am I more or less an authority than someone who doesn’t write frequently?

    I think it’s pretty clear that Google can extract entities (i.e. – people) from text. They can even categorize them by field. But understanding their authority is … highly nuanced.

    Instead, it’s usually easier for Google to understand those dimensions through an analysis of the text. Do they refer to it as a nose job or rhinoplasty? Or there’s even me, I might use some of the terms related to Information Retrieval but the real IR folks have a much different syntax than I and that’s how Google could likely tell who is more authoritative on that topic.

    Hopefully that makes sense.

  32. Eric // November 21st 2018

    Awesome analysis, AJ. I kind of understand what you’re trying to say. I’ll probably re-read the article over the weekend to fully absorb what you’re trying to say.

    How would you explain sites that have lost massive amounts of traffic since recent updates, like 90% traffic because if the recent algo changes are purely limited to query syntax that would imply these sites seem to do an extremely piss poor job at query syntax matching.

    Also, could you please share some insights about your example above why you think google choose to de-rank for “signs of death” but rank for “signs of death coming soon”. Without knowing the actual article, it looks like the two queries might have the same user intent on the face of it.

    Thanks for the great article!

  33. AJ Kohn // November 21st 2018

    Thanks Eric. It’s hard for me to know why a site would lose that amount of traffic without looking at it. Syntax not matching intent, engagement issues, changes to the link graph and a handful of others could be the culprits.

    As to why Google made the change with the ‘signs of death’ content – the reasons weren’t cut and dry. There were some specific areas in the piece that dealt with the coming soon topic and the formatting of the content might have reduced engagement for the head term. But if I were certain, I’d be on top of every SERP.

  34. Marianne // November 21st 2018

    Virtual high five, such a great article. Thank you and the images are great too.

  35. AJ Kohn // November 21st 2018

    Thanks Marianne, particularly about the images since I spend quite a bit of time getting the ‘right’ ones.

  36. UCK // November 22nd 2018

    This makes perfectly sense.

    As you said, “it’s important to think about how these things might be accomplished algorithmically”…and, i add, what cannot be performed algorithmically.

  37. AJ Kohn // November 22nd 2018

    Thanks UCK. And you’re absolutely right. My hope is more folks think about what truly can be done algorithmically and what can’t.

    Too often people make changes that won’t impact the algorithm or inadvertently make some that impact a ranking factor. Then when things change they attribute specious changes to the algorithmic change.

    To me it’s a lot about curiosity. How would Google change the type of content served on a SERP over the course of time? Why would a page gain and lose traffic for variations of the same topic but with different syntax?

    The answer isn’t going to be simple but that’s what makes this business fun.

  38. Niall // November 22nd 2018

    You are quite literally the only SEO/digital marketer I’ve come across who says not to pay so much attention to the whole E.A.T concept, and now I can see why.

    The Tweet by Gianluca also really hit home with me – far too much time is spent looking at tools that do their best to interpret data…instead of just looking at the raw search data (Google) and doing a bit of pattern matching.

    Because, try as they might, I’ve yet to find any “tool” that can figure out search intent.

    Great article AJ.

    I won’t lie here, I’m going to have to read this 3 – 4 times to let it all sink in 🙂

  39. AJ Kohn // November 28th 2018

    Thanks so much for your comment Niall. Too few seem to understand the value of pattern matching. It comes naturally to me but I know from repeated interactions that it doesn’t for others. But those that work at it and know that intent is the lifeblood to successful search will be rewarded.

  40. Hayden // November 26th 2018

    Excellent article and research. I will need to sit down and dissect this more to understand each point – this is all very fascinating, and definitely aligns with my personal experience on SERPs of late. Thanks again!

  41. Ilan // November 27th 2018

    Love your content AJ! so you completely disagree with this https://www.mariehaynes.com/eat/?

  42. AJ Kohn // November 28th 2018

    Thanks Ilan and the answer to your question is largely yes but there’s nuance there.

    If you dig into what is recommended you find that links from reputable sites and forums are major themes. At that point you’re talking about the link graph and how citations flow and whether topical sites are more valuable (they are).

    There are also recommendations to use more scientific or medical references, which might include syntax that would get Google’s attention. Referring to a nose job as rhinoplasty may shed light on whether that text is more valuable or not. This is something embeddings can uncover. Even if we’re just talking external links, that’s still attaching yourself to the link graph in a different way.

    In addition, general themes around reducing ads or using social proof are ways to increase engagement and the opportunity for long clicks.

    So, a lot of what is recommended would actually match up to real ranking factors. Mind you, I think there are a bunch of specious recommendations too and other optimizations I’d prioritize over many of these.

    For many, following E-A-T optimization may work but I think it will be because they sort of stumble into doing certain things that hit on true algorithm factors. So it fills a niche.

    But my personal view is that there’s more long-term gain (and greater efficiency) looking at query syntax, content, intent and engagement.

  43. Jill // November 28th 2018

    Great article AJ! I have been “skeptical” of the whole EAT thing only because I have sites ranking very well that have no real “expert” signals” –no great backlinks, no press mentions – NADA.

    I have been in the SEO world for a long time – and how I do SEO has not changed significantly over the years — good content has always been the key — and I believe will continue to be they key to winning in Google.

    Great keyword research and reader focused content will win every time over jumping at every update or every word Mueller says on Twitter!

  44. AJ Kohn // November 28th 2018

    Thanks Jill. It’s funny you mention the no ‘expert’ signals bit. I recently launched a site with a colleague in the health vertical. There are no expert signals. No byline. No real author bio. But it’s good content and it meets query syntax and satisfies intent.

    Within 2 days of launch, without a single link to it, Google started ranking it and it started to get clicks. And it’s steadily doing better. Not only that, but in my mind we could have done much better with making the content more focused. So the sky is the limit.

    So I’m right there with you Jill.

  45. Alex // December 11th 2018

    Oh, not usually one for reading your posts late AJ!

    Super interesting article! Have to say that while I think you’re right about EAT being invalid as an approach to site optimisation. However, it has been a very useful tool for getting clients over the line on spending time and resource on the things that generate EAT – namely good research, great content, unique ideas and doing some good old fashioned brand marketing.

    Sometimes it’s just easier to say Google says you should do this. 🙂

  46. AJ Kohn // December 11th 2018

    Thanks Alex. And you’re not wrong. The EAT framework does help some folks sell clients on the work that is necessary to be successful. I choose not to work with those clients and … that’s a luxury.

    Many want simple answers, even if they’re wrong and not complex answers that happen to be right. But how long can you thread the needle and not have clients begin sending you inane suggestions for improvement?

    I wrote this, in part, because at some point many SEOs seemed to conflate the EAT framework with real ranking factors. Whether they were always confused or they started to believe the lies, I can’t tell.

    So, yes, selling EAT can actually help to a certain extent. But it’ll wind up being less and less effective and is certainly less efficient. These things matter in the long run in my view. YMMV.

  47. Claudio Miguel // January 29th 2019

    This is an awesome article. Since algo changes happen frequently and not just in Google, it’s probably a best longer term strategy that we marketers focus our efforts on understanding the long term goals of the company we’re publishing in be it Google/Facebook/etc, rather than trying to crack the current state of the algo.

  48. AJ Kohn // January 30th 2019

    Thanks for the kind words and comment Claudio. And I couldn’t agree more. Understanding the mindset of Google or Facebook and what they’re trying to accomplish and seeing how your site/clients can fit into that vision is a far more effective strategy.

  49. Kelechi Ibe // February 18th 2019

    This post is filled with all sort of gems I don’t even know where to begin my compliments! All I can say is that your point on “Intent Optimization” (targeting the keyword, optimizing the intent) and satisfying active and passive intent has been of massive benefits to me!

    If you Google search the phrase “data driven marketing case studies” right now (without the quotes), you will see an article I wrote sitting at the top #1 position (HINT: it featured Progressive and Macy’s). I literally wrote that article after finished listening to your podcast with Dan Shure from Evolving SEO and implemented active and passive intent.

    I remember your example of someone searching for a vacuum cleaner manual and how offering an actual manual first to satisfy active intent was key but then going in-depth into the “unconscious passive intent” is how you win the SERP.

    My approach: I immediately satisfied the intent of someone searching for a “case study” on data-driven marketing. On the first few paragraphs, I included a PDF research on the topic (satisfied active intent – check), then dove deeper into satisfying passive intent by offering real-life case studies featuring well-known companies with examples, video, screenshots (satisfied passive intent – check).

    The page dominated! (And continues to send quality traffic daily)

    All I can say is that your stuff works and I’m listening intently!

    Great work, AJ!

  50. Swati Sinha // April 22nd 2019

    Great article, really nice images to go with and what a trail of discussions. I am a novice in this field but have got a good dose of knowledge and insights by going through the whole stuff.
    Thanks for sharing AJ.

Sorry, comments for this entry are closed at this time.

You can follow any responses to this entry via its RSS comments feed.

xxx-bondage.com