What I Learned In 2020

February 08 2021 // Career + Life // 7 Comments

(This is a personal post so if that isn’t your thing then you should move on.)

This is the ninth year that I’ve done a year in review piece. I’m at a different stage of my journey so you might benefit from and find yourself in prior year versions. Here are easy links to 2011, 2012, 2013, 2014, 2016, 2017, 2018 and 2019

In 2020, the world shifted beneath our feet but my business kept growing. The anxiety I had about certain facets of my business disappeared, replaced by anxiety about COVID-19 and our political landscape.

I found sanctuary in simple habits that I’ll take forward into 2021.

COVID-19

Groundhog Day Phil Driving

It was 2020, so you can’t avoid talking about COVID-19. The strange thing is, social distancing didn’t change the way I do business. For the most part, I do all my work remotely. From time to time I might go into the city (we really don’t call it San Fran) and meet with a client.

I couldn’t do that in 2015 because I was going through chemotherapy. I found that not meeting in person worked just fine. Clients agreed.

So that’s how it’s been for the last five years. Once in a blue moon I might go in for a meeting but I largely stay at home. Heck, I have two clients in Australia, four in New York and two in Seattle so it’s not like I’m going to meet with them in person much anyway.

So the quarantine and lock down orders were not a big change for me. But it was different.

The day was no longer split up by trips to drop my daughter at school, going out to get groceries or stepping out with my wife for mojitos at a favorite restaurant. My daughter was doing distance learning, we quickly moved to delivery services for our groceries and I upped my bartending skills.

The days were definitely monolithic and homogenous, leading to a general groundhog day malaise.

Motivation

X-Meh

Available at Woot

The biggest problem I had in 2020 was motivation. Early in the year I bought out a partner, found a developer and began working on side projects. It was the outcome of my 2019 epiphany. Let’s do this!

But even before COVID-19 hit in earnest, I was slowing down. And once COVID-19 was upon us my motivation evaporated. The reasons for this were two-fold and interrelated.

Money is not a problem. The business grew, yet again, another 13% in 2020. Some people get a buzz from making more and more money but I’m not one of them.

I’m not saying I don’t still strive for that to some degree. There are some goals I have in mind. Things I’d do with that money. But it’s no longer a primal motivation to do this or that thing so I can make money to get out of debt or to pay the mortgage or go on that vacation.

Things were going amazingly well in my life. So my passions turned outwards, toward dealing with COVID-19 and the 2020 election cycle. Why should I spend time in pursuit of even more money when so many things were going wrong?

Digressions

Office Space Movie Scene

Some of my efforts were productive. I was engaged with my local school district, helping to oust an ineffective Superintendent and then pushing for the right course of action with the new one. I gave to specific political candidates and made sure every DonorsChoose project at my daughter’s school was fully funded.

Some of my efforts were less productive, doom-scrolling and ranting about the lack of logic and empathy I saw in our country.

To me, the division seems less about Republicans vs Democrats and more about a difference between a philosophy of ‘me’ vs ‘we’. (My politics are far-left and while I rarely engage in public I make no apologies for that perspective either.)

I also agonized over the George Floyd murder and the systemic racism it exposed. How could I be an ally? While I have a very good imagination, I can’t understand how it must feel to be black in America.

I caught glimpses in Lovecraft Country by Matt Ruff. But ultimately, I felt powerless. Perhaps that’s apt.

Habits

New York Times Crossword Puzzle Statistics

There was a silver lining to lock down orders and the deluge of bad news. I took up some good habits. I’m a big believer that you reach certain goals by instilling good habits. Yet, I often found I failed at making those habits are reality.

In 2020, I made good habits stick. I’m not sure exactly why and, to be honest, I’m not particularly interested in finding out. I’m just happy I got there.

Each year I say I want to read more. The habit I changed? As an early riser I would wake up and watch TV. I stopped doing that and read instead.

As a result, I read 16 books in approximately 9 months, including books by William Gibson, Neal Stephenson, N.K. Jemisin, Jonathan Franzen and Emily St. John Mandel.

I have a morning routine of sorts. I’m up early. Like 5am early. I do the New York Times Crossword. I make some coffee. I read.

But I also learned that you don’t have to be dogmatic about it. During the Tour de France, I woke up and eagerly watched each stage, enjoying both the race and scenery. And I’d watch biathlon too, becoming a huge fan despite my distaste for both skiing and guns.

Hiccups

Tower Bridge Jigsaw Puzzle

One of the other things we did as a family was jigsaw puzzles. Not just a few but 28 and counting, with all but two of those being 2,000 piece puzzles.

Jigsaw puzzles and the crossword reinforced essential truths. Both can only be completed if you take it step-by-step. You don’t just fit all the pieces together in a half-hour. You don’t get every clue one after the other in the crossword. (You might get close on Monday!)

There are parts of each that are relatively easy. You sort and pick out the edges and get the outline of the puzzle done first. In the crossword you go through and pick off the ones that come to you right away. You also put in the obvious plural (s), past (ed) and comparative (est and ier) suffixes when spotted.

But then you find there are rough patches. It can be slow going as you work on the gradient of the sky. And sometimes you get stuck on the crossword. A fleeting thought that this is the one that you can’t crack.

The trick is to keep going.

The next time you sit down at the table, the light is a bit better and you see the subtle difference in the sky and the pieces are now going in one after the other. Or suddenly you get one of the long answers that reinforces the theme to that crossword. Things click and you’re getting the double meaning of the clues again.

Incrementalism

Brick by Brick Lego Tower Bridge Project

I did not get to where I am today by accident. Nor could I have gotten here 8 years earlier. You have to build, brick on brick like a massive lego project, to reach your goal.

It reminds me of the song Wake Up, Stop Dreaming by Wang Chung.

Wake up stop dreaming
There’s more than just two steps to heaven
I’m saying if you wanna get to heaven
You’d better wake up
Wake up, stop dreaming

I’m not religious yet the lyrics inspire me to not simply dream, but to do. Wake up! Take those next steps.

What 2020 did more than anything is confirm that every step is important even if they aren’t of equal value. I might put in a handful of pieces when I sit down at the table for 45 minutes. The next time I may put in a flurry of 150 pieces in the span of 10 minutes.

The puzzle doesn’t get completed without each of those steps.

This is where my personal and business lives intersect. Because one of my mantras about SEO is that the sum is greater than the parts. You can see some of this play out in my recent piece about SEO A/B testing. I’ll be writing a follow-up piece in the near future.

But as a preview, not every step is a step change. But you can’t get to the top of the mountain without taking all the steps.

Measurement

A New Chart of History

Measurement is clearly a large part of my business. I wouldn’t have it any other way. But the time scale of measurement matters. While it often sounds self-serving, SEO takes time. Patience may be the most underrated skill in our industry.

I’ve battled weight issues for a number of years. I know how to lose the weight and have a number of times, only to put it back on again. Part of the reason for this is that the habits I used to lose that weight were very rigid.

I’d go for the lowest calorie intake possible, denying myself, so I could see results quickly. It wasn’t just about speed but about keeping momentum. When you weigh yourself every day it got hard to keep going when the number went up and not down.

This time around I’m not going for the lowest calorie intake. I’m looking to lose the weight slowly. There will be days when I have a couple of mojitos and blow past my calorie limit.

The funny thing is that those ‘cheat’ days and the numbers on the scale don’t line up. A day after indulging my weight often goes down. (Yes, yes, it might be that I’m dehydrated.) Other times, after a particularly good day – or even a stretch of two – my weight stays the same or creeps up.

But over time it all translates into consistent weight loss. I’ve lost a little over 10 pounds this year, averaging about 1.5 pounds each week. It’s not about living and dying by what the scale says every day. It’s about knowing that I’ll get the results I want if I keep taking those steps.

It dawned on me that I provide this service to clients. I help them move beyond the panic of a week that was a little soft. I encourage them not to spend hours in analysis but instead to execute on their roadmap. Do that next thing.

Putting It All Together

It was the best of times, it was the worst of times

I enter 2021 feeling like I can combine the things I’ve learned over the course of the last few years. I will continue to take risks and be unafraid to fail. I can shake off the guilt of not returning some emails promptly or missing a few deadlines. I’ll rely on the relentless power of habits.

Even if it doesn’t come together as planned, it’s the next step and I’m eager to take it.

 

SEO A/B Testing

February 03 2021 // Analytics + SEO // 21 Comments

SEO A/B testing is limiting your search growth.

Among Us Kinda Sus

I know, that statement sounds backward and wrong. Shouldn’t A/B testing help SEO programs identify what does and doesn’t work? Shouldn’t SEO A/B testing allow sites to optimize based on statistical fact? You’d think so. But it often does the opposite.

That’s not to say that SEO A/B testing doesn’t work in some cases or can’t be used effectively. It can. But it’s rare and my experience is SEO A/B testing is both applied and interpreted incorrectly, leading to stagnant, status quo optimization efforts.

SEO A/B Testing

The premise of SEO A/B testing is simple. Using two cohorts, test a control group against a test group with your changes and measure the difference in those two cohorts. It’s a simple champion, challenger test.

So where does it go wrong?

The Sum is Less Than The Parts

I’ve been privileged to work with some very savvy teams implementing SEO A/B testing. At first it seemed … amazing! The precision with which you could make decisions was unparalleled.

However, within a year I realized there was a very big disconnect between the SEO A/B tests and overall SEO growth. In essence, if you totaled up all of the SEO A/B testing gains that were rolled out it was way more than actual SEO growth.

I’m not talking about the difference between 50% growth and 30% growth. I’m talking 250% growth versus 30% growth. Obviously something was not quite right. Some clients wave off this discrepancy. Growth is growth right?

Yet, wasn’t the goal of many of these tests to measure exactly what SEO change was responsible for that growth? If that’s the case, how can we blithely dismiss the obvious fact that actual growth figures invalidate that central tenant?

Confounding Factors

So what is going on with the disconnect between SEO A/B tests and actual SEO growth? There are quite a few reasons why this might be the case.

Some are mathematical in nature such as the winner’s curse. Some are problems with test size and structure. More often I find that the test may not produce causative changes in the time period measured.

A/A Testing

Many sophisticated SEO A/B testing solutions come with A/A testing. That’s good! But many internal testing frameworks don’t, which can lead to errors. While there are more robust explanations, A/A testing reveals whether your control group is valid by testing the control against itself.

If there is no difference between two cohorts of your control group then the A/B test gains confidence. But if there is a large difference between the two cohorts of your control group then the A/B test loses confidence.

More directly, if you had a 5% A/B test gain but your A/A test showed a 10% difference then you have very little confidence that you were seeing anything but random test results.

In short, your control group is borked.

Lots of Bork

Swedish Chef Bork Bork Bork

There are a number of other ways in which your cohorts get get borked. Google refuses to pass a referrer for image search traffic. So you don’t really know if you’re getting the correct sampling in each cohort. If the test group gets 20% of traffic from image search but the control group gets 35% then how would you interpret the results?

Some wave away this issue saying that you assume the same distribution of traffic in each cohort. I find it interesting how many slip from statistical precision to assumption so quickly.

Do you also know the percentage of pages in each cohort that are currently not indexed by Google? Maybe you’re doing that work but I find most are not. Again, the assumption is that those metrics are the same across cohorts. If one cohort has a materially different percentage of pages out of the index then you’re not making a fact based decision.

Many of these potential errors can be reduced by increasing the sample size of the cohorts. That means very few can reliably run SEO A/B tests given the sample size requirements.

But Wait …

Side Eye Monkey Puppet

Maybe you’re starting to think about the other differences in each cohort. How many in each cohort have a featured snippet? What happens if the featured snippets change during the test? Do they change because of the test or are they a confounding factor?

Is the configuration of SERP features in each cohort the same? We know how radically different the click yield can be based on what features are present on a SERP. So how many Knowledge Panels are in each? How many have People Also Asked? How many have image carousels? Or video carousels? Or local packs?

Again, you have to hope that these are materially the same across each cohort and that they remain stable across those cohorts for the time the test is being run. I dunno, how many fingers and toes can you cross at one time?

Exposure

Stop Making Sense

Sometimes you begin an SEO A/B test and you start seeing a difference on day one. Does that make sense?

It really shouldn’t. Because an SEO A/B test should only begin when you know that a material amount of both the test and control group have been crawled.

Google can’t have reacted to something that it hasn’t even “seen” yet. So more sophisticated SEO A/B frameworks will include a true start date by measuring when a material number of pages in the test have been crawled.

Digestion

Captain Marvel Flerken Tentacles

What can’t be known is when Google actually “digests” these changes. Sure they might crawl it but when is Google actually taking that version of the crawl and updating that document as a result? If it identifies a change do you know how long it takes for them to, say, reprocess the language vectors for that document?

That’s all a fancy way of saying that we have no real idea of how long it takes for Google to react to document level changes. Mind you, we have a much better idea of when it comes to Title tags. We can see them change. And we can often see that when they change they do produce different rankings.

I don’t mind SEO A/B tests when it comes to Title tags. But it becomes harder to be sure when it comes to content changes and a fool’s errand when it comes to links.

The Ultimate SEO A/B Test

Google Algorithm Updates

In many ways, true A/B SEO tests are core algorithm updates. I know it’s not a perfect analogy because it’s a pre versus post analysis. But I think it helps many clients to understand that SEO is not about any one thing but a combination of things.

More to the point, if you lose or win during a core algorithm update how do you match that up with your SEO A/B tests? If you lose 30% of your traffic during an update how do you interpret the SEO A/B “wins” you rolled out in the months prior to that update?

What we measure in SEO A/B tests may not be fully baked. We may be seeing half of the signals being processed or Google promoting the page to gather data before making a decision.

I get that the latter might be controversial. But it becomes hard to ignore when you repeatedly see changes produce ranking gains only to erode over the course of a few weeks or months.

Mindset Matters

The core problem with SEO A/B testing is actually not, despite all of the above, in the configuration of the tests. It’s in how we use the SEO A/B testing results.

Too often I find that sites slavishly follow the SEO A/B testing result. If the test produced a -1% decline in traffic that change never sees the light of day. If the result was neutral or even slightly positive it might not even be launched because it “wasn’t impactful”.

They see each test as being independent from all other potential changes and rely solely on the SEO A/B test measurement to validate success or failure.

When I run into this mindset I either fire that client or try to change the culture. The first thing I do is send them this piece on Hacker Noon about the difference between being data informed and data driven.

Among Us Emergency Meeting

Because it is exhausting trying to convince people that the SEO A/B test that saw a 1% gain is worth pushing out to the rest of the site. And it’s nearly impossible in some environments to convince people that a -4% result should also go live.

In my experience SEO A/B test results that are between +/- 10% generally wind up being neutral. So if you have an experienced team optimizing a site you’re really using A/B testing as a way to identify big winners and big losers.

Don’t substitute SEO A/B testing results over SEO experience and expertise.

I get it. It’s often hard to gain the trust of clients or stakeholders when it comes to SEO. But SEO A/B testing shouldn’t be relied upon to convince people that your expert recommendations are valid.

The Sum is Greater Than The Parts

Because the secret of SEO is the opposite of death by a thousand cuts. I’m willing to tell you this secret because you made it down this far. Congrats!

Slack Channel SEO Success

Clients often want to force rank SEO recommendations. How much lift will better alt text on images drive? I don’t know. Do I know it’ll help? Sure do! I can certainly tell you which recommendations I’d implement first. But in the end you need to implement all of them.

By obsessively measuring each individual SEO change and requiring it to obtain a material lift you miss out on greater SEO gains through the combination of efforts.

In a follow-up post I’ll explore different ways to measure SEO health and progress.

TL;DR

SEO A/B tests provide a comforting mirage of success. But issues with how SEO A/B tests are structured, what they truly measure and the mindset they usually create limit search growth.

Rich Results Test Bookmarklets

July 12 2020 // SEO + Technology // 6 Comments

Last week Google announced that it was going to deprecate the Structured Data Testing Tool in lieu of the newer Rich Results Test.

Structured Data Testing Tool Shut Down Notice

I use the Structured Data Testing Tool daily to validate structured data on client sites and frequently play with a blob of JSON-LD until I get just the right nesting.

Because of that I long ago developed a Structured Data Testing Tool bookmarklet. I mean, who has time to copy a URL, go to another tab and paste that URL into the tool and hit enter?

No Time For That

With the bookmarklet all I have to do is click the bookmark and it launches the tool in a separate tab for the page I’m currently viewing. I know it seems like a small thing. But in my experience, small things add up quickly. Or you can just listen to Martin Gore.

Rich Results Test Bookmarklets

So the other day I dusted off my limited JavaScript skills and created two new bookmarklets that do the same thing but for the Rich Results Test for Googlebot Smartphone and Googlebot Desktop.

Rich Results Test – Mobile

Rich Results Test – Desktop

Drag the highlighted links above to your bookmarks bar. Then click the bookmark whenever you want to test a specific page. It will create a new tab with the Rich Results Test … results.

So if I’m on this page and I click the Rich Results Test – Mobile bookmark it opens a tab and performs the Rich Results Test for that page.

Rich Results Test Results Example

I’m guessing there are a number of these bookmarklets floating around out there. But if you don’t have one yet, these can help streamline your structured data validation work.

I hope you find this helpful. Please report any incompatibility issues or bugs you might find with my bookmarklet code.

What I Learned in 2019

January 27 2020 // Career + Life + SEO // 23 Comments

(This is a personal post so if that isn’t your thing then you should move on.)

This is the eighth year that I’ve done a year in review piece. If this is your first time reading one you may need the context of prior years. I’ve dealt with a variety of issues leading up to this point. Here are easy links to 2011, 2012, 2013, 2014, 2016, 2017 and 2018.

2019 was a successful year in one way but not in many others. As I closed out the year I realized that I’d taken the wrong learnings from 2018. I’d let the business come to me, devalued my expertise and lost confidence.

Business Booms

Shut Up and Take My Money

The business grew another 38% in 2019. I remain a bit stunned at the numbers.

I moved all legacy clients to expertise retainers and these new arrangements allowed me to carry more clients than I had in the past.

I was concerned that the relatively new expertise retainers might not translate into the same sort of success for clients, which would likely mean more client churn. But that didn’t happen. Not at all.

The problem was not with the expertise retainers but my own fear that they weren’t delivering enough value.

Confidence

You Can Not Be Serious

I have often been accused of being cocky. I get it. From the outside I argue pretty passionately and am very willing to take a stand for what I believe to be true. I hope I do so in as civil a way as possible but that might not always be the case.

When I think about myself I’d certainly say I’m confident. It’s not something I lack. But for some reason there were areas last year where confidence seemed lacking. It was, frankly, a bit of a shock to make this discovery.

I was not confident that my expertise was enough to support my retainers. Yet that went against all logic when I looked at the results I was driving for these clients.

I was not confident that I could add enough value to outside projects or build new projects on my own. Yet the one outside project I worked on is driving nearly 30,000 visits a day on my strategy and my content.

So where was this drain in confidence coming from?

I believe strongly in my expertise about certain topics but did not believe strongly enough in the value of all that expertise combined. It’s a subtle thing but incredibly important.

The analogy I’d make is a tennis player who is confident in their serve, in their footwork, in their forehand and backhand, in their net play but, oddly, not confident in their game.

Confidence is such an important part of any endeavor. Because at some point something is going to go sideways. In tennis your first serve might break down. Or you just have a few games where your backhand isn’t working.

If you only have confidence in the components you’re unlikely to find lasting success. Instead, you have to have confidence in yourself. You’ll find a way to fix that backhand. You’ll figure out a way to win.

I’m reminded of something Jon Henshaw said to me a number of years ago. “If the Internet went away tomorrow you’d find another way to be successful.” It was damn flattering and the words stick with me to this day.

Instigator

Elliot from Mr. Robot

That lack of confidence led to being less aggressive about opportunities. I wasn’t taking as much initiative as I had been previously.

Part of this was taking the wrong learnings from 2018. I’d ended that year with a bit of schmaltz around needing other people to succeed. There’s a popular quote about this floating around.

“If you want to go fast, go alone. If you want to go far, go together.”

I’m actually not arguing against this philosophy. I think it’s true. But here’s the thing. There are a whole bunch of people who don’t go anywhere. When I look back at where I’ve been most successful in life over the last few years it’s because I’ve been the instigator.

I may start out alone but I find people along the way.

The point is, I don’t think a lot of things would have come to fruition if I had not been the instigator. I lost that to a large degree in 2019. I was waiting for others to help get things started. Or I thought that partnership was critical to success.

In last year’s piece, I’d asked if anyone wanted to help launch a new politically slanted site. Nobody raised their hand to help and as a result nothing ever happened. That won’t happen this year.

I’ll fumble around and figure out how to get it done.

Failure

Brave Enough to Be Bad Quote

One of the reasons I didn’t do more was a fear of failure. When you’re comfortable and accustomed to success in one area I think it becomes more difficult to think of failing in another.

There’s a strange dark synergy with confidence here. If you don’t believe in you but just the things you do then having some of those things fail becomes pretty crippling.

Strangely, this isn’t about how others perceive me. I haven’t defined myself by how others view me since … high school. I’m the critic holding myself back, which is strange because I’m so good at framing suboptimal situations.

I won’t hold myself back in 2020.

This is a lot easier for me now. The reason why? Money. It sounds crass but it’s not a big deal if I lose $5,000 on a new project. Even turning away paying clients to focus on something I think will pay off down the line is okay.

That voice in my head can’t scare me with visions of missed mortgage payments and an inability to feed my family. So it’s a lot easier to take risks and drown out that inner voice by shouting ‘cowabunga!’ as I dive in head first.

Disconnected

No More Wood Chips Pleas e

I wrote four blog posts in 2019 and one of those was the year in review piece. That’s not a lot. Certainly less than I had planned.

Part of this was clearly about time management and simply not putting as much value on sharing my expertise. But the other part was because I felt disconnected from the industry.

I don’t see a lot of what I do or how I think about search showing up in industry pieces. That’s okay. There are a number of ways to achieve search success and plenty of demand for all of us.

Yet, the gulf has widened to such a degree that it becomes hard to understand how I’d fit into the landscape.

Many of my views are contrary to mainstream thought. I never talk about E-A-T. I advocate for less A/B testing. I find third-party tools often obscure real insight. I think many are far too obsessed with site speed.

I don’t mind publishing contrarian views if I believe enough people are listening. I’m just not sure that’s the case these days.

In the past I could spend a fair amount of time to defend and debate my views. I still could but I find it hard to come up with a good reason why I should.

Audience

New Yorker Cartoon "Read The Room"

The problem I have right now is audience. My primary target market are executives at large scale business-to-consumer start-ups. Thing is, they don’t trust the talking heads in search. Not in the slightest.

Instead, they ask other executives and friends. They reach out to see if their venture capital backers have leads on skilled search professionals that have helped other portfolio companies.

A few posts to maintain a certain degree of visibility are necessary but referrals based on working relationships are how I secure all new work. I think this is true for a handful of other folks in the industry as well.

I admit this is really only true if you’re a solo consultant or very small shop. Agency and tool representatives still need to be out there because the margins on those businesses are thinner.

So I’m not showing up at conferences or lobbing grenades into mainstream SEO thought because it doesn’t really help me anymore. I miss it. But I’m finding it hard in the cold light of logic to defend the time and energy it takes.

It makes me wonder if the direction of the industry has changed because of a mix shift issue with contributors.

Life

Life Is Like A Box of Chocolates

Remember last year when I said that I was going to accomplish some important personal goals by adhering to certain habits. Yeah … that didn’t happen.

I’ve never been heavier and I read a total of three books all year.

I simply lost focus. I was handicapping failure. I took on more than I should have because I lacked confidence in my new expertise retainer strategy. I spent way too much time on the business and less on myself. I decided other things were more important than my physical and mental health.

It wasn’t all about work. The one thing that hasn’t wavered throughout has been a dedication to family. I have only missed one of my daughter’s events … ever. And that was because I was in the hospital. I regularly cancel or move meetings to be there for her activities. Lacrosse season is just around the corner!

Last year I also became the Northgate Girls Tennis Team Booster Representative, which turns out to be a fairly large commitment. So I have to cut myself some slack there. I did stuff.

And after talking about it for a decade I made sure my wife was able to follow-through on a family reunion. While I’m not eager to go back to Florida (no offense folks) I’m very thankful we were able to pull it off and create a bunch of memories.

Points

AJ Kohn Interviews Gary Illyes

Taking a note from prior year’s learnings I can acknowledge that I wasn’t a total slacker this year.

I continued to contribute to Bay Area Search and was able to coordinate and conduct and interview with Gary Illyes. Unfortunately, the video still isn’t available. I’m going to work on that but until then you can read this great write-up from Kevin Indig.

I was also a vocal advocate for Genius as they went public with their allegations of theft by Google and their proxies.

While not my intention, that probably did more for my personal brand than any of my other activities in 2019, particularly when you think about my target market.

That’s not why I did it. I was, and still am, pissed. But that doesn’t make me a Google hater. Far from it. I simply call them as I see them.

Next

Is This A Pigeon?

I don’t know what comes next. I don’t have a formula that will help me better balance work and life. But that’s okay. I don’t need to figure that out here in this post. Or even tomorrow. (And while well intentioned, please don’t send life hacks and productivity book suggestions.)

What I need to do is remain confident that I will.

Will I fail again? Maybe. Or maybe I’ll catch fire like Will Scott. (I mean, talk about a lasting transformation and true inspiration.)

Here’s what I am doing. I’m being an instigator again.

I reached out to a potential partner and in the span of a week was able to have a dialog that let me cross that idea off the list of side projects.

I parted ways with one client where I no longer felt like I was able to deliver value. To me, their roadmap was geared toward a version of Google that last existed two years ago.

I did a quick thread on the new Popular products unit Google launched. Danny wound up replying and was helpful later when I pinged him on another issue. I appreciate this because I was pretty hard on Danny last year.

I contacted comScore about getting historical qSearch data so I can fill in and update my US desktop search volume graph. They didn’t get back to me other than to add my email to their marketing list (not cool). That won’t stop me from getting some sort of data to inform a theory I have regarding search trends.

I hopped down the street to get the slow leak in my tire fixed and thoroughly cleaned the ice maker. Now I no longer worry about getting a flat and we again have crushed ice. These small things sound stupid but let me tell you dealing with them brings such relief and satisfaction.

In all, I’m taking what I learned in the last few years and am doing those things more often and faster. It’s up to me to get things started.

The Problem With Image Search Traffic

November 14 2019 // Analytics + Rant + SEO // 11 Comments

Where To Track Image Search Traffic

Google makes it easy for marketers to make bad decisions by hiding the performance of image search traffic.

Marketers have grown accustomed to not seeing image search traffic broken out in analytics packages. And Google persists in telling marketers to use Google Search Console to track image search traffic.

The problem? Google Search Console doesn’t tell marketers how image search traffic performs.

Here’s why Google’s decision to hide image search traffic performance is hurting websites.

Image Search History

Google Analytics doesn’t track image search as a separate source of traffic. This never made any sense to me.

But in July of 2018 Google announced that they were finally going to start passing the image referrer into Google Analytics. I was, in all honesty, elated that we’d finally have image search split out.

So I waited. And waited. And waited. And waited. And waited. And then, very quietly, Google updated that post.

Google Decides To Give Us Bad Data

WTF! “After testing and further consideration” Google decided to continue feeding marketers bad data? I cursed like a sailor. Multiple times.

Even worse? They pointed marketers to the Search Console Performance Report. Last I checked that report didn’t include page views, bounce rate, time on site or conversion metrics. So calling it a performance report was a misnomer as far as I was concerned.

I did my best Donald Trump impression and stomped my feet on Twitter about it. Nothing came of it. No one seemed to care. Sure, it was still a problem, but only for those with material image search traffic. I knew what to look for and … I was busy.

So what changed? Two things happened that made me write this piece.

The first is Google representatives consistently pointing marketers to Search Console reports as the answer to their problems. This triggers me every time. Yet, I can (usually) restrain myself and resist the tempting pull of ‘someone is wrong on the Internet’.

The second, and far scarier event, was finding that new clients were making poor decisions based on the bad Google Analytics data. Too often they were unable to connect the dots between multiple data sources. The fate of projects, priorities and resources were at stake.

Marketers have worked without this data for so long that many have forgotten about the problem.

Let me remind you.

Image Search Tracking

Google Analytics Y U No Track Image Search

Out of frustration I figured out a way to track image search in Google Analytics. That was in 2013. Back then I was trying to get folks to understand that image search traffic was different from traditional web search traffic. And I could prove it with those Google Analytics advanced filters.

Image Search by Browser

Unfortunately, soon after that post in 2013 we began to lose visibility as more and more browsers failed to capture the image search referrer.

Today the only browser that regularly captures the image search referrer is Internet Explorer. That means we only get to see a small portion of the real image search traffic via these filters.

Clearly that introduces a fair amount of bias into the mix. Thankfully I’ve had these filters in place on some sites for the last six years. Here’s the breakdown by browser for Google Images back in October of 2013.

Image Search by Browser October 2013

There’s a nice distribution of browsers. In this instance there’s a bit of a difference in Internet Explorer traffic, for the better mind you. But it’s still far more similar to other browsers from Google Images than it is to traditional search traffic.

Now here’s the breakdown by browser for Google Images from October of 2019 (from the same site).

Image Search by Browser October 2019

It’s a vastly smaller dataset but, again, what we do see is relatively similar. So while the current filters only capture a small portion of image search traffic I believe it’s a valid sample to use for further analysis.

Image Search Performance

Once you have those filters in place you instantly see the difference. Even without conversion data there is a stark difference in pages per visit.

Image Search Performance Comparison

That’s a look at October 2019 data from a different site. Why am I using a different site? It has more data.

Think I’m hiding something? Fine. Here’s the same data from the first site I referenced above.

image-search-pages-per-session-difference-again

The behavior of image search traffic is very different that web search traffic.

Think about how you use image search! Is it anything like how you use web search? The intent of image search users differs from that of web search users.

Why does Google think we should treat these different intents the same?

Image Search Conversion

Things get more interesting (in a Stephen King kind of way) when you start looking at conversion.

eCommerce Image Search Conversion Rate

This is a large set of data from an eCommerce client that shows that image search traffic does not convert well. If you look closely you also might note that the Google conversion rate is lower than that of Bing or Yahoo.

For those squinting, the conversion for Google is 1.38% while Bing and Yahoo are at 1.98% and 1.94% respectively. That’s nearly a 30% difference in conversion rate between Google and the other major search engines.

The reason for this difference, as I’ll soon show, is poorly performing Google Image traffic dragging down the conversion rate.

Here’s another eCommerce site with a unique conversion model (which I can’t reveal).

Image Search Conversion Rates

In this instance, Google Images performs 64% worse (.17%) than Google (.47%). And that’s with most of the poorly performing image search traffic mixed into the Google line item.

Over the last 28 days Google Search Console tells me that 33.5% of Google traffic is via image search. The distribution above shows that 5.8% comes from image search. So the remaining 27.7% of the Google traffic above is actually image search.

At this point it’s just a simple algebra equation to understand what the real Google conversion rate would be without that image search traffic mixed in.

Image Search Conversion Math

Confused Math Lady

Don’t be scared away by the math here. It’s really not that hard.

First I like to say it as a sentence. If total traffic of 88,229,184 has a conversion rate of 0.47%, but 27.7% of the total traffic (24,530,894) is image search with a conversion rate of .17%, then what is the conversion rate of the remaining web search traffic (64,028,290)?

Then it becomes easier to write the equation.

24,530,894*0.17 + 64,028,290 * X  = 88,229,184 * 0.47

At that point you solve for X.

4,170,252 + 64,028,290X = 41,622,816

64,028,290X = 41,622,816 – 4,170,252

64,028,290X = 37,452,565

X = 37,452,565/64,028,290

X = 0.58

That means the true difference in conversion performance is .17% versus .58% or nearly 71% worse.

Organic Search Conversion Deflation

Including image search traffic into organic search decreases the overall conversion rate. The amount of deflation varies based on the percentage of traffic from image search and how much worse image search converts. Your mileage may vary.

Here’s another example of how this might play out. Here’s the conversion rate trend for an eCommerce client.

conversion-rate-trend

They’ve been concerned about the continuing decline in conversion rate, despite material growth (60%+) in traffic. The drop in conversion rate between July 2018 and October of 2019 is 38%.

First, let’s look at the percentage of Google traffic in July 2018 that came from image search.

Image Search Share of Traffic July 2018

I don’t have a whole month but the ratio should hold about right. In July 2018 the share of Google traffic from image search was 30.2%.

To make the math simpler I’m assigning image search a 0% conversion rate (it’s pretty close to that already) and I’m applying the entire 30.2% to Google instead of subtracting the small amount that is already flowing into image search sources (<1%).

Adjusted Conversion Rate July 2018

When you do the math Google suddenly has a 2.19% conversion rate, which puts it in line with Bing and Yahoo. Funny how that works huh? Actually it’s not funny at all.

Seriously folks, I want you to fully digest this finding. Before I removed the Google Image traffic the conversion rate of the three search engines is:

Google: 1.51%

Bing: 2.21%

Yahoo: 2.23%

But when I remove Google Image search traffic the conversion rate of the three search engines is:

Google: 2.19%

Bing: 2.21%

Yahoo: 2.23%

When image search traffic is removed the conversion data makes sense. 

You know what else happens? Paid Search doesn’t look nearly as dominant as a conversion channel.

Paid Search Conversion July 2018

So instead of organic search being nearly half as effective (1.55% vs 2.97%) it’s approximately 75% as effective (2.19% vs 2.97%).

But look at what happens when we analyze October of 2019. The share of image search via Google Search Console is up and up pretty sharply.

Image Search Share of Traffic October 2019

Now, 44.8% of the Google traffic to this site is from image search. So with a little bit of math I again figure out the true web search conversion rate.

Adjusted Conversion Rate October 2019

Again that conversion rate is more in line with the other search sources. (Though, note to self, investigate Bing conversion drop.)

Paid search conversion also dropped to 2.25% in October of 2019. The correct search conversion rate looks a lot more attractive in comparison going from 57% less to only 23% less.

Let me restate that.

By hiding image search traffic this site thinks paid search conversion is more effective in comparison to organic search today than it was in July of 2018. The reality is the opposite. In comparison to paid search, organic search conversion improved slightly.

Mix Shift Issues

Sir Mix-A-Lot

If we go back to that trend at the beginning of the prior section, the drop in conversion from July 2018 to October 2019 is no longer 38% but is approximately 21% instead. That’s still a material drop but it’s not 38%!

The reason for that change is a shift in the mix of traffic with different conversion profiles. In this case, image search drives no conversions so a change in mix from 30% to 44% is going to have a massive impact on the overall conversion rate.

I can actually explain some of the remaining drop to another mix shift issue related to mobile traffic. Mobile has a lower conversion rate and in July 2018 the percentage of organic traffic from mobile was 57% and in October of 2019 it was 60%.

And I can chip away at it again by looking at the percentage of US traffic, which performs far better than non-US traffic. In July 2018, US traffic comprised 53% of Google search traffic. In October 2019, US traffic comprised 48% of Google search traffic.

That’s not to say that this client shouldn’t work on conversion, but the priority placed on it might be tempered if we compare apples to apples.

And that’s what this is really about. Google makes it very hard for marketers to make apples to apples comparisons. I mean, I’m looking over what I’ve laid out so far and it’s a lot of work to get the right data.

Alternate Image Search Tracking

Walternate from Fringe

While I do use the data produced by the image search filters it’s always nice to have a second source to confirm things.

Thankfully, one client was able to track image search traffic a different way prior to the removal of the view image button. What did they find? The image search conversion rate was 0.24% while the web search conversion rate was 2.0%.

Yup. Image search performed 88% worse than web search.

This matters for this particular client. Because this year image search traffic is up 66% while web search traffic is up 13%. How do you think that translates into orders? They’re up 14%.

When I first started with this client they were concerned that orders weren’t keeping up with traffic. Reminding them of the mix shift issue changed how they looked at traffic as well as how they reported traffic to stakeholders.

Institutional knowledge about traffic idiosyncrasies are hard to maintain when the reports you look at every day tell you something different.

Bad Data = Bad Decisions

No Regerts Tattoo

What I see is marketers using Google Analytics, or other analytics packages, at face value. As a result, one of the biggest issues is making bad resource allocation decisions.

Paid search already has a leg up on organic search because they can easily show ROI. You spend X and you get back Y. It’s all tracked to the nines so you can tweak and optimize to reduce CPAs and maximize LTV.

Organic search? Sure we drive a ton of traffic. Probably a lot more than paid search. But it’s hard to predict growth based on additional resources. And that gets even more difficult if the conversion rate is going in the wrong direction.

So management might decide it’s time to work on conversion. (I swear I can hear many heads nodding ruefully in agreement.) Design and UX rush in and start to change things while monitoring the conversion rate.

But what are they monitoring exactly? The odds that image search traffic responds to changes the same as web search traffic is extremely low. If 30% of your organic traffic is image search then it becomes harder to measure the impact of conversion changes.

Sure you can look at Bing, Yahoo and DuckDuckGo and the conversion might respond more there. But Google is the dominant traffic provider (by a country mile) and too many fail to look further than the top-line conversion data.

A/B Testing?

Villanelle Wants You To Be Quiet

Oh, and here’s a brainteaser for you. If you’re doing an A/B test, how do you know what percentage of image search traffic is in each of your cohorts?

Yeah, you don’t know.

Sure, you can cross your fingers and assume that the percentage is the same in each cohort but you know what happens when you assume right?

Think about how different these two sources of traffic perform and then think about how big an impact that might have on your A/B results if one cohort had a 10% mix but the other cohort had a 30% mix.

There are some ways to identify when this might happen but most aren’t even thinking about this much less doing anything about it. Many of those fact-based decisions are based on what amounts to a lie.

Revenue Optimization

This isn’t just about eCommerce sites either. If you’re an advertising based site you’re looking for page views, right?

Image Search Traffic Publishers View

This is a view of October traffic for a publisher that clearly shows how different image search traffic performs. Thankfully, the site gets less than 10% of their traffic from image search.

Image Search Share for Publisher

Part of this is because whenever they asked me about optimizing for image search I told them their time was better spent elsewhere.

Pinterest for Publishers

Far better to invest in getting more traffic from a source, like Pinterest, that better matches intent and therefore supports the advertising business.

Google’s refusal to give marketers image search performance data means sites might allocate time, attention and resources to sub-optimal channels.

Pinterest

Elephant with Pinterest Logo

The elephant in the room is Pinterest. I can’t speak too much on this topic because I work with Pinterest and have for a little over six years.

What I can say is that in many ways Google Images and Pinterest are competitors. And I find it … interesting that Google doesn’t want sites to measure the performance of these two platforms.

Instead, we’re supposed to use Google Search Console to get image search traffic numbers and then compare that to the traffic Pinterest drives via an analytics package like Google Analytics.

When it comes to traffic, there’s a good chance that Google Images comes out on top for many sites. But that’s not the right way to evaluate these two sources of traffic. How do those two sources of traffic perform? How do they both help the business.

Why Google? Why?

Rick Sanchez

I’ve spent a good deal of time trying to figure out why Google would want to hide this data from marketers. I try hard to adhere by Hanlon’s Razor.

“Never attribute to malice that which can be adequately explained by stupidity.”

But it’s hard for me to think Google is this stupid or incompetent. Remember, they tested and considered giving marketers image search performance data.

Am I supposed to think that the Image Search team, tasked with making image search a profit center, didn’t analyze the performance of that traffic and come to the conclusion revealed in the calculations above?

I’m open to other explanations. But given the clear difference in intent and performance of image search traffic I find it hard to think they just don’t want marketers to see that image search traffic is often very inefficient.

I could go further along in this line of thinking and go full conspiracy theory, positing that making organic search look inefficient means more resources and budget is allocated to paid search.

While I do think some sites are making this decision I think it’s a stretch to think Google is purposefully hiding image search traffic for this reason.

Is Image Search Useless?

Please Close Gate

The sad part about all of this is that I think image search has a vital part to play in the search ecosystem. I believe it most often represents top of funnel queries. Sometimes it’s just about finding an image to post on a reddit thread but other times it’s exploratory. And either way I don’t mind the brand exposure.

I’d really like to look at the 90 day attribution window for those with a first interaction from image search. Do they come back through another channel later and convert? That might change the priority for image search optimization.

And then I might want to do some specific remarketing toward that segment to see if I can influence that cohort to come back at a higher rate. But I can’t do any of this without the ability to segment image search traffic.

Homework

Homework

If you’re made it this far I’d really like you to do this math for your site. Here’s a crib sheet for how to perform this analysis.

Take a month of organic search data from Google Analytics.

Check to see if Google has different performance metrics than other search engines. That’s a strong clue the mix of traffic could be causing an issue.

Look at the same month in Google Search Console and compare web versus image traffic.

Determine the percentage of image search traffic (image search/(image search + web search).

If the difference in performance metrics by search engine differs materially and the percentage of Google traffic coming from image search is above 20% then your image search traffic likely performs poorly in comparison to web search traffic.

Do the math.

Here’s where it gets tricky. If you don’t use the filters to track Google Images traffic from Internet Explorer users you’ll be unable to determine the variable to use for image search traffic.

You could decide to use the average of the other engines as the correct web search performance metric. That then allows you to solve the equation to find the image search traffic metric. But that’s a bit deterministic.

Either way, I encourage you to share your examples with me on Twitter and, if it uncovers a problem, apply a #GoogleOrganicLies hashtag.

TL;DR

The decision to hide image search performance may cause sites to allocate resources incorrectly and even make bad decisions about product and design. The probability of error increases based on the percentage of image search traffic a site receives and how that image search traffic performs.

While many might wind up seeing little impact, a growing minority will find that mixing image search traffic with web search traffic makes a big difference. I encourage you to do the math and find out whether you’ve got a problem. (This feels oddly like a ‘get tested’ health message.)

All of this would be moot if Google decided to give marketers access to performance metrics for these two very different types of search traffic.

The Invisible Attribution Model of Link Acquisition

August 30 2019 // Advertising + Marketing + SEO // 11 Comments

Links are still an important part of ranking well in search. While I believe engagement signals are what ultimately get you to the top of a search result, links are usually necessary to get on the first page.

In the rush to measure everything, I find many are inadvertently limiting their opportunities. They fail to grasp the invisible attribution model of link acquisition, which is both asymmetrical and asynchronous.

The result? Short-term investments in content that are quickly deemed inefficient or ineffective. Meanwhile savvy marketers are drinking your milkshake.

Link Building vs Link Acquisition

Nick Young Question Marks

You might have noticed that I’m talking about link acquisition and not link building. That’s because I think of them as two different efforts.

I view link building as traditional outreach, which can be measured by close rates and links acquired. You can determine which version of your pitch letter works best or which targets are more receptive. Measurement is crystal clear.

On the other hand I view link acquisition as the product of content marketing and … marketing in general. It’s here that I think measurement becomes difficult.

Shares and Links 

Simple and wrong or complex and right

Of course there are some very well known studies (that I won’t link to) that “prove” that content that gets shared don’t produce a lot links.

I guess that’s it folks. End of post, right?

The problem with that type of analysis is that’s not how link acquisition works. Not in the slightest.

Asymmetrical

Asymmetrical Millennium Falcon

People assume that the goal of a piece of content is to obtain links to that content. Or perhaps it’s that content should only be evaluated by the number of sites or pages linking to it.

Clearly that’s an easy metric. It feels right. It’s easy to report on and explain to management. But I think it misses the point. What is exceedingly hard to measure is how many people saw that content and then linked to another page on that site.

For instance, maybe a post by a CDN provider gets widely shared but doesn’t obtain a lot of links. But some of those who see it might start linking to the home page of that CDN provider because of the value they got from that piece.

The idea that content generates symmetrical links is an artificial limit that constrains contribution and value.

Asynchronous

Asynchronous Comeback

Links are not acquired right after content is published. Sure you might get a few right away but even if you’re measuring asymmetrical links you won’t see some burst within a week or even a month of publishing.

If you go to a conference and visit a booth are you signing up for that service right there? Probably not. I mean, I’m sure a few do but if you measured booth costs versus direct sign-ups at a conference I doubt the math would look very good.

Does that mean it’s a bad strategy? No. That booth interaction contributes to a sale down the road. The booth interaction and resulting sale are asynchronous.

Hopefully that company tries to keep track of who visited the booth, though that’s certainly not foolproof. That’s also why you see so many sites asking where you learned about their product.

They’re trying to fill in the invisible parts of an attribution model.

Saturation Marketing 

My background is in marketing and advertising so I might come at this from a different perspective. I am a big believer in saturation marketing overall and see it as a powerful SEO tactic.

Here’s an example. I go to a Sharks game and the boards are covered in logos.

Sharks Playoff Game 2019

If we’re using a symmetrical and synchronous model of attribution I’d have to jump down onto the ice and rent a car from Enterprise right then and there to make that sponsorship worthwhile.

That’s ludicrous, right? But why do we hold our content to that standard?

Story Time

Gatorade NASCAR Car

Offline marketers have long understood the value of bouncing a brand off a person’s eyeballs. I didn’t fully appreciate this until I was in my first job out of college.

I worked at an advertising agency outside of Washington D.C.. Our big client was The Army National Guard. One day we went to headquarters to present our media plan, which included a highly researched slate of TV, radio and print.

Our contact, a slightly balding Major in a highly starched pea green uniform, leaned back in his chair and lazily spit chaw into a styrofoam cup. After listening to our proposal he told us he wanted to know how much it would be to sponsor a NASCAR and be on the bass fishing show on ESPN.

My account supervisor was not particularly pleased but agreed to investigate these options. That task fell to me. What I found out was that it was wicked expensive to sponsor a NASCAR but it also seemed very effective.

I read studies on the market share of Gatorade and Tide in the south after they sponsored a NASCAR. We’re talking 400% growth. Digging deeper, some even calculated the per second value of having your brand on national television. I was fascinated.

Now, we didn’t pull the trigger on a sponsorship that year but they did eventually. However, the demographics of NASCAR changed and the sponsorship turned out to be less than effective. (Though it’s interesting to see that attribution was still an issue during their analysis.)

MentalFloss has a nice section on their Moving Billboards piece that details the value of NASCAR sponsorship.

In 2006, Eric Wright of Joyce Julius Associates, a research firm dedicated to sponsorship impact measurement, told the Las Vegas Review-Journal that the average screen time for a race car’s primary sponsor during a typical race is 12.5 minutes and the average number of times the announcers mention the sponsor is 2.6 times per race. The comparable value to the sponsor for the time on screen, according to Wright, is $1.7 million. A sponsor’s exposure goes up if its driver takes the checkered flag or is involved in a wreck, especially if the wreck occurs in the later stages of the race and the company name is still visible when the car comes to a stop. “If you crash, crash fabulously, and make sure your logo is not wrinkled up,'” Dave Hart of Richard Childress Racing once told a reporter.

The emphasis is mine. And clearly you might quibble with their calculations. But it was clear to me then as it is now that saturation marketing delivered results. Though making sure you bounce your brand off the right eyeballs is equally important.

Branded Search

Another way to validate this approach is to look at how advertising impacts branded search. One of my clients is a David in a vertical with a Goliath. They don’t have a big advertising budget. So they’re doing a test in one market. Here’s the branded search for each according to Google Trends.

Impact of Marketing on Branded Search

It’s pretty easy to spot where my client is doing their advertising test!

Now, I’ve shown this a few times recently. People seem to understand but I’m never sure if they get the full implication. You might even be asking what this has to do with link acquisition.

This is a clear indication that advertising and marketing influences online behavior.

By the power of Grayskull we have the power! Now, in this case it’s offline advertising. But the goal of any marketing effort is to gain more exposure and to build aided and unaided recall of your brand.

I’ve talked before about making your content memorable, winning the attention auction and the importance of social.

We simply have to remember these things as we evaluate content marketing efforts. And far too many aren’t. Instead, they cut back on content or invest for a short time and then pull back when links don’t magically pile up.

Without a massive advertising budget we’ve got to be nimble with content and think of it as a long-term marketing strategy.

Attribution Models

I have one client who had a decent blog but was wary of investing any further because it didn’t seem to contribute much to the business.

A funny thing happened though. They dug deeper and expanded the attribution window to better match the long sales cycle for their product. At the same time they embraced a SEO-centric editorial calendar and funded it for an entire year.

The result? Today that blog generates seven figures worth of business. Very little of that is attributed on a last click basis. People don’t read a blog post and then buy. But they do come back later and convert through other channels.

Those sales are asymmetrical and asynchronous.

Unfortunately, I find that very few do attribution well if at all. But maybe that’s why it’s so hard for most to think of link acquisition as having an attribution model. Adding to the problem, many of the touch points are invisible.

You don’t know who saw a Tweet that led to a view of a piece of content. Nor whether they later saw an ad on Facebook. Nor whether they dropped by your booth at a trade show. Nor whether they had a conversation with a colleague at a local event. Nor whether they visited the site and read a secondary piece of content.

You see, links don’t suddenly materialize. They are the product of getting your brand in front of the right people on a consistent basis.

Proof?

Proof is in the Pudding

That blog I talked about above. Here’s what referring domains for the site looks like over the past year.

Referring Domains Graph

Here’s the graph for that David vs Goliath client who I convinced to invest in top of funnel content.

Referring Domains Graph All Time

Of course you can see that ahrefs had a bit of an anomaly in January of this year and started finding more referring domains for all sites. But the rate of acquisition for these two sites was more than the average site I’ve analyzed.

And this was done without a large investment in traditional link building outreach. In one case, there was essentially no traditional link building.

Links equal Recommendations

I think we forget about why and how people wind up linking. Remember that links are essentially a citation or an endorsement. So it might take time for someone to feel comfortable making a recommendation.

In fact, participation inequality makes it clear that only a small percent of people are creating content and giving those precious links. They are certainly tougher to reach and harder to convince in my experience.

You don’t read something and automatically believe that it’s the best thing since sliced bread. (Or at least you shouldn’t.) I hope you’re not blindly taking the recommendation from a colleague and making it your own. Think about how you give recommendations to others offline. Seriously, think about why you made your last recommendation.

Recommendations are won over time.

Action Items

Finding Nemo Now What Scene

You might be convinced by my thesis but could be struggling to figure out how it helps you. Here’s what I’d offer up as concrete take aways.

Stop measuring content solely on links acquired

I’m not saying you shouldn’t measure links to content. You should. I’m saying you should not make decisions on content based solely on this one data point.

Start measuring your activity

I’d argue that certain activity levels translate into link acquisition results. How many pieces of content are you producing each month? How much time are you dedicating to the marketing of that content? My rule of thumb is at least as much time as you took producing it. I’ve seen others argue for three times the time it took to produce it.

Want to get more detailed? Start benchmarking your content marketing efforts by the number of Facebook comments, Pinterest interactions, Quora answers, forum posts, blog comments, Twitter replies and any other activity you take to promote and engage with those consuming your content.

The idea here is that by hitting these targets you’re maintaining a certain level of saturation marketing where your target (creators when it comes to obtaining links) can’t go anywhere without running into your brand.

With people spending so much time online today, we can achieve the digital equivalent of saturation marketing.

Use an attribution model

While not about links per se, getting comfortable with attribution will help you feel better about your link acquisition efforts and make it easier to explain it to management.

Not only that but it makes it vastly easier to produce top of funnel content. Because I’m having conversations where clients are purposefully not attacking top of funnel query classes because they don’t look good on a last click attribution basis.

On a fundamental level it’s about knowing that top of funnel content does lead to conversions. And that happens not just for sales but for links too.

TL;DR

Content plays an important role in securing links. Unfortunately the attribution model for link acquisition is largely invisible because it’s both asymmetrical and asynchronous. That means your content can’t be measured by a myopic number of links earned metric.

Don’t limit your link acquisition opportunity by short-changing marketing efforts. Link acquisition is about the sum being greater than the parts. Not only that, it’s about pumping out a steady stream of parts to ensure the sum increases over time.

Query Syntax

February 11 2019 // SEO // 16 Comments

Understanding query syntax may be the most important part of a successful search strategy. What words do people use when searching? What type of intent do those words describe? This is much more than simple keyword research.

I think about query syntax a lot. Like, a lot a lot. Some might say I’m obsessed. But it’s totally healthy. Really, it is.

Query Syntax

Syntax is defined as follows:

The study of the patterns or formation of sentences and phrases from words

So query syntax is essentially looking at the patterns of words that make up queries.

One of my favorite examples of query syntax is the difference between the queries ‘california state parks’ and ‘state parks in california’. These two queries seem relatively similar right?

But there’s a subtle difference between the two and the results Google provides for each makes this crystal clear.

Result for California State Parks

Results for State Parks in California Query

The result for ‘california state parks’ has fractured intent (what Google refers to as multi-intent) so Google provides informational results about that entity as well as local results.

The result for ‘state parks in california’ triggers an informational list-based result. If you think about it for a moment or two it makes sense right?

The order of those words and the use of a preposition change the intent of that query.

Query Intent

It’s our job as search marketers to determine intent based on an analysis of query syntax. The old grouping of intent as informational, navigational or transactional are still kinda sorta valid but is overly simplistic given Google’s advances in this area.

Knowing that a term is informational only gets you so far. If you miss that the content desired by that query demands a list you could be creating long-form content that won’t satisfy intent and, therefore, is unlikely to rank well.

Query syntax describes intent that drives content composition and format.

Now think about what happens if you use the modifier ‘best’ in a query. That query likely demands a list as well but not just a list but an ordered or ranked list of results.

For kicks why don’t we see how that changes both of the queries above.

Query Results for Best California State Parks

Query Results for Best State Parks in California

Both queries retain a semblance of their original footprint with ‘best california state parks’ triggering a local result and ‘best state parks in california’ triggering a list carousel.

However, in both instances the main results for each are all ordered or ranked list content. So I’d say that these two terms are far more similar in intent when using the ‘best’ modifier. I find this hierarchy of intent based on words to be fascinating.

The intent models Google use are likely more in line with more classic information retrieval theory. I don’t subscribe to the exact details of the model(s) described but I think it shows how to think about intent and makes clear that intent can be nuanced and complex.

Query Classes

IQ Test Pattern

Understanding what queries trigger what type of content isn’t just an academic endeavor. I don’t seek to understand query syntax on a one off basis. I’m looking to understand the query syntax and intent of an entire query class.

Query classes are repeatable patterns of root terms and modifiers. In this example the query classes would be ‘[state] state parks’ and ‘state parks in [state]’. These are very small query classes since you’ll have a defined set of 50 to track.

What about the ‘best’ versions? What syntax would I use and track? It’s not an easy decision. Both SERPs have infrastructure  issues (Google units such as the map pack, list carousel or knowledge panel) that could depress clickthrough rate.

In this case I’d likely go with the syntax used most often by users. Even this isn’t easy to ferret out since Google’s Keyword Planner aggregates these terms while other third-party tools such as ahrefs show a slight advantage to one over the other.

I’d go with the syntax that wins with the third-party tools but then verify using the impression and click data once launched.

Each of these query classes demand a certain type of content based on their intent. Intent may be fractured and pages that aggregate intent and satisfy both active and passive intent have a far better chance of success.

Query Indices

Devil Is In The Details

I wrote about query indices or rank indices back in 2013 and still rely on them heavily today. In the last couple of years many new clients have a version of these in their dashboard reports.

Unfortunately, the devil is in the details. Too often I find that folks will create an index that contains a variety of query syntax. You might find ‘utah bike trails’, ‘bike trails utah’ and ‘bike trails ut’ all in the same index. Not only that but the same variants aren’t present for each state.

There are two reasons why mixing different query syntax in this way is a bad idea. The first is that, as we’ve seen, different types of query syntax might describe different intent. Trust me, you’ll want to understand how your content is performing against each type of intent. It can be … illuminating.

The second reason is that the average rank in that index starts to lose definition if you don’t have equal coverage for each variant. If one state in the example performs well but only includes one variant while another state does poorly but has three variants then you’re not measuring true performance in that query class.

Query indices need to be laser focused and use the dominant query syntax you’re targeting for that query class. Otherwise you’re not measuring performance correctly and could be making decisions based on bad data.

Featured Snippets

Query syntax is also crucial to securing the almighty featured snippet – that gorgeous box at the top that sits on top of the normal ten blue links.

There has been plenty of research in this area about what words trigger what type of featured snippet content. But it goes beyond the idea that certain words trigger certain featured snippet presentations.

To secure featured snippets you’re looking to mirror the dominant query syntax that Google is seeking for that query. Make it easy for Google to elevate your content by matching that pattern exactly.

Good things happen when you do. As an example, here’s one of the rank indices I track for a client.

Featured Snippet Dominance

At present this client owns 98% of the top spots for this query class. I’d show you that they’re featured snippets but … that probably wouldn’t be a good idea since it’s a pretty competitive vertical. But the trick here was in understanding exactly what syntax Google (and users) were seeking and matching it. Word. For. Word.

The history of this particular query class is also a good example of why search marketers are so valuable. I identified this query class and then pitched the client on creating a page type to match those queries.

As a result, this query class (and the associated page type) went from contributing nothing to 25% of total search traffic to the site. Even better, it’s some of the best performing traffic from a conversion perspective.

Title Tags

Home Searching For The Any Key

The same mirroring tactic used for featured snippets is also crazy valuable when it comes to Title tags. In general, users seek out cognitive ease, which means that when they type in a query they want to see those words when they scan the results.

I can’t tell you how many times I’ve simply changed the Title tags for a page type to target the dominant query syntax and seen traffic jump as a result. The increase is generally a combination, over time, of both rank and clickthrough rate improvements.

We know that this is something that Google understands because they bold the query words in the meta description on search results. If you’re an old dog like me you also remember that they used to bold the query words in the Title as well.

Why doesn’t Google bold the Title query words anymore? It created too much click bias in search results. Think about that for a second!

What this means is that by having the right words in the Title bolded created a bias too great for Google’s algorithms. It inflated the perceived relevance. I’ll take some of that thank you very much.

There’s another fun logical argument you can make as a result of this knowledge but that’s a post for a different day.

At the end of the day, the user only allocates a certain amount of attention to those search results. You win when you reduce cognitive strain and make it easier for them to zero in on your content.

Content Overlap Scores

Venn Diagram Example

I’ve covered how the query syntax can describe specific intent that demands a certain type of content. If you want more like that check out this super useful presentation by Stephanie Briggs.

Now, hopefully you noticed that the results for two of the queries above generated a very similar SERP.

The results for ‘best california state parks’ and ‘best state parks in california’ both contain 7 of the same results. The position of those 7 shifts a bit between those queries but what we’re saying is there is a 70% overlap in content between these two results.

The amount of content overlap between two queries shows how similar they are and whether a secondary piece of content is required.

I’m sure those of you with PTPD (Post Traumatic Panda Disorder) are cringing at the idea of creating content that seems too similar. Visions of eHow’s decline parade around your head like pink elephants.

But the idea here is that the difference in syntax could be describing different intent that demands different content.

Now, I would never recommend a new piece of content with a content overlap score of 70%. That score is a non-starter. In general, any score equal to 50% or above tells me the query intent is likely too similar to support a secondary piece of content.

A score of 0% is a green light to create new content. The next task is to then determine the type of content demanded by the secondary syntax. (Hint: a lot of the time it takes the form of a question.)

A score between 10% and 40% is the grey area. I usually find that new content can be useful between 10% and 20%, though you have to be careful with queries that have fractured intent. Because sometimes Google is only allocating three results for, say, informational content. If two of those three are the same then that’s actually a 66% content overlap score.

You have to be even more careful with a content overlap score between 20% and 30%. Not only are you looking at potential fractured intent but also whether the overlap is at the top or interspersed throughout the SERP. The former often points to a term that you might be able to secure by augmenting the primary piece of content. The latter may indicate a new piece of content is necessary.

It would be nice to have a tool that provided content overlap scores for two terms. I wouldn’t rely on it exclusively. I still think eyeballing the SERP is valuable. But it would reduce the number of times I needed to make that human decision.

Query Evolution

When you look at and think about query syntax as much as I do you get a sense for when Google gets it wrong. That’s what happened in August of 2018 when an algorithm change shifted results in odd ways.

It felt like Google misunderstood the query syntax or, at least, didn’t understand the intent the query was describing. My guess is that neural embeddings are being used to better understand the intent behind query syntax and in this instance the new logic didn’t work.

See, Google’s trying to figure this out too. They just have a lot more horsepower to test and iterate.

The thing is, you won’t even notice these changes unless you’re watching these query classes closely. So there’s tremendous value in embracing and monitoring query syntax. You gain insight into why rank might be changing for a query class.

Changes in the rank of a query class could mean a shift in Google’s view of intent for those queries. In other words, Google’s assigning a different meaning to that query syntax and sucking in content that is relevant to this new meaning. I’ve seen this happen to a number of different query classes.

Remember this when you hear a Googler talk about an algorithm change improving relevancy.

Other times it could be that the mix of content types changes. A term may suddenly have a different mix of content types, which may mean that Google has determined that the query has a different distribution of fractured intent. Think about how Google might decide that more commerce related results should be served between Black Friday and Christmas.

Once again, it would be interesting to have a tool that alerted you to when the distribution of content types changed.

Finally, sometimes the way users search changes over time. An easy example is the rise and slow ebb of the ‘near me’ modifier. But it can be more subtle too.

Over a number of years I saw the dominant query syntax change from ‘[something] in [city]’ to ‘[city] [something]’. This wasn’t just looking at third-party query volume data but real impression and click data from that site. So it pays to revisit assumptions about query syntax on a periodic basis.

TL;DR

Query syntax is looking at the patterns of words that make up queries. Our job as search marketers is to determine intent and deliver the right content, both subject and format, based on an analysis of query syntax.

By focusing on query syntax you can uncover query classes, capture featured snippets, improve titles, find content gaps and better understand algorithm changes.

TL;DC

(This is a new section I’m trying out for the related content I’ve linked to within this post. Not every link reference will wind up here. Only the ones I believe to be most useful.)

Query Classes

Aggregating Intent

Creating Rank Indices

Neural Embeddings

Hacking Attention

A Language for Search and Discovery

Search Driven Content Strategy

 

The end. Seriously. Go back to what you were doing. Nothing more to see here. This isn’t a Marvel movie.

 

What I Learned In 2018

January 29 2019 // Career + Life + SEO // 12 Comments

(This is a personal post so if that isn’t your thing then you should move on.)

2018 was a satisfying year because many of the issues that I surfaced last year and in prior years were resolved. I moved my business to expertise retainers, was more comfortable with success and stopped beating myself up (and others) for not being super human.

I had a lot less angst, guilt and was generally a lot happier.

Expertise Retainers

A Very Particular Set of Skills

One of the biggest and most successful changes in the business was moving away from any hourly rates or guarantees. In 2017 I had grown weary of the conversations about how many hours I’d worked and whether that was enough to satisfy the retainer.

Now, to be honest, there weren’t a lot of those conversations but there were enough that it bugged me. So I upped my retainer rates and moved to a pure value-based arrangement.

It was no longer about how many hours I put in but how much value I could deliver. It didn’t matter if that value was delivered in 10 minutes if it meant a 30% increase in traffic. I get paid based on my expertise or … my very particular set of skills.

What this also seems to do is match me with similarly like-minded clients. Many instantly understood that time spent wasn’t the right metric to measure. So it came down to whether they trusted that I had the expertise.

The result is more productivity. Not so much because I’m more productive but that there’s less time spent convincing and more time spent implementing.

Thinking

the-thinker-rodin

I regularly chat with Zeph Snapp to discuss business and life. One of the things he said years ago was that my goal should be to get paid to think. I really liked the sound of that.

Expertise retainers get close to realizing that goal. Because part of my expertise is the way I think about things. I have a natural ability to see patterns and to take disparate pieces of information and come to a conclusion.

I used to think this was no big deal. Doesn’t everyone see what I see? The answer to that is no. I’m not saying I’m some mentalist or massive smarty pants. I’m just adept at identifying patterns of all sorts, which happens to be meaningful in this line of work.

More importantly, I’m able to communicate my thinking in a way that people seem to understand. Most of the time this takes the form of analogies. But sometimes it’s just describing, step by step, how I figured something out.

The value isn’t just what I do, but how I do it.

Everything Takes Longer

Slot Crossing Street

Last year my goal was to launch two sites and a tool in collaboration with others. That didn’t happen. Instead, I was able to launch one site in the fourth quarter of 2018.

The truth of the matter is that everything takes longer than you think it will. That report you think is going to take you 15 minutes to crank out takes 30 minutes instead. Now, that might not seem like a lot individually. But it adds up quickly.

It extends even longer when you’re counting on others to realize your vision. As you’ll see later on, I’m not blaming anyone here. But you can’t move the ball forward when one of your collaborators goes dark.

No matter how many times I internalize the ‘everything takes longer than expected’ truth I am still surprised when it surfaces like a shark fin slicing through calm water. I don’t know if that’s a shortcoming or if I’m just perpetually optimistic.

Time is Bendy

This might sound like a Doctor Who quote but that’s not where this is going. While everything seems to take longer than you expect, in retrospect it also seems like you’ve done quite a lot in a short amount of time.

Time is a strange beast.

When those 1099-MISCs start rolling in I realize just how many clients I worked with in a given year. Then I might go through the litany of different projects that I took on that year. It turns out I was very busy and very productive.

So while it never feels like you’re making huge strides while you’re in the thick of things you can look back and see just how far you’ve come. This is the same feeling I get when hiking or cycling.

A view from the top

It doesn’t seem like you’re climbing that much but then you turn around and see how far you’ve gone and can admire the stunning view.

Response Times

One of the things I’ve battled for ages is the speed in which I reply to email. Worse is that the email I don’t respond to at all are for those that I’d like to help. It’s people who I don’t want to say no to but … should. I just don’t have the time.

So I’ll take that initial call and I’ll promise a proposal. I have the best intentions. But in the end I am deep into working and when I think about sending that proposal I can only think about how I’ll fit that work in if they say yes. So I put it off.

Those emails just sit there. Potential work and, more importantly, the promise of help are left dangling. I generally keep those threads as unread mail. Today I have four unread items in my inbox. They are all folks I just … ghosted.

Ghosted

I keep those threads as unread to remind me. Not so much to beat myself up but to ensure that I don’t get into those spots in the future. I can only do so much and while I’d like to do more I know I simply can’t.

If you are one of those four, I apologize. I still think about your projects. I’m happy when I see you mentioned in a mainstream article. I sincerely wish you the best.

Think It, Do It

Just Do It

The good news is that I’m vastly better at responding to most other email. I often got into the habit of thinking about what I have to do. Or thinking about how I’m going respond, essentially typing up the response in my head.

I’ve gotten much better at identifying when I’m doing this and instead actually do it. This has been really transformative. Because I find that it’s often the little things that build up and start weighing me down.

I know many would say that I should focus on the most impactful project first. But that hasn’t worked for me. It makes me less productive if I know there are six other things I need to get to. They all might be smaller tasks but my brain is crunching away on that stuff in the background.

It’s like firing up the Activity Monitor on your computer and seeing all those rogue processes spinning away drawing down the computing power. I need to close those out so I can get more computing power back.

I feel better when I get those small things done. It’s a mini victory of sorts. I can take that momentum and roll it into the larger projects I need to tackle.

Framing

Framing

I realized that I’m incredibly good at framing. Not the artistic kind but the psychological kind.

For instance, I often tell people that I won the cancer lottery. If you’re going to get cancer, follicular lymphoma is the three cherries variety. I’ll die of something else long before this type of cancer takes me down.

I do this all the time. It’s not that I don’t acknowledge that something is tough or troubling. But how you frame it makes a huge difference in how you handle that situation.

Framing is marketing to yourself.

Framing doesn’t change the facts but it does change … how you perceive reality. I acknowledge that it’s a hell of a lot easier to do this when you’re white and financially secure. But I’ve done it my entire life. (Granted, I’ve always been white but not always financially secure.)

I moved out to San Diego with my now wife and we spent a year without a couch. We didn’t have enough money to go to full price movies. But we were together in beautiful San Diego.

I framed the move from Washington D.C to San Diego as an adventure. I framed it as doing something the vast majority don’t. So even if things didn’t work out, the attempt was worth it. The way I framed it, even failure was a success! It seems laughable. I mean, seriously, I’m chuckling to myself right now.

But by framing it that way I was able to enjoy that time so much more. I was able to be less stressed about the eventual outcome and instead just be present in the moment.

Juggling

Feeling Overwhelmed

I finally overcame my guilt of dropping the communications ball. The fact of the matter is that most of us are juggling a lot. And there are plenty of times when I’m on the receiving end of not getting a response.

A friend will put me in touch with someone and I’ll respond with some meeting times. Then I don’t hear from them for a month or more. Eventually they surface and apologize for the delay.

I’ll waive off the apology. “No worries, I totally understand.” Then we pick-up where we left off and see where things go.

I guess I’ve realized that people are far more forgiving about these things. I don’t think anyone intentionally decides they’re going to drop that email thread. Things just … happen.

Because, everything takes more time than you think it will. (See what I did there.)

Success

Soup Dragon's Video Screencap

The business, which was already crazy good, continued to grow.

For a long time part of me figured that people resented my success. Why him and not me? And you know what, those people might be out there. But I no longer think that’s the majority.

In part, this is a realization that my success does not mean that others won’t find their own. This isn’t a zero sum game of people at the top and others at the bottom. I found a niche and others will and have found their own.

There are multiple pathways to success, even within our own small industry. And I’m more than happy to chat with other consultants and give them advice and document templates. There’s more than enough business out there.

Does the income disparity between myself and the average American still make me uneasy? Hell yeah. But me feeling guilty about spending the money I earn doesn’t do much about that except make me less happy.

Guilt is not a good form of activism.

I’m not a big consumer anyway. I don’t rush out to get the new phone or the new TV or the coolest clothes. I eat out a bit more. I travel. I donate more too. That doesn’t earn me gold stars, it’s just what it is.

What I did instead was register marginaltaxratesexplained.com the other week. So please get in touch if you’re a developer or designer who has any interest in educating folks on this topic. Because most people don’t get it.

SEO Success

Last year I managed to launch one out of three ventures. It might sound like I was disappointed but in reality I think one out of three is pretty damn good. (Framing in action folks.)

The one I did manage to launch got traffic right off the bat. And each week it gets more. All this with less than 50 pages of content! It was really a proof of concept for a much larger idea. So 2019 will be about scaling.

I’m super excited about this site. But what it really did was confirm just how effective SEO can be when you approach it correctly. There’s so much opportunity!

There’s a whisper campaign out there about how difficult SEO is getting. The SERPs are getting crowded out by ads and Google is taking away more clicks. It’s even worse on mobile where there’s less screen real estate right?

Sorry, but the sky is not falling. I’m not saying there aren’t challenges. I’m not saying things haven’t changed. It just means we need to change and adapt. Too many are still conducting business using Panda and Penguin as their guardrails.

SEO is easy when you understand how and why people are searching and work to satisfy their intent. That’s a bit of a simplification but … not by much. Target the keyword, optimize the intent. It’s been my mantra for years.

It’s great when you use this approach with a client, make a big bet, and see it pay off.

Rank Index Success Example

The graph above is the result of launching a geographic directory on a client site. Not only has the average rank for this important query class moved from the low teens to approximately four but the conversion rate increased by 30% or more for these queries.

More traffic. Better traffic.

What shouldn’t be downplayed here is that the requirements for the new page type where built around what users searching would expect to see when they landed. SEO was the driving force for product requirements.

SEO isn’t just about keyword research but about knowing what users expect after typing in those words.

Habits

Going into 2019 I’m focusing more on habits. In the past I’ve had explicit goals with varying degrees of success in achieving them.

I have 2019 goals but I also list the habit or habits that will help me reach each goal. I wound up putting on a lot of the weight I lost in 2017. So this year I’m going to lose 32 pounds and hit my target weight of 160.

To do that I’m going to journal my food and weigh myself every day. When I do those things, I know I have a much better chance of reaching that goal and maintaining it. Frankly, getting there is usually easy. I’m already down 12 pounds. Maintenance is more difficult.

Another example is my desire to read more. This is something I want to do but haven fallen short of in recent years. But this time I decided the habit to change was to read before bed instead of falling asleep to the TV.

I already use this methodology with a number of clients, whether it be in maintaining corpus control or in developing asynchronous link-building campaigns. So what’s good for the goose should be good for the gander, right?

Adapting

Adapt or Die

If you read through my ‘What I Learned’ series I think you’ll see that I am good at adapting to situations. In 2018, that was once again put to the test.

I took a nearly month long vacation in Europe. We went to London, Paris, Venice and the South of France. (As an aside, this was a no-work vacation and as such I did not bill clients for that month off. So it’s amazing that the business grew by 20% while I only billed 11 months of work.)

As a family we had a vision of what our vacation would be like. My wife had various ‘walking guides’ to the cities we’d be visiting. We couldn’t wait to go and imagined ourselves trekking around and exploring the rich history of each city.

But a few weeks before we were set to leave my daughter dislocated her kneecap. We were at a court warming up between tournament matches when she suddenly crumpled to the ground, howling in pain.

She had this same injury twice before so we knew the time to recover would extend well into our trip. She wouldn’t be able to walk for any long period of time. But here’s the thing. Instead of thinking about how awful it was going to be, we simply figured out a way to make it work.

I bought foldable canes and we rented a wheelchair when we were in London. It wasn’t what we planned but it worked out amazingly well. I pushed her around London in the wheelchair and you’d be amazed at how many lines you can cut when your child is in that chair or has on a brace and limps around using a cane.

I kid you not, when we went to Versailles, the line to get in was horrendous. Hours for sure. I got in line while my wife and daughter (limping with her cane) went to the front to ask if there was a wheelchair available. The result? We jumped that line and got to see some of the back rooms of Versailles as we secured her wheelchair.

Here’s the back room entrance to the Palace of Versailles.

Back Room Entrance at Palace of Versailles

And here’s the crazy ass key that still opens that door.

Key to Versailles

The point here is that you have to deal with the reality that is right in front of you and not what you hoped it might be. When you embrace the here and now it can turn out to be pretty awesome.

If you take anything away from this post I hope it is this. Because nothing good comes from trying to navigate life when you’re constantly thinking it should have been different.

But that wasn’t what really pushed our ability to adapt. Instead, it was what happened the first night we were in our villa in the South of France.

The Long Story

(Seriously, this is a long story so if you want to bail now that’s cool. I’m going to try to keep it short but it’s still going to be long. I think it ties things together but you might disagree. So … you’ve been warned.)

We rented a gorgeous villa in Saint-Raphaël with a pool and a gorgeous view. It was going to be the relaxing part of a very busy vacation.

I was asleep on the couch downstairs (because I snore) when my wife woke me up by yelling, “AJ, there’s someone in the house!” Heart pounding, I bounded upstairs and saw the briefest of motion to my right and ran to where the sliding glass door was open. I guess I was chasing the burglar out?

I didn’t see much so I ran back inside and checked on my wife (who was fine and, incidentally, a badass) and then immediately went back downstairs to check on my daughter who was in an entirely different room. She was fine and still asleep.

We composed ourselves and took inventory. The burglar had stolen some jewelry, our phones, my wallet and my backpack, which had … our passports. Ugh! They’d pulled my wife’s suitcase out of her room and had rummaged through it and were going back to do the same with mine when my wife woke up and scared him off.

In short, someone had broken into our villa while we slept and robbed us. It was scary as fuck. But it all could have been a whole lot worse. No one was hurt. You can always get ‘things’ back.

And we did almost instantly. The guy must have been so freaked at being chased that he’d dropped my wife’s purse as he fled. I found it just outside on the balcony. Inside? Her wallet and brand new camera! Losing the wallet would have been one thing but the thought of losing a whole trip worth of photos would have been a real blow.

We started making calls, struggling through the international dialing codes while adrenaline continued to course through our veins. We called the property manager, our travel insurance provider and my credit card companies.

It was 3 in the morning so the first few hours weren’t that productive but it allowed us to calm down and come up with a plan of action. By 7 am we starting to hear from everyone and the wheels were put into motion.

Our contact for the rental was Brent Tyler, a Brit who was quite the character. He was always ‘on’ and had a witty response for damn near everything. He’d even written a book about moving from Cookham to Cannes. But what mattered that day was that he spoke fluent French, which was going to be instrumental in helping deal with the local police.

Because that’s what we had to do. The local police came by and then they sent the CSI team later on to take prints and DNA evidence.

French CSIDusting for Prints

Then we had to go to Fréjus to file a police report.

It was a small station fortified by two massive lucite looking doors where you had to be buzzed in. The police officer was a French female version of a stereotypical lazy sheriff. She wasn’t keen to do much for tourists.

But that all changed when she met Brent.

Oh, she had a thing for him! So here I am watching these two flirt as they go through the list of items that were stolen. His French is good but not perfect and she finds that endearing. She’s asking what something means and he’s trying to find the right words to describe it.

I know the French word for yes is ‘oui’ but quickly learn that ‘yeah’ is ‘ouais’ which sounds like ‘whey’. Because this is how Brent responds when he and this police officer settle on something. “Whey, whey, whey, whey” Brent nods as the police officer grins.

It is an odd thing to be in such an awful situation but see these ebullient interactions. I didn’t know whether to be annoyed or happy for the distraction.

Either way we were able to get the report filed, which was particularly good for insurance purposes. Check that off our list and move on. We were feeling good about things.

That’s saying a lot too because Brent never told us to keep all the steel shutters down at night. Hell we didn’t even know the place came with steel shutters! If we’d been told, no one could have broken in. So we had to rely on someone who we were a bit angry with at the time. I think we all figured out a way to make it work and that’s sort of the point.

On the way back to the villa we stopped to get passport photos. Because the next day we had to drive to the U.S. Consulate in Marseille to get new passports. Here’s what I looked like in those photos.

French Passport Photos

They tell you not to smile so I look both tired and pissed off. It’s a nice Andy Warhol type effect though and looking at it now actually makes me smile.

Later that day, someone buzzed at the front gate of the villa and asked if I was there. Who the hell was asking for me here? But it soon became clear that this gentleman had found my driver’s license.

I let him in and learned that he too had been burgled last night along with two others in the neighborhood. They’d taken his glasses and some expensive photography equipment. He was from the Netherlands and said his son found my license out by their trash cans in the morning.

I thanked him profusely and once he left went out to see if I could locate any other items. I trekked up and down those windy roads. I didn’t find anything, though I did meet some very friendly goats.

Friendly French Goats
The next day we drove to Marseille, which was over two hours away. It was a stressful trip.

Things are just different enough to make things difficult. What button do I press and how much do I have to pay at this toll? Why isn’t it working!? What am I doing wrong?! There are cars behind us!

Maybe it was our mood or perhaps it was the area of town but … Marseille was not my jam. It all felt a bit sketch. But again, perhaps my paranoia was just at a high point that day.

We had an appointment at the U.S. Consulate but even then it was like entering some nuclear bunker. The guardhouse had a “sniper map” with a picture of their view of the street in grid format. So if there’s a threat approaching they could simply call in a grid code and, well, I’m not sure what happens but I figure it would be like something out of Sicario.

Past the guardhouse we were led into an interior room where you can’t take anything electronic inside. At this point it doesn’t feel like those movies where you run to the embassy for assistance and they say “you’re okay, now you’re on American soil.” No, it was the TSA on steroids instead.

Once inside it turned out to be a pretty mundane room that, apparently, hadn’t been updated since the late 80s. A state department worker tells us that we can start the process of getting new passports by filling out the forms online. Oh, and those passport photos we got aren’t going to work. It’s a total scam. They’ll take our photos here instead.

My wife and I start filling out the forms online and just as we’re about to move on to my daughter’s passport the state department woman barges out and tells us to stop. It’s … dramatic. She’s just received a call that someone, a neighbor, has found our passports!

Yes, while we are there applying for new passports, someone called to tell us they found our stolen passports. This neighbor called the police in Fréjus who said they had no information on lost passports. (Yeah, not true!) So he took the next step and called the U.S. Embassy in Paris, who then put him through to our contact in Marseilles.

I am in awe that this stranger went to these lengths and at the incredible timing of his call. The state department contact tells us that this is only the second time in ten years that this has happened.

She goes on to tell us that these break-ins are a huge problem in the area and have been getting worse over the past few years. They come in through the forest to avoid the gates that bar entrance to the community on the road. She describes a pile of found credit cards and passports at the end of every season.

She checks to make sure that our new passport requests haven’t gone through and we arrange to meet with our neighbor later that day when we return. Things are looking up so we take the scenic way home and spend a few hours at the beach in La Ciotat.

Once home we meet up with our neighbors who tell us my passport case was hidden in his wheel well. Not only are the passports there but they missed the cash I’d stuffed into one of the interior pockets. Bonus!

Our neighbors are very funny and kind. They also tell us that they too were burgled many years ago and that’s why they had steel shutters installed. Ah, if only we’d known.

Sleeping in the villa is still difficult but … we make it work and try to have as much fun as we can. Not having our phones is a pain but my daughter’s phone and the iPad were left untouched so we’re still digitally functional.

But it’s not quite over.

On Monday we get an email confirming that our passports have been cancelled. What the hell! It turns out the online forms we’d filled out were, in fact, submitted. So the next few days are spent talking and emailing with our state department contact.

She is clearly embarrassed that she sent us home only to get this notice a few days later. She reaches out to DHS and asks them to undo the cancellation. Our contact even sends me a snippet of her Skype conversation where the DHS says that they’re not supposed to do that anymore but … they’ll make an exception.

So it seems like we’re in the clear. The problem is she isn’t quite sure if the new status will propagate through the entire border control database before we depart. There’s a chance we go to leave via Charles de Gaulle and are suddenly being swarmed by folks with guns wearing body armor.

The odds are that won’t happen but it’s still hard not to think about that potential outcome. At some point I just figured that if the worst did happen it would mean another week at a hotel and a few more days in Paris. It might be inconvenient and expensive but things would work out.

Of course, nothing of the sort happened. We handed a stone faced man our passports and he stamped them and with a sigh of relief we went to get something to eat before we boarded the plane.

The Take Aways

See, I told you it was a long story. But here’s the thing. I still think of that vacation as being … great. I could certainly frame it differently. I could frame it as how our grand vacation was ruined by this awful event. But I don’t. What does that accomplish?

I am not saying everything happens for a reason. I hate that saying. Instead, I’d simply say that chaos is the general thread of all life. How you handle it is what matters.

I also think of all the people that helped us. Sure there was the dirtbag who broke in and stole our stuff but there were more people who chipped in to get us back on our feet. Even the dirtbag didn’t hurt anyone and actually left our passports in a place where they were likely to be found. I’d like to believe that was on purpose.

I was also able to see that my anger at Brent wasn’t useful. I could tell he felt like shit and was willing to do what he could to assist us as a result. Even the French police officer who didn’t seem to care … came through in her own way.

Now, I don’t think these things happen just by accident. I don’t think we would have received as much help as we did if we weren’t hustling to help ourselves, to be our own advocate and to ask for what we needed. Like I said, the thread of every life is chaos. It’s not if something is going to happen it’s when.

So it’s up to you to do as much as you can. When others see that you’re willing to try, they try too. Can it be that simple? I don’t know.

Conversely, it also struck me that this incident was both a big deal and meaningless at the same time. At the end of the day, it does turn into a story. It’s fodder for a blog post. Lives go on. Business continues. No one truly cares. I mean, people care but … it’s not a huge deal.

There were three other families who had the same experience. What I went through was not unique. That is oddly comforting. Just as it is when I think about my business issues. They are not unique. They’re still important but I try not to take them too seriously.

I took two other things away from this experience that might not be apparent from my narrative. The first is that exceptions can be made so everyone doesn’t get the same treatment.

While there’s no guarantee that you’ll be the exception to that rule, you never know unless you ask. Ask nicely but never settle. Never stop pushing because you’re not bumping up against something like gravity or the first law of motion. These are not immutable laws. They are rules made by imperfect humans. Sometimes they can change or be bent.

The second take away was that you need the help of others to reach your goals. I am perpetually grateful to the many folks who helped me get to where I am and continue to help me to this day. But it goes beyond that. Historically, I am very bad at letting go of things. I like doing things myself. I get fed up easily and feel like many are simply allergic to work.

But I was put in a situation where I needed the guy who spoke French and the woman fighting to un-cancel our passports. I couldn’t do those things. So it’s one thing to know that others help you achieve your goals but it’s quite another to experience it first hand.

As a result I’ve been able to take my hands off the reigns a lot more and let others do what they’re good at, leaving me more time to … think.

Algorithm Analysis In The Age of Embeddings

November 19 2018 // Analytics + SEO // 55 Comments

On August 1st, 2018 an algorithm update took 50% of traffic from a client site in the automotive vertical. An analysis of the update made me certain that the best course of action was … to do nothing. So what happened?

Algorithm Changes Google Analytics

Sure enough, on October 5th, that site regained all of its traffic. Here’s why I was sure doing nothing was the right thing to do and why I dismissed any E-A-T chatter.

E-A-T My Shorts

Eat Pant

I find the obsession with the Google Rating Guidelines to be unhealthy for the SEO community. If you’re unfamiliar with this acronym it stands for Expertise, Authoritativeness and Trustworthiness. It’s central to the published Google Rating Guidelines.

The problem is those guidelines and E-A-T are not algorithm signals. Don’t believe me? Believe Ben Gomes, long-time search quality engineer and new head of search at Google.

“You can view the rater guidelines as where we want the search algorithm to go,” Ben Gomes, Google’s vice president of search, assistant and news, told CNBC. “They don’t tell you how the algorithm is ranking results, but they fundamentally show what the algorithm should do.”

So I am triggered when I hear someone say they “turned up the weight of expertise” in a recent algorithm update. Even if the premise were true, you have to connect that to how the algorithm would reflect that change. How would Google make changes algorithmically to reflect higher expertise?

Google doesn’t have three big knobs in a dark office protected by biometric scanners that allows them to change E-A-T at will.

Tracking Google Ratings

Before I move on I’ll do a deeper dive into quality ratings. I poked around to see if there are material patterns to Google ratings and algorithmic changes. It’s pretty easy to look at referring traffic from the sites that perform ratings.

Tracking Google Ratings in Analytics

The four sites I’ve identified are raterlabs.com, raterhub.com, leapforceathome.com and appen.com. At present there’s really only variants of appen.com, which rebranded in the last few months. Either way, create an advanced segment and you can start to see when raters have visited your site.

And yes, these are ratings. A quick look at the referral path makes it clear.

Raters Program Referral Path

The /qrp/ stands for quality rating program and the needs_met_simulator seems pretty self-explanatory.

It can be interesting to then look at the downstream traffic for these domains.

SEMRush Downstream Traffic for Raterhub.com

Go the extra distance and you can determine what page(s) the raters are accessing on your site. Oddly, they generally seem to focus on one or two pages, using them as a representative for quality.

Beyond that, the patterns are hard to tease out, particularly since I’m unsure what tasks are truly being performed. A much larger set of this data across hundreds (perhaps thousands) of domains might produce some insight but for now it seems a lot like reading tea leaves.

Acceptance and Training

The quality rating program has been described in many ways so I’ve always been hesitant to label it one thing or another. Is it a way for Google to see if their recent algorithm changes were effective or is it a way for Google to gather training data to inform algorithm changes?

The answer seems to be yes.

Appen Home Page Messaging

Appen is the company that recruits quality raters. And their pitch makes it pretty clear that they feel their mission is to provide training data for machine learning via human interactions. Essentially, they crowdsource labeled data, which is highly sought after in machine learning.

The question then becomes how much Google relies on and uses this set of data for their machine learning algorithms.

“Reading” The Quality Rating Guidelines

Invisible Ink

To understand how much Google relies on this data, I think it’s instructive to look at the guidelines again. But for me it’s more about what the guidelines don’t mention than what they do mention.

What query classes and verticals does Google seem to focus on in the rating guidelines and which ones are essentially invisible? Sure, the guidelines can be applied broadly, but one has to think about why there’s a larger focus on … say, recipes and lyrics, right?

Beyond that, do you think Google could rely on ratings that cover a microscopic percentage of total queries? Seriously. Think about that. The query universe is massive! Even the query class universe is huge.

And Google doesn’t seem to be adding resources here. Instead, in 2017 they actually cut resources for raters. Now perhaps that’s changed but … I still can’t see this being a comprehensive way to inform the algorithm.

The raters clearly function as a broad acceptance check on algorithm changes (though I’d guess these qualitative measures wouldn’t outweigh the quantitative measures of success) but also seem to be deployed more tactically when Google needs specific feedback or training data for a problem.

Most recently that was the case with the fake news problem. And at the beginning of the quality rater program I’m guessing they were struggling with … lyrics and recipes.

So if we think back to what Ben Gomes says, the way we should be reading the guidelines is about what areas of focus Google is most interested in tackling algorithmically. As such I’m vastly more interested in what they say about queries with multiple meanings and understanding user intent.

At the end of the day, while the rating guidelines are interesting and provide excellent context, I’m looking elsewhere when analyzing algorithm changes.

Look At The SERP

This Tweet by Gianluca resonated strongly with me. There’s so much to be learned after an algorithm update by actually looking at search results, particularly if you’re tracking traffic by query class. Doing so I came to a simple conclusion.

For the last 18 months or so most algorithm updates have been what I refer to as language understanding updates.

This is part of a larger effort by Google around Natural Language Understanding (NLU), sort of a next generation of Natural Language Processing (NLP). Language understanding updates have a profound impact on what type of content is more relevant for a given query.

For those that hang on John Mueller’s every word, you’ll recognize that many times he’ll say that it’s simply about content being more relevant. He’s right. I just don’t think many are listening. They’re hearing him say that, but they’re not listening to what it means.

Neural Matching

The big news in late September 2018 was around neural matching.

But we’ve now reached the point where neural networks can help us take a major leap forward from understanding words to understanding concepts. Neural embeddings, an approach developed in the field of neural networks, allow us to transform words to fuzzier representations of the underlying concepts, and then match the concepts in the query with the concepts in the document. We call this technique neural matching. This can enable us to address queries like: “why does my TV look strange?” to surface the most relevant results for that question, even if the exact words aren’t contained in the page. (By the way, it turns out the reason is called the soap opera effect).

Danny Sullivan went on to refer to them as super synonyms and a number of blog posts sought to cover this new topic. And while neural matching is interesting, I think the underlying field of neural embeddings is far more important.

Watching search results and analyzing keyword trends you can see how the content Google chooses to surface for certain queries changes over time. Seriously folks, there’s so much value in looking at how the mix of content changes on a SERP.

For instance, the query ‘Toyota Camry Repair’ is part of a query class that has fractured intent. What is it that people are looking for when they search this term? Are they looking for repair manuals? For repair shops? For do-it-yourself content on repairing that specific make and model?

Google doesn’t know. So it’s been cycling through these different intents to see which of them performs the best. You wake up one day and it’s repair manuals. A month of so later they essentially disappear.

Now, obviously this isn’t done manually. It’s not even done in a traditional algorithmic sense. Instead it’s done through neural embeddings and machine learning.

Neural Embeddings

Let me first start out by saying that I found a lot more here than I expected as I did my due diligence. Previously, I had done enough reading and research to get a sense of what was happening to help inform and explain algorithmic changes.

And while I wasn’t wrong, I found I was way behind on just how much had been taking place over the last few years in the realm of Natural Language Understanding.

Oddly, one of the better places to start is at the end. Very recently, Google open-sourced something called BERT.

Bert

BERT stands for Bidirectional Encoder Representations from Transformers and is a new technique for pre-NLP training.  Yeah, it gets dense quickly. But the following excerpt helped put things into perspective.

Pre-trained representations can either be context-free or contextual, and contextual representations can further be unidirectional or bidirectional. Context-free models such as word2vec or GloVe generate a single word embedding representation for each word in the vocabulary. For example, the word “bank” would have the same context-free representation in “bank account” and “bank of the river.” Contextual models instead generate a representation of each word that is based on the other words in the sentence. For example, in the sentence “I accessed the bank account,” a unidirectional contextual model would represent “bank” based on “I accessed the” but not “account.” However, BERT represents “bank” using both its previous and next context — “I accessed the … account” — starting from the very bottom of a deep neural network, making it deeply bidirectional.

I was pretty well-versed in how word2vec worked but I struggled to understand how intent might be represented. In short, how would Google be able to change the relevant content delivered on ‘Toyota Camry Repair’ algorithmically?  The answer is, in some ways, contextual word embedding models.

Vectors

None of this may make sense if you don’t understand vectors. I believe many, unfortunately, run for the hills when the conversation turns to vectors. I’ve always referred to vectors as ways to represent words (or sentences or documents) via numbers and math.

I think these two slides from a 2015 Yoav Goldberg presentation on Demystifying Neural Word Embeddings does a better job of describing this relationship.

Words as Vectors

So you don’t have to fully understand the verbiage of “sparse, high dimensional” or the math behind cosine distance to grok how vectors work and can reflect similarity.

You shall know a word by the company it keeps.

That’s a famous quote from John Rupert Firth, a prominent linguist and the general idea we’re getting at with vectors.

word2vec

In 2013, Google open-sourced word2vec, which was a real turning point in Natural Language Understanding. I think many in the SEO community saw this initial graph.

Country to Capital Relationships

Cool right? In addition there was some awe around vector arithmetic where the model could predict that [King] – [Man] + [Woman] = [Queen]. It was a revelation of sorts that semantic and syntactic structures were preserved.

Or in other words, vector math really reflected natural language!

What I lost track of was how the NLU community began to unpack word2vec to better understand how it worked and how it might be fine tuned. A lot has happened since 2013 and I’d be thunderstruck if much of it hadn’t worked its way into search.

Context

These 2014 slides about Dependency Based Word Embeddings really drives the point home. I think the whole deck is great but I’ll cherry pick to help connect the dots and along the way try to explain some terminology.

The example used is looking at how you might represent the word ‘discovers’. Using a bag of words (BoW) context with a window of 2 you only capture the two words before and after the target word. The window is the number of words around the target that will be used to represent the embedding.

Word Embeddings using BoW Context

So here, telescope would not be part of the representation. But you don’t have to use a simple BoW context. What if you used another method to create the context or relationship between words. Instead of simple words-before and words-after what if you used syntactic dependency – a type of representation of grammar.

Embedding based on Syntactic Dependency

Suddenly telescope is part of the embedding. So you could use either method and you’d get very different results.

Embeddings Using Different Contexts

Syntactic dependency embeddings induce functional similarity. BoW embeddings induce topical similarity. While this specific case is interesting the bigger epiphany is that embeddings can change based on how they are generated.

Google’s understanding of the meaning of words can change.

Context is one way, the size of the window is another, the type of text you use to train it or the amount of text it’s using are all ways that might influence the embeddings. And I’m certain there are other ways that I’m not mentioning here.

Beyond Words

Words are building blocks for sentences. Sentences building blocks for paragraphs. Paragraphs building blocks for documents.

Sentence vectors are a hot topic as you can see from Skip Thought Vectors in 2015 to An Efficient Framework for Learning Sentence RepresentationsUniversal Sentence Encoder and Learning Semantic Textual Similarity from Conversations in 2018.

Universal Sentence Encoders

Google (Tomas Mikolov in particular before he headed over to Facebook) has also done research in paragraph vectors. As you might expect, paragraph vectors are in many ways a combination of word vectors.

In our Paragraph Vector framework (see Figure 2), every paragraph is mapped to a unique vector, represented by a column in matrix D and every word is also mapped to a unique vector, represented by a column in matrix W. The paragraph vector and word vectors are averaged or concatenated to predict the next word in a context. In the experiments, we use concatenation as the method to combine the vectors.

The paragraph token can be thought of as another word. It acts as a memory that remembers what is missing from the current context – or the topic of the paragraph. For this reason, we often call this model the Distributed Memory Model of Paragraph Vectors (PV-DM).

The knowledge that you can create vectors to represent sentences, paragraphs and documents is important. But it’s more important if you think about the prior example of how those embeddings can change. If the word vectors change then the paragraph vectors would change as well.

And that’s not even taking into account the different ways you might create vectors for variable-length text (aka sentences, paragraphs and documents).

Neural embeddings will change relevance no matter what level Google is using to understand documents.

Questions

But Why?

You might wonder why there’s such a flurry of work on sentences. Thing is, many of those sentences are questions. And the amount of research around question and answering is at an all-time high.

This is, in part, because the data sets around Q&A are robust. In other words, it’s really easy to train and evaluate models. But it’s also clearly because Google sees the future of search in conversational search platforms such as voice and assistant search.

Apart from the research, or the increasing prevalence of featured snippets, just look at the title Ben Gomes holds: vice president of search, assistant and news. Search and assistant are being managed by the same individual.

Understanding Google’s structure and current priorities should help future proof your SEO efforts.

Relevance Matching and Ranking

Obviously you’re wondering if any of this is actually showing up in search. Now, even without finding research that supports this theory, I think the answer is clear given the amount of time since word2vec was released (5 years), the focus on this area of research (Google Brain has an area of focus on NLU) and advances in technology to support and productize this type of work (TensorFlow, Transformer and TPUs).

But there is plenty of research that shows how this work is being integrated into search. Perhaps the easiest is one others have mentioned in relation to Neural Matching.

DRMM with Context Sensitive Embeddings

The highlighted part makes it clear that this model for matching queries and documents moves beyond context-insensitive encodings to rich context-sensitive encodings. (Remember that BERT relies on context-sensitive encodings.)

Think for a moment about how the matching model might change if you swapped the BoW context for the Syntactic Dependency context in the example above.

Frankly, there’s a ton of research around relevance matching that I need to catch up on. But my head is starting to hurt and it’s time to bring this back down from the theoretical to the observable.

Syntax Changes

I became interested in this topic when I saw certain patterns emerge during algorithm changes. A client might see a decline in a page type but within that page type some increased while others decreased.

The disparity there alone was enough to make me take a closer look. And when I did I noticed that many of those pages that saw a decline didn’t see a decline in all keywords for that page.

Instead, I found that a page might lose traffic for one query phrase but then gain back part of that traffic on a very similar query phrase. The difference between the two queries was sometimes small but clearly enough that Google’s relevance matching had changed.

Pages suddenly ranked for one type of syntax and not another.

Here’s one of the examples that sparked my interest in August of 2017.

Query Syntax Changes During Algorithm Updates

This page saw both losers and winners from a query perspective. We’re not talking small disparities either. They lost a lot on some but saw a large gain in others. I was particularly interested in the queries where they gained traffic.

Identifying Syntax Winners

The queries with the biggest percentage gains were with modifiers of ‘coming soon’ and ‘approaching’. I considered those synonyms of sorts and came to the conclusion that this page (document) was now better matching for these types of queries. Even the gains in terms with the word ‘before’ might match those other modifiers from a loose syntactic perspective.

Did Google change the context of their embeddings? Or change the window? I’m not sure but it’s clear that the page is still relevant to a constellation of topical queries but that some are more relevant and some less based on Google’s understanding of language.

Most recent algorithm updates seem to be changes in the embeddings used to inform the relevance matching algorithms.

Language Understanding Updates

If you believe that Google is rolling out language understanding updates then the rate of algorithm changes makes more sense. As I mentioned above there could be numerous ways that Google tweaks the embeddings or the relevance matching algorithm itself.

Not only that but all of this is being done with machine learning. The update is rolled out and then there’s a measurement of success based on time to long click or how quickly a search result satisfies intent. The feedback or reinforcement learning helps Google understand if that update was positive or negative.

One of my recent vague Tweets was about this observation.

Or the dataset that feeds an embedding pipeline might update and the new training model is then fed into system. This could also be vertical specific as well since Google might utilize a vertical specific embeddings.

August 1 Error

Based on that last statement you might think that I thought the ‘medic update’ was aptly named. But you’d be wrong. I saw nothing in my analysis that led me to believe that this update was utilizing a vertical specific embedding for health.

The first thing I do after an update is look at the SERPs. What changed? What is now ranking that wasn’t before? This is the first way I can start to pick up the ‘scent’ of the change.

There are times when you look at the newly ranked pages and, while you may not like it, you can understand why they’re ranking. That may suck for your client but I try to be objective. But there are times you look and the results just look bad.

Misheard Lyrics

The new content ranking didn’t match the intent of the queries.

I had three clients who were impacted by the change and I simply didn’t see how the newly ranked pages would effectively translate into better time to long click metrics. By my way of thinking, something had gone wrong during this language update.

So I wasn’t keen on running around making changes for no good reason. I’m not going to optimize for a misheard lyric. I figured the machine would eventually learn that this language update was sub-optimal.

It took longer than I’d have liked but sure enough on October 5th things reverted back to normal.

August 1 Updates

Where's Waldo

However, there were two things included in the August 1 update that didn’t revert. The first was the YouTube carousel. I’d call it the Video carousel but it’s overwhelmingly YouTube so lets just call a spade a spade.

Google seems to think that the intent of many queries can be met by video content. To me, this is an over-reach. I think the idea behind this unit is the old “you’ve got chocolate in my peanut butter” philosophy but instead it’s more like chocolate in mustard. When people want video content they … go search on YouTube.

The YouTube carousel is still present but its footprint is diminishing. That said, it’ll suck a lot of clicks away from a SERP.

The other change was far more important and is still relevant today. Google chose to match question queries with documents that matched more precisely. In other words, longer documents receiving questions lost out to shorter documents that matched that query.

This did not come as a surprise to me since the user experience is abysmal for questions matching long documents. If the answer to your question is in the 8th paragraph of a piece of content you’re going to be really frustrated. Google isn’t going to anchor you to that section of the content. Instead you’ll have to scroll and search for it.

Playing hide and go seek for your answer won’t satisfy intent.

This would certainly show up in engagement and time to long click metrics. However, my guess is that this was a larger refinement where documents that matched well for a query where there were multiple vector matches were scored lower than those where there were fewer matches. Essentially, content that was more focused would score better.

Am I right? I’m not sure. Either way, it’s important to think about how these things might be accomplished algorithmically. More important in this instance is how you optimize based on this knowledge.

Do You Even Optimize?

So what do you do if you begin to embrace this new world of language understanding updates? How can you, as an SEO, react to these changes?

Traffic and Syntax Analysis

The first thing you can do is analyze updates more rationally. Time is a precious resource so spend it looking at the syntax of terms that gained and lost traffic.

Unfortunately, many of the changes happen on queries with multiple words. This would make sense since understanding and matching those long-tail queries would change more based on the understanding of language. Because of this, many of the updates result in material ‘hidden’ traffic changes.

All those queries that Google hides because they’re personally identifiable are ripe for change.

That’s why I spent so much time investigating hidden traffic. With that metric, I could better see when a site or page had taken a hit on long-tail queries. Sometimes you could make predictions on what type of long-tail queries were lost based on the losses seen in visible queries. Other times, not so much.

Either way, you should be looking at the SERPs, tracking changes to keyword syntax, checking on hidden traffic and doing so through the lens of query classes if at all possible.

Content Optimization

This post is quite long and Justin Briggs has already done a great job of describing how to do this type of optimization in his On-page SEO for NLP post. How you write is really, really important.

My philosophy of SEO has always been to make it as easy as possible for Google to understand content. A lot of that is technical but it’s also about how content is written, formatted and structured. Sloppy writing will lead to sloppy embedding matches.

Look at how your content is written and tighten it up. Make it easier for Google (and your users) to understand.

Intent Optimization

Generally you can look at a SERP and begin to classify each result in terms of what intent it might meet or what type of content is being presented. Sometimes it’s as easy as informational versus commercial. Other times there are different types of informational content.

Certain query modifiers may match a specific intent. In its simplest form, a query with ‘best’ likely requires a list format with multiple options. But it could also be the knowledge that the mix of content on a SERP changed, which would point to changes in what intent Google felt was more relevant for that query.

If you follow the arc of this story, that type of change is possible if something like BERT is used with context sensitive embeddings that are receiving reinforcement learning from SERPs.

I’d also look to see if you’re aggregating intent. Satisfy active and passive intent and you’re more likely to win. At the end of the day it’s as simple as ‘target the keyword, optimize the intent’. Easier said than done I know. But that’s why some rank well and others don’t.

This is also the time to use the rater guidelines (see I’m not saying you write them off completely) to make sure you’re meeting the expectations of what ‘good content’ looks like. If your main content is buried under a whole bunch of cruft you might have a problem.

Much of what I see in the rater guidelines is about capturing attention as quickly as possible and, once captured, optimizing that attention. You want to mirror what the user searched for so they instantly know they got to the right place. Then you have to convince them that it’s the ‘right’ answer to their query.

Engagement Optimization

How do you know if you’re optimizing intent? That’s really the $25,000 question. It’s not enough to think you’re satisfying intent. You need some way to measure that.

Conversion rate can be one proxy? So too can bounce rate to some degree. But there are plenty of one page sessions that satisfy intent. The bounce rate on a site like StackOverflow is super high. But that’s because of the nature of the queries and the exactness of the content. I still think measuring adjusted bounce rate over a long period of time can be an interesting data point.

I’m far more interested in user interactions. Did they scroll? Did they get to the bottom of the page? Did they interact with something on the page? These can all be tracking in Google Analytics as events and the total number of interactions can then be measured over time.

I like this in theory but it’s much harder to do in practice. First, each site is going to have different types of interactions so it’s never an out of the box type of solution. Second, sometimes having more interactions is a sign of bad user experience. Mind you, if interactions are up and so too is conversion then you’re probably okay.

Yet, not everyone has a clean conversion mechanism to validate interaction changes. So it comes down to interpretation. I personally love this part of the job since it’s about getting to know the user and defining a mental model. But very few organizations embrace data that can’t be validated with a p-score.

Those who are willing to optimize engagement will inherit the SERP.

There are just too many examples where engagement is clearly a factor in ranking. Whether it be a site ranking for a competitive query with just 14 words or a root term where low engagement has produced a SERP geared for a highly engaging modifier term instead.

Those bound by fears around ‘thin content’ as it relates to word count are missing out, particularly when it comes to Q&A.

TL;DR

Recent Google algorithm updates are changes to their understanding of language. Instead of focusing on E-A-T, which are not algorithmic factors, I urge you to look at the SERPs and analyze your traffic including the syntax of the queries.

Tracking Hidden Long-Tail Search Traffic

January 25 2018 // Analytics + SEO // 11 Comments

A lot of my work is on large consumer facing sites. As such, they get a tremendous amount of long-tail traffic. That’s right, long-tail search isn’t dead. But you might think so when you look at Google Search Console.

Hidden Search Traffic

I’ve found there’s more data in Google Search Console than you might believe. Here’s what I’m doing to track hidden long-tail search traffic.

Traffic Hazards

The first step in understanding how to track long-tail search is to make sure you’re not making mistakes in interpreting Google Search Console data.

Last year I wrote about the dangers of using the position metric. You can only use it reliably when looking at it on the query level and not the page level.

Today, I’m going the other direction. I’m looking at traffic by page but will be doing so to uncover a new type of metric – hidden traffic.

Page Level Traffic

The traffic for a single page in Google Search Console is comprehensive. That’s all the traffic to a specific page in that time frame.

Page Level Metrics from Google Search Console

But a funny thing happens when you look at the query level data below this page level data.

Query Level Data for a Page in Google Search Console

The numbers by query do not add up to the page level total. I know the first reaction many have is to curse Google and write off the data as being bad. But that would actually be a bad idea.

The difference between these two numbers are the queries that Google is suppressing because they are either too small and/or personally identifiable. The difference between the page total and visible total is your hidden long-tail traffic.

Calculating Hidden Traffic

Finding the amount of hidden long-tail traffic turns out to be relatively easy. First, download the query level data for that page. You’ll need to make sure that you don’t have more than 1,000 rows or else you won’t be able to properly count the visible portion of your traffic.

Once downloaded you calculate the visible total for those queries.

Visible Total for Page Level Queries

So you’ll have a sum of clicks, sum of impressions, a calculated clickthrough rate and then calculate a weighted average for position. The latter is what seems to trip a lot of folks up so here’s that calculation in detail.

=SUMPRODUCT(Ex:Ex,Cx:Cx)/SUM(Cx:Cx)

What this means is you’re getting the sum product of impressions and rank and then dividing that by the sum of impressions.

Next you manually put in the page total data we’ve been provided. Remember, we know this represents all of the data.

The clicks are easy. The impressions are rounded in the new Search Console. I don’t like that and I hope it changes. For now you could revert to the old version of search console if you’re only looking at data in the last 90 days.

(Important! The current last 7 days option in Search Console Beta is actually representative of only 6 days of data. WTF!)

From there I calculate and validate the CTR. Last is the average position.

To find the hidden long-tail traffic all you have to do is subtract the visible total from the page total. You only do that for clicks and impressions. Do not do that for CTR folks. You do the CTR calculation on the click and impression numbers.

Finally, you calculate the weighted position for the hidden traffic. The latter is just a bit of algebra at the end of the day. Here’s the equation.

=((C110*E110)-(C109*E109))/C111

What this is doing is taking the page total impressions * page total rank – visible page total impressions * visible page total rank and dividing that by the hidden page total impressions to arrive at the hidden page total rank.

The last thing I’ve done here is determine the percentage of clicks and impressions that are hidden for this page.

Hidden Traffic Total for Page Level Traffic

In this instance you can see that 26% of the traffic is hidden and … it doesn’t perform particularly well.

Using The Hidden Traffic Metric

This data alone is interesting and may lead you to investigate whether you can increase your long-tail traffic in raw numbers and as a percentage of total traffic. It can be good to know what pages are reliant on the more narrow visible queries and what pages draw from a larger number of hidden queries.

In fact, when we had full keyword visibility there was a very predictable metric around number of keywords per page that mapped to increases in authority. It still happens today, we just can’t easily see when it happens.

But one of the more interesting applications is in monitoring these percentages over time.

Comparing Visible and Hidden Traffic Over Time

What happens to these metrics when a page loses traffic. I took two time periods (of equal length) and then determined the percentage loss for visible, total and hidden.

In this instance the loss was almost exclusively in visible traffic. The aggregate position number (dangerous to rely on for specificity but good for finding the scent of a problem) leads me to believe it’s a ranking problem for visible keywords. So my job is to look at specific keywords to find which ones dropped in rank.

What really got me curious was when the opposite happens.

Hidden Traffic Loss

Here the page suffered a 29% traffic loss but nearly all of it was in hidden traffic. My job at that point is to figure out what type of long-tail queries suddenly evaporated. This isn’t particularly easy but there are clues in the visible traffic.

When I figured it out things got very interesting. I spent the better part of the last three months doing additional analysis along with a lot of technical reading.

I’ll cover the implications of changes to hidden traffic in my next post.

Caveats and Traps

Slow Your Roll

This type of analysis is not particularly easy and it does come with a fair number of caveats and traps. The first is the assumption that the page level data we get from Google Search Console is accurate and comprehensive. I’ve been told it is and it seems to line up to Google Analytics data. #ymmv

The second is that the data provided at the query level is consistent. In fact, we know it isn’t since Google made an update to the data collection and presentation in July of 2017.

Google Search Analytics Data Changes

Mind you, there were some other things that happened during that time and if you were doing this type of analysis then (which is when I started in earnest) you learned quite a bit.

You also must select a time period for that page that doesn’t have more than 1,000 visible queries. Without knowing the total visible query total you can’t calculate your hidden total. Finding the right timeframe can sometimes be difficult when looking at high volume pages.

One of the traps you might fall into is assuming that the queries in each bucket remain stable. That’s not always the case. Sometimes the composition of visible queries changes. And it’s hard to know whether hidden queries were promoted to visible or vice versa.

There are ways to control for some of this in terms of the total number of visible terms along with looking at not just the raw change in these cohorts but the percentage changes. But it can get messy sometimes.

In those situations it’s down to interpretation. Use that brain of yours to figure out what’s going on.

Next Steps and Requests

Shia Labeouf Just Do It

I’ve been playing with this metric for a while now but I have yet to automate the process. Adjacent to automation is the 1,000 visible query limit, which can be eliminated by using the API or tools like Supermetrics and/or Data Studio.

While performing this analysis on a larger set of pages would be interesting, I’ve found enough through this manual approach to keep me busy. I’m hopeful that someone will be excited to do the work to automate these calculations now that we have access to a larger data set in Google Search Console.

Of course, none of that would be necessary if Google simply provided this data. I’m not talking about the specific hidden queries. We know we’re never getting that.

Just give us a simple row at the end of the visible query rows that provides the hidden traffic aggregate metrics. An extra bonus would be to tell us the number of keywords that compose that hidden traffic.

After publishing this, John Mueller reminded me that this type of presentation is already integrated into Google Analytics if you have the Search Console integration.

The presentation does most of what is on my wishlist.

Other term in Google Analytics Search Console Integration

Pretty cool right? But it would be nice if (other) instead said (167 other search queries). The real problem with this is the data. It’s not comprehensive. Here’s the downloaded data for the page above including the (other) row.

GA Search Console Data Incomplete

It’s an interesting sub-set of the hidden queries but it’s incomplete. So fix the data discrepancy or port the presentation over into search console and we’re good. :-)

TL;DR

You can track hidden long-tail search traffic using Google Search Console data with some straight-forward math. Understanding and monitoring hidden traffic can help diagnose ranking issues and other algorithmic shifts.