It’s been a frustrating few weeks of discussion about Authorship and Author Rank.
Here I will present a few things that may give you some more context overall and, in particular, my point of view on things.
Social Computing Research
Just the other day Google revealed that it gave $1.2 million dollars in awards to those undertaking social computing research.
We know that interactions on the Web are diverse and people-centered. Google now enables social interactions to occur across many of our products, from Google+ to Search to YouTube. To understand the future of this socially connected web, we need to investigate fundamental patterns, design principles, and laws that shape and govern these social interactions.
We envision research at the intersection of disciplines including Computer Science, Human-Computer Interaction (HCI), Social Science, Social Psychology, Machine Learning, Big Data Analytics, Statistics and Economics. These fields are central to the study of how social interactions work, particularly driven by new sources of data, for example, open data sets from Web2.0 and social media sites, government databases, crowdsourcing, new survey techniques, and crisis management data collections. New techniques from network science and computational modeling, social network and sentiment analysis, application of statistical and machine learning, as well as theories from evolutionary theory, physics, and information theory, are actively being used in social interaction research.
We’re pleased to announce that Google has awarded over $1.2 million dollars to support the Social Interactions Research Awards, which are given to university research groups doing work in social computing and interactions. Research topics range from crowdsourcing, social annotations, a social media behavioral study, social learning, conversation curation, and scientific studies of how to start online communities.
What this says to me is that Google is intensely interested in understanding how to use social interaction data. But they’re not there yet. And why should they be? They’ve been working on link based signals and refinement for over 10 years but haven’t delved into social data until the last few.
This is a discipline that they are far from fully understanding. I can’t help but pick out words like ‘investigate’, ‘envision’ and ‘new’. This is a post about the exploration of the effects of social interaction on a host of fields. These are not papers as to their conclusions.
But we do have a few of those papers, areas where Google has begun to learn about how social interactions or signals might impact search. Lets take social annotations as an example.
Social Annotations in Web Search
This research was presented at the 2012 Conference on Human Factors in Computing Systems.
Remember when our SERPs had a whole bunch of smaller faces in them and other various social gestures? Well, Google found that those didn’t work. We hardly noticed them and when we did we didn’t always believe they added value.
In fact, the only thing that really did was the Authorship snippet. It’s a very interesting read if you’re interested in design and authority. The way we see results today is clearly influenced by this research and you can see Google learning more about how social connections and expertise work within search.
This study revealed a counter-intuitive result. Despite having the names and faces of familiar people, and despite being intended to be noticeable to searchers, subjects for the most part did not pay attention to the social annotations.
Our questions about contact closeness, expertise, and topic were answered by the reactions captured during the retrospective interviews. These interviews revealed the importance of contact expertise and closeness, and the importance of the search topics in determining whether social signals are useful, thus echoing past ﬁndings on the role of expertise in social search.
I walk away thinking that all of this is much tougher than we believe peering in from the outside. That and Google is at the start of this research, not the end.
Knowing that they aspire to understand these dynamics also makes the closure of Google Reader odd since there is a substantial amount of data that could be mined there, all tied back to identity and, by extension, topical expertise.
Whisper Down The Lane
Overall, authorship and the potential for Author Rank was a hot topic that spilled out into multiple other sessions. Both Matt Cutts and Duane Forrester were asked about link based signals versus social signals. You could tell they are both tired of this question. Paraphrasing, they essentially said that while social signals are intriguing they’re not nearly as far along as we in the industry might believe (or want).
When prodded about the collapse of the link graph they noted that the link graph was just fine thank you very much. Link manipulation, the intent behind linking that we feel is so perverted, is not nearly as rampant as we assume. The mainstream blogger or site owner is linking for the right reasons. In short, the link graph is still valuable and with lower friction to producing digital content it may actually improve as more laypeople become content producers.
That’s not to say that social signals aren’t important but it will be a complement to or a refinement of the link graph, not a replacement. This is something I discussed in my original Author Rank post.
If we believe that search engines still view the link graph as viable there may be ways to simply use Authorship to make the link graph more accurate. Think of Authorship as meta information passed on every link. When looking for information on cancer the link given to an article from an established oncologist at a world renowned hospital would likely confer more value than a link to an article from ‘screwcancer888′ at a Q&A site.
In some ways this reminds me of delegating authority which Bill Slawski (always insightful) wrote about back in late 2010. What we’re really talking about is identifying expertise and allowing those experts to help curate our view of those topics where it matters – in search results.
It’s enticing to pick apart responses and statement by Googlers when they are asked to comment on Author Rank. The fact is that they’re not going to divulge much or commit one way or the other (at least publicly). They’ve been burned before by saying something that is true but interpreted in different ways.
So when asked, of course they’re going to reply that it’s something they’re experimenting with (because they do aspire to use the data) but that it is currently not a direct ranking signal and nothing to worry about now.
Of course that leads everyone to look for the experiments, to look for indirect ranking signals and to take the ‘now’ as a declaration of sorts for future implementation.
Authorship could be an indirect signal if you believe (like I do) that the click through rate (CTR) on a result can provide a positive feedback signal. And we know the CTR on authored results disrupts the normal click distribution of a SERP. Of course Google could take into account the Authorship snippet and normalize the CTR impact. So perhaps it isn’t having that indirect impact. See how confusing it can get?
Just for fun, let us think what would transpire if a Googler simply said there is no such thing as Author Rank without any hedging or caveats. People would start to conflate that with Authorship, potentially reducing the adoption rate. Many would interpret it to mean that Google had abandoned author based weighting completely. Thus, when Google did figure it out and apply it the industry would point to the statement and shout ‘liar’ at the top of their lungs.
We’ve trained Google to provide us with these elliptical statements. I choose to view them through this lens.
What To Look For?
That’s not to say that we shouldn’t be interested in the topic. I like the testing Terry Simmonds is doing on the mechanics of Authorship because it documents how Google is trying to extend the mark-up to more of the content on the web. And that’s a constraint as far as I can tell right now. Conversations about the inability to roll out updates because of low adoption are not uncommon.
You can’t begin to rank results based on topical expertise if many of the experts aren’t included in the selection criteria. The participation rate in Authorship has to be such that using it would provide a materially better ranking of content. Reports have Authorship coverage as low as 9% and as high as 17%. That’s not a lot really and both studies are limited based on the relatively small data sets analyzed.
The problem? If you were to want information on astrophysics you’d probably want to include Neil deGrasse Tyson in those results. Yet, he’s not on Google+ (as far as I can tell) and isn’t part of the Authorship program.
Looking at how Google is trying to assign Authorship is important.
The mechanics and the indirect Authorship Google often grants is particularly intriguing. I noted that Jonathon Colman was receiving a bounce back Authorship link on a SlideShare URL for which no direct Authorship mark-up was present.
I recall seeing this in the past on URLs from Quora, FriendFeed and Flickr. I swear some of these used to show up in Author Stats but I haven’t seen them lately (except for FriendFeed which I see at the tail end of my list.)
In fact, the bug that took Author Stats down might have been the exposure of indirect Authorship based on high confidence in matching public social graph data to Google+ profiles. Rapleaf got the brunt of the ire for crawling the public social graph but Google clearly has and continues to use this information even though the social circles feature has been retired.
Looking today I see another interesting URL showing up in Author Stats – Twitter.
There’s quite a lot of evidence that Twitter is a fairly well trusted source of indirect Authorship, but that’s a post for another day. However, we can also look at the verbiage in the Structured Data Testing Tool, which has changed within the last few weeks.
The points of interest here are the ‘(direct or indirect)’ verbiage as well as the fact that the tool only checks the first rel=author link listed on a webpage. The former certainly makes me believe that assigning Authorship based on indirect links is important to Google.
The latter tells me two things. First that the tool should not be trusted as the final arbiter of whether the correct Authorship is or will be applied. Second that Google obviously sees multiple authors or entities (or agents) on the page.
Lets go a step further. Google’s new Social Sign-In can be construed as a portable digital signature which might allow Google to rely on comments and other content produced outside of Google+. So tracking how this is rolled out and whether the reviews that now flow under your profile are also granted Authorship are interesting developments.
I’ve been eager to see Author Rank implemented since I first saw Matt Cutts interview Steven Levy.
This actually predates Authorship and the follow-up question by Matt (along with a bit of body language) makes it clear that Google was thinking about this seriously. While I absolutely do look for connections and patterns that might paint a picture of the future I’m not looking for it behind every corner and trying to fit Author Rank into each and every odd result or anecdote.
I prefer to talk about how people might build authority rather than how they would build Author Rank. Just as links are the result and not the goal, Author Rank will be the result and not the goal of your efforts.
Discussions about what makes someone an authority and how Google might want to translate that into math are fascinating. What makes someone authoritative versus popular? Is there a difference? If so, how would you go about separating the two?
How do you map the decline of authority? Of someone who is no longer really an expert and just mailing it in? Can you identify this even if they remain popular? How can you tell if someone is endorsing content based on merit or friendship? Is it what you know or who you know?
Furthermore, you could find that one was popular for the wrong reasons. Would you want to rank someone highly who simply fanned the flames of dissent and created controversy? The tone and type of interaction will be important so sentiment analysis and other processes will need to determine how to use social interaction as a reliable signal.
And how does influence fit into this equation? One can be influential without being popular, but clearly being popular gives you a better chance of being influential just by sheer reach. Can you be influential without being an authority? I think so. Just look at Jenny McCarthy and her influence within the anti-vaccine movement.
The latter clearly strays into the subjective nature of quality, relevance and authority that I touched on after the Panda update. Personalization helps to ensure that your subjective view of authority is reflected back to you. That’s why search results are changed based on who you follow on Google+. And personalization of search results is the most important thing about Google+ in my view.
But in discussing how Google might identify authority and expertise, we’re dealing with the aggregate. So the question isn’t really about your personal view (which is reflected back in Search+ results) but how the aggregate views different figures and authorities.
Of course, being likable is part of the way you can obtain authority. And it is often not what you say, but how you say it (or present it) that gets you noticed. So part of building authority is in ensuring that you can communicate in a way that conveys that expertise but also makes it accessible and … memorable.
Yes, I see all of this as being related because the same content presented in comic sans without any images or paragraph breaks wouldn’t have nearly the same impact and would not, ultimately, convey authority. Even though the actual words are the same!
I had a similar conversation with Dan Shure where he wondered about the impact of publishing content from Rand Fiskin under somebody else’s name. Would it get as much ‘play’ and be received as well? I doubt it. So what does that say about the connection of authority, popularity and quality assessment?
These are just a few of the things that make this topic so incredible.
I believe Google wants to use Author Rank but I also believe that it’s far more difficult than we think. Focusing solely on Author Rank may blind us to tracking Google’s progress and building what is truly important. Authority.