You Are Browsing The Technology Category

Mozilla Search Showdown

November 15 2011 // SEO + Technology // 5 Comments

Mozilla's search partnership with Google expires at the end of November. What happens next could change search engine and browser market share as well as the future of Mozilla.

The Mozilla Google Search Partnership

Originally entered into in November 2004 and renewed in 2006 (for 2 years) and 2008 (for 3 years), the search partnership delivers a substantial amount of their revenue to Mozilla. In fact, in 2010 98% of the $121 million in revenue came from search related activity.

The majority of Mozilla's revenue is generated from search functionality included in our Firefox product through all major search partners including Google, Bing, Yahoo, Yandex, Amazon, Ebay and others.

Most of that search revenue comes specifically from Google. The 'Concentrations of Risk' section in Mozilla's 2009 (pdf) and 2010 (pdf) consolidated financial statements put Google's contribution to revenue at 91% in 2008, 86% in 2009 and 84% in 2010.

Using the 2010 numbers, Mozilla stands to 'lose' $3.22 per second if the partnership expires. Mozilla is highly dependent on search and Google in particular. There's just no way around that.

What does Google get for this staggering amount of money?

Firefox Start Page

Google is the default search bar search engine as well as the default home page. This means that Firefox drives search after search to Google instead of their competitors.

Browser Share

Clearly browsers are an important part of the search landscape since they can influence search behavior based on default settings. As Mozilla points out, in 2002 over 90% of the browser market was controlled by Internet Explorer. At the time it made perfect sense for Google to help Mozilla break the browser monopoly.

The rise of Firefox helped Google to solidify search dominance and Mozilla was paid handsomely for this assistance.

However, it doesn't look like Google was comfortable with this lack of control. Soon after the announced renewal of the search partnership in 2008 Google launched their own browser. At the time, I wrote that Chrome was about search and taking share from Internet Explorer.

Browser Market Share 2011

I still think Chrome is about search and the trend seems to indicate that Chrome is taking share (primarily) away from Internet Explorer. In short, Google sought to control its own destiny and speed the demise of Internet Explorer.

Mission accomplished.

Chrome is now poised to overtake Firefox as the number two browser. That's important because three years ago Google had no other way to protect their search share. Chrome's success changes this critical fact.

Toolbars

Toolbars were the first attempt by search engines to break the grip of Internet Explorer. Both Google and Yahoo! used toolbars as a way to direct traffic to their own search engines.

What happened along the way was an amazing amount of user confusion. Which box were you supposed to search in? The location (or address) bar, the search box or the toolbar?

This confusion created searches in the location bar and URL entries in the search bar. Savvy users understood but it never made much sense to most.

Location Bar Search

The result? For those that figured it out there is evidence that people actually enjoyed searching via the location bar.

How many searches are conducted per month via the address bar? MSN wouldn't release those figures, but it did say that about 10 to 15 percent of MSN Search's overall traffic comes from address bar queries.

The company has analyzed the traffic from users who search via the address bar and discovered both that the searches appear intentional in nature, rather than accidental, and that those making use of address bar searching do so frequently.

This data from 2002 indicates that the location bar default might be very valuable. Sure enough, the location bar default is part of the search partnership Mozilla has with Google.

Firefox Location Bar Search Default

This also happens to be the most difficult setting to change. You can change the search bar preference with a click and the home page with two clicks, but the location bar is a different (and convoluted) story.

Firefox About:Config Warning

Most mainstream users aren't going to attempt entering about:config into their location bar, but if they do this first screen will likely scare them off.

I recently had to revisit the location bar default because I took Firefox for Bing for a spin. This add-on, among other things, changes the location bar default to Bing and it remains that way even after the add-on is removed. That's a serious dark pattern.

All of this makes me believe that the location bar might be the most valuable piece of real estate.

Omnibox

Having helped create confusion with their toolbar (now no longer supporting Firefox 5+) and seen the value of location bar searches, Chrome launched the omnibox, a combined location and search bar. The omnibox reduced confusion and asked users to simply type an address or search into one bar. Google would do the rest. Of course, the default for those searches is Google.

The omnibar seems to be a popular feature and why wouldn't it be? Users don't care what field they're typing in, they just want it to work. You know who else thinks this is a good idea? The Firefox UX Team.

Firefox Omnibar

While these mockups are for discussion purposes only, it's pretty clear what the discussion is about. According to CNET, a combined Firefox search-and-location bar is being held up by privacy issues. That was in March and the latest release of Firefox (just last week) still didn't have this functionality.

Back in late 2009 Asa Dotzler had a lot to say about the independence of Firefox and how they serve the user.

Mozilla’s decisions around defaults are driven by what’s best for the largest number of users and not what’s best for revenue.

It’s not about the money. The money’s there and Mozilla isn’t going to turn it down, but it’s not about the money. It’s about providing users with the best possible experience.

Great words but have they been backed up with action? Both users and the Firefox UX Team are lobbying for an omnibox, the Firefox for Bing add-on is a clear dark pattern and the ability to change the default location bar search engine is still overly complicated.

Is this really what's best for users?

Don't Count On Inertia

If Mozilla were to switch horses and cut a search deal with Bing, they'd be counting on inertia to retain users and their current search behavior. The problem is that Firefox was marketed as the solution to browser inertia.

Before Firefox many users didn't even understand they could browse the Internet with anything but Internet Explorer. Those same users are now more likely to switch.

It's sort of like being the other woman right? If he cheats with you, he's also liable to cheat on you.

With a search bar still in place users can easily change that default. Firefox would be counting on location bar searches and the difficulty in changing this default to drive revenue. You might get some traction here but I'm guessing you'd see browser defection, increased search bar usage and more direct traffic to the Google home page.

With an omnibar in place Firefox would be running a very risky proposition. Many mainstream users would likely migrate to another browser (probably Chrome). More advanced Firefox users would simply change the defaults.

You could move to an omnibar and make the default easy to change, but both Firefox and users have made it abundantly clear that they prefer Google. So how much would a Bing search partnership really be worth at that point?

Can Bing Afford It?

Bing is losing money hand over fist so it's unclear whether Bing can actually pony up this type of money anyway. If they did, it could cause browser defection and other behavior that would rob the search partnership of any real value and put Firefox at risk.

Even if Bing pirated half of the searches coming from Firefox, that's not going to translate into a real game changer from a search engine market share perspective.

Mozilla could partner with Bing but I don't think either of them would like the results.

Mozilla in a Pickle

Mozilla In a Pickle

If Google is the choice of users (as Firefox claims) installing a competing default search engine may hasten the conversion to Chrome. This time around Mozilla needs Google far more than Google needs Mozilla. I'm not saying that Google doesn't want the search partnership to continue, but I'm betting they're driving a very hard bargain.

Google no longer has a compelling need to overpay for a search default on a competing browser. I have to believe Mozilla is being offered a substantially lower dollar amount for the search partnership.

I don't pretend to know exactly how the partnership is structured and whether it's volume or performance based but it really doesn't matter. Google paid Barry Zito like prices back in 2008 at the height of the economic bubble but the times have changed and Google's got Tim Lincecum (Chrome) mowing down the competition.

Mozilla and Google are playing a high stakes game of chicken. The last renewal took place three months prior to the expiration. We're down to two weeks now.

This time the money might not be there.

TL;DR

The search partnership between Mozilla and Google expires at the end of November. The success of Chrome gives Google little incentive to overpay for a search default on Firefox. This puts Mozilla, who receives more than 80% of their revenue through the Google search partnership, in a poor position with few options.

Cut Up Learning

October 03 2011 // Life + Technology // 6 Comments

Is information overload a problem our new digital society must solve or are we changing how we learn?

Information Overload

We've gone from a handful of TV channels to more than 500; from a few radio stations to streaming music on demand; from reading the local newspaper to reading publications from around the world.

The Extracting Value from Chaos report from IDC iView provides a staggering overview of our digital footprint.

In 2011, the amount of information created and replicated will surpass 1.8 zettabytes (1.8 trillion gigabytes) - growing by a factor of 9 in just five years.

It's not just digital either. We see this trend in the publishing industry where print-on-demand and self-published books have skyrocketed (pdf).

Book Publishing Statistics Graph

This does not include Audio or eBooks.

Of course, we're also sharing all of this information at an accelerated rate.

Facebook Law of Sharing Graph

Zuckerberg's Law of Sharing states that sharing activity will double each year.

You know that information is increasing, but you might not realize just how much and how fast it is increasing.

Curation

As the amount of information increases many have looked at ways to sift through and make sense of it all. The goal is to find signal amid the noise. Plenty of folks are trying to apply different techniques and algorithms to winnow things down to only the most interesting and relevant.

KnowAboutIt, XYDO, My6Sense, Trunk.ly and Summify among others are all trying to cull the web and deliver the 'right' information to your inbox.

Aggregated social curation sounds logical but I haven't found it very valuable. I find the stuff I already read (or would have found) anyway. Maybe it works if you're not drinking from the information hose, but most of us are doing more of that in one way or the other. I can't imagine relying on just these services for my information.

Many believe that serendipity is an important part of information consumption, but most of the services give this lip service at best. They're doing more of what a good brand marketer would do, cranking out extensions to a known product. In this case that product is the type of content that you and your network of 'friends' are reading. I think you quickly reach a local maxima where you're not finding new things and making new connections.

Today's curation seems more like an echo chamber.

Distraction

Info Freako by Jesus Jones

Nicholas Carr thinks the Internet is doing evil things to us and Google might be making us stupid. NPR books summarizes Carr's thesis as follows.

Carr believes that the Internet is a medium based on interruption — and it's changing the way people read and process information. We've come to associate the acquisition of wisdom with deep reading and solitary concentration, and he says there's not much of that to be found online.

Carr might be right about the distraction of the Internet. But this is but one way in which we acquire information. I watch two hour movies straight through, can read a book for hours at a stretch and still conduct lengthy phone calls. The idea that we can only process information in one way seems like an odd conclusion. It would be like saying that because we possess the ability to drive that athletic prowess will decline.

Taking it a step further, there is an assumption that we process information uniformly. Here's where fiction helps reveal a greater truth. The Ghost In Love by Jonathan Carroll explores the division of personality. We are different people throughout our lives, day-by-day and even different people at the same time.

How can we be kind when you were so mean to that stranger the other day? How can we be intelligent when you made such a stupid mistake the other week? Many people struggle with this seeming paradox. But we're not robots! We're not just one monolithic entity that does things the same way every day. Not only do we evolve over time (just think about your musical tastes) but we'll react to information in different ways on an hourly basis. Much of this has to do with context, but I think there are more complex factors at work.

So why do we persist in this notion that we can only comprehend information in one way. That's just patently untrue.

Cut-Up Learning

The cut-up technique was made popular by William Burroughs and is performed by cutting up content and putting it back together in a different order. By doing so, it reveals new words, new insight and new meaning. It's a type of non-linear learning.

I believe the Internet, the great distractor, is a digital version of the cut-up technique. It is actually more powerful because we can cut-up more information from a wide variety of topics and mediums.

We're so consumed with capturing just the right thing, those few articles that will provide insight, that we miss the opportunity to piece together and make connections to a larger puzzle.

The goal isn't to curate and aggregate the content into neat little packages but to cut up the information to unlock trends and insight.

Skimming

I read a large number of RSS feeds, a diverse blend of literature, photography, analytics, SEO, technology, life hacking, science, local, marketing, design, UX, humor and start-up related blogs. I also let the river of information flow through platforms like FriendFeed, Google+, Pinterest and Twitter.

Do I read every post word for word? No. I'm skimming a lot of the time, both in terms of the type of content that is being generated (the theme and pulse of activity) and the actual content itself. Skimming doesn't mean I'm not getting value from that content. By skimming through a variety of pieces, topics and media I create a very different view of the data that is swirling around me.

That also doesn't prevent me from taking a deep dive on any given piece I find. In fact, I'd hazard that I locate more of these pieces through the act of skimming.

Cut-Up Example

Live Long and Prosper by Han Solo with Malcolm Reynolds Image

So lets go from theory to practice. I believe Google is extremely interested in creating some sort of AuthorRank based on the quality of and engagement with the content that author produces. Here's the cut-up that leads me to this conclusion.

I watch Steven Levy interviewed by Matt Cutts, and find both Levy's mention of being outranked by Huffington Post interesting but also note the look Cutts gives someone in the audience directly after this remark. I watch this video after authorship is rolled out by Google at SMX Advanced. This is an example of how the cut-up technique doesn't need to be linear.

I keep track of the debate around identity on Google+ and see how their inflexibility on the issue is rooted in ensuring confidence in authorship. I watch the high rate of iteration in the rel="author" program and note who is leading those efforts. I look at which Google Webmaster Central videos in this latest series are released first. Because they record them in large chunks, so the authorship video getting to the head of the line signals a sort of priority.

I read about the acquisitions of companies that measure engagement with content. I ask questions about what Google is doing with PostRank and (repeatedly) get no response. Silence can be a very loud signal.

Those are all signals within the actual topic, though they might be in different media. But I also pay close attention to how Facebook is implementing EdgeRank and note the direction LinkedIn is going as well. Again, those are closely related to authorship and identity so it's not going too far afield.

But there are other vectors that might seem unrelated. I listen to artists who are irate at how their work is taken and used without credit. I key in on articles that highlight the music that is most often sampled by new artists. I listen to the Rick Astley and Nirvana mashup. I laugh at the misattributed quote meme but also think about what it represents. I uncover distasteful social proof manipulation and dive into the argument about influence and whether Klout is accurate.

Alone, each of these things are of passing interest, but with access to so much information I find greater context and meaning.

Mind Hacking

The digital age allows us to peer over the shoulders of more people. A lot of them may provide little to no value but some will be intelligent and provide thoughtful commentary and links. I've become adept at quickly recognizing the difference. It's reminiscent of what Gladwell talks about in Blink.

Maybe I am an outlier and my information consumption behavior is non-traditional, but given the rate in which information is accelerating I believe more and more people will adopt (or be forced into) this type of cut-up learning.

I used to scoff at the number of people Robert Scoble followed, invoking Dunbar's Number as my defense. What I've realized is that there is a vast difference in social relationships versus information discovery.

I still believe in Dunbar's Number as it pertains to relationships but not when it comes to information discovery. I doubt highly that Robert is truly friends with the 32,000 people he follows on Twitter. But he is adept at taking the stream of information those people create and gaining value from it.

Tools

Certain tools can help to make cut up learning easier, in part by simply letting you organize what you'll skim. Google Reader is an absolute stellar resource. And no one has beaten the original FriendFeed friend of a friend functionality in delivering new and random things to my worldview. G+ is slowly getting better since I do find a diverse blend of technology, science, art and business that I can peruse.

The curation services? I'll use them. But they're more like an information safety net. My interaction with them is limited to no more than a 10 second skim and scroll of the content for confirmation.

But in the end, the biggest tool we'll have is our mind and our own ability to collect and process all that information. Maybe our brains are being rewired but who's to say that's a bad thing?

TL;DR

I found an article the other day that opined that the way to succeed in business was to know where the customer was going, not where they were now. This was a proxy for how I felt about the difference between curation services and cut up learning. Curation can tell you where things are now, while cut up learning can tell you where things are going.

Information overload may not be a problem we have to solve but instead could lead to a new way of learning. Skimming things does not make us shallow, it may actually make us rich.

Comment Censorship

August 07 2011 // Rant + Social Media + Technology // 14 Comments

In the past month I've left a number of comments on blogs only to find they were never published.

Fry Meme Spam or Blog Censorship

I'd like to believe that the blog owners simply didn't see my comment. That it fell into their spam queue which they rarely, if ever, look at. Because the alternative is that they saw the post and decided to suppress it. Now, it's their blog - their little corner of the Internet - but this type of censorship is troubling.

Comments Content

What about the content of my comments? To be fair, in some instances I was disagreeing with some or all of the content in that post. But I did so in a constructive manner, using links to my own thoughts on the topic or to other material to help round out my argument.

I regularly publish comments on this blog that are contrary to my own opinion. One only has to look at the comments on my Stop Writing For People post for examples. I'm free to respond and defend myself, but having the debate in the open is important. It builds trust, much like having bad reviews on a product is actually a good thing.

Comments are incredibly valuable because they provide additional information on the content. They make your content better through clarification, confirmation, addition and debate.

Comments = Content.

Comments are a rich source of meta information that deliver value to both readers and search engines. This extends to links as well! Relevant links in comments help create a web of information that users now and in the future will find useful.

Yet it is those links that may be at the root of the problem.

Comment Spam

It's like the Internet version of a plague of locusts. One of the most popular ways to combat comment spam is to screen comments that have links. This is one of the default setting in Akismet.

It makes sense since many spammers will drop a link or links in comments. But links are not the problem. Spammers are the problem.

What's wrong with contextual links to relevant content? This is not behavior that should be penalized. In fact, it should be encouraged. In many ways, the comment spam problem threatens the link graph.

ratio of comment spam to real comments

Not only that but, anecdotally, it seems that comment spam sometimes pushes people to disable comments altogether. When the ratio of comment spam to real comments is too high, many simply give up. I understand the decision but it's depressing that it gets to that point.

Outsourcing

Fed up with comment spam and general comment management, have we decided to outsource engagement to social networks? Twitter, Facebook, LinkedIn, and Google+ are all happy to provide venues in which comments can flourish. Make no mistake, these venues understand the value of comments.

Is our obsession with amplification and generating social proof robbing us of the real value of comments and conversation? Certainly there is some hope that it's like a rubber band. The content goes out, but then snaps back, drawing more comments to your content. It works to a certain extent, but by how much and at what cost is an interesting debate.

The Filter Bubble

Of course these bloggers may have seen my comment and simply decided not to publish it. Eli Pariser argues that personalization and 'invisible algorithmic editing' as a real danger but I think comment censorship (whether intentional or accidental) is the true menace.

I believe much of the hype around the filter bubble is FUD. Personalization is rather minimal in most cases though I do agree with Gabriel Weinberg's view of how to deal with personalization.

Personalization is not a black and white feature. It doesn't have to be on or off. It isn't even one-dimensional. At a minimum users should know which factors are being used and at best they should be able to choose which factors are being used, to what degree and in what contexts.

Personalization deals with the fact that some content isn't being made readily visible. Comment censorship excises content from the Internet altogether.

Identity

So what could help get us out of this morass? How can we ensure comments are once again a vital part of the content ecosystem? Identity.

Identity

The reason why many embraced Facebook comments was because comments are attached to an identity. Not only that, but an identity that people cared about. This obviates the need for aggressive moderation. You might run into a troll, but it'll be a troll you can clearly identify and block.

Identity essentially stops comment spam because you can't post as Best Miami Attorneys. Comment moderation is suddenly manageable again.

Censorship

A commenting system that uses identity removes most of the uncertainty around comment censorship. If my comment isn't published, it's likely because that blogger made an active decision to toss it into the HTML version of The Bermuda Triangle.

Cat Censors Blog Comments

If the filter bubble can be managed through making personalization transparent, so too can comment censorship. A third-party, identity-backed comment system could track the number of comments censored on each blog. A grade or score could then be shown to let users know how much of the conversation was being censored. In some ways it would be like Charity Navigator but for blogs.

So perhaps the blogger who touts the benefits of community actually censors 32% of blog comments. That might be an interesting thing to know.

Could this get messy? Sure. But you can build a system of checks and balances.

Reputation

Bad Reputation by Joan Jett

Joan Jett might not care about her bad reputation but you should. Whether it's a thumbs-up, thumbs-down, number of Likes, sentiment analysis, length of comments, spelling and grammar or other metrics, a savvy comment system could begin to assign reputation to each user.

So the censorship percentage wouldn't be flat in nature. If you blocked a known troll, no worries. If you censored someone who had a history of abusive comments full of foul language, no problem.

On the other hand, it would be disturbing if you censor someone who consistently adds value to conversations. The reputation of those you censor would matter.

Confidence

I'd like to be confident that I'm not missing good comments that wind up going into spam.

I'd like to be confident that if I take the time and effort to comment on a blog that it will be published and, hopefully, spark further comment and conversation.

I'd like to be confident that the comments I read are not biased and simply a form of self-curated cheerleading.

"Confidence is contagious. So is lack of confidence." - Vince Lombardi

The Internet desperately needs more confidence.

Google+ Review

July 07 2011 // Social Media + Technology // 19 Comments

(This post is an experiment of sorts since I'm publishing it before my usual hard core editing. I'll be going back later to edit and reorganize so that it's a bit less Jack Kerouac in style. I wanted to publish this version now so I could get some feedback and get back to my client work. You've been warned.)

I've been on Google+ for one week now and have collected some thoughts on the service. This won't be a tips and tricks style post since I believe G+ (that's the cool way to reference it now) will evolve quickly and what we're currently seeing is a minimum viable product (MVP).

In fact, while I have enjoyed the responsiveness that the G+ team has shown, it echoes what I heard during Buzz. One of my complaints about Buzz was that they didn't iterate fast enough. So G+, please go ahead and break things in the name of speed. Ignore the howling in the interim.

Circles

Circles is clearly the big selling point for G+. I was a big fan of the presentation Paul Adams put together last year that clearly serves as the foundation to Circles. The core concept was that the way you share offline should be mirrored online. My family and high school friends probably don't want to be overwhelmed with all the SEO related content I share. And if you want to share a personal or intimate update, you might want to only share that with family or friends.

It made perfect sense ... in theory.

I'm not sure Circles works in practice, or at least not the way many though they would. The flexibility of Circles could be its achilles heel. I have watched people create a massive ordered list of Circles for every discrete set of people. Conversely, I've seen others just lump everyone into a big Circle. Those in the latter seem unsettled, thinking that they're doing something wrong by not creating more Circles.

Of course there is no right or wrong way to use Circles.

But I believe there are two forces at work here that influence the value of Circles. First is the idea of configuration. I don't think many people want to invest time into building Circles. These Circles are essentially lists, which have been tried on both Facebook and Twitter. Yet, both of these two social giants have relegated lists in their user interface. Was this because people didn't set them up? Or that once they set them up they didn't use them?

I sense that Facebook and Twitter may have realized that the stated need for lists or Circles simply didn't show up in real life usage. This is one of those problems with qualitative research. Sometimes people say one thing and do another.

As an aside, I think most people would say that more is better. That's why lists sound so attractive. Suddenly you can really organize and you'll have all these lists and you'll feel ... better. But there is compelling research that shows that more choice leads to less satisfaction. Barry Schwartz dubbed it The Paradox of Choice.

The Paradox of Choice has been demonstrated with jam, where sales were higher when consumers had three choices instead of thirty. It's also been proven in looking at 401k participation, the more mutual fund choices available, the lower the participation in the 401k program.

Overwhelmed with options, we often simply opt-out of the decision and walk away. And even when we do decide, we are often less satisfied since we're unsure we've made the right selection. Those who scramble to create a lot of lists could fall prey to the Paradox of Choice. That's not the type of user experience you want.

The second thing at work here is the notion that people want to share online as they do offline. Is that a valid assumption? Clearly, if you're into cycling (like I am) you probably only want to share your Tour de France thoughts with other cyclists. But the sharing dynamic may have changed. I wrote before that Google has a Heisenberg problem in relation to measuring the link graph. That by the act of measuring the link graph they have forever changed it.

I think we may have the same problem in relation to online sharing. By sharing online we've forever changed the way we share.

If I interpret what FriendFeed (which is the DNA for everything you're seeing right now), and particularly Paul Buchheit envisioned, it was that people should share more openly. That by sharing more, you could shine light on the dark corners of life. People could stop feeling like they were strange, alone or embarrassed. Facebook too seems to have this same ethos, though perhaps for different reasons - or not. And I think many of us have adopted this new way of sharing. Whether it was done intentionally at first or not becomes moot.

So G+ is, in some ways, rooted in the past, of the way we used to share.

Even if you don't believe that people are now more willing to share more broadly, I think there are a great many differences in how we share offline versus how we share online. First, the type and availability of content is far greater online. Tumblr quotes, LOLcats, photos and a host of other types of media are quickly disseminated. The Internet has seen an explosion of digital content that runs through a newly built social infrastructure. In the past, you might share some of the things you'd seen recently at a BBQ or the next time you saw your book group. Not anymore.

Also, the benchmark for sharing content online is far lower than it is offline. The ease with which you can share online means you share more. The share buttons are everywhere and social proof is a powerful mechanism.

You also can't touch and feel any of this stuff. For instance, think about the traditional way you sell offline. The goal is to get the customer to hold the product, because that greatly increases the odds they'll purchase. But that's an impossibility online.

Finally, you probably share with more people. The social infrastructure built over the last five years has allowed us to reconnect with people from the past. We continue to share with weak ties. I'm concerned about this since I believe holding onto the past may prevent us from growing. I'm a firm believer in Dunbar's number, so the extra people we choose to share with wind up being noise. Social entropy must be allowed to take place.

Now Circles might support that since you can drop people into a 'people I don't care about' Circle that is never used. (I don't have this Circle, I'm just saying you could!) But then you simply wind up with a couple of Circles that you use on a frequent basis. In addition, the asynchronous model encourages people to connect with more people which flies in the face of this hardwired number of social connections we can maintain.

Lists and circles also rarely work for digesting content. Circles is clearly a nice way to segment and share your content with the 'right' people. But I don't think Circles are very good as a content viewing device.

You might make a Circle for your family. Makes perfect sense. And you might then share important and potentially sensitive information using this Circle. But when you look at the content feed from that Circle, what do you get? It would not just be sensitive family information.

If your brother is Robert Scoble you'd see a boat load of stuff there. That's an extreme example, but lets bring it to the more mundane example of, say, someone who is a diehard sports fan. Maybe that family member would share only with his sports buddies, but a lot of folks are just going to broadcast publicly and so you get everything from that person.

To put it more bluntly, people are not one-dimensional.

I love bicycling. I also have a passion for search and SEO. I also enjoy books, UX, LOLcats and am a huge Kasabian fan. If you put me in an SEO Circle, there's a good chance you'll get LOLcats and Kasabian lyrics mixed in with my SEO stuff. In fact, most of my stuff is on Public, so you'll get a fire hose of my material right now.

Circles is good for providing a more relevant sharing mechanism, but I think it's a bit of a square peg in a round hole when it comes to digesting content. That's further exacerbated by the fact that the filtering capabilities for content are essentially on and off (mute) right now.

Sure, you could segment your Circles ever more finely until you found the people who were just talking about the topic you were interested in, but that would be a small group probably and if you had more than just one interest (which is, well, pretty much everyone) then you'll need lots of Circles. And with lots of Circles you run into the Paradox of Choice.

Conversation

I've never been a fan of using Twitter to hold conversations. The clipped and asynchronous style of banter just doesn't do it for me. FriendFeed was (is?) the place where you could hold real debate and discussion. It provided long-form commenting ability.

G+ does a good job fostering conversation, but the content currently being shared and some of the feature limitations may be crushing long-form discussions and instead encouraging 'reactions'.

I don't want a stream of 6 word You Tube like comments. That doesn't add value. I'm purposefully using this terminology because I think delivering value is important to Google. Comments should add value and there is a difference in comment quality. And yes, you can influence the quality of comments.

Because if the comments and discussion are engaging you will win my attention. And that is what I believe is most important in the social arms race we're about to witness.

Attention

There is a war for your attention and Facebook has been winning. G+ must fracture that attention before Facebook really begins to leverage the Open Graph and provide search and discovery features. As it stands Facebook is a search engine. The News Feed is simply a passive search experience based on your social connections and preferences. Google's talked a lot about being psychic and knowing what you want before you do. Facebook is well on their way there in some ways.

User Interface

If it's one thing that Google got right it was the Red Number user interface. It is by far the most impressive part of the experience and feeds your G+ addiction and retains your attention.

The Red Number sits at the top of the page on G+, Google Reader, Google Search and various other Google products. It is nearly omnipresent in my own existence. (Thank goodness it's not on Google Analytics or I really wouldn't get any work done.) The red number indicator is both a notifier, navigation and engagement feature all-in-one. It is epic.

It is almost scary though, since you can't help but want to check what's going on when that number lights up and begins to increment. It's Pavlovian in nature. It inspired me to put together a quick LOLcat mashup.

OMG WTF Red Number!

It draws you in (again and again) and keeps you engaged. It's a very slick user interface and Google is smart to integrate this across as many properties as possible. This one user interface may be the way that G+ wins in the long-run since they'll have time to work out the kinks while training us to respond to that red number. The only way it fails is if that red number never lights up.

I'll give G+ credit for reducing a lot of the friction around posting and commenting. The interactions are intuitive but are hamstrung by Circles as well as the display and ordering of content.

Content

There is no easy way to add content to G+ right now. In my opinion, this is hugely important because content is the kindling to conversation. Good content begets good conversation. Sure we could all resort to creating content on G+ through posting directly, but that's going to get old quickly. And Sparks as it now stands is not effective in the slightest. Sorry but this is one feature that seems half-done (and that's being generous.) Right now the content through Sparks is akin to a very unfocused Google alert.

I may be in the minority in thinking that social interactions happen around content, topics and ideas far more often than they do around people. I might interact with people I'm close to on a more personal level, responding to check-ins and status updates but for the most part I believe it's about the content we're all seeing and sharing.

I really don't care if you updated your profile photo. (Again, I should be able to not see these by default if I don't want to.)

Good content will drive conversation and engagement. The easiest way to effect that is by aggregating the streams of content we already produce. This blog, my YouTube favorites, my Delicious bookmarks, my Google Reader favorites, my Last.fm favorites and on and on and on. Yes, this is exactly what FriendFeed did and it has, in many ways, failed. As much as I love the service, it never caught on with the mainstream.

I think some of this had to do with configuration. You had to configure the content streams and those content streams didn't necessarily have to be yours. But we've moved on quite a bit since FriendFeed was introduced and Google is adhering to the Quora model, and requiring people to use their real names on their profiles.

Google is seeking to create a better form of identify, a unified form of identity it can then leverage for a type of PeopleRank signal that can inform trust and authority in search and elsewhere. But identity on the web is fairly transparent as we all have learned from Rapleaf and others who still map social profiles across the web. Google could quite easily find those outposts and prompt you to confirm and add them to your Google profile.

Again, we've all become far more public and even if email is not the primary key, the name and even username can be used with a fairly high degree of confidence. Long story short, Google can short-circuit the configuration problem around content feeds and greatly reduce the friction of contributing valuable content to G+.

By flowing content into G+, you would also increase the odds of that red number lighting up. So even if I haven't visited G+ in a day (heck I can't go an hour right now unless I'm sleeping) you might get drawn back in because someone gave your Last.fm favorite a +1. Suddenly you want to know who likes the same type of music you do and you're hooked again.

Display

What we're talking about here is aggregation, which has turned into a type of dirty word lately. And right now Google isn't prepared for these types of content feeds. They haven't fixed duplication detection so I see the same posts over and over again. And there are some other factors in play here that I think need to be fixed prior to bringing in more content.

People don't quite understand Circles and seem compelled to share content with their own Circles. The +1 button should really do this, but then you might have to make the +1 button conditional based on your Circles (e.g. - I want to +1 this bicycling post to my TDF Circle.) That level of complexity isn't going to work.

At a minimum they'll need to collapse all of the shares into one 'story', with the dominant story being the one that you've interacted with or, barring prior interaction, the one that comes from someone in your Circle and if there are more than one from your Circle then the most recent or first from that group.

In addition, while the red number interface does deliver the active discussions to me, I think the order of content in the feed will need to change. Once I interact on an item it should be given more weight and float to the top more often, particularly if someone I have in my Circles is contributing to the discussion there.

Long-term it would also be nice to pin certain posts to the top of a feed if I'm interested in following the actual conversation as it unfolds.

The display of content needs to get better before G+ can confidently aggregate more content sources.

Privacy

One of the big issues, purportedly, is privacy. I frankly believe that the privacy issue is way overblown. (Throw your stones now.) As an old school direct marketer I know I can get a tremendous amount of information about a person, all from their offline transactions and interactions.

Even without that knowledge, it's clear that people might talk about privacy but they don't do much about it. If people truly valued privacy and thought Facebook was violating that privacy you'd see people shuttering their accounts. And not just the few Internati out there who do so to prove a point but everyday people. But that's just not happening.

People say one thing, but do another. They say they value privacy but then they'll give it away for a chance to win the new car sitting in the local mall.

Also, it's very clear that people do have a filter for what they share on social networks. The incidents where this doesn't happen make great headlines, but the behavioral survey work showing a hesitance to share certain topics on Facebook make it clear we're not in full broadcast mode.

But for the moment lets say that privacy is one of the selling points of G+. The problem is that the asymmetric sharing model exposes a lot more than you might think. Early on, I quipped that the best use of G+ was to stalk Google employees. I think a few people took this the wrong way, and I understand that.

But my point was that it was very easy to find people on G+. In fact, it is amazingly simple to skim the social graph. In particular, by looking at who someone has in their Circles and who has that person in their Circles.

So, why wouldn't I be interested in following folks at Google? In general, they're a very intelligent, helpful and amiable bunch. My Google circle grew. It grew to 300 rather quickly by simply skimming the Circles for some prominent Googlers.

The next day or so I did this every once in a while. I didn't really put that much effort into it. The interface for finding and adding people is quite good - very fluid. So, I got to about 700 in three or four days. And during that time the suggested users feature began to help out, providing me with a never ending string of Googlers for me to add.

But you know what else happened? It suggested people who were clearly Googlers but were not broadcasting that fact. How do I know that? Well, if 80% of your Circle are Googlers, and 80% of the people who have you in Circles are Googlers there's a good change you're a Googler. Being a bit OCD I didn't automatically add these folks to my Google Circle but their social graph led me to others (bonus!) and if I could verify through other means - their posts or activity elsewhere on the Internet - then I'd add them.

How many people do I have in my Google circle today?

Google Employees G+ Circle

Now, perhaps people are okay with this. In fact, I'm okay with it. But if privacy is a G+ benefit, I don't think it succeeds. Too many people will be upset by this level of transparency. Does the very private Google really want someone to be parsing the daily output of its employees? I'm harmless but others might be trolling for something more.

G+ creates this friction because of the asymmetric sharing model and the notion that you only have to share with the people in your circles. Circles ensures your content is compartmentalized and safe. But it exposes your social graph in a way that people might not expect or want.

Yes, I know there are ways to manage this exposure, but configuration of your privacy isn't very effective. Haven't we learned this yet?

Simplicity

Circles also has an issue with simplicity. Creating Circles is very straight forward but how content in those Circles is transmitted is a bit of a mystery to many. So much so that there are diagrams showing how and who will see your content based on the Circle permutations. While people might make diagrams just for the fun of it, I think these diagrams are an indication that the underlying information architecture might be too complex for mainstream users. Or maybe they won't care. But if sharing with the 'right' people is the main selling point, this will muddy the waters.

At present there are a lot of early adopters on G+ and many are hell bent on kissing up to the Google team at every turn. Don't get me wrong, I am rooting for G+. I like Google and the people that work there and I've never been a Facebook fan. But my marketing background kicks in hard. I know I'm not the target market. In fact, most of the people I know aren't the target market. I wonder if G+ really understands this or not.

Because while my feed was filled with people laughing at Mark Zuckerberg and his 'awesome' announcement, I think they missed something, something very fundamental.

Keep it Simple Stupid

Yes, hangouts (video chat) with 10 people are interesting and sort of fun. But is that the primary use case for video chat? No, it's not. This idea that 1 to 1 video chat is so dreadful and small-minded is simply misguided. Because what Facebook said was that they worked on making that video chat experience super easy to use. It's not about the Internati using video chat, it's about your grandparents using video chat.

Mark deftly avoided the G+ question but then, he couldn't help himself. He brought up the background behind Groups. I'm paraphrasing here, but Zuckerberg essentially said that Groups flourished because everyone knew each other (that's an eye poke at the asymmetric sharing model) and that ad hoc Groups were vitally important since people didn't want to spend time configuring lists. Again, this is - in my opinion - a swipe at Circles. In many ways, Zuck is saying that lists fail and that content sharing permissions are done on an ad hoc basis.

Instead of asking people to configure Circles and manage and maintain them Facebook is making it easier to just assemble them on the fly through Groups. And the EdgeRank algorithm that separates your Top News from Recent News is their way of delivering the right content to you based on your preferences and interactions. I believe their goal is to automagically make the feed relevant to you instead of forcing the user to create that relevance.

Sure there's a filter bubble argument to be made, but give Facebook credit for having the Recent News tab prominently displayed in the interface.

But G+ could do something like this. In fact, they're better placed than Facebook to deliver a feed of relevant information based on the tie-ins to other products. Right now there is essentially no tie in at all, which is frustrating. A +1 on a website does not behave as a Like. It does not send that page or site to my Public G+ feed. Nor does Google seem to be using Google Reader or Gmail as ways to determine what might be more interesting to me and who really I'm interacting with.

G+

I'm addicted to G+ so they're doing something right. But remember, I'm not the target market.

I see a lot of potential with G+ (and I desperately want it to succeed) but I worry that social might not be in their DNA, that they might be chasing a mirage that others have already dismissed and that they might be too analytical for their own good.

Google Scribe SEO Hints

June 05 2011 // SEO + Technology // 6 Comments

Lost in the Google +1 button launch and Schema.org announcement was the release of a new version of Google Scribe. In typical Google fashion, this unassuming product may be more important than both Google +1 and Schema.org.

Google Scribe

What is Google Scribe?

Google Scribe is one of a number of Google Labs experiments.

Google Scribe helps you write better documents. Using information from what you have already typed in a document, Google Scribe's text completion service provides related word or phrase completion suggestions. It also checks your documents for incorrect phrases, punctuations, and other errors like misspellings. In addition to saving keystrokes, Google Scribe's suggestions indicate correct or popular phrases to use.

Think of it as an intelligent version of Google Docs.

Language Research Lab

But what is Google Scribe really about? Look no further than the engineer working on the project.

Google Scribe Engineer

That's right, Google Scribe is about language models, something at the core of how Google interprets and evaluates web content.

Since Google Scribe's first release on Google Labs last year, we have been poring over your feedback and busy adding the top features you asked for. Today, we're excited to announce a new version of Google Scribe that brings more features to word processing.

Poring over your feedback might seem like they're reading comments and suggestions submitted by users, but in actuality I'm guessing it's the complex analysis of usage. Google Scribe is about language research. The kind of research helping Google refine algorithmic signals.

Every time you use Google Scribe you're helping to refine the language model by choosing from one of many text completion suggestions. Google is getting smarter about language.

Semantic Proofreading

One of the new features seems to be a direct result of this analysis: semantic proofreading.

Semantic Proofreading Example

Normal spell check would not catch the words in this example because both words are correctly spelled. Yet, the language model has learned that the word awesome is rarely ever preceded by the word quiet.

That's quite awesome.

Good Writing Matters

Unless you've been living under a rock you probably know that Google is using spelling and grammar as a way to determine content quality. Any analysis of Amit Singhal's Panda questions would indicate that grammar and spelling are gaining in algorithmic importance.

I'd recently discussed Google's potential use of spelling and grammar on reviews with Bill Slawski. I wasn't convinced it was a good idea.

But then Barry Schwartz reported on a Google Webmaster Forum response by Google employee John Mueller regarding spelling in user generated content.

This was noteworthy enough to prompt an official Google tweet.

Google Good Spelling Tweet

Is that clear enough for you?

Anchor Text Suggestions

This new version of Google Scribe creates automatic anchor text for a URL. That in itself is pretty interesting, but Google Scribe also gives alternate anchor text suggestions and the ability for the user to create their own.

Here are two examples using my fellow Sphinn editors: Michael Gray and Jill Whalen.

Google Scribe Link Suggestions for Wolf Howl

Google Scribe Link Suggestions for High Rankings

Clearly Google Scribe is already seeing and using back link profiles. But Google will learn about the validity of the anchor text every time someone changes the anchor text from the automated, or primary suggestion, to one of the other suggestions or creates something entirely new.

What happens when Google Scribe determines that the primary suggestion for a URL is rarely used? The implication is that link suggestions could provide a feedback mechanism on overly optimized or 'off-topic' anchor text.

In other words, a paid link signal.

High Quality Documents

I'm convinced Google Scribe is helping to improve Google's ability to interpret and analyze language. But there are indications that Google could be thinking even bigger.

Google Scribe Labs Description

Sure enough the description of Google Scribe starts with that succinct elevator pitch. "Write high-quality documents quickly." The last word tells me it's meant to support the new digital content work flow.

Scribe Bookmarklet and Extension

You can take Google Scribe on the go using the bookmarklet or Chome extension. I'm using the bookmarklet right now as I'm writing this post.

Google Scribe WordPress Integration

It's a bit clunky from a UX perspective but I see a lot of potential. A more refined product might help sites ensure their users are producing well written user generated content.

Flipping The Funnel

Why limit yourself to the output of content when you can influence the input of content.

The explosion of digital content has been made possible, in large part, by blogging platforms. Yet, the quality of the content has been uneven, and that's probably being generous. So why not attack the problem at the top of the funnel? Help people write better content.

I like the idea. In fact, I like it so much I'm exploring a side project that does the same thing in a different yet complementary way.

Google Scribe and SEO

Data and Spot Star Trek LOLcat

Like it or not, Google is using spelling and grammar to determine content quality. Google Scribe is one method being used by Google to better understand and evaluate language and anchor text. It's not about the actual product (right now) but about the data (feedback) Google Scribe is producing.

Instead of obsessing about the specifics of the Panda update the SEO community can look to Google Scribe and take the hint. It's not just what you say, it's also how you say it.

So if you're responsible for content, take a few more minutes and proofread your work. Google will.

Mechanical Turk Tips

June 03 2011 // SEO + Technology // 7 Comments

Amazon Mechanical Turk is a great way to do a wide variety of tasks, from content creation to image tagging to usability. Here are 15 tips to get the most out of Mechanical Turk.

Mechanical Turk Logo

Learn The Lingo

What's a HIT? Mechanical Turk can be a bit confusing upon first glance. In particular, you'll need to understand this one important acronym.

A HIT is a Human Intelligence Task and is the work you're asking workers to perform. A HIT can refer to the specific task your asking them to perform but also doubles as the terminology of the actual job you post in the community.

Select A+ Workers

95 percent or more approval rate for HITs

The long and the short of it is that reputation matters and past performance is a good indicator of future performance. Limit your HITs to those with at least a 95% approval rate.

It may shrink your pool of workers and could increase the time to completion but you make up for it in QA savings.

Segment Your Workers

Match the right workers to the right task. In my experience, you get better results from US based workers when you're doing anything that requires writing or transcription. Conversely, international workers often excel in tasks such as data validation and duplicate detection.

Give Workers More Time Than They Need

The time you give is the time workers have before the HIT disappears. Imagine starting a job and when you come back to turn in your work and collect payment the shop has closed and left town. This can really frustrate workers.

Mechanical Turk Reward Tip

I think Amazon creates this problem with the messaging around the hourly rate calculation. My advice, don't get too hung up on the hourly rate and err on the side of providing more time for your HITs.

Provide Specific Directions

Remember that you are communicating work at a distance to an unknown person. There's no back-and-forth dialog to clarify.

In addition, workers are looking to complete work quickly and to ensure they fulfill the HIT so their approval rate remains high. The latter, in particular, makes specificity very important.

Tell workers exactly what to do and what type of work output is expected.

Make It Look Easy

While the directions should be specific you don't want a 500 word paragraph of text to scare folks off. Make sure your HIT looks easy from a visual perspective. This means it's easily scanned and understood.

Take advantage of the HTML editor and build in a proper font hierarchy, appropriate input fields and use a pop of color when you really want to draw attention to something important.

Give Your HIT a Good Title

Make sure your HIT title is the appropriate length (not too short or long) and that it's descriptive and appealing.

Mechanical Turk HIT Title examples

A good title is a mixture of SEO and marketing principles. It should be relevant and descriptive but also interesting and alluring.

Bundle The Work

If you can do it, bundle a bunch of small tasks into one HIT. For instance, have them tag 10 photos at a time.

This helps because you can set a higher price for your HIT. You'll attract a larger pool of workers since many don't seek out 'penny' HITs.

Mind Your Email

Workers will email you - frequently. Do not ignore them.

You are joining a community. Just take a peek at Turker Nation. As with any community, you get and build a reputation. Don't make it a bad one. Respond to your email, even if the response isn't what workers want to hear.

In addition, you learn how to tweak your HIT by listening to and interacting with the workers.

Pay Fast

A lot of the email you may receive is around a familiar refrain: "When will you pay." This gets tedious so I generally recommend paying quickly, reducing the amount of unproductive email and giving you a good reputation within the community.

Pay Mechanical Turk HITs Fast

That means setting your automatic approval for something like 2 or 3 days.

Develop a QA System

To pay fast you need a good QA system. You can either do this yourself or, alternatively, put the work out as a separate HIT. That's right, you can use Mechanical Turk to QA your Mechanical Turk work. Insert your Inception or Yo Dawg joke here.

Bonus Good Work

10 Dollar Bill

Give a bonus when you find workers who have done an excellent job on number of HITs. It doesn't have to be a huge amount, but take the top performers and give them a bonus.

Not only is this the right thing to do, it'll go a long way to establishing yourself in the community and developing a loyal pool of quality workers.

Build a Workforce

Once you find and bonus good workers, continue to give them HITs. You can do this by creating a list of and limiting HITs to just those workers.

If you do this you probably want to keep the 'Required for preview' box checked so workers not on that list aren't frustrated by previewing a HIT they don't have any chance of working on.

Download the worker history (under Manage > Workers) and use Excel to find high volume and high quality workers. Then create your list (under Manage > Qualification Types) so you can use it in your HIT.

Block Bad Apples

Just as you build a list of good workers, you also need to block a few of the bad ones. They might have dynamite approval ratings but for different types of tasks. Some people are good a some things and ... not so good at others.

Coaching workers is time consuming and costly, so it's probably better for you and the worker to simply part ways. You ensure the approval rate on your HITs remains high and the worker won't put their approval rate in jeopardy.

Understand Assignments

Finally, understand and use assignments wisely. Each HIT can be assigned to a certain number of workers.

Warning on Assignments per HIT

So if you're HIT is about getting feedback on your new homepage design, you might assign 500 workers to that HIT. That means you'll give 500 reactions to your new homepage. It's one general task that requires multiple responses.

But if you're HIT is about validating phone numbers for 500 businesses, you will assign 1 worker to each HIT. That means you'll get one validation per phone number. Do not assign 500 workers or you'll get 500 validations per phone number. That's wasteful and likely to irk those businesses too.

Mechanical Turk Tips

These tips are the product of experience (both mine and the talented Drew Ashlock), of trial and error, of stubbing toes during the process.

I hope this helps you avoid some of those pitfalls and allows you to get the most out of a truly innovative and valuable service.

Yahoo Email Hacked

May 23 2011 // Rant + Technology // 409 Comments

(IMPORTANT: Before I get to my story, if your Yahoo! email has been hacked I recommend that you immediately change your password, update your security questions and ensure your Yahoo! Mobile and Y! Messenger are both up-to-date. You should also visit Yahoo! Email Abuse Help and use this process if you are unable to login to your Yahoo! account. Also, make sure to read the comments on this post since there is a tremendous amount of good information there as well.)

(UPDATE 12/13/11: Yahoo has introduced second sign-in verification as an added security measure. It will require that you add a mobile phone number and verify it via a text message. Here's the direct link to start using second sign-in verification.)

It happened just before we arrived at the San Francisco Zoo. We are at a red light on Sloat Boulevard when my phone started to vibrate.

Buzz. Buzz. Buzz. Buzz. Buzz. Buzz. Buzz. Buzz. Buzz. Buzz. Buzz. Buzz. Buzz.

Had the rapture come a day late? No. I was getting undeliverable messages. Lots of them. My Yahoo email had been hacked!

admiral akbar star wars its a trap spoof

Here are the two important lessons I learned as a result.

I Have Good Friends

I didn't want our day at the Zoo ruined, me staring into my phone resetting passwords and figuring out what happened. So I put the problem on the back burner and proceeded to have a fun family day.

But I did take time to quickly tap out a response to people who replied to the spam coming from my hijacked account. Why? Because they took the time and effort to give me a heads up that I had a problem. These were good people. Good friends.

The thing is, I'd gotten a number of these same emails lately from other hacked Yahoo accounts. I figured these people knew they'd been compromised and I didn't need to respond. With the shoe on the other foot, I realized those emails were comforting even though I was well aware of the problem.

I'll shoot off an email the next time I get a hacked email from someone.

Yahoo Email Security Failed

The odds are that I will get another one of those emails because I learned just how easy Yahoo makes it for hackers.

Upon getting home I went about securing my account. On a lark, I checked Yahoo's 'View your recent login activity' link.

yahoo recent login activity

Sure enough at 10:03 AM my account was accessed from Romania. This obvious login anomaly didn't set off any alarms? Shouldn't my security questions have been presented in this scenario? I have never logged in from Romania before.

I've never logged in from outside the US. Yahoo knows this. In fact, Yahoo knows quite a bit about my location.

yahoo location history

My locations puts me in three states: California, New York and Pennsylvania. I also have location history turned on, so it's not just my own manually saved locations (some of which are ancient), but Yahoo's automated location technology keeping track of me.

Do you see Romania in this list? I don't.

Why is Yahoo making it this easy for spammers to hijack accounts? Make them work a little bit! At a minimum, make them spoof their location.

Yahoo should have noted this anomaly and used my security questions to validate identity. I still would have had to change my password (which wasn't that bad) but I would have avoided those embarrassing emails.

A simple rule set could have been applied here where users are asked to validate identity if the login (even a successful one) is outside of a 500 mile radius of any prior location.

I've had a Yahoo account for over 10 years without a problem, even as I moved my business accounts over to Gmail.

Yesterday I thanked those friends who had my back. Unfortunately, Yahoo wasn't one of them.

WordPress Duplicate Content

April 27 2011 // Rant + SEO + Technology // 21 Comments

In February Aaron Bradley sent me an email to let me know that I had a duplicate content problem on this blog. He had just uncovered and rectified this issue on his own blog and was kind enough to give me a heads up.

Comment Pagination

The problem comes in the way that WordPress handles comment pagination. The default setting essentially creates a duplicate comment page.

Here's what it looks like in the wild. Two pages with the same exact content.

http://blog.wolframalpha.com/2011/04/18/new-age-pyramids-enhance-population-data/comment-page-1/

http://blog.wolframalpha.com/2011/04/18/new-age-pyramids-enhance-population-data

That's not good. Not good at all.

Comment-Page-1 Problem

The comment-page-1 issue offends my own SEO sensibilities, but how big of a problem is it really?

WordPress Spams Google

There are 28 million inurl results for comment-page-1. 28 million!

Do the same inurl search for comment-page-2 and you get about 5 million results. This means that only 5 million of these posts attracted enough comments to create a second paginated comment page. Subtract one from the other and you wind up with 23 million duplicate pages.

The Internet is a huge place so this is probably not a large percentage of total pages but ... it's material in my opinion.

Change Your Discussion Settings

If you're running a WordPress blog I implore you to do the following.

Go to your WordPress Dashboard and select Settings --> Discussions.

How To Fix Comment-Page-1 Problem

If you regularly get a lot of comments (more than 50 in this default scenario) you might want to investigate SEO friendly commenting systems like Disqus, IntenseDebate or LiveFyre.

Unchecking the 'break comments into pages' setting will ensure you're not creating duplicate comment pages moving forward. Prior comment-page-1 URLs did redirect, but seemed to be doing so using a 302 (yuck). Not satisfied I sought out a more permanent solution.

Implement an .htaccess RewriteRule

It turns out that this has been a known issue for some time and there's a nice solution to the comment-page-1 problem in the WordPress Forum courtesy of Douglas Karr. Simply add the following rewrite rule to your .htaccess file.

RewriteRule ^(.*)/comment-page-1/ $1/ [R=301,L]

This puts 301s in place for any comment-page-1 URL. You could probably use this and keep the 'break comments into pages' setting on, which would remove duplicate comment-page-1 URLs but preserve comment-page-2 and above.

Personally, I'd rather have the comments all on one page or move to a commenting platform. So I turned the 'break comments into pages' setting off and went a step further in my rewrite rule.

RewriteRule ^.*/comment-page-.* $1/ [R=301,L]

This puts 301s in place for any comment-page-#. Better safe than sorry.

Don't Rely on rel=canonical

Many of the comment-page-1 URLs have a rel=canonical in place. However, sometimes it is set up improperly.

Improper Rel=Canonical

Here the rel=canonical actually reinforces the duplicate comment-page-1 URL. I'm not sure if this is a problem with the Meta SEO Pack or simple user error in using that plugin.

Many times the rel=canonical is set up just fine.

Canonical URL from All-In-One SEO Pack

The All in One SEO Pack does have a Canonical URL option. I don't use that option but I'm guessing it probably addresses this issue. The problem is that rel=canonical doesn't stick nearly as well as a 301.

Comment-Page-1 in SERP

So even though this post from over three months ago has a rel=canonical, the comment-page-1 URL is still being returned. In fact, there are approximately 110 instances of this on this domain alone.

Comment Page 1 Site Results

Stop Comment-Page-1 Spam

23 million pages and counting. Sure, it would be nice if WordPress would fix this issue, but short of that it's up to us to stop this. Fix your own blog and tell a friend.

Friends don't let friends publish duplicate content.

Open Graph Business Intelligence

April 06 2011 // Social Media + Technology // 1 Comment

Facebook's Open Graph can be used as a valuable business intelligence tool.

Here's how easy it can be to find out more about the people powering social media on your favorite sites.

How It Works

The Open Graph is populated with meta tags. One of these tags is fb:admins which is a list of Facebook user IDs.

fb:admins open graph tag

Here we are on a Time article that is clearly using the Open Graph.

Sample Time.com Article

The fb:admins tag is generally found on the home page (or root) of a site because that's one of the ways you grant people access to Insights for Websites.

Lint Bookmarklet

You could open up a new tab and go to the Facebook Linter Tool to enter the domain or you can use my handy Bookmarklet that gives you one-click access to Lint that site.

Get Lint Info

Drag the link above to your bookmark bar and then click on it anytime you want to get information about the Open Graph mark-up from that site's home page.

Linter Results

The results will often include a list of Facebook IDs. In this instance there are 8 administrators on the Time domain.

Facebook Lint for Time

Click on each ID to learn as much as that person's privacy settings will allow. You can find out quite a bit when you do this.

In this instance I've identified Time's Technical Lead, a Senior Program Manager (with a balloon decorating company on the side), a bogus test account (against Facebook rules) and the Program Manager, Developer Relations for ... Facebook.

I guess it makes sense that Time would get some special attention from Facebook. Still, it raised my eyebrows to see a Facebook staffer as a Time administrator.

Cat Lee

Cat actually snagged 'cat' as her Facebook name (nicely done!) and says her favorite football team is the Eagles. I might be able to strike up a conversation with her about that. Go Eagles!

I'd probably also ask her why a fake test account is being used by Time.

Tester Time on Facebook

That is unless Time really does have a satanic handball enthusiast on staff.

Dig Deeper

Sometimes a site won't use fb:admins but will authenticate using fb:app_id instead. But that doesn't mean your sleuthing has come to an end. Click on the App ID number and you'll usually go to that application.

Time Facebook Application Developer Information

By clicking on Info I'm able to view a list of Developers. Some of these I've already seen via fb:admins but two of them are actually new, providing a more robust picture of Time's social media efforts and resources.

You'll only be stymied if the site is using fb:page_id to authenticate. That's generally a dead end for business intelligence.

Open Graph Business Intelligence

I imagine this type of information might be of interest to a wide variety or people from recruiters to journalists to sales and business development professionals. You could use this technique on its own or collect the names and use LinkedIn and Google to create a more accurate picture of those individuals.

How would you use this information?

Google Personalized Search

March 21 2011 // SEO + Social Media + Technology // Comment

Google recently launched a new feature that allows users to personalize their search results by blocking certain domains. What impact will this have and what does it mean for the future of search?

The Smiths

Artificial Intelligence

A recent New York Post article by Peter Norvig discussed advances in artificial intelligence. Instead of creating HAL, the current philosophy is to allow both human and computer to concentrate on what they do best.

A good example is the web search engine, which uses A.I. (and other technology) to sort through billions of web pages to give you the most relevant pages for your query. It does this far better and faster than any human could manage. But the search engine still relies on the human to make the final judgment: which link to click on, and how to interpret the resulting page.

The partnership between human and machine is stronger than either one alone. As Werner von Braun said when he was asked what sort of computer should be put onboard in future space missions, “Man is the best computer we can put aboard a spacecraft, and the only one that can be mass produced with unskilled labor.” There is no need to replace humans; rather, we should think of what tools will make them more productive.

I like where this might be leading and absolutely love the idea of personalized results. Let me shape my own search results!

Human Computer Information Retrieval

I've been reading a lot about HCIR lately. It's a fascinating area of research that could truly change how we search. Implemented the right way, search would become very personal and very powerful.

The challenge seems to be creating effective human computer refinement interfaces. Or, more specifically, interfaces that produce active refinement, not passive refinement.

At present, Google uses a lot of passive refinement to personalize results. They look at an individual's search and web history, track click-through rate and pogosticking on SERPs and add a layer of geolocation.

Getting users to actively participate has been a problem for Google.

Jerry Maguire

A Brief History of Google Personalization

Google launched personalized search in June of 2005 and expanded their efforts in February of 2007. But the first major foray into soliciting active refinement was in November of 2008 with the launch of SearchWiki.

This new feature is an example of how search is becoming increasingly dynamic, giving people tools that make search even more useful to them in their daily lives.

The problem was that no one really used SearchWiki. In the end it was simply too complicated and couldn't compete with other elements on the page, including the rising prominence of universal search results and additional Onebox presentations.

In December of 2009 Google expanded the reach of personalized search.

What we're doing today is expanding Personalized Search so that we can provide it to signed-out users as well. This addition enables us to customize search results for you based upon 180 days of search activity linked to an anonymous cookie in your browser.

This didn't go down so well with a number of privacy folks. However, I believe it showed that Google felt personalized search did benefit users. They also probably wanted to expand their data set.

In March of 2010 SearchWiki was retired with the launch of Stars.

With stars, we've created a lightweight and flexible way for people to mark and rediscover web content.

Stars wasn't really about personalizing results. It presented relevant bookmarks at the top of your search results. Google clearly learned that the interaction design for SearchWiki wasn't working. The Stars interaction design was far easier, but the feature benefits weren't compelling enough.

A year later, Stars is replaced with blocked sites.

We’re adding this feature because we believe giving you control over the results you find will provide an even more personalized and enjoyable experience on Google.

Actually, I'm not sure what this feature is called. Are we blocking sites or hiding sites? The lack of product marketing surrounding this feature makes me think it was rushed into production.

In addition, the interaction design of the feature is essentially the same as FriendFeed's hide functionality. Perhaps that's why the messaging is so confused.

Cribbing the FriendFeed hide feature isn't a bad thing - it's simple, elegant and powerful. In fact, I hope Google adopts the extended feature set and allows results from a blocked site to be surfaced if it is recommended by someone in my social graph.

Can Google Engage Users?

I wish Google would have launched the block feature more aggressively and before any large scale algorithmic changes. The staging of these developments points to a lack of confidence in engaging users to refine search results.

Google hasn't solved the active engagement problem. Other Google products that rely on active engagement have also failed to dazzle, including Google Wave and Google Buzz.

I worry that this short-coming may cause Google to focus on leveraging engagement rather then working on ways to increase the breadth and depth of engagement.

In addition, while we’re not currently using the domains people block as a signal in ranking, we’ll look at the data and see whether it would be useful as we continue to evaluate and improve our search results in the future.

This may simply be a way to reserve the right to use the data in the future. And, in general, I don't have a problem with using the data as long as it's used in moderation.

Curated data can help augment the algorithm. Yet, it is a slippery slope. The influence of others shouldn't have a dramatic effect on my search results and certainly should not lead to sites being removed from results altogether.

That's not personalization, that's censorship.

SERPs are not Snowflakes

All of Google's search personalization has been relatively subtle and innocuous. Rank is still meaningful despite claims by chicken little SEOs. I'm not sure what reports they're looking at, but the variation in rank on terms due to personalization is still low.

SERPs are not Snowflakes

Even when personalization is applied, it is rarely a game changer. You'll see small movement within the rankings, but not wild changes. I can still track and trend average rank, even with personalization becoming more commonplace. Given the amount of bucket testing Google is doing I can't even say that the observed differences can be attributed solely to personalization.

I don't use rankings as a way to steer my SEO efforts, but to think rank is no longer useful as a measurement device is wrong. Yet, personalization still has the potential to be disruptive.

The Future of Search Personalization

Google needs to increase the level of active human interaction with search results. They need our help to take search to the next level. Yet, most of what I hear lately is about Google trying to predict search behavior. Have they given up on us? I hope not.

Gary Marchionini, a leader in the HCIR field, puts forth a number of goals for HCIR systems. Among them are a few that I think bear repeating.

Systems should increase user responsibility as well as control; that is, information systems require human intellectual effort, and good effort is rewarded.

Systems should be engaging and fun to use.

The idea that the process should be engaging, fun to use and that good effort is rewarded sounds a lot like game mechanics. Imagine if Google could get people to engage search results on the same level as they engage with World of Warcraft!

World of Google

Might a percentage complete device, popularized by LinkedIn, increase engagement? Maybe, like StackOverflow, certain search features are only available (or unlocked) once a user has invested time and effort? Game mechanics not only increases engagement but helps introduce, educate and train users on that product or system.

Gamification of search is just one way you could try to tackle the active engagement problem. There are plenty of other avenues available.

Personalization and SEO

I used the cover artwork from the Smith's last studio album at the beginning of this post. I thought 'Strangeways, Here We Come' was an apt description for the potential future of personalized search. However, a popular track from this album may be more meaningful.

Stop me if you think you've heard this one before.

SEO is not dead, nor will it die as a result of personalization. The industry will continue to evolve and grow. Personalization will only hasten the integration of numerous other related fields (UX and CRO among others) into SEO.

The block site feature is a step in the right direction because it allows control and refinement of the search experience transparently without impacting others. It could be the start of a revolution in search. Yet ... I have heard this one before.

Lets hope Google has another album left in them.