About Rand Fishkin
Rand Fishkin uses the ludicrous title, Wizard of Moz. He’s founder and former CEO of Moz, co-author of a pair of books on SEO, and co-founder of Inbound.org. Rand’s an unsaveable addict of all things content, search, & social on the web, from his multiple blogs to Twitter, Google+, Facebook, LinkedIn, and a shared Instagram account. In his minuscule spare time, he likes to galavant around the world with Geraldine and then read about it on her superbly enjoyable travel blog.
In his Learn Inbound talk, Rand covers Moz’s recently released search ranking factors correlation data, opinions of the SEO field collected via survey, and the results of some recent tests performed on Google. Together, these will help illustrate how the search giant is ranking things today, and what tactics are working in SEO.
- Social shares don’t factor for rankings, or at least they factor minimally. There is a small correlation between heavily shared content, but it usually doesn’t have a knock-on effect.
- However, social engagement on the page is of some importance. Onsite engagement does factor into the rankings.
- Social shares on their own aren’t enough; amplification and outreach are vital to earning links. Don’t be afraid to pay/work for links from trusted sources.
- While the algorithm may be flattening and the SERPs are growing increasingly smart, link-building and outreach are still the missing pieces in an already very complex jigsaw.
All right, let's dive in. So we survey about 150 SEO professionals, people that we've identified who we think have very impressive work that they've done and good opinions about how Google works, and this is the 2013 view. You can see something interesting. So what I like to look at, and we'll go through this in a sec, what I like to look at is how SEO professionals' opinions have changed over time. To me, that's fascinating, right? Because I actually think that this aggregated opinion data is pretty solid. It does a reasonably good job of describing to us what we are feeling in the search results and from Google as well. So there's 2009, 2011, 2013, and 2015. I'll show you all of these together, and you'll get a real sense, I think, just visually, assuming the color changes work. '09, 2011, 2013, 2015.
Do you guys see what's happening? It's a little tough, but basically, from these big Pac-Man-like chunks, we're getting smaller and smaller, like everything is flattening out. Everything is getting like, "Oh, it's both more complex and more nuanced." Right? There's no one dominating factor like there used to be.
So I'll show you this. I'll illustrate this in a different way. Domain-level link features, right? Basically, people saying domain authority, how important is domain authority, the power, the ranking power of an individual domain to rank content in Google. 2009, right? Twenty four percent and then down to 14 and a half percent. And in 2009, page-level link features at 43% drifting down to also 14 and a half percent. And what about keyword usage? Well, looks like keyword usage kind of stayed the same. So you get that nuanced, flattening view of what Google's algorithm is doing.
One of the things, by the way, that we couldn't include because we hadn't had it in previous iterations is around user and usage data and engagement data, which this time for the first time we included and is also in this range. I find that fascinating. I think user and usage data is sort of the missing key. When you look at a set of Google search results and you see, "Hey, it's not that powerful a domain. It doesn't have that many links pointing to the page itself. The keywords don't seem all that optimized." What is that missing element?
Will pointed out earlier this past week at SearchLove that he saw some content; that Pinterest guy that he mentioned in the Moz Top 10 email. Anybody subscribed to the Moz Top 10 email? A few folks. So the Moz Top 10 email, when you click on those, many of you, most of you in Chrome, it looks to us like Google is actually seeing that data and that those clicks are translating in aggregate, in large quantities with high engagement, etc., etc., into rankings. And so we've been tracking the rankings for stuff that we put on the Moz Top 10, and they rise indeed. So engagement data, I think that big missing piece.
So some big takeaways, right? I think professional SEOs are seeing this flattening of the algorithm. We're seeing that after years of kind of dominating how search results work and how Google ranks pages that links are becoming not the only thing. They're still a powerful thing, and we have tests showing that as well, but they are not the dominant force they used to be. And engagement data, very clearly on the rise.
So I'm also going to talk a little bit about correlation data because we do two things with the ranking factors each year. We collect opinion data. We also collect data about mathematical correlations between individual factors and whether they suggest that higher-ranking pages have them and lower-ranking pages don't. So, yeah, correlation doesn't imply causation, but to finish that quote in its entirety, it sure is a strong hint. Correlation tells us something else of great value. And that something else is it's not saying why these pages rank highly or why these pages at the top rank above those pages below.
What it does say is in aggregate, what features do those pages have that these lower-ranking pages on average don't have? And I don't know about you, but I care deeply about that. I almost don't care whether or not it's causal. I am deeply interested in the raw correlation, just like I'm deeply interested not in whether something got a bunch of retweets because it had a word in it, but which words did successful tweets have in them? Correlation data is fascinating and interesting to me by itself.
It also lets us do some cool things. So three big applications. I'm going to walk through examples of these. Number one, correlation data lets us debunk statements that are not causal because something that is negatively correlated or not correlated cannot be a strong influencer.
Second, it lets us show relative potential influence, right? And third, of course, it lets us identify factors that we might want to do more testing, more investigation around.
So there's this one asshole that we have in my country. I'm sure you have assholes here, although probably not this big. We grow them real big in America. So this person that I'm ashamed is also from America is going to make some statements, some ridiculous statements like, "Google are losers. The more ads you buy, the higher you rank." And of course, we have some correlation data that can say, "Well, Mr. The Donald, in fact, ads have a negative correlation with ranking highly," suggesting that actually the fewer ads you have on average...it's a very small negative correlation, right? But the better you'll rank. So that can disprove this kind of conspiracy theory crap.
There are also things like, "Oh, SEOs who use multiple keyword repetitions, they're going to do way better than folks who are following some fancy LDA topic modeling system." Well, actually, it turns out also not true at all. In fact, and I'll show this a little bit later, but the more sophisticated we get with topic modeling algorithms, the better we tend to see the correlation, suggesting that Google is indeed using fairly sophisticated word and language in topic models, which I don't think is a surprise to anyone.
Correlation can also lead us to interesting things that we can validate through other means, right? So we might look at something like partial match anchor text, for the first time ever, has a higher correlation with ranking than exact match anchor text. In all of the previous years in previous tests, that was not the case. And so we might go, "Hey, you know what? Maybe partial match is becoming more influential, more powerful than it has been in the past. Let's go run some experiments and see if this is true." I didn't include it here, but yes, we actually did run these experiments. FYI, long story short, no. Exact match anchor text still more powerful than partial. I think the reason that we see that correlation happening is because there's so many people who abuse exact match anchor text that you're going to see essentially a lot of spammy, manipulative stuff at the lower end of the rankings versus higher.
It is true, by the way, that Google... Well, I believe Google when they say that they have hundreds, many hundreds of elements in their ranking algorithm. And if that is true, then we should not expect that we would see things in the super high positive correlation. If you saw a factor, a single ranking factor in Google that was 0.8 or above, that would suggest to us that that one factor overwhelms everything else, that it really doesn't matter anything else that you do, as long as you do that one thing.
So as you might imagine, the overwhelming majority, in fact, every ranking factor we've ever been able to look at historically is somewhere between a -0.4 and a +0.4, +0.45, which I think is expected behavior. That's not surprising.
Alright, so let's look at a few of these. We collected about 16,500 search results from google.com. This is U.S. earlier this year. And you'll actually see some familiar stuff in here. So these are link metrics' correlations with rankings which haven't changed much in the last six years. Despite the fact that SEOs feel like links are less powerful, the correlation with ranking hasn't shifted really at all, which surprises me. I keep expecting when we run these one of these years, it's going to start dropping, and it just doesn't.
Another interesting thing we were able to do this year. And I want to thank the folk at Ahrefs. They're sort of competitors, but in our field, competitors are also friends, which I love about the SEO world. And you can see Moz and Ahrefs, really, really similar. Like super, super similar when it comes to raw correlations.
Social shares. This one is kind of interesting to me. They're down from their high. The first time we looked at them was I think 2009. They rose from '09 to 2011. They rose again in 2013. And now they've actually dropped back down a little bit. So social shares, not quite as well-correlated as they were last year, if you remember, or two years ago. Two years ago, they were nearing what links were. In fact, Google+ shares were like weirdly high two years ago, and we were sort of going, "Wait. Why is that? What could be happening there?"
Traffic and engagement. For the first time, we managed to get this. So traffic does look strongly correlated, but that's a little bit chicken and egg to me. This is data from SimilarWeb who has a panel of about 30 million users, so fairly comprehensive and I think reasonably trustworthy, at least for well-trafficked sites. And this one, what I was hoping that I would see is that bounce rate and time on site would actually be well-correlated. And then when I didn't see that, what I went is, "Okay, the next step that I want to do is look at exclusively searched traffic for bounce rate and engagement," because this is all bounce rate and engagement. And I don't think Google cares at all if a page does really well on Reddit or if it does really well on Facebook, and then people visit it from those places, and then they bounce right back. Google doesn't go, "Oh, well, that's clearly a terrible page because it did really well on Facebook and got a lot of traffic there, but the bounce rate was high." I think Google does care, and we can see this in some of the experiments we've run recently, if you do pogo sticking, which is essentially I perform a search, I click on a result, and then I go, "This is terrible. I don't like this. This sucks. I'm going right back to the search results and clicking on someone else's result." That, they really don't like. And that's not going to be reflected in these numbers, so we need to revisit that.
Keyword use and on-page optimization. Again, like I talked about, that topic modeling stuff, I think to my mind, this suggests a huge hole, a huge gap in our current SEO tools landscape. I think we need much more sophisticated topic modeling algorithms in the SEO world that better replicate whatever Google is doing internally. And if we do, whoever does that, knock on wood that maybe Moz can do it, but whoever does that I think is going to have a wildly successful product. And those of us who pay attention to that are going to see that our rankings can improve just from the content, like the words and phrases and topics that we can include on our pages.
For the first time, we actually did something cool. We were able to break down by category of keywords and search results. So I think this is super slick and I was surprised at the delta between different categories. So this is essentially categories of keywords. Here's health at the top, right? And then like occasions and gifts, dining and nightlife, retailers and general merchandise. And look at those, right? That's pretty substantial.
This is linking out on a page, right? So essentially, should you link out from your website? Not should you. Do pages that link out more on average tend to rank higher than pages that don't link out, right? And aggregate, 0.09. But look at the swing. If you're a restaurant website, Google doesn't give two shits if you linked to anybody else, right? Which I think is fine. And I'm not saying it's an actual Google ranking factor. It may not be, right? It might be that restaurant websites don't earn more links or don't earn more shares or don't earn more engagement from linking out. It could be something else, right? But clearly, the folks in the health category who link out tend to significantly outperform their peers who don't, versus the folks in dining and nightlife, not the case.
This is external exact match anchor text links coming in. And again, you can see home and garden, apparel, beauty and personal care significantly lower than jobs and education, news media and publications, travel and tourism, where those factors seem to correlate much better with rankings. This is document length.
You have almost certainly heard if you've paid attention to kind of what's been going on in the social sphere, like Buzzfeed and Upworthy talking about how lengthier articles always perform so much better for them, right? Well, you know what? In arts and entertainment, and news media, and family and community, this is true in Google, too. Lengthier documents tend to perform much better. But you know what? That tiny little restaurant website that you've got, or a retail website, that is not the case. It's not the case that a retailer's website tends to perform much, much better when the content is dramatically longer.
So Amazon aside, and I bet Amazon is probably 0.06 of that 0.07 because they have such long product pages, that's not necessarily at least helping those folks rank like it is some of these other ones.
This is social shares. Twitter is at the top here. I know this is hard to read, but these are Twitter. These are Facebook. So Twitter, Facebook, Twitter, Facebook, Twitter, Facebook. And correlations are relatively similar across this. Interestingly enough, the share activity ratio for Twitter and Facebook is pretty similar to the correlation ratio between them, which to me suggests Google is doing exactly what they say they're doing and not paying any attention to social shares. I don't think they use social shares in their algorithm at all. I believe them when they say that. I think they're using the engagement that often comes, accompanies social sharing and social amplification in those algorithms, just like I talked about with the Moz Top 10 in email.
So I think, look, correlations with links maybe suggests to me that links, in our experiments, too, suggests to me that while it feels like Google's algorithm is flat and while it feels like there's other things going on, links can still move the needle. They still have that ability. And in many competitive niches, it is hard to move that needle unless you have links. I think a lot of professional SEOs feel that way already, but that's reinforcing that.
Second, I mentioned, I think we desperately need these more sophisticated tools when it comes to on-page. I think there's huge amounts of room for optimization there. I hope some really smart folks from the world of computer science and natural language processing consider putting their efforts towards our problem. I think they'll make a lot of money doing it if they do.
And third, correlation, I think, in my opinion, clearly is way more useful the more granular we get. Imagine this. Imagine just the 500 keywords or 1000 keywords that you, your company, your organization care about. Imagine being able to see the correlations for various features for your keywords across your entire industry and across a broad group of all keywords. So now you're like, "Oh, look, document length is moderately well-correlated broadly. It's very well-correlated in my industry. And for my particular search results, it has a very high correlation. Clearly, that's something I should be paying attention to much more so than the average website." I think that is a killer feature. I think that is the future of what we should be doing when we aggregate rankings data. To my mind, that's just totally clear.
Okay. So some examples of what we can do with this stuff. We can help validate things that Google says. I mentioned that at some point in the near future...wait. Is somebody from Google Ireland here? What? My facial expression is totally open, and I hope that you are here and that you say nice things about me when you go home.
Okay. I have been thinking. I don't know. Barry, you'll back me up on this. I'm thinking like maybe next year, I might make a whole presentation that's just things Google said and then we prove them wrong, right? Like, "Oh, yeah, 302s are just like 301s." Wait. Are you sure? Okay.
So we can verify theories about what's in Google's algorithm, right? So we can take a look and say, "Oh, we're pretty sure. We think this thing is here. Let's go run some tests and actually validate whether that's causal or purely correlation." We can do that. And I think we can find some better tactical approaches. I'll show you all three.
One, let's talk about secure sites, right? So SSL. Google comes out, they say, "HTTPS is a ranking signal. We're going to be boosting the rankings of things that are secure, of sites that are secure." And look, there's a million good reasons to make your sites secure. I don't dispute this fact. But Moz did it, right? We were like, "Okay, look, we're supposedly thought leaders in the SEO world. Moz clearly needs to do this." You can dive into the data yourself, but I've screenshotted a Google Analytics and it's showing off like different points along the line. Basically, what happened when we switched to SSL was exactly what happened when we switched domains, right? When we moved from seomoz.org in 2013 to moz.com, we took a hit for around three-ish months. And then as Google sort of indexed and got everything back in there, we came back and we were stronger than ever going forward. And that's exactly what happened with SSL. So, I mean, look, here's HTTPS. That is a 0.04 correlation with higher rankings. Lower in some industries, higher in others. But there are a lot of factors that are way higher correlation. Not causal, but correlation with higher rankings that Google says are not factors at all. So when they say it's a ranking factor, I wonder if they're like, "We will give you the tiniest smidgen of a nudge forward." Probably not enough to move anything, but that's not what the blog post says, right? The blog post is just like, "You should go SSL."
Then Google is like, "Well, if you're an SEO and you're not recommending that people go HTTPS, you are wrong and you should feel bad." So I'm kind of a dick, so I'm like, "Well, if you're Google and you're not transparent about the fact that it takes time to recover from this change, you should feel bad, too." And credit to Englishman Dave Naylor who was like, "Yes, I indeed felt bad when I switched to HTTPS and lost 20% of my traffic for three months. That did feel bad." So yeah.
We talked about the 302 thing, so I won't go into that other than to say I suspect...I have this weird feeling. Have you guys heard a rumor that there's like a troll factory somewhere in Zurich? I think there might be. I can't prove it, but it might...okay.
Raw URL mentions. So for a long time, lots of people in the SEO world have thought, "You know what I think Google might be doing? They might be looking at unlinked mentions of URLs." Right? Someone writes out your site, http://www.moz.com/blog/whatever, but it's not linked, right? It's just a mention somewhere. Like maybe Google is using that in some ways. Is that possible? Like it could be a thing?
Well, so from Fresh Web Explorer, we can take a look at all the pages, newly published pages and we can find URL mentions because it's got all the full content in there. And then we looked across correlations and like, "Yeah, you know? Those correlation values are pretty high, like way higher than HTTPS. So maybe something is going on there." I mean, it could be a chicken and egg thing, right? It could be that maybe the URL got popular and people were pasting it all over the place and it got the mentions. And that happened after it was popular, not that's what drove it.
So I've been working with Eric Enge and Mark Traphagen and Cyrus and some folks on this project called IMEC Labs. We do a bunch of experiments to try and validate what's going on in SEO. And because we saw this interesting correlation data and we had this theory, we then went down the path of running an experiment around it. We put a bunch of URL mentions on a bunch of websites, watched their rankings, then changed them to links, and the data is pretty conclusive. We repeated this experiment a number of times. You know what? It turns out URL mentions, no. They did nothing when we put them up multiple times. They did nothing. As soon as we changed them to real links, the thing that we linked to rose in the rankings, suggesting URL mentions probably correlated but non-causal. Okay. So not the most exciting test in the world, but definitely a validation of something.
Links and shares. This is the last thing that I'll walk you through, but I think many of us, myself included, have been not only holding a bad opinion or an opinion that we can now prove wrong, but also amplifying that opinion to many other folks in the field, and I feel pretty bad about that.
So we all know that links can overwhelm other signals. Did you guys catch this experiment? This was like one of my favorite gray hat, black hat...probably gray hat. I think gray hat type of experiment. So this is refugeeks.com, and this is Matt Cutts, www.mattcutts.com. He's on sabbatical. I don't know. I assume building hover boards or something. But "refugeeks.com is world's best SEO website - Matt Cutts." Wow. That is a hell of a sabbatical. That's like what is going on there? But funny story, it turns out that this URL is blocked by robots.txt. What does that mean? It means Google can't crawl and see what's actually in the content there. So if you point a bunch of anchor texts at it saying this, Google is like, "Well, let's just take their word for it. We can't actually see what's going on at that URL. We're not allowed to crawl it."
So this, by the way, shows off the power of exact match anchor text links, shows off the power of domain authority, and shows off some nefarious things that you could do with things that are blocked by robots.txt on powerful sites if you felt like it, but you probably shouldn't. But it's probably not that bad to do, so go for it.
So, okay, there are lots of things about a link. I don't need to walk through them all, right? But there's lots of things that we know about a link, lots and lots of things that we know about a link that can enhance or reduce their ability to move their rankings. And that stuff mattered a ton when we used to do manual link building, right? Like when we were going out and trying to get this link, trying to get that link, trying to get that link, this mattered a lot. We needed to know which elements are moving the needle.
But nowadays, content, like content marketing, that's building a lot of our links for us, right? We make great content and people just seem to find it and link to it. We kind of use social to get our shares, right? This is like modern day SEO.
I'm so sorry. Moz and BuzzSumo joined forces, not BuzzFeed. I should edit it now, but I'm not going to. Moz and BuzzSumo joined forces. Thank you, Steve. I know you're somewhere. Yes, you rock. You should go up and thank Steve afterwards. This study, I thought, was remarkable. So BuzzSumo looks at a million pieces of content along with us. We pull a bunch of data on it and find some very depressing news. Like very depressing. Remember, BuzzSumo is collecting content that has social shares, many of them quite significant social shares. What is the median number of linking root domains of content in this million-sample set of things that have gotten shared on the web? Well, it's one. One. That sucks. That's terrible. Great. You've got a bunch of tweets and one link. That's exciting.
This is actually a distribution of the total shares and the number of articles receiving the shares. And you can see a power law distribution. It's basically like income in the United States, like Trump gets that and then the rest of us. It sucks. It's just shitty. The reality of social amplification and links is that the correlation is not there. That's the correlation between tweets, and all of the other networks, and links, 0.0281.
Wait a minute. Wait a minute. I thought sharing was how we got links. If there's no correlation between getting shared and getting linked, how are we going to get links? What's going on here? So we tried segmenting the samples, right? We tried breaking it out into the only things that had been very heavily shared. So here are things that had been shared, posts with over 10,000 total shares. Ten thousand shares across Facebook and Twitter and Pinterest and Google+ and LinkedIn, etc. For the most heavily shared content, the correlation is 0.101. That is also really low, like shockingly low. And the median number of referring links, referring the link, two. At least we doubled it, but 10,000 shares and two links? I think this kind of breaks a paradigm that I've been holding on to for years now, right?
First off, I can't endorse either of these. These common SEO maxims. If you hear someone say this, be like, "Yeah, guess what? I saw that get flogged on stage until it was dead." If you create good unique content, Google will not figure out the rest. They will not. No. The best way to earn links is to create great content. Big fat no. Neither of those are true. I mean, like just categorically false. The best way to earn links isn't even to create great content and share it on social networks. That alone won't work either, right? We know we have to do something different.
So I've presented this concept a bunch of times, and I apologize if you've seen me do it or watched me do it. I feel guilty. It was wrong. I shouldn't have. I publish something. I amplify it on social. I grow my social networks so that I can reach more people, right? And then I get links through those shares. I grow my authority. Now I can rank for slightly more competitive phrases, and I earn search traffic. And I go back and I do it all again. And this is wrong. That works for a tiny, tiny sliver of a percent of us who often, unfortunately, happen to be like Moz and Kissmetrics and BuzzSumo, like our little world. And so we think, "Oh, well, it works for me. Must work for everyone else." But the data says it doesn't work for everyone else. That's not how it goes. That part doesn't just happen. That get links, that grow authority, it doesn't just happen.
I think we have to recognize something that I think a lot of SEOs have told me over the years that I have not paid enough attention to, and that is, "You still got to do outreach, Rand." You still got to do outreach. Link building is still a thing. It's a process. It doesn't just happen because your content was great and got lots of shares. Social shares by themselves almost never directly lead to links. Moz is an outlier here. I'm guilty of using our outlier status and saying like, "Oh, this must be how it works for everybody." That's not true.
And second, I think content that performs extremely well on social networks and ranks well may not be ranking exclusively or may be ranking in spite of the fact that it doesn't have links. And some of that ranking probably comes from the engagement. Again, the engagement that we talked about at the beginning that people have been seeing from that.
So you can find all of the ranking factors here at bit.ly/rankingfactors2015, and you can get this slide deck online at bit.ly/rankslides2015. And with that, I'm looking forward to some great Q&A. Thank you.