Tuesday, September 8, 2015

Content, Shares, and Links: Insights from Analyzing 1 Million Articles

Posted by Steve_Rayson

This summer BuzzSumo teamed up with Moz to analyze the shares and links of over 1m articles. We wanted to look at the correlation of shares and links, to understand the content that gets both shares and links, and to identify the formats that get relatively more shares or links.

What we found is that the majority of content published on the internet is simply ignored when it comes to shares and links. The data suggests most content is simply not worthy of sharing or linking, and also that people are very poor at amplifying content. It may sound harsh but it seems most people are wasting their time either producing poor content or failing to amplify it.

On a more positive note we also found some great examples of content that people love to both share and link to. It was not a surprise to find content gets far more shares than links. Shares are much easier to acquire. Everyone can share content easily and it is almost frictionless in some cases. Content has to work much harder to acquire links. Our research uncovered:

  • The sweet spot content that achieves both shares and links
  • The content that achieves higher than average referring domain links
  • The impact of content formats and content length on shares and links

Our summary findings are as follows:

  1. The majority of posts receive few shares and even fewer links. In a randomly selected sample of 100,000 posts over 50% had 2 or less Facebook interactions (shares, likes or comments) and over 75% had zero external links. This suggests there is a lot of very poor content out there and also that people are very poor at amplifying their content.
  2. When we looked at a bigger sample of 750,000 well shared posts we found over 50% of these posts still had zero external links. Thus suggests while many posts acquire shares, and in some cases large numbers of shares, they find it far harder to acquire links.
  3. Shares and links are not normally distributed around an average. There are high performing outlier posts that get a lot of shares and links but most content is grouped at the low end, with close to zero shares and links. For example, over 75% of articles from our random sample of 100,000 posts had zero external links and just 1 or less referring domain link.
  4. Across our total sample of 1m posts there was NO overall correlation of shares and links, implying people share and link for different reasons. The correlation of total shares and referring domain links across 750,000 articles was just 0.021.
  5. There are, however, specific content types that do have a strong positive correlation of shares and links. This includes research backed content and opinion forming journalism. We found these content formats achieve both higher shares and significantly more links.
  6. 85% of content published (excluding videos and quizzes) is less than 1,000 words long. However, long form content of over 1,000 words consistently receives more shares and links than shorter form content. Either people ignore the data or it is simply too hard for them to write quality long form content.
  7. Content formats matter. Formats such as entertainment videos and quizzes are far more likely to be shared than linked to. Some quizzes and videos get hundreds of thousands of shares but no links.
  8. List posts and videos achieve much higher shares on average than other content formats. However, in terms of achieving links, list posts and why posts achieve a higher number of referring domain links than other content formats on average. While we may love to hate them, list posts remain a powerful content format.

We have outlined the findings in more detail below. You can download the full 30 page research report from the BuzzSumo site:

Download the full 30-page research report

The majority of posts receive few shares and even fewer links

We pulled an initial sample of 757,000 posts from the BuzzSumo database. 100,000 of these posts were pulled at random and acted as a control group. As we wanted to investigate certain content formats, the other 657,000 were well shared videos, ‘how to’ posts, list posts, quizzes, infographics, why posts and videos. The overall sample therefore had a specific bias to well shared posts and specific content formats. However, despite this bias towards well shared articles, 50% of our 757,000 articles still had 11 or less Twitter shares and 50% of the posts had zero external links.

By comparison 50% of the 100,000 randomly selected posts had 2 or less Twitter shares, 2 or less Facebook interactions, 1 or less Google+ shares and zero LinkedIn shares. 75% of the posts had zero external links and 1 or less referring domain links.

75% of randomly selected articles had zero external links

Shares and links are not normally distributed

Shares and links are not distributed normally around an average. Some posts go viral and get a very high numbers of shares and links. This distorts the average, the vast majority of posts receive very few shares or links and sit at the bottom of a very skewed distribution curve as shown below.

This chart is cut off on the right at 1,000 shares, in fact the long thin tail would extend a very long way as a number of articles received over 1m shares and one received 5.7m shares.

This long tail distribution is the same for shares and links across all the domains we analyzed. The skewed nature of the distribution means that averages can be misleading due to the long tail of highly shared or linked content. In the example below we show the distribution of shares for a domain. In this example the average is the blue line but 50% of all posts lie to the left of the red line, the median.

There is NO correlation of shares and links

We used the Pearson correlation co-efficient, a measure of the linear correlation between two variables. The results can range from between 1 (a total positive correlation) to 0 (where there is no correlation) to −1 (a total negative correlation).

The overall correlations for our sample were:

Total shares and Referring Domain Links 0.021

Total shares and Sub-domain Links 0.020

Total shares and External Links 0.011

The results suggest that people share and link to content for different reasons.

We also looked at different social networks to see if there were more positive correlations for specific networks. We found no strong positive correlation of shares to referring domain links across the different networks as shown below.

  • Facebook total interactions 0.0221
  • Twitter 0.0281
  • Linkedin 0.0216
  • Pinterest 0.0065
  • Google plus 0.0058

Whilst there is no correlation by social network there is some evidence that very highly shared posts have a higher correlation of shares and links. This can be seen below.

Content sample Average total shares Median shares Average referring domain links Median referring domain links Correlation total shares - referring domains
Full sample
of posts

(757,317)
4,393 202 3.77 1 0.021
Posts with over
10,000 total shares

(69,114)
35,080 18,098 7.06 2 0.101

The increased correlation is relatively small, however, it does indicate that very popular sites, other things being equal, would have slightly higher correlations of shares and links.

Our finding that there is no overall correlation contradicts previous studies that have suggested there is a positive correlation of shares and links. We believe the previous findings may have been due to inadequate sampling as we will discuss below.

The content sweet spot: content with a positive correlation of shares and links

Our research found there are specific content types that have a high correlation of shares and links. This content attracts both shares and links, and as shares increase so do referring domain links. Thus whilst content is generally shared and linked to for different reasons, there appears to be an overlap where some content meets the criteria for both sharing and linking.

Screen Shot 2015-08-24 at 17.38.25.png

The content that falls into this overlap area, our sweet spot, includes content from popular domains such as major publishers. In our sample the content also included authoritative, research backed content, opinion forming journalism and major news sites.

In our sample of 757,000 well shared posts the following were examples of domains that had a high correlation of shares and links.

Site Number of articles in sample Referring domain links - total shares correlation
The Breast Cancer Site 17 0.90
New York Review of books 11 0.95
Pew Research 25 0.86
The Economist 129 0.73

We were very cautious about drawing conclusions from this data as the individual sample sizes were very small. We therefore undertook a second, separate sampling exercise for domains with high correlations. This analysis is outlined in the next section below.

Our belief is that previous studies may have sampled content disproportionately from popular sites within the area of overlap. This would explain a positive correlation of shares and links. However, the data shows that the domains in the area of overlap are actually outliers when it comes to shares and links.

Sweet-spot content: opinion-forming journalism and research-backed content

In order to explore further the nature of content on sites with high correlations we looked at a further 250,000 random articles from those domains.

For example, we looked at 49,952 articles from the New York Times and 46,128 from the Guardian. These larger samples had a lower correlation of links and shares, as we would expect due to the samples having a lower level of shares overall. The figures were as follows:

Domain

Number articles in sample

Average Total Shares

Average Referring Domain Links

Correlation of Total Shares to Domain Links

Nytimes.com

49,952

918

3.26

0.381

Theguardian.com

46,128

797

10.18

0.287

We then subsetted various content types to see if particular types of content had higher correlations. During this analysis we found that opinion content from these sites, such as editorials and columnists, had significantly higher average shares and links, and a higher correlation. For example:

Opinion content

Number articles in sample

Average Total Shares

Average Referring Domain Links

Correlation of Total Shares to Domain Links

Nytimes.com

4,143

3,990

9.2

0.498

Theguardian.com

19,606

1,777

12.54

0.433

The higher shares and links may be because opinion content tends to be focused on current trending areas of interest and because the authors take a particular slant or viewpoint that can be controversial and engaging.

We decided to look in more detail at opinion forming journalism. For example, we looked at over 20,000 articles from The Atlantic and New Republic. In both cases we saw a high correlation of shares and links combined with a high number of referring domain links as shown below.

Domain

Number articles in sample

Average Total Shares

Average Referring Domain Links

Correlation of Total Shares to Domain Links

TheAtlantic.com

16,734

2,786

18.82

0.586

NewRepublic.com

6,244

997

12.8

0.529

This data appears to support the hypothesis that authoritative, opinion shaping journalism sits within the content sweet spot. It particularly attracts more referring domain links.

The other content type that had a high correlation of shares and links in our original sample was research backed content. We therefore sampled more data from sites that publish a lot of well researched and evidenced content. We found content on these sites had a significantly higher number of referring domain links. The content also had a higher correlation of links and shares as shown below.

Domain

Number articles in sample

Average Total Shares

Average Referring Domain Links

Correlation of Total Shares to Domain Links

FiveThirtyEight.com

1,977

1,783

18.5

0.55

Priceonomics.com

541

1,797

11.49

0.629

PewResearch.com

892

751

25.7

0.4

Thus whilst overall there is no correlation of shares and links, there are specific types of content that do have a high correlation of shares and links. This content appears to sit close to the center of the overlap of shares and links, our content sweet spot.

The higher correlation appears to be caused by the content achieving a higher level of referring domain links. Shares are generally much easier to achieve than referring domain links. You have to work much harder to get such links and it appears that research backed content and authoritative opinion shaping journalism is better at achieving referring domain links.

Want shares and links? Create deep research or opinion-forming content

Our conclusion is that if you want to create content that achieves a high level of both shares and links then you should concentrate on opinion forming, authoritative content on current topics or well researched and evidenced content. This post falls very clearly into the latter category, so we will shall see if this proves to be the case here.

The impact of content format on shares and links

We specifically looked at the issue of content formats. Previous research had suggested that some content formats may have a higher correlation of shares and links. Below are the details of shares and links by content format from our sample of 757,317 posts.

Content Type

Number in sample

Average Total Shares

Average Referring Domain Links

Correlation of total shares & referring domain links

List post

99,935

10,734

6.19

0.092

Quiz

69,757

1,374

1.6

0.048

Why post

99,876

1,443

5.66

0.125

How to post

99,937

1,782

4.41

0.025

Infographic

98,912

268

3.67

0.017

Video

99,520

8,572

4.13

0.091

What stands out is the high level of shares for list posts and videos.

By contrast the average level of shares for infographics is very low. Whilst the top infographics did well (there were 343 infographics with more than 10,000 shares) the majority of infographics in our sample performed poorly. Over 50% of infographics (53,000 in our sample) had zero external links and 25% had less than 10 shares in total across all networks. This may reflect a recent trend to turn everything into an infographic leading to many poor pieces of content.

What also stands out is the relatively low number of referring domain links for quizzes. People may love to share quizzes but they are less likely to link to them.

In terms of the correlation of total shares and referring domain links, Why posts had the highest correlation than all other content types at 0.125. List posts and videos also have a higher correlation than the overall sample correlation which was 0.021.

List posts appear to perform consistently well as a content format in terms of both shares and links.

Some content types are more likely to be shared than linked to

Surprising, unexpected and entertaining images, quizzes and videos have the potential to go viral with high shares. However, this form of content is far less likely to achieve links.

Entertaining content such as Vine videos and quizzes often had zero links despite very high levels of shares. Here are some examples.

Content

Total Shares

External Links

Referring Domain Links

Vine video

https://vine.co/v/O0VvMWL5F2d

347,823

0

0

Vine video

https://vine.co/v/O071IWJYEUi

253,041

0

1

Disney Dog Quiz

http://blogs.disney.com/oh-my-disney/2014/06/30/qu...

259,000

0

1

Brainfall Quiz

http://www.brainfall.com/quizzes/how-bitchy-are-yo...

282,058

0

0


Long form content consistently receives more shares and links than shorter-form content

We removed videos and quizzes from our initial sample to analyze the impact of content length. This gave us a sample of 489,128 text based articles which broke down by content length as follows:

Length (words) No in sample Percent
<1,000 418,167 85.5
1-2,000 58,642 12
2-3,000 8,172 1.7
3,000-10,000 3,909 0.8
Over 85% of articles had less than 1,000 words.

We looked at the impact of content length on total shares and domain links.

Length (words) Total Shares Average Referring Domain Links Average
<1,000 2,823 3.47
1-2,000 3,456 6.92
2-3,000 4,254 8.81
3-10,000 5,883 11.07

We can see that long form content consistently gets higher average shares and significantly higher average links. This supports our previous research findings, although there are exceptions, particularly with regard to shares. One such exception we identified is IFL Science, that publishes short form content shared by its 21m Facebook fans. The site curates images and videos to explain scientific research and findings. This article examines how they create their short form viral content. However, IFLS Science is very much an exception. On average long form content performs better, particularly when it comes to links.

When we looked at the impact of content length on the correlation of shares and links. we found that content of over 1,000 words had a higher correlation but the correlation did not increase further beyond 2,000 words.

Length (words) Correlation Shares/Links
<1,000 0.024
1-2,000 0.113
2-3,000 0.094
3,000+ 0.072

The impact of combined factors

We have not undertaken any detailed linear regression modelling or built any predictive models but it does appear that a combination of factors can increase shares, links and the correlation. For example, when we subsetted List posts to look at those over 1,000 words in length, the average number of referring domain links increased from 6.19 to 9.53. Similarly in our original sample there were 1,332 articles from the New York Times. The average number of referring domain links for the sample was 7.2. When we subsetted out just the posts over 1,000 words the average number of referring domain links increased to 15.82. When we subsetted out just the List posts the average number of referring domain links increased further to 18.5.

The combined impact of factors such as overall site popularity, content format, content type and content length is an area for further investigation. However, the initial findings do indicate that shares and/or links can be increased when some of these factors are combined.

You can download the full 30 page research report from the BuzzSumo site:

Download the full 30-page research report

Steve will be discussing the findings at a Mozinar on September 22, at 10.30am Pacific Time. You can register and save your place here https://attendee.gotowebinar.com/register/41189941...


Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don't have time to hunt down but want to read!

No comments:

Post a Comment