Link Building Has Changed »

Posted by randfish

When I first started in SEO, link acquisition was almost always a manual process. I’d search the engines for links that pointed to the competition, find relevant directories and link lists, email relevant sites and beg, borrow or bribe (aka buy advertising) to get a link. I tried reciprocal link building (and did some pretty dumb stuff). Then, as I got more intertwined in the SEO community, I found vendors who built large networks of sites, spammed blogs/forums/guestbooks and ran text link sales operations. I leveraged these services to help clients rank better, almost always with great success. Then I met Matt Cutts, found out more about Google’s webspam team, saw penalties and their impact (remember Florida?) and even found some sites we worked on in the Sandbox.

Over time, I got smarter. I read papers about HilltopTrustrank, Anti-Trustrank and many more. I saw sites escaping the sandbox once they’d earned greater quantities of trusted links. I started understanding that Google’s search quality team was only going to get better at recognizing and counting legitimate links (and tossing out the junk), so I focused exclusively on more "white hat" kinds of links. That’s when I discovered linkbaiting and the power of Digg, Reddit & StumbleUpon to drive traffic that would naturally link. We had success with quizzes (and after Matt left SEOmoz, he had a little too much success) and viral content that earned thousands of links overnight and started offering it as a service.

As our clientele and foci changed, we changed again. Linkbait gave way to broader viral marketing efforts. Social media marketing arose as a practical and high quality way to earn links. Our clients became larger brands and organizations and one-off link projects weren’t scalable, so we consulted on tactics like content and technology licensing, training editorial staff to earn links & participate in the social media world themselves, and incentivizing user-generated content, which in turn brought links from those users. We found ways to drive natural links to deep pages on huge sites targeting the long tail, how to combine embeddable content and user-adopted brand affinity to drive link growth. And we stopped buying links entirely.

I figured a visual history might make for a compelling view:

A History of Link Building Tactics

Now, link building is changing again. I’m of the distinct impression that the engines (nowadays referring to Bing & Google, since the others are all but out of the picture) are evolving to keep up with the web’s breakneck speed and new forms of data, along with new ways of analyzing links, are making themselves felt in the SERPs. My guesses/observations would include:

  • Twitter really is cannibalizing the web’s link graph, or at least, the blogosphere’s and Google seems to be using Tweet counts in some way (though possibly only in the QDF algo).
  • The acceleration rate of link acquisition and the freshness of new links is having a more dramatic impact than before, and the "old crusty links" paradigm may be fading a bit.
  • Brand mentions and keyword associations with brand names are influencing the rankings more and more.
  • Un-trustworhty link patterns are conferring more filters and penalties than ever before.
  • QDD is as strong as ever, and vertical results are more prominent than at any time in the engines’ histories.
  • Google and Microsoft both know more about traffic and surfing habits than ever before, and this data is likely being used to, at the least, quality control for potential algorithmic misses.
  • Ad blindness is worse than ever (16% of Internet users are responsible for 85% of all ad clicks on the web), forcing the engines to make ads more relevant and more obvious to continue earning revenue.
  • Paid inclusion is going away, and talk of potentially paying sites to be in the indices (the reverse model) is in the air (or maybe not).
  • Billions of non-linked "references" flow out across the web through social media messages, emails, tweets and IMs. Someone, at some search engine, is undoubetdly mining this data to see how they can derive value and relevancy from it.

As marketers, we have to evolve or be left behind by those who can better adapt. It’s hard to see the forest for the trees right now, but I think we’re closing in on a time when real-time, social and traditional web references are all a part of the rankings equation. The future may be less about links and more about brand building and brand participation. I don’t want to be the most-linked-to site in my niche; I want to be the site that’s synonymous with my niche.

Now we just have to figure out the tactics…

Do you like this post? Yes No

Charting ‘Unique Keyphrases’ Using Advanced Segments »

Posted by RobOusbey

A useful indicator of SEO success is the number of unique keyphrases that send traffic to a website. An increase in this number is a reflection of increased trust in the site by search-engines.

Google Analytics can show you the total number of unique organic keyphrases at a glance, on the Traffic Sources ⇒ Keywords page. (Make sure you select ‘non-paid’ to exclude any CPC campaigns.)

This post will show you how to break that down to a more useful level of granularity and help you to create a table such as the following:

We’ll aim to categorise traffic into three buckets: ‘branded’, ‘head terms’ and ‘mid-long tail terms’. (In reality, we’ll actually calculate the first two, and the third one will be ‘everything that is left’.)

As we often can’t export enough keywords from Google Analytics to do the analysis offline, we will have to use ‘Advanced Segments’ to do this. This means that we can only group together ‘branded terms’ and ‘head terms’ in ways that we can explain through AND and OR statements.

The process for doing this goes like this:

  1. Plan to create advanced segments that define each group of keywords you want to track
  2. Define rules using ‘AND’ & ‘OR’ statements that describe which keywords should be in each group
  3. Apply these groups each month, one at a time, to the previous month’s data, in order to reveal the number of unique keywords.

Since this ‘rule defining’ will take place in Google Analytics’ Advanced Segments feature, we’ll be using ‘regular expressions’ - a clever but pretty technical method of defining which items in a set should be included in a particular subset. (More details about them at this site.)

The next sections may have particular appeal to the more ‘techie’ readers (or just those people feeling brave) - so do feel free to just skip down to the end to see screen-shots of these segments applied to the keywords report, if the nitty-gritty isn’t your cup of tea.

Creating the ‘Branded Terms’ Segment

If you’ve not really implemented Advanced Segments before, I suggest starting with Google Analytics’ help pages on the topic, but also having a play with the feature, to see how it works. (Really, do have a play. I’m going to assume you at least have understood what most of the main buttons do, and that’s a great way to find out.)

Planning the Segment

Let’s use a fictional company, TechNet, who make a product called the Vox9000. Their segment for ‘branded terms’ will include anything that mentions these terms.

Define the Rules, Create the Segment

To create the segment for branded terms, begin by clicking ‘Advanced Segments’ ⇒ ‘Create new custom segment’.

In the first ‘dimension or metric’ space, add a ‘Medium’ block (found under ‘Dimensions’) and set Condition to ‘Matches exactly’ and Value to ‘organic’. Then hit ‘and‘ to add another section. Place a ‘Keywords’ block here, with Condition as ‘Matches regular expression’ and a value that is all your branded terms, separated by the pipe character: |

(NB: the pipe acts as an ‘OR’ in these regular expressions.)

As an example, for TechNet (which people often search for it with a spaces, as ‘Tech Net’) that makes a product called ‘Vox9000′ (sometimes searched for as ‘Vox 9000′) would use the following string here: technet|tech net|vox9000|vox 9000

Give the segment a name, and save it.

Creating the ‘Head Terms’ Segment

Planning the Segment

The next segment - the head terms - is a bit more complicated, and you’ll see why it’s important for us to to specify rules that will define the head keyphrases.

Let’s imagine that TechNet sells laptops and notebooks in Philadelphia and Baltimore. (Therefore head terms will be those such as ‘notebooks’ or ‘laptops in philadelphia’)

In this example, the rules to define head terms might be:

  • the phrase can’t mention any branded terms
  • it must mention one of their product groups (laptop, notebook)
  • it can only have two words of 3+ characters (this allows for some short linking words, such as a, in, at, etcetera)
  • it can only have a maximum of four words in total.

Define the Rules, Create the Segment

The last two rules can be the trickiest to implement, so we’ll look at these first. Two insights help us solve these requirements:

Insight 1: Combining the two rules, and using S and L to indicate short words (1 or 2 characters) and long words (3+ characters) we see that the only twenty possible structures for keyphrases are: L, LS, SL, LL, LSS, SLS, SSL, LLS, LSL, SLL, LSSS, SLSS, SSLS, SSSL, LLSS, LSLS, LSSL, SLLS, SLSL, SSLL

Insight 2: The regular expression: \b[^ ]{3,50}\b matches a word of between 3 & 50 characters. It’s also necessary to know that ^ matches something at the beginning of an expression, and $ matches at the end. (Seriously, they do. Start by going through the examples at this site if you want to know why that’s the case.)

We’re now in a position to take the list of combinations from ‘Insight 1′ and replace ‘S’ with \b[^ ]{1,2}\b (matching words with 1/2 characters) and ‘L’ with \b[^ ]{3,50}\b, putting spaces in-between, wrapping in parentheses, and matching at beginning and end. Missed that? OK, here are examples of some of the resulting statements:

L becomes ^(\b[^ ]{3,50}\b)$
SL becomes ^(\b[^ ]{1,2}\b \b[^ ]{3,50}\b)$
LSL becomes ^(\b[^ ]{3,50}\b \b[^ ]{3,50}\b \b[^ ]{1,2}\b)$
etc.

You should join the twenty created expressions together using a pipe character, to create the resulting, massive, expression. To save space, I won’t post the whole expression in, but you can see what it looks like if you hover your mouse over this text.

NB: There seems to be a limit to the number of parts to an expression that you can put into Google Analytics, so I tend to break this up into two parts - say, those matching on three or less words, and those matching four - and put them as ‘OR’ alternatives in one section. I’ve done that below to demonstrate.

The resultant segment rules for ‘Branded Keyphrases’ look like this:

The image shown above reads:

  • Dimension: Medium, Condition: Matches exactly, Value: organic
  • AND
    • Dimension: Keyword, Condition: Does not match regular expression, Value: technet|tech net|vox9000|vox 9000
  • AND
  • AND
    • Dimension: Keyword, Condition: Matches regular expression, Value: laptop|notebook
  • Collecting the numbers

    With our two Advanced Segments defined, we can head back to the ‘keywords’ page and set the date range to the last month. Click each image to see it full size.

    We can apply each custom segment in turn, in order to collect the following numbers for September:

    • Total keyphrases: 64,278
    • Branded keyphrases: 393
    • Head keyphrases: 2,835
    • Other keyphrases: 61,050 (calculated from the previous three numbers)

    You can now put these numbers in a spreadsheet in order to chart the change in number of unique keyphrases as months go by.

    You can use these basic techniques to create and report on even more well defined segments of keyphrases (for example: you could group keyphrases by competitiveness, department, intent, etc.) If there are particular steps here that require more explanation, or you’re looking for more ideas about how to apply this to your SEO reporting structure, drop a comment below.

    Do you like this post? Yes No

    Whiteboard Friday - How SEOs Know SEO »

    Posted by great scott!

    So what’s the trick?  How do these folks who run around calling themselves SEOs actually know SEO?  Do they just make it up? Is there a class you take somewhere?  This week Rand looks at exactly this question: where do these guys (and gals) learn the stuff they know and how do they stay on top of the ever-changing search landscape to make sure they’re putting forth best practices for their clients and projects?

    Watch this week’s Whiteboard Friday to learn where you should focus your efforts if you want to learn SEO. You’ll find it’s not as complicated as you may think. In fact, it’s pretty simple, but not necessarily easy, especially when you start talking about IR and patent analysis, conducting research, collecting and analyzing correlation data, building ranking models, and other fancy strategies. But, as SEO extraordinaire and all-around awesome dude, Dave Snyder, adroitly demonstrated in his recent post about how he got started in internet marketing, hard work, talent, and a little luck are the backbone of success in this industry.

    SEOmoz Whiteboard Friday - How SEOs Know SEO from Scott Willoughby on Vimeo.

    p.s. Here’s the original post on Ben’s Ranking Models from the SEOmoz/Distilled London Training Seminar

    Do you like this post? Yes No

    SEOmoz 2009 Search Spam PubCon Party »

    Posted by jennita

    This post really doesn’t need much of an introduction, so I’ll get right down to it. Pubcon is coming! Pubcon is cooommiiinnnnggggg! It seems like the whole industry might just shut down for a week while we take over Las Vegas (I hope they’re ready for us). This would probably be a great time for spammers to come in and take over our SERPs since we’ll be busy in sessions, going to parties, meeting new people… and gambling (DUDE! It’s Vegas).

    SEOmoz will be representing in full force this year. Although we don’t have a booth, you’ll find us lurking in all corners of the event. Here’s a quick lowdown on who will be attending from the moz crew:

    Rand's yellow shoes
    Photo Courtesy of Dana Lookadoo
    • Danny - Say "Danny Dover" ten times fast. What?! It’s funny. Really. (ok I’m tired)
    • Scott - He’s coming out from behind the camera!
    • Adam - Ping him if you’re interested in user testing some of our new products!
    • Jen - Holla!
    • Arden - You’ll recognize him by being the friendly one (unlike the rest of us meanies)
    • Rand - You know, that guy who always wears those funny yellow shoes
    • Gillian - She arrives just in time from her worldwide SEO tour
    • Pete - As in Dr. Pete, apparently people only know him by that name. :)

    Speaking of Dr. Pete, don’t forget to check out his post 7 Tips for Surviving PubCon to help you make it through the week.

    Party Party Party!

    I know I know, quit blabbing and get to the good stuff. SEOmoz will be hosting the 3rd Annual Search Spam / Werewolf party on Tuesday night. Tickets are unfortunately limited to 200 people and are for SEOmoz Pro members plus guests, so be sure to RSVP right away before they’re gone!

    Here are the details:

    Date: Tuesday, November 10, 2009
    Time: 7-9pm
    Location: Wynn Hotel - Chambertin Room
    Drinks: 1 free drink ticket per person, cash bar after that

    RSVP for SEOmoz party

    This is a great way to meet other people from the community in a fun, laid back environment. There’s nothing better than meeting Matt Cutts for the first time while sitting at a table, cursing him, during a vocal game of Werewolf.

    Remember that if you attend you get your own deck of Werewolf cards with 25 well known Search Marketing peeps. Oh! And check this out, this year we have an ALL NEW deck of Search Spam cards. That’s right people, those old cards are now collectors items and you can probably sell them on eBay for millions of dollars. Heh… ok probably not, but if you do I’d like a percentage of the profit. :D

    New Search Spam Cards for Pubcon
    Cindy Krum, Todd Friesen and Chris Winfield are on the deck this year. There’s also a mystery coupon!

    Who knows, you could be the new Gracious Granter of Re-Inclusion or one of the dubious Black Hats. Perhaps you’re more on the white side of things? Hmmmm could you be in the deck? The only way to find out is to actually come to the party and get your own deck! If you’re not in the deck, you could always have fun with it and try to get people’s signatures on their cards. I actually did that last year and found it to be a good way to find a reason to talk to the "celebrities" :) (ya do what you gotta do).

    Werewolf Game

    So what IS this Werewolf game I’m talking about? Well you can find the description & rules here, plus I found this great quote from Ian Kennedy about the game back in 2007:

    Werewolf (also known as Mafia) is a great parlor game in which players try and figure out the good guys from the bad guys relying on your ability to read the body language of other players to determine who is telling the truth and who is lying while keeping your role and identity hidden from others. Because the game inspires psychological tactics and gaming, it’s the perfect way for a room full of SEO experts and search engine engineers to unwind after a full day of conference sessions here at Webmaster World in Las Vegas.

    - Ian Kennedy (everwas.com)

    Jen LopezLast year I played the game for the first time. It took me a while to warm up to playing, but once I did I had a great time! I met a bunch of new people, and who knows maybe it even helped me to get this job! (I played Matt Cutts QUITE well I should add). I can say from experience that I was glad I didn’t miss this party, and I can’t wait to play again this year. Be sure to sign up soon as the space is limited! We don’t want you to miss out and not get to see who else is in the deck. It could be YOU! (Yep, I’m in the deck and it says "Jenny from the C-Block" heh)

    The moz party is happening before the PubCon Palazzo Lavo Nightclub Party, be sure to also RSVP for that as well. Don’t forget to check out the PubCon blog to get information on all the PubCon parties going on.

    Please remember to say hello if you see any of us! But whatever you do, RSVP for the Search Spam party ASAP.

    Do you like this post? Yes No

    What Makes a Link Worthy Post - Part 2 »

    Posted by chenry

    This post was originally in YOUmoz, and was promoted to the main blog because it provides great value and interest to our community. The author’s views are entirely his or her own and may not reflect the views of SEOmoz, Inc.

    What really makes a blog post worth linking to?  In my last post, What Makes a Link Worthy Post - Part 1, I took a look at the 3,800 blog posts on SEOmoz and did some analysis on a few different aspects of the posts and their affect on the number of in linking domains (ILDs).  Some of the results were very interesting to me and it made me want to push it further. 

    I created a list of 40 SEO/SEM blogs that I read and feel are important to people in the industry and set those as my sample population.  I first crawled each website and collected a list of over 72,330 different blog posts from the 40 different websites.  Then over the course of the next few days, I crawled each post and collected the following information in my database:

    • Blog Post Title
    • Original URL
    • # of Links from Root Domains (Via Linkscape API)
    • # of ILDs (Via Linkscape API)
    • If The Post Had Images, Lists, Or Videos
    • Content of Post (No Comments or Other Text on Site)
    • # of Words in Post

    POSTS TITLE EFFECT ON ILDs

    Does the length of the post’s title affect how many domains will link to it?  The data suggests that posts with a title length between 10 and 18 words are on average more linked to than those with less or more.  The data also suggests there may be a “sweet” spot around 14 to 16 words in length.  The chart below was created without removing stop words. 

    This data proves to me that a descriptive title is what the linkerati is looking for.  Going overboard on the length of the title can prove to be a bad move also. 

    EXAMPLES OF HIGHLY LINKED TO POSTS WITH TITLE LENGTH IN THE “SWEET SPOT”

    POSTS LENGTH EFFECT ON ILDs

    Post length is a long debated thing out there in the blogosphere.  Most bloggers will tell you that you should keep your posts around 500 to 900 words, and that might be stretching it.  When it comes to SEO/SEM blogs, longer more content filled posts are more linked to than those with limited amount of content. 

    From the chart below you can see there is a word range that seems to collect more ILDs than other word ranges.  Based on the data, the ideal length of your posts should be around 2328 to 2618 words.  In my previous post, the ideal length for only SEOmoz’s post was between 1800 and 3000 words. 

    The chart above shows posts only up to 2812 words, but accounts for over 99.55% of all the posts. Posts that were greater than 2812 words really had a low number of ILDs.  For this reason and for the display of the chart, they were removed.

    EXAMPLES OF HIGHLY LINKED TO POSTS (BETWEEN 2328 AND 2618 WORDS)

    DEPTH OF POSTS EFFECT ON ILDs

    Seos know that you want to keep your key content in as few subfolders as possible but does this affect the number of ILDs you receive?  The data suggests that the depth of your post doesn’t affect the number ILDs.  The graph below shows that just about half of the blogs out there place their content two subfolders deep, such as seomoz.org/blog/POST-TITLE. 

    MEDIA’S EFFECT ON ILDs

    What role does placing list, images, and/or videos in a post play on the number of ILDs?  The data shows that putting any one of the media’s in your post will increase the number of ILDs you receive.  Putting a list on your plain text post could double the number of ILDs you receive.  The results are even more outstanding when all three types of media are used.

    Do I really believe that you can take any post, slap a picture in it and you will automatically receive more links?  No, but if you have decent content and media to support your post, it will appeal to more users and in turn increase the number of potential links.  I find it amazing that just by adding images and lists to your post could increase the number of ILDs by a large percent.  Images and lists are one of the easiest things to create and anyone can do it, so why aren’t they?  See the chart below for the full specs on adding media to your post.

    TOP MEDIA POST EXAMPLES

    So I’m sure you are all wondering what some good examples are of the different type of post along with the media.  Below are some links to some great posts that contain different types of media and have been successful.  Some of these posts should be your guide when creating new content for your site.

    ALL 3 MEDIA TYPES

    ONLY LISTS & VIDEOS

    ONLY LISTS & IMAGES

    ONLY IMAGES & VIDEOS

    ONLY LISTS

    ONLY VIDEOS

    ONLY IMAGES

    NONE

    TOP DOMAINS FOR MEDIA TYPE

    The data shows that there were certain domains that tended to use certain types of media in their posts.  Below I’ve put together two sites for each category so if you enjoy posts of a certain type you can visit their blog.

    ALL 3 MEDIA TYPES

    ONLY LISTS & VIDEOS

    ONLY LISTS & IMAGES

    ONLY IMAGES & VIDEOS

    ONLY LISTS

    ONLY VIDEOS

    ONLY IMAGES

    NONE

    AUTHORITIES EFFECT ON ILDs

    What role does a blog authority play in the number of ILDs?  Seems like a simple question and the data seems to show that if your an authority in your niche, you will generate many more ILDs than someone who is not.  Look at the chart below and you can see that Matt Cutt’s blog generates almost twice as many as its closest competitor, sugarrae.com!

    TOP TOPIC THAT ATTRACT LINKS

    Unlike SEOmoz not every blog places their post into nice categories and if they do, those categories will not match across all the sites.  So how do we determine what topics are attracting the most links and are good topics to create posts about?  We crawl 72,300 posts, determine the ILDs, and then extract the most used words from those posts to create a “super” group of keywords that result in link worthy blog posts.

    The first thing I wanted to do was extract all the text and find the words that are most used in all blog posts, I was curious, aren’t you?  After pulling out 27,658,728 million words and sorting them, five words came out on top: Search, Google, Yahoo, Site, and SEO.  Was I surprised, no but it’s interesting to know and a good starting point. 

    Taking a look at the top 1% of all 72,330 posts, it was found that the words did change a little bit.  Some of the top words used were:  Search, Google, Site, Links, SEO, Content, People, and Social.   This data seems very similar to what was found in part one of this study, with the SEOmoz data.  Posts that are about link building are very popular but now we can conclude that they are attracting links.  When we look at a much smaller percentage say only the top 50 posts, you find that you are getting very similar words such as: Google, Search, Blog, Link, Pagerank, and Site

    So what can you really take away from the content of the top 50 blog posts?  Stick with the major engines: Google, Yahoo, and maybe even Bing, on a good day.  The linkerati likes topics including Link Building, Pagerank, and Social Media.  As my disclaimer stated above, these are not the rules but just observations from a small sampling of the blogosphere.  If I knew the exact topic that the linkerati loves, I wouldn’t be writing here, I would be out making millions writing all day. 

     

    BIGGEST TAKEAWAYS

    • The data suggests that posts with title between 14 and 16 words attract more ILDs than those with longer or shorter titles.
    • Contrary to belief, the data suggests that posts with more than 900 words are attracting more links than those with 900 words.  Shoot for post between 2328 and 2618 words.
    • The data suggest the location/depth of your blog post doesn’t seem to have an effect on the number of ILDs you will receive but may affect your SEO work, so be cautious.
    • If you’re interested in the top post with a certain type of media, check above.  Also if you’re interested in the blogs that tailor to a certain type of media, check above.
    • Authority plays a major role in the number of ILDs that you will receive on your post.  Matt Cutt’s blog receives twice as many ILDs as the next closest blog.
    • Hot topics that attract links include: Google, Search, Blogs, Link Building, Pagerank, SEO, and Social Media.

    SUMMARY

    In summary, the takeaways above are generalization about a small group of post from the blogosphere and should not be taken as rules but merely as a guide to help you create content that will have the possibility to generate links.  Work on the authority in your niche and become that place people come to receive great advise.  While you’re waiting for authority to grow, make sure that your posts included visual aids to help readers get the takeaways quickly.

    SPECIAL THANKS

    Special thanks to the SEOmoz team for the access to the Linkscape API.  Without the use of the API this post would have never been possible.

    Do you like this post? Yes No

    24 Hours Without Privacy »

    Posted by Danny Dover

     Warning: This post has very little to do with SEO. It will apply to you and your work but not in the way that you are familiar with. It is an experiment for both of us.


    He wakes up at 7:31 AM just like the day before. With ignorance as his shield, he ventures out into a world without privacy.

    The young man (a title given to him by his mother) sits submissively in his bed and idly scans e-mail on his phone. Another notice from his bank, a forwarded e-mail from his friend, five new e-mails from his co-workers, and an intimate message from a girl he met last weekend at a party. He suppresses a smirk, she is toying with him and he knows it.

    Unbeknownst to him, four copies of his private e-mail are stored in locations around the world. The first is stored locally on his phone, a second on a search engine giant’s servers, a third on a consumer electronics company that he forwards his e-mail through and a fourth on the massive social network’s servers where the flirty message originated. Each of these copies are duplicated across servers for the safety of redundancy. Four separate corporations, run by people he will never meet, store his most private messages. A fifth corporation, a telecommunications conglomerate, logs the entire process and associates it with his account.

    After checking his e-mail, he pulls himself out of bed, takes a shower, and eats a homemade breakfast burrito. Just wearing a towel, he glances over to see if his blinds are closed. He is most conscious of his privacy when he is alone.

    He pulls his favorite shirt out of his closet (a medium sized black unlabeled t-shirt) and pulls it over his head. He leaves his apartment after double checking that he locked the deadbolt on his front door.

    Unknown to him, he is video recorded as he leaves his home by his apartment’s security system. The resulting tape is scheduled to be stored in a permanent archive.

    He walks the two blocks to his bus stop just in time to catch the 43. He sits down on his normal seat on the bus (between the middle door and the support pole) and puts in his headphones. For the entire 30 minute duration of the ride, none of the 23 people on the bus make eye contact with each other. The bus driver lazily and incoherently babbles into the onboard PA.

    When he first got on the bus, he scanned his bus pass which logged his account identifier into the metro system. After doing so, he walked by an on-board microphone that was set to record the entire bus. At the same time he was recorded by two different video cameras on the ceiling of the bus. As he travels down the road, the GPS in the bus sends its coordinates back to the bus station.

    Meanwhile 23 other GPS enabled devices sit in the pockets of the passengers. In any of those cell phones ring, they will enable microphones that would pick up and transmit the man’s voice. Even if the GPS features are disabled, the cell phones can still be triangulated via cell towers. In the off chance that both the GPS features and cell networks were disabled, the Internet enabled phones could still be geo-located when they accessed public WiFi routers. Together there are 70 different ways that the man sitting in his favorite seat on the bus could be tracked.

    He sits calmly and listens to the Garden State Soundtrack for the 46th time.

    Eventually he reaches his stop and steps off the bus and into the city. Adjusting his earphone, he starts the short hike to work. He nods at the police officer who is stopped at a red light. The cop doesn’t nod back.

    As he walks the half mile from his bus stop to his office, he is recorded by 4 different video cameras. The first is in the police car, the second and third are intersection cameras and the final camera is in the ATM he walks by.

    He eventually makes it to his office building and continues up the staircase. He says “hi” to the friendly man at front desk, grabs a glass of water and unpacks his laptop. He greets his co-workers and eventually settles at his desk.

    As he entered the office building, he swiped his card and was identified as having opened the door. This action was stored in a remote database. He then walked into the office where he entered a room with 25 other cell phones. There are also 26 computers. Together there are over 50 different devices that can access the internet. Each of them identify their location and general details every time they connect to a server.

    The privacy conflicts that the young man encountered in the offline world are nothing compared to those he will encounter later today in the online world.

    He sits down at his desk and loads up all of his normal browser bookmarks to quickly skim the days news. He checks his e-mail again.

    One of his friends sent him a link to an online video of a little kid singing a Beatles song. Having already seen it, he responds with a “lol” and switches tabs so he can skim the most popular articles on his favorite social news site. He clicks several of the corresponding links and is taken to some websites with various clever images and lists. The content is designed to be consumed quickly. He digests it like watching a flip book.

    Around the office he hears erratic bursts of laughter as his coworkers click the same links. Noticing this "popcorn effect" for the first time, he shares his thought on a micro blogging service.

    As soon as the young man opened his Internet browser, his operating system immediately began logging each website he visited. When his homepage loaded, the action was logged by the search giant. This record is owned by the same company that owns the second website he visited to watch the music video. Both actions are stored somewhere on an unknown server. The third site he visits, his favorite social network, logs every action he takes and associated it directly with his account. The same is done for all of his friends.

    After leaving that closed garden, he shares his popcorn thought with his acquaintances on the micro blogging service. His message is dispersed through a massive API where an untold number of services permanently store it and associate it with his account.

    He has a similar experience when he browses his favorite social news website. All of his actions on the domain are stored and associated with his account. Even after he leaves this domain to view the day’s funny pictures and top ten lists, his actions are tracked either by the search engine giant or a large Redmond based software company.

    At one point he isn’t on a page that displays ads by these major ad networks, but is instead tracked by a specific online retailer who has its own ad network. The online retailer associates his page views with his shopping account. All of this data is stored permanently and mined to ensure that the online retailer is able to maximize profits.

    He finishes his morning internet round by buying tickets on a movie ticket site. The popular site stores his zip code and credit card details.

    Later, after finishing a long day at work, he stops by his local grocery store to pick up a six pack of beer. He goes straight to the back of the store and brings the drink back to the register. Despite his facial hair, the clerk requests to see his ID. He complies and pulls it out of his wallet. The young man keys in his phone number in absence of his grocery store loyalty card so that he can save $0.50. The cash register prints out a receipt and the cashier shoves it in a plastic bag along with the purchase. The man thanks the grocer and continues on his way home.

    As soon as he stepped into the grocery store he was picked up by one of about 20 video cameras that continually record shoppers. As he approached the checkout stand he started a three tiered identification process that rivals that of getting a Passport.

    The first method was via government ID and was paradoxically the least useful to the grocery store. The cashier ignored his picture and instead focused on typing his birthdate into the register computer as speedily as possible.

    The second form of identification was via his phone number that was tied to his grocery card number. This allows the grocery store to log all of the young man’s purchases. This data is later analyzed to determine spending patterns of different demographics and to identify sale combinations that maximize profits.

    The final form of identification underwent the most analysis. The man’s credit card required an autograph. This transaction was stored by four different companies. The first was stored locally at the point of sale by the grocery store. The second was stored on his credit card’s servers. The third was on his bank’s servers and the final copy was stored on the servers of his favorite personal finance website.

    As he finally gets back to his apartment he grabs the package from his doormat and settles down for the evening. He orders in a pizza because he is too tired to cook for himself. The young man finishes his day by splitting the beers with his roommate and watching a movie.

    The package on his doorstep hints at only the tip of two data icebergs. The first one is owned by the online retailer and the second one is owned by a worldwide shipping company. His address and name are stored in massive databases owned by the two corporations. He will never have access to his information stored in shipping database. Records of the contents of the package, along with his credit card information are stored by the online retailer so that it can show relevant suggestions when he visits the website.

    His address, phone number, name and pizza preferences are stored by the pizza company when he places his order. This transaction is also stored by the pizza place, his bank, the credit card company and his personal finance website.

    His movie selection is stored on the server’s of the consumer electronics manufacturer from earlier. Again his financial transaction is recorded by four corporations.

    Back in his bed, he looks at the calendar, the date is November 3rd 2009. This is not some day off in the distant future, it is today. This is not some made up character, this is my life and it extends directly into yours.

    The balance between privacy and convenience is fickle. It is unproven and the rules are uncertain. Just because something is free online does not mean you are not making a sacrifice when you use it.

    That said, I do not believe that sharing personal data for convenience sake is evil. Many times it is economical and even beneficial. The fact that your pharmacy has access to all of your prescriptions gives them the ability to check for potentially lethal interactions. This is fundamentally good. Policies like these save lives. Information is not a scarce resource like oil or coal, it is bountiful and truly limitless. I believe it should be up to the individual to decide what they want to do with their person data. After all, a few of the corporations above were opt in.

    At some point in the future, I plan to completely open source my life. By this I mean, putting all of my personal data online. This will include everything from my browsing history to my digitized DNA. I have been conducting research on the ramifications of this for a little over a year and this article  unveils some of the realizations I have had.

    As you can see, the leap to throwing away my privacy is actually much smaller than I originally thought. In fact, both you and I are already almost all of the way there.


    The references to major corporations made above were all real. The actual corporations are listed below with some relevant facts.

    "search giant" - Google Inc.:
    In addition to the data points listed above, Google is storing hundreds of other metrics about it’s users.

    "consumer electronics" - Apple Inc.:
    Apple is rumored to be building a $1 billion data center so it is likely that more data like that mentioned in post will be stored in the future.

    "massive social network" - Facebook Inc.:
    In total, 25 Terabytes of user activity data is stored daily by the online social networking service.

    "telecommunications conglomerate" - AT&T Inc.:
    AT&T reported only has 20,268 servers. This is infantile compared to Google’s estimated 1,000,000 servers.

    "micro blogging service" - Twitter Inc.:
    Twitter recently peaked at 5,000 messages a second following Michael Jackson’s death. Odds are one of them was yours.

    "social news site" - Digg Inc.:
    Digg says that only about half of its server load is from visitors to its website. The other half is a mix of Digg buttons and API calls. This means a non-trivial amount of information that Digg collects is from people who are not even on the Digg domain.

    "video provider" - YouTube Inc:
    Youtube serves over 1,000,000,000 (billion) views a day. Odds are you are one of them.

    "credit card company" - Visa Inc.:
    The major credit card companies are now hiring psychologists and statisticians to mine your buying data and figuring out who is a liability.

    "a large Redmond based software company" - Microsoft Corporation:
    Microsoft is in the process of finishing one of the world’s biggest data centers in anticipation of creating the world’s first mainstream cloud-based operating system, Microsoft Azure. In Microsoft’s eye, the future is in your data. (Google and I agree)

    "online retailer" - Amazon.com, Inc.:
    Amazon has more 55 million active customer accounts.

    "favorite personal finance website" - Mint Software, Inc:
    Mint Software was recently bought by Intuit Inc. making its combined collection of personal finance information one of the biggest in the world.

    "worldwide shipping company" - United Parcel Service, Inc.:
    The United Parcel Service (UPS) can reach more than 4 billion of the earth’s 6.3 billion people to which it delivers more than 13.3 million packages each day.

    "the pizza company" - Domino’s Pizza, Inc
    Each of the 40,000 systems in the company’s franchises are connected to their global network. Your pizza order is not alone.


    Danny Dover Twitter

    Do you like this post? Yes No

    30 SEO Problems & the Tools to Solve Them (Part 1 of 2) »

    Posted by randfish

    Every day, SEOs are challenged in their jobs to solve problems big and small - some are technically complex, others are merely time consuming, repetitive and tedious. At SEOmoz, we love to build, use and recommend tools to help solve these issues. Tools and automation aren’t always the right answer, but for many of the challenges we face, they’re a welcome ally in the battle for effectiveness and efficiency.

    In this first of the two part series on the subject, I’ll be covering tools on SEOmoz. My next segment will focus on tools across the web.

    #1: View Source Sucks

    We’ve all had the experience of loading a web page, viewing the source code and sorting through trying to determine whether the H1 tag was implemented properly or if the <head> contains a rel="canonical" tag or, worst of all, counting internal/external links manually. These are essential elements of the SEO process, but they’re a "Royal Pain In The Butt" (RPITB from here on out).

    Solution: Analyze Page in the mozBar

    Thankfully, the mozBar has this spiffy "Analyze Page" button that opens a visual overlay with critical stats like meta data, link counts, rel="canonical," Hx tags, and even counts of characters in content areas. I’ve used this personally in a lot of live-review and client meeting scenarios and people are consistently impressed (and it makes us look like Pros).

    mozbar page analysis overlay
    See tags and character counts without having to view source (example from Oyster.com Hotel Reviews)

    mozbar link counts in the analysis view
    Link counts and page attributes are visible at the bottom of the "analyze page" overlay

    Getting the data fast is awesome - looking professional and raising eyebrows while we do it is another thing. I love tools that make SEOs look good - honestly, I’m focused on making more of SEOmoz’s products in this vein. I wish we’d built more of our tools historically with the mindset of ease-of-use and simple, obvious value (I sometimes worry that we’ve gone overly advanced in past tools that I’ve designed - hopefully Adam & the product team will keep us better focused).

    #2: Determining a PageRank Penalty

    Sometimes it’s hard to know whether a drop in PageRank (or the PageRank score on a page you haven’t visited before) is due to natural factors or modifcation by Google’s webspam team. Whether it’s a review of a potential client’s website, a look at a potential link partner or an analysis of your own site, knowing what’s happened with the PageRank score is an advanced, but sometimes essential piece of the SEO process.

    Solution: Historical PageRank + PageRank vs. mozRank

    Thankfully, there’s a very good system for solving this problem (or at least getting closer to the answer). First up is a free tool we’ve had for a long time - the Historical PageRank Checker:

    Historical PageRank Checker

    When PageRank has been lowered more than one point, particularly in a timeframe that doesn’t correlate with a standard PR update, you can feel relatively confident that some sort of PR penalty was incurred.

    Next are the metrics mozRank & mozTrust from Linkscape. Since mozRank in particular is both highly correlated with PageRank (on average ~0.55 off from toolbar PR) and calculated independently, you can use the comparison between these metrics to help identify disparities. When PR is significantly lower than mozRank, particularly on the homepage of a website, there’s a potential that a PR penalty may exist (though it’s also possible that PR simply hasn’t updated - Linkscape recalculates metrics every month, while Google updates PageRank on a fairly random schedule every 3-9 months). 

    The metrics from Linkscape aren’t perfect, nor are they a sure identifier, but they do provide an alternate source for comparison and contrast. You can get mozRank via Linkscape itself, or use the free API if you’d like to employ it on tools or in a more scalable fashion.

    #3: Valuing a Potential Link

    It’s hard to compare the value of links from potential pages, and yet this is an essential task in the SEO world. Managers need to know whether link acquisition is going well or poorly. Link builders need to be able to judge the quality of the sites and pages they’re targeting. SEO consultants and analysts need to determine where good links are coming from, where competitors have earned great links and what links might be spammy/low quality.

    Historically, we’ve had a very limited number of metrics - things like link counts from Yahoo! Site Explorer, PageRank of the site’s homepage and others have low correlation with rankings (we explored this on the blog in Ben’s Ranking Models post) and data accuracy issues, too (PageRank’s update cycles and lack of granularity - one point of PageRank is a huge amount of variance).

    Solution: Linkscape Metrics

    Linkscape has a lot of depth when it comes to metrics (sometimes too much, actually!). You can see data about numbers of links, linking root domains, scores around raw link popularity (mozRank) and trustworthiness (mozTrust). The metrics run on both a domain and an individual page, so you can get a sense of the importance of an individual URL and the domain it’s on. You can also feel confident that the metrics are provided with a greater eye to providing specific value to SEOs. The folks behind Linkscape are uniquely focused on providing metrics that prove valuable, predictive and accurate.

    Linkscape Metrics via the SEOmoz Toolbar

    One of my favorite places to get the metrics quickly is via the mozBar, which shows them at the top of the analyze page overlay. For even more depth, you can use the data detail tab (e.g. for Raveable.com) on the Linkscape basic report - and for large amounts of data, you can view (or export to CSV) the top 3,000 links to a page or site via the advanced reports.

    #4: Watching Rankings Over Time

    Watching rankings is a pain and manual systems aren’t scalable or a good use of anyone’s time. It’s also tough (perhaps even a RPITB) to track rankings across multiple engines and TLDs (.co.uk, .com.au, .co.nz, etc.) and keep track of the data in a format that can be exported intelligently.

    Solution: Rank Tracker

    Thankfully, there’s the Rank Tracker, a serious upgrade from our previous Rank Checker tool. You can watch rankings across multiple engines and geographies, and the interface is simple + easy to use.

    Rank Tracker Selection Interface

    Choosing terms to track is straightforward, and the system automatically pings every week and stores the historical data, which you can download in CSV. Lately, I’ve been impressed with accuracy - despite the personalization and geographic modification, the team’s been making great strides to ensure that the rankings are a good estimate of what a "normal" (non-logged in, geographically agnostic) user would receive.

    Rank Tracker Results

    BTW - I’ve also heard good things about Advanced Web Ranking (and always like to recommend good competitors - definitely more of that coming in the next post in this series).

    #5: Quickly Comparing Two Pages Metrics

    Answering the question "why does that page outrank me?" has plagued SEOs since time immemorial. There’s so many things that goes into the ranking equation that it can be tough to determine what’s critical to the process vs. unimportant. It’s particularly challenging to understand the difference in link metrics - is one on a more important domain? Does one have more links, but they’re mostly nofollowed?

    Solution: Visualization & Comparison Tool

    The Linkscape Visualization Tool is a great way to "see" into the rankings when comparing two pages.

    Linkscape Visualization & Comparison Tool

    The visual shapes represent the degree to which the page is meeting that metric’s potential, and somewhat amazingly, we see that the bigger area nearly always outranks the smaller one. It’s a great way to show clients, prospects and managers the gap between your site and a competitor’s and explain how far you have to go and in what direction. The tool doesn’t show all of the metrics in Linkscape, but it’s a good representative set and in future iterations, we plan to have more refinement and options available.

    #6: Finding Competitors’ Links

    Who’s linking to my competitors but not linking to me? It seems like a simple, straightforward question, but, as usual, the devil’s in the details. Most of the existing toolsets on the web (I mentioned several in this post) use the Yahoo! link query - linkdomain:site1.com linkdomain:site2.com -linkdomain:mysite.com (for example, see this search for pages linking to hotels.com and kayak.com but not Oyster.com). The problem is you have no good way to determine whether the list returned includes nofollow links, whether you’re getting the most valuable, important pages/sites listed first and whether the list filters out some potentially great stuff.

    Solution: Competitive Link Research Tool

    Enter the Link Intersect Tool in Labs. Just enter your site plus at least two competitors:

    Competitive Link Finder Tool

    The tool results will show you a list of domains that contain links to pages from your competitors but don’t point to you:

    Link Intersect Tool Results

    At SEOmoz, we’ve been calling this "cheating" for link building. The results are so useful and instantly actionable (and the data’s quite excellent, particularly when sorted in DmR order) that it just doesn’t make sense not to use it.

    #7: Tracking Links & Mentions in the "Fresh Web’

    Watching what’s happening around a blog post, website or brand name is a challenge. Lots of blog search engines and some of the emerging real time search engines can give you data points here, and some are actually quite good for their niche (I’ll definitely cover a few in Part 2), but sometimes, you just want a graph of what’s been happening in the blogosphere/twitosphere with a list of URLs where the action’s taking place.

    Solution: Blogscape

    We don’t talk a tremendous amount about Blogscape, but it’s getting to be a very good tool (and more upgrades are on the way). The dataset currently comprises 10 million feed sources that we found significant links to via Linkscape. These includes news feeds, blogs and, yes, Twitter accounts, too. The threshold was a number of unique linking root domains, so while this source doesn’t contain everything, it’s also not bogged down by a ton of noise, helping to make the signal rise to the top.

    Blogscape Graph

    Don’t miss the query operators page, which shows extent of search parameters and advanced data you can get from the index.

    #8: Fast Access to Links & Anchor Text

    Sometimes, you just need to see a list of links fast. Yahoo! Site Explorer has historically been the "go-to" source for this, but over time, not being able to filter nofollow’d links, nor see metrics, nor have any idea about the sort order used has made it a frustratingtool.

    Solution: Backlink Analysis Tool

    Labs’ Backlink Analysis Tool is terrificly useful for this scenario. Not only do you get a list of links ordered by relative importance in just a few seconds (slightly longer if the URL/domain has many thousands of links), you also retrieve an ordered list of anchor text distribution pointing to the page, subdomain or root domain.

    Backlink Analysis Tool

    It’s not pretty, but it is simple to use.

    Anchor Text Breakdown

    The fast anchor text breakdown is terrific for making short work of comparing multiple sites’ link profiles.

    Link List

    The link list itself is ordered by mozRank passed, a metric helping to show where the most "juice" is originating (though not necessarily the most important domains/pages). You can get more advanced in full Linkscape reports, but quick, dirty link lists and anchor text at the touch of a button, this is hard to beat.

    #9: Quickly Comparing Metrics from Numerous Sources

    There are times when client reports or c-level execs need a long list of metrics from a variety of sources - Compete, Alexa, Google PageRank, Yahoo! Link Counts, Google News mentions, etc. Going to each of the individual tools, running reports and gathering the metrics can be an especially tedious RPITB, particularly if you need to gather this data for multiple domains/pages.

    Solution: Trifecta Tool

    Trifecta isn’t always perfect - it’s pulling data from a lot of sources, some of which don’t have great uptime and can be squirrely about the ways they return information. However, it can be a much needed ally in the fight against the laborious process of manually collecting the numbers.

    Trifecta Tool Results

    The comparison feature is also a neat way to see and collect data from multiple sources at once:

    Trifecta Comparison Report

    #10: Finding Competitors’ Most Successful Linkbait

    How is it that my competitor earned all their links? What content did they put out there that was so successful? How can I figure out their strategy? Unless you’re willing to do a lot of surfing, this is a tough problem to solve.

    Solution: Top Pages Tool

    Thankfully, through Linkscape, we can collect data about which pages on a given subdomain or root domain have earned the most links. To give greater accuracy in the data, we use # of linking root domains. It’s our sense that seeing pages that have earned large numbers of links from different sites will give the best idea of where and how links are flowing to a site and how they’ve been acquired. The Top Pages Tool in Labs shows this data:

    Top Pages on SEOmoz

    Now that you can see them, you can go visit those pages, learn how they got the links, and reverse engineer those crafty strategic moves (also great for ID’ing spam that’s been created on your site).

    #11: Identifying Pages that Can Flow Link Juice Internally

    How do I know which pages on my site have the most link juice to share? It’s a common query as the right internal links can help to make the difference with both competitive SERPs and indexing problems.

    Solution: Top Pages Tool

    Once again, it’s Top Pages to the rescue. Not only can we see which pages have earned link juice, but we can also identify potential problems (302s and blocking w/ robots.txt being two of the big ones):

    Top Pages on Netflix

    I’m guessing someone at Netflix should really look into this… FYI - the "0" usually doesn’t indicate a problem; it’s just a marker (we’ll work on fixing that up as I know some of you have asked about it).

    #12: Get Social Media Monitoring Data

    While Blogscape is a good search tool on our fresh web index, there’s a lot of demand for a more functional montoring tool. Robust solutions from companies like Visible Technologies here in Seattle are quite pricey (worthwhile if you’re a big brand making a serious investment, but no geared to SMBs or most consultants).

    Solution: Social Media Monitoring Prototype

    The Social Media Monitoring Prototype is one of our latest releases and while it’s still very much in early alpha, the data is quite compelling and usable.

    Social Media Monitoring Prototype in Labs

    The counts (links, mentions and tweets) can be used to help determine the value of blogosphere, Twitter and linkbait campaigns over time. Just be aware that because of how Blogscape’s index and retrieval of sources functions, data from the last 48 hours is less stable and complete than older material. It’s a good tool to use after the fact, not necessarily in the heat of the campaign.

    #13: Streamline Common Link Search Queries

    If you’ve ever been tasked with manual link acquisition and told to use "all the common link queries" to find potential sources, you know how incredibly frustrating the process can be. No one likes searching for the same combination of phrases dozens of times over and over to retrieve the one or two credible sources that result in the SERPs.

    Solution: Labs’ Link Acquisition Assistant

    Danny released a spiffy tool earlier this year - the Link Acquisition Assistant - that’s a big time-saver on this front. Enter a few pieces of data about your site and the link campaign you’re running and it will spit back links to tons of relevant search queries and link lists. While it doesn’t automate everything, it can also be a huge boost in exposing ways to find and earn links you might not have considered.

    Link Acquisition Assistant

    #14: Determine a Keyword’s Relative SEO Competitiveness

    How hard would it be to rank for a particular keyword? Which keyword would be easier to rank for today? These questions are tough to answer unless you’re willing to dig deep into data on the top results - and that’s horribly time consuming (and a RPITB).

    Solution: Keyword Difficulty Tool

    The Keyword Difficulty tool provides a quick view into metrics that have historically helped SEOs determine potential competitiveness, as well as a percentage score that gives a sense of relative competition level.

    Keyword Difficulty Tool Results

    Like Trifecta, the data isn’t always perfect, and a new version of this tool is actually on its way (employing lots of the ranking models stuff we’ve been building with Linkscape to help actually analyze a page/site of your choice and tell you if you’ve "got a shot"). However, it’s still quite a good tool for getting a robust dataset automatically and

    #15: Getting On-Page Optimization Right

    Have I targeted my keywords in all the right tags? Did I misplace or mis-code anything? Am I as "on-page optimized" as I can/should be? Sure, you can dig through the source code manually and check, but that’s a (last time, I promise) RPITB.

    Solution: Term Target

    With the Term Target, just plug in the keyword you’re targeting and the page you want to rank and it sends back an analysis of the keyword usage, along with recommendations for where and how to employ the query term.

    Term Target SEO Tool

    There’s nothing particularly complex here (though, eventually, we’ll be switching to recommendations based on our correlation and ranking models data), but the usefulness is easy to see. We have members that I know just run the report on lists of pages, send the results to clients and get the changes implemented.


    Next week, I’ll look to cover many of the hairy SEO quandries that tools outside SEOmoz can help to solve. If you’ve got other ideas, tools or requests around any of these, please do leave them in the comments!

    Do you like this post? Yes No

    Web Analytics and Segmentation for Better Conversion Optimization »

    Posted by philou2803

    This post was originally in YOUmoz, and was promoted to the main blog because it provides great value and interest to our community. The author’s views are entirely his or her own and may not reflect the views of SEOmoz, Inc.

    I noticed some months ago that SEOmoz was having more and more web analytics/conversion rate related post, so I decided to give it a shot and talk about segmenting your traffic data in order to have a better understanding of your traffic. For this post I only use the Advanced Segments Tool from Google Analytics.

    Segmenting by location:

    I used to work for a website who was selling services to UK people exclusively. When I started there they were driving exclusively organic traffic, and the overall landing page conversion rate was as good as expected. So I started to segment the data by location and checked UK conversion rate against overall conversion rate: 

    google analytics tool   

    That was quite insightful. Look at the difference in conversion rate:

    Segmenting by Location   

    In fact the conversion rate of our target country was more than decent, and the reason our overall conversion rate seemed low was due to the mass traffic coming from other location (in this case coming from US). Unfortunately, we had nothing to offer them L. The solution was to drive more UK traffic and sort some SEO problem in fact. (For the story, Google did not understand we were a UK site: .com TLD, server in California, and Google Webmaster tools not yet implemented). Without looking at the UK segment exclusively, we would have taken the wrong business decision.  

    Segmenting by keywords:

    This one is quite obvious for most of the SEOmoz community I guess. Adwords is probably the most transparent ad network, and you can easily see which keyword converts, which one does not. The ROI is quite straight forward to measure.  

    Internal Search is quite interesting to segment as well: It is very important to understand which visitors use internal Search, and if those people convert more than average! Very easy to set up on Google Analytics:

     GA Graph

    Quite insightful, isn’t it? On the site above, the conversion rate is 5-6 times higher for visits with Site Search than visits without Site Search!   

    Content Segmentation (very useful for Linbait analysis)

    Let’s say you made very good linkbait content which went viral and gave you a lot of inbound links. As everyone knows, the SEO benefit is amazing, but did the post on its own lead to conversions? Did visitors signed up to your blog or Newsletter after finding out about your site? How could we measure this!  Well, nothing easier. You can easily segment your data to show only visits and visitors which had your linkbait page as an entry page:

    Entry Page Segment

    That way you can quickly appreciate the value of your linkbait (excluding the SEO benefit). I have a small online marketing blog where I wrote a post on how to track spiders with Google Analytics some times ago, and I can see very easily with that kind of segment what the visitors did after reading that, and where they were coming from.

    traffic source

    Segmenting by Visitor behaviour: 

    I love this one as it gives you very good insights and reveal a lot on how your site perform. For example I have an e-commerce website which has a conversion funnel of 5 pages. In that case, we can consider that a visit with 5 page views is an “engaged visit”. It is very important to see how those visits behave on your site, against the overall traffic: Here is the formula:

    Behavior analysis

    How about we compare that with the overall traffic?

    engagement visits

    The number of Page Views depends on what you want to measure, and which industry you’re on.That sort of segments are very good indicators to measure the engagement of your site. In the example above, we can clearly see that there is a lot of room for improvement.

    I only described 4 ways here to segment your data (which I hope you enjoyed). There are so many other ways to get insights with segmentation. You can for example have more than 1 dimension in your segment: Example: I want to see only visits coming from UK, with more than 15 pages views, which stayed on site less than 6 minutes (tricky isn’t it?)

    Segmentation is a wonderful topic which leads to another huge subject in web analytics: Key Performance Indicators. That will be my next post if that one is published (I’m thinking of a KPI Cheat Sheet). In the mean time, I would love to have your feedback and here about your experiences with segmentation.

    Do you like this post? Yes No

    Online Marketing News - 2009 - Creative Commons 3.0