Posted by randfish
What a year! From traveling to software development, saying goodbye to old friends and growing the team with new ones, we’ve had a tremendously exciting 12 months at SEOmoz. To celebrate, next week, on Wednesday, January 6th 2010, we’ll be hosting an informal meetup at the Elysian Brewery on Capitol Hill in Seattle, WA. Everyone from the Seattle technology, startup and SEO community is welcome to attend, and we’ll be hosting a special guest, Distilled’s Will Critchlow (who’s chosen the worst possible time, weather-wise, to visit our fair city). Please RSVP via the Google form below!
In addition to the meetup, I thought it would be appropriate (and fun) to celebrate the year with a look back in pictures. Enjoy!

SEOmoz’s Mel Gray, Matt Heilman, Gillian Muessig, Nick Gerner, Sarah Bird & Mike Thompson at Seattle’s Big Climb Event, raising money for the Leukemia & Lymphoma Society

The SEOmoz December holiday party at Olive8 - (Left to Right) Arden, Jimmy, Christine, Sarah, Ben Huff, Timmy, Gillian, Adam, Sam, Jen, Rand, Chas, Kate, Darren, Danny & Nick. Why did we all stuff into the dual showers? Umm… I don’t know. It seemed like a good idea at the time. You can watch our holiday video greeting and more holiday party photos on Facebook.

SEOmoz’s Chas Williams and Sarah Bird won most festive attire at our holiday event.

Tony Adam (BillShrink), smiling next to his SEOmoz Werewolf/Search Spam card at Pubcon Las Vegas in November

Kristy Bolsinger (blog), Kate Morris (blog) & Matt Cutts (blog) at the SEOmoz Werewolf Party at Pubcon Las Vegas

Ben Hendrickson and Jen Lopez, attired in full moz regalia, carrying "link juice" by the SEOmoz booth at SMX Advanced in Seattle

Ben, Danny (with a mustache! - he’s hidden so look real close), Chas, Scott & Timmy at lunch downstairs from SEOmoz’s offices at the Elysian Brewery on Capitol Hill

Sarah Bird hard at work in our cramped conference room

Sometimes, when we have tough decisions to make and could go either way, we Roshambo. I lost this round, and we ended up spending $5K on some professional services in our search for a new VP of Engineering.

Aimclear’s Marty Weintraub sent us a singing gorilla for the holidays. Tragically, I was out of town, but got to watch the video on Facebook :-)

At the beginning of the year, we had some construction work done on the office to help accomodate new arrivals

Mozzers hard at work in the conference room (and apparently freezing cold, too).

Ben Hendrickson explains ranking models and how we can "prove" H1 tags don’t really matter for SEO

Rand, Sarah, and SEOmoz board member & investor, Michelle Goldberg at The Naked Truth (a startup event in Seattle). Leaning on my shoulder is Mystery Guest, who tragically forgot sunglasses (why didn’t I give her mine?!)

The Conversion Rate Experts squirrel (yes, they have a mascot) at the SEOmoz/Distilled London PR) Training Seminar in October. Must check on progress of the SEOmoz Ring-tailed Lemur mascot costume.

Jon Kelly (Quinstreet), Tony Adam (Billshrink), Andy Liu (BuddyTV) and Neil Patel (Quicksprout) at SEOmoz’s annual party after SMX Advanced in Seattle at the Garage (photo-bombing courtesy of Matt Cutts)

Rand on Hubspot TV with Mike Volpe in Hubspot’s Boston offices (Rand: "My grandparents asked what channel I was going to be on.")

Rand is subsumed by Kristjan Mar Hauksson’s (of Nordic eMarketing) gigantic Viking hands in an Icelandic ice bar in the capital, Reykjavik following RIMC 2009

Dixon Jones (Receptional), Adam Lasnik (Google) & Rand go glacier hiking in Iceland

Rand at Searchfest Portland with Anne Kennedy (BeyondInk) and Adam Audette (Audette Media) speaking about SEOmoz’s history & future (apparently I was a bit more animated than most other folks) :-)

On a panel at SES London chaired by Mike Grehan (SES), Rand pictured with Brett Tabke (WebmasterWorld), Chris Sherman (Third Door Media), Jill Whalen (HighRankings) and Kevin Ryan (WebVisible)

Outside the Chicago Hilton for SES Chicago with Richard Zwicky (Enquisite), Bill Leake (Apogee), Aaron Kahlow (OMS)

Jane Copland (Ayima), Danny Dover, Rand & Richard Baxter (SEO Gadget) in London following the Distilled/SEOmoz PRO Training Seminar

Mystery Guest gives Rob Kerry (Ayima) a gift in London on our way back from lunch near the Ayima offices. ("Why is my love always a source of linkbait?" - MG)

Rand & Will Critchlow (Distilled), standing under their respective time zone clocks in Distilled’s London offices.
.jpg)
Alexander Holl (blog), Rand, Sandra & Matthew Finlay (Rising Media), Marcus Tandler (Mediadonis) at an SMX Munich party

Rand with Vanessa Fox (NineByBlue) & Mystery Guest in Bled, Slovenia for a day trip following SMX Munich

Rand & Mystery Guest join Nirav Tolia (Fanbase) for lunch in San Francisco during one of Rand’s VC fundraising expeditions to the valley

Bob Rains (blog), Lawrence Coburn (Rateitall), Lauren Vaccarello (Salesforce), Todd Malicoat (Stuntdubl) and Donna Rains in a limo during a (loosely SEO related) wine tasting trip in Monterey, CA

Laura Lippay (blog), Mystery Guest, Vanessa Fox (NinebyBlue), Lauren Vaccarello (Salesforce) & Jessica Bowman (SEM in House) in San Francisco following the Jane & Robot conference

Rand, Tom Critchlow (Distilled), Ken Jurina (Epiar), Dharmesh Shah (OnStartups & Hubspot), David Mihm (blog), Matt Brown (Define Search Strategy), Danny Dover & Nick Gerner at the SEOmoz PRO Training Seattle

Mystery Guest homemade retro Star Trek outfits for Halloween this year (and got a wig + Vulcan ears to complete her ensemble)

Rand with his grandparents, Si & Pauline Fishkin at a Broadway musical following SMX East in New York City

Rand & Cindy Krum (Rank Mobile) tour Soho during SMX East in New York City

Left to Right: Rand, Greg Boser (3Dog Media), Barry Smyth (BSocial), Stephen Pavlovich (Conversion Rate Experts), Rob Kerry (Ayima), Aidan Beanland (Yahoo!7), Michael Motherwell (MMIT Search Australia), Bruce Clay (Bruce Clay, Inc), Greg Grothaus (Google)

The SEOmoz whiteboards in our conference room, showing off early concepts of new software (codenamed "Turbomoz") we’re hoping to launch this coming June

Ciaran Norris (Mindshare) was interviewed by Channel 4 in the UK on social media, search & Rupert Murdoch’s threats to shut off Google traffic. Tragically, he appeared garbed in naught save rags, and couldn’t be bothered to properly attire with a cravatte. Credit to Jane Copland for the image capture.
Rand, with a traditional Colombian hat, a gift from Gustavo Parra (at right) pictured at SMX Advanced Seattle

The SEOmoz crew outside the Garage following our party at SMX Advanced

David Temple (SEM Scholar), Gillian Muessig and Barry Smyth (BSocial) at SMX Singapore

Jen Lopez at SMX Advanced with Michael Gray (Wolf Howl)
Oh, and just FYI, the photos above are in no particular chronological order.
Posted by randfish
Historically, I’ve been fairly narrow in what I read in the blogosphere and tech arena (almost all SEO-centric stuff). You can see my Firefox sidebar list here, which hasn’t changed much since 2008 with the exception of the blogs and news sections. But, over the past 6 months, I’ve been broadening out considerably and found that it adds a great deal to the conversations I’m able to participate in and contribute to, especially as SEOmoz itself has expanded from the SEO world to the larger technology and startup world. For the New Year, I thought I’d share some of the sources that have contributed most on this front and some of my favorite posts/contributions from those sources.
I find more good stuff here than anywhere else, and the diversity is impressive, too. Tragically, Hacker News is also a place for lots of misinformation, fear, and loathing around SEO, but it’s good to get a sense for how the rest of the technology world still views our niche. The signal to noise ratio is higher than on places like delicious/popular, the tech subreddit or Digg (which has become largely useless to tech professionals as its moved away from its roots).
A few items I’ve found via Hacker News include:
Fred writes compelling pieces consistently, almost never gets preachy, is self-promotional in a highly credible and useful way and brings up topics I wouldn’t have thought about without him. Most of us can’t have Fred on our boards or as an investor, but we can get into his head via his blog and participating more in the comments there has been a priority of mine for a while (he’s built a remarkable community in the comments).
Some favorite posts:
Chris, like Fred, delivers crystal clear value propostions with his posts. And IMO, he’s even higher signal to noise than Fred. I don’t always agree with him on everything, but I like the way he thinks about problems, I like the ones he brings up and I think he has his finger intensely on the pulse of what startups and technologists (and technical marketers like SEOs) are thinking about and dealing with. It’s a pleasure to see a new post from Chris - here’s to hoping he makes many more in 2010.
Some favorites include:
Techmeme is an obvious choice, but it’s also critical to the list. If it weren’t for Techmeme, I’d have to wade through ReadWriteWeb, Mashable and Techcrunch post-by-post, every day. Don’t ever leave us, Gabe.
No specific posts here - there’s far too many to name, and the site updates much too quickly for me to even recall all the great stuff I’ve found here. However, I will say that I highly recommend m.techmeme.com for mobile browsing. It’s been a joy to scroll through every time my wife takes extra-long in the dressing room at Anthropologie.
(http://answers.onstartups.com)
Launched just this past October, Answers On Startups has become a haven for learning more about the challenges, issues and questions entrepreneurs face in the technology world. I’ve recommended it before, and early on participated heavily (and I’d like to do more of that in the future), but if you’re seeking answers from highly authoritative folks in a scalable fashion, this is the spot. I’m really impressed by the quality of many contributions there - the signal to noise is pretty exceptional.
Some of the best include:
In my ideal world, 5 years from now, when I’ve been put out to pasture by someone smarter and more capable, or bought out :-) I’d have a blog like this. Some entries are just links, some are lengthy and thoughtful and all are interesting and worth reading. Author John Gruber also brings a remarkably diverse range of topics to the site and yet somehow, signal to noise remains high.
A few recent picks:
A few of Steve’s posts are not only relevant, but serve to actually change direction in the executive ranks here at SEOmoz. That’s high praise, but if you read the blog, you’ll see what I mean. Steve’s been there, and his experiences run in shocking parallel to the issues we face or worry about on a regular basis. Even when I disagree with points, the logic and thought he puts into the post makes for a great read and a hard think.
Some of his best:
(http://www.nytimes.com/gst/mostemailed.html)
Despite the financial and institutional problems they face, the NYTimes still puts out absolutely phenomenal content on nearly every area of life. From cooking to politics, travel to health, there is amazing material to be found in the Grey Lady, and the Most Emailed list is the place to find the best of the best.
Some favorites:
When I was out trying to raise a second round of VC this summer (big mistake - more on that in a future post), Venturehacks’ historic content was invaluable. However, visiting the site made me realize how much good stuff there is that doesn’t apply only to those currently raising money. They’ve got some seriously great writers/contributors, invaluable interviews and tackle tough subjects.
My personal favorites recently included:
Since they don’t publish archives (the most frustrating feature), I’m unable to show off just how cool this site is and has been over the last few months, but just try visiting a couple times a day for the next few weeks and you’ll see. It’s remarkable how much good stuff gets re-tweeted (and how much junk - signal to noise is about 15%, which is still decent since it’s easy to skim and consume at will). You can also get a sense for how important Twitter’s link graph is to the engines through Twittersphere - a lot of pages that have 0 links will have thousands of tweets pretty fast.
Posted by Danny Dover
Update: Based on some excellent feedback in the comments (Seriously, thank you everyone!) I have updated the post with some clarifications and more added data. Specifically, I added a diagram of the page setup and removed a confusing comment I made about Javascript links.
As a result of this shift, we have been running more tests and analyzing more data. Before I get into the topic of our latest test results, let me provide some important points to establish context.
The Experiment
We chose the following five PageRank sculpting methods to test:
Rel=‘nofollow’ - The standard mechanism for nofollowing a link. <a href=’http://www.example.com’ rel=‘nofollow’>example</a>
Link Consolidation - Consolidating low priority pages. You can read more about link consolidation here.
Iframe - Include a standard link in an iframe that is blocked via robots.txt or meta robots so engines can’t follow it.
Javascript - An external Javascript file (blocked from robots) that inserts links into divs when the page renders.
Control Case - Null test with standard links.
Page Setup
We then built five standardized websites that used these different methods (one used iframes for its test links, another one used Javascript for its test links, etc..) and included one normal link with the anchor text of a phrase that was completely unique on the Internet.
Each website in the experiment used the same template. Each keyword phrase was targeted in the same place on each page and each page had the same amount of images, text and links.
The standardized website layout contained:
.png)
Please note that the above example was NOT actually used. I provided a fake example to maintain the integrity of the testing platform for future tests.
The experiment variables were:
We then did everything we could to make sure that all of these pages received the same amount of link juice from external sources.
The null result would be a random assortment of experiment types ranking in the SERPs.
The alt result would be one experiment type outranking all of the others.
Redundancy
We then duplicated this experiment eight times in parallel. This meant 40 different domains, 40 different IP addresses, 8 different WHOIS records, 8 different hosting providers and 8 different payment methods. (We then went outside and drank)
We ran this test for 2 months.
The Results
| PageRank Sculpting Method | Average Rank in Google |
| Nofollow | 2.4 |
| Link Consolidation | 3.0 |
| Iframe | 3.1 |
| Javascript | 3.2 |
| Control Case | 3.2 |
| Rank | Test 1 | Test 2 | Test 3 | Test 4 | Test 5 | Test 6 | Test 7 | Test 8 |
|---|---|---|---|---|---|---|---|---|
| 1. | nofollow | nofollow | control | nofollow | consolidation | iframe | nofollow | control |
| 2. | javascript | iframe | javascript | consolidation | iframe | consolidation | consolidation | iframe |
| 3. | consolidation | javascript | nofollow | iframe | nofollow | control | control | javascript |
| 4. | control | control | consolidation | javascript | javascript | javascript | javascript | nofollow |
| 5. | iframe | consolidation | iframe | control | control | nofollow | iframe | consolidation |
As you can see, the nofollow method ranked an average of 1 place higher (0.7) in the SERPs than the control result. This is significant when you realize the total is out of 5.
It appears that the iframe method and link consolidation were slightly effective but the margin was so small that they could be contributed to error.
The Javascript method did not work at all.
The Bottom Line
Despite what the search engine representatives say, nofollow is still an effective way for sculpting PageRank. If you have nofollow sculpting already installed, don’t remove it. If you don’t have it installed, implementing it probably won’t make a drastic change but we encourage you to test this when it is responsible to do so.
Posted by great scott!
Welcome back to our second installment of this very special WhiteBEARD Friday! Last week Rand Fishclause discussed how the new school way to get links is to give back to webmasters. That’s right, you’ve gotta give a little to get a little. This week, in the spirit of Searchmas©, we’re giving you 12 examples of sites that exemplify this new model.
From video hosting, to awards, to social profiles, and many more, we hope you’ll come away with some great ideas about what you can do to provide outward value to the linkerati and get a whole lotta link love back in return.
SEOmoz Whitebeard Friday - 12 Link Strategies of Searchmas from Scott Willoughby on Vimeo.
From all of us here at SEOmoz, thanks for joining us every week for our 2009 season of Whiteboard Friday, and for being part of one of the most vibrant, fun, and talented communities on the web. Your participation and readership really means the world to us, and we can’t wait to share 2010 with you. Until then, happy holidays :)
Posted by randfish
Earlier this month, Google launched personalized results by default for all users. SEOs should have already read Danny Sullivan’s analysis of the shift (which is quite excellent) and I also suggest checking out David Harry’s Guide on the topic. Sadly, despite some good advice, it appears that a lot of folks are still worried that this is somehow the "end of SEO" or demands a "completely new look at SEO practices." Let’s do a brief analysis:
The big takeaway here is that these action items aren’t particularly groundbreaking. We should have been doing all of these as responsible, effective Internet marketers anyway.
No. I’m maintaining my previous stance that unless a shift from Google fundamentally changes the classic SEO process:
It doesn’t qualify as a "tectonic" or "massive" or "fundamental" change in SEO. The best practices we’ve been recommending to clients, developers and content creators for the last half-decade are actually less impacted by this change than by some of the other items we’ve encountered recently (Bing + Yahoo! combining, real-time results at the top of query results, more vertical results in the SERPs, etc.). These latter examples call for much more active changes, learnings and direct action on the part of SEOs vs. personalization, which by-and-large just strengthens the reasons for best practices we’ve long known to exist.
p.s. Tomorrow evening at 6pm (Tuesday Dec. 22nd), I’ll be attending an informal SEO meetup in San Diego, CA at the Gordon Biersch Brewery in Mission Valley - 5010 Mission Center Road San Diego, CA 92108. Hope to see some of you there before the holidays!
Posted by RobOusbey
Amongst the add-ons I add to any new install of Firefox is the Web Developer Toolbar by Chris Pederick. (Find the install links at the bottom of this post.)
Obviously, this add-on is chock-full of features that are useful for web developers, but it really does make diagnosing various SEO issues much easier. This list gives the top seven tasks that I find easier when the toolbar is installed.
By turning off JavaScript and Cookies, you can browse the web as it’s seen by ‘bots (which in most cases can’t accept cookies or execute JavaScript.) This basic change can help you recognise site architecture issues pretty quickly, such as when a main navigation bar is displayed using JavaScript or when visitors who can’t accept cookies always get redirected to the front page. (Yes, I’ve seen both of these in the wild.)
For a more hardcore spider-emulation experience, use the Toolbar to turn off styles and images. The sudden appearance of previously cloaked text or seeing that the ‘main heading’ is actually an H4 item and sat 75% of the way through the content might suggest why a particular page is having issues.
Although different spiders treat meta redirects in different ways, it can often be easier to diagnose some on-site issues if you disable them altogether via ‘Disable → Meta Redirects‘. To see what the site serves up to different user agents (such as mobile devices, GoogleBot, etc) you’ll want to get the author’s other successful add-on, the user-agent switcher.
Talking of page structure, you can press ‘Information → View Document Outline‘ to see the structure of a page, or simply ‘Outline → Outline Headings‘ to see the hierarchy of headings within the page.
The toolbar gives quick access to code validation tools (such as the HTML, CSS and RSS validation from WC3.) There are also options to highlight links without title attributes, or images with missing (or blank) alt attributes.
Those of us with our massive screens (by the way, did you see this guy?) might not always appreciate how people view our pages. However, a quick click on the ‘resize’ button lets you see the site through the viewport of an older monitor or a net book.

A change we’ve tried to make at Distilled recently is to include more illustrative images in our client reports. A fiddly task that comes up from time to time is creating a screen shot of a web page, but without it being obvious which links you’re already clicked on. A quick click on ‘Miscellaneous → Visited Links → Mark All Links Unvisited‘ removes the ‘visited’ styles from any links on the page.
A year ago, I posted about how to hide your referrer string when browsing, as a handy way to prevent people seeing that you’re probing their site. It’s much easier to do with the Web Developer Toolbar, by simply clicking ‘Disable → Disable Referrers‘
You can read more about the Web Developer Tool Add-On, or if you’re running Firefox, simply install it now.
If you’re already a convert to this add-on, do let us know in the comments of any other features you use regularly.
Posted by great scott!
Ho-ho-ho! Merry Winter to you! In a very special Whiteboard Friday we’ll look at the new model for attracting lots of inbound links: giving back to webmasters. Nowadays it’s not always enough just to have great content. You’ve got to give the linkerati value–something that will incentivize them to link to your site. Rand Fishclause discusses how this new model works and then, next week, we’ll give you 12 link strategies of Christmas just in time for you to open them under your tree and put into action for the New Year.
SEOmoz Whitebeard Friday - Give and Ye Shall Receive from Scott Willoughby on Vimeo.

Just a quick reminder that today is the final day to get the new Advanced SEO Training Series: Tips, Tricks & Tactics at the special launch pricing of 20% off + free shipping!
Posted by randfish
First off, apologies for my absence from the blog these past few days. It’s been an incredibly busy time, trying to wrap things up before I leave for San Diego over the holidays. So much for a December lull… In this post, I’m going to try tackling a lot of the recent trends we’ve been observing from the engines and talk about my personal perception of what’s to come over the next 12 months.
Microsoft initially beat Google to the punch in announcing their integration with Twitter data in their SERPs. And in response, last Monday, Google released what is, in my opinion, an early test version of Twitter integration that’s nowhere near ready for prime-time. Google has a history of jumping the gun to prevent other companies from stealing the press narrative, but in this case, I think it’s seriously damaging (and nearly everyone, consumer or search enthusiast, agrees) their usability and relevance.

As Danny Sullivan notes, it’s like we’re back to Infoseek in 1997. If you want to rank #1, don’t worry about quality content, relevance or popularity, just be the last person to Tweet about a topic and you’ll come out on top (at least, for a few seconds).
This is, in my estimation (and many others), the worst implementation of new results Google’s ever implemented. I imagine the clickthrough and abandonment stats have their usability folks up in arms already, and it’s only to preserve face from a PR perspective (as well as an increasingly prideful attitude of "Don’t like it? So what are you gonna do about it?" that Aaron Wall describes in a gutting fashion here) that this has stayed in place as long as it has (1.5 weeks).
In 2010, I think this fades away. Perhaps not entirely, but we won’t be seeing it for nearly as many queries with the prevalence we do today. Google may love real time, and it’s certainly gotten them a lot of press (though very little of it is entirely positive), but they can’t continue sacrificing quality for PR in this fashion. I think the engineers still run things over there, and the stats data is already making them balk. Although I don’t have numbers, my impression is that we’re already way down in the quantity of queries showing real time results compared to last week.
All that real-time integration bashing aside, I’m a firm believer in my original hypothesis that Twitter is cannibalizing the web’s link graph. In fact, I think a rough history of "recommendation sources" looks something like:

Google has always strived to keep up with the latest ways that content is being recommended and suggested. It’s how they determined popularity and relevance with PageRank and I think Twitter’s data is merely the next evolution. Just yesterday, they launched their own URL shortening service (I think this was more to get data, but it’s also possible it was a pre-emptive PR strike against bit.ly, who launched their PRO service just a day later).
Google’s not going to just take raw number of tweets or re-tweets. I think we’re already seeing the relevance and reputation calculations in their decisions of which tweets and sources to show in the real-time results, and I expect that algorithms/metrics like PageRank, TrustRank, etc. will find their way into how Google uses the real-time data. Today, SEOs want to turn tweets into links so they can get SEO benefit. My feeling is that tweets are going to carry their own weight in helping pages rank in the not-too-distant future.
Unlike real-time’s temporal nature in the results, I think personalized search is here for the long haul. Google released their "permanent" personalization of results last week, and Bing released their own just this week. As usual, SearchEngineLand’s coverage is impeccable, though one big question remains in my mind:
What metrics impact personalization?
Is it merely clickthroughs from the organic results? Does visit history play a role? Or clicks from other vertical search services Google offers? What about clicks from paid search ads - either in the SERPs or from AdSense/DoubleClick?
I’d love to see experimentation done on this front so marketers have a better idea what they’re dealing with. If it’s proven that you can get organic benefits by attracting PPC clickthrough, this may be the new "paid inclusion" for 2010, and could drive bid prices up massively as companies compete not only for paid listing clicks, but for the chance to earn "organic" positioning as well.
Personalization means a few things for SEOs, but it doesn’t fundamentally change the game, IMO:
Whenever we encounter these "paradigm changing" events in the SEO world, I like to go back to my philosophy about SEO fundamentals. From what I can see, it looks like things haven’t changed enough yet to warrant panic. It’s been a massively dynamic 3 months, but we’re not on the precipice of anything that’s going to shift SEO in the ways some previous "game-changers" have.
The latest figures suggest that Google continues to slowly gain market share in the US, while Bing & Yahoo! compete for share that will eventually belong to them both (once the regulatory hurdles clear, which I think they will). I believe that a year from now, most webmasters will be looking at a scenario where Comscore/Hitwise reports Binghoo! has ~25-28% market share, but those engines combine to send a little under 20% of all search traffic (remember that they count searches on all Microsoft and Yahoo! properties - even internal searches - while Google tends to send the vast majority of their search traffic externally to other sites).
Tragically, everything I hear out of Yahoo! and Bing is that Site Explorer is off to the great beyond. The expense of maintaining a web index isn’t something Yahoo!’s willing to invest in once they don’t have to, and Bing’s given no indication that they’re going to re-open the portal to link information. The best we can hope for is an acceleration in the functionality offered by Bing Webmaster Tools, but even that’s unlikely to offer competitive link intelligence.

I’m guessing other services will rise up to try to take Site Explorer’s place, as the service had millions of monthly queries run against it.
Forrester put out a great report on US Interactive Marketing Spend (a little pricey at $1749, but interesting). Two graphics struck me as particularly compelling:

SEO trails only social media and online video as places where marketers (not just search marketers, but ALL marketers) will be shifting dollars.

Meanwhile, SEO continues to outpace PPC in terms of CAGR. We’ve still got a long way to go before balance is established between the share of clicks SEO commands and the fraction of spend it receives, but the gap is slowly closing.
If I were doing another startup today, it would focus on software for conversion rate optimization. I think this is still the most under-utilized and highest ROI activities in the marketing department, but more awareness is on its way. CRO isn’t just about testing; it’s about building a process for improving conversion over time. Online businesses can generate so much revenue from this, yet few invest. I think 2010 is the year, simply because it’s an inflection point for companies to assess their spend and where they derive value. These guys are likely in for a blockbuster year; I wish I could invest :-)

This graphic comes via my post on choosing which Internet Marketing Channel to Pursue.
Google & Bing are both doing more to make their visitors stickier and get their queries answered without ever having to leave the engine. This is a good product practice for both companies, and I’m surprised Google’s taken so long to move away from their "get people off Google" point-of-view, but it’s definitely happening. Check out some recent examples:

Everything I need to know is right there - the last game score, the record, the opponent, their next match day and time. The only thing missing? What channel it’s playing on in my area.

I don’t even have to complete my query! Google’s got that weather report sitting in the suggest box. They wrote about this feature here which launched last week. Google O/S had another good post on the topic.

Thankfully, I’m not actually headed to Kodiak, but those results are pretty spiffy, and are likely to prevent me from needing to visit Alaskaair.com and get that flight info.

The customer service number is something Bing’s started to provide more and more (though there’s one company even they don’t have that data on). With Fedex, you don’t even need to leave Bing to track a package (Google also offers similar functionality).
My perception is that the more the engines can apply "instant answers" to search queries, the more they will, and the less any other sites will see traffic from those queries. It’s a better user experience this way, and I’m certain it’s one of the biggest things that engenders loyalty and return queries - something both engines are desperately competing for.
This post isn’t intended to be one-sided, and I’d love to hear from you - do you agree? Disagree? Think I’m out of my head? Let everyone know :-)
Posted by Tom_C
This past week saw the launch of Google’s real-time search and quite frankly everyone flipped out. And justifiably so, it’s not often that our SERPs get torn up so much in a new way like this.
Questions I’d love to see the answer to are things like:
Unfortuantely I think it’s a bit early to have answers to questions like this, so rather than tackle these questions I’m just going to talk a little bit about how you can go about tracking the impacts of real-time search results on your industry.
Does real-time search affect my industry?
The answer is probably yes. For search terms that have hardly any tweet-volume I’ve already seen examples where literally one or two tweets can generate a real-time one-box. Sometimes even for the brand name term. That means that more or less any breaking news in your industry will generate some level of real-time results.
But what about other industries? After all many of us will be working on sites that target keyphrases that people DO tweet about. For us, the focus is on trending search terms. The key thing is to identify the types of keyphrase that might feature real-time search results. The most useful way of doing this that I’ve found is to monitor twitter volume and in particular monitor peaks and troughs in volume. Trendistic will do this nicely for you. The first neat thing from Trendistic is that you can see a long list of hot topics by day in the archive:
How Do I Track Real-Time Traffic?
The second nice thing from Trendistic is the ability to query individual terms and see when peaks and troughs occured over time, for example here’s a snapshot of the [eagles] term (nice win Eagles!):
By using a service like this you can query the historic search volume and take an educated guess at when real-time search might have been triggered. By doing this for your main search terms you can start to understand things like strange traffic drops or spikes that might have been caused by real-time one boxes hanging out in your SERPs.
What about if you’re actively engaging in twitter though? If you feel like you might have gained a portion of your search traffic from tweets that were appearing in real-time search results then you should think about tracking those clicks.
Tracking real-time search volume and one-box traffic is a difficult problem however and one that isn’t completely solved. That said, here’s a few things that might be of use. Firstly, for anyone seeing #-based Google URLs you can actually track clicks from different parts of the page. Looking at the following real-time search for [nexus one]:
I clicked on two different results, the first one was a ‘real’ result that appeared in the real-time box, that is a page that’s been crawled recently and shows up via Google rather than showing up because Google found the result on Facebook or Twitter etc. With the # URLs at Google in action I saw the following full referral path:
http://www.google.co.uk/url?sa=t&source=web&oi=blog_result&ct=result&cd=11&ved=0CBcQmAEwCg&url=http%3A%2F%2Fwww.ccortez.com%2Fhtc-nexus-one-blessed-by-the-fcc-updated%2F&rct=j&q=nexus+one&ei=gComS7LCDZehjAeDwdTOBw&usg=AFQjCNF2939x_yuKVTzL9UlN6m23cw0Kog
Note the "&oi=blog_result" in the referring URL (bolded added, obviously). This let’s you see any real-time traffic that has come via a crawled blog post. After that I clicked on a twittered URL and got the following:
http://www.google.co.uk/url?url=http://bit.ly/7315xj&rct=j&ei=2yomS4y7NYvNjAfQ3qXfBw&sa=X&oi=microblog_result&resnum=9&ct=result&cd=1&ved=0CD8QoAQoADAI&q=nexus+one&usg=AFQjCNGWb9DkQaPZd2NGuOg6Th7lWd9hsg
Note both the url=http://bit.ly/7315xj and &oi=microblog_result (again, bolded). This allows you to see both where the click came from (a real-time microblog result, i.e. from a site like twitter or facebook) but also the URL that was twittered (in this case the bit.ly link).
These referring URLs will show up in your server logs but unfortunately won’t show up in Google Analytics (since Google treats these all as search queries and so will just dump them in the same place and only let you see the keyword searched for). To get them to show up in Google Analytics you need to set up a profile to show the full referring URL, such as the filter detailed in part 2 of this post.
Not all users see these # Google URLs however, most are still seeing the old style search?q= Google URLs. From looking at the traffic for sites where we have the appropriate filter set up I’d say somewhere between 5 and 10% of users are seeing these URLs. This means that if you can get this kind of data for a small proportion of your traffic and extrapolate for the other 90% of users. (Btw, does anyone have any more accurate stats on the % of users seeing which search result type? I’ve not seen anything concrete anywhere…)
Of course, looking at the example above we see that a fair amount of traffic from micro blogging servicies actually goes through URL shorteners such as bit.ly. In that case there’s another method you can use to track your traffic. Take a look at the following referral list for this bit.ly URL:
I’m sure over the coming weeks more and more will get said about real-time search but hopefully this has been food for thought!
If you haven’t yet grabbed your copy of our new Advanced SEO Training Series: Tips, Tricks & Tactics DVD series, there’s good news! SEOmoz extended the special launch pricing of 20% off plus free shipping until December 18th. Order your copy now before the offer is gone!
Posted by great scott!
They scrape you, they copy you, you license your content, you need geo-targeted versions of your pages…whatever the reason, duplicate content happens. In this week’s Whiteboard Friday we’ll look at how to deal with duplicate content in ways that will help you make sure you’re the one who ranks for your material (as you should) and what traps to avoid .
SEOmoz Whiteboard Friday - Dealing with Duplicate Content from Scott Willoughby on Vimeo.
If you haven’t yet grabbed your copy of our new Advanced SEO Training Series: Tips, Tricks & Tactics DVD series, I’ve got good news! We’ve extended our special launch pricing of 20% off plus free shipping for another week. This sale price will only be available until December 18th, and then it’s gone for good, so order your copy soon!
Posted by Dr. Pete
You’d have a hard time telling by my posts (let alone my Twitter stream), but I’m supposedly a psychologist or something, so I thought it was time I did a little psychologizing here on the Moz blog. One thing I like to think I’ve learned over the years is the subtle art of persuasion – not the manipulative, why-won’t-my-clients-be-reasonable variety, but the art of communicating in a way that helps promote win-win situations with clients, prospects, and partners.
This post is the first in what could be a series (if you like it) about the art of professional persuasion. Whether it’s your boss, client, prospect, co-worker, or website visitor, your success often hinges on the ability to communicate persuasively.
Every web designer has a version of this story – you work your little fingers to the bone to come up with the perfect design, research your client’s color preferences, industry competitors, and TiVo playlist, finally present your masterpiece to them, and then gasp in horror as they rip your baby to shreds like a pack of wolves on tainted Slim Fast. What happened? Whether you realize it or not, you forced your client against a wall by asking them a Yes/No question:

On the one-hand, you have your design, and on the other hand, nothing. Your client can only approve or disapprove. If they approve, great; if they don’t, then they start to do what all people do: rationalize their decisions. On a gut level, there’s something about your design they don’t like, so they look for things to pick apart. You (naturally) get defensive, and it’s all downhill from there.
So, what happens if you give your client two options? You’ve turned a Yes/No question into an A/B question. Instead of "Do you like it?", you’ve made the shift to "Which one do you like?":

Not to over-illustrate what may be obvious by now, but you’ve just asked a Yes/Yes question, and the answer to a Yes/Yes question is almost always "Yes".
I know what you’re thinking, because I thought it for years: isn’t creating two designs a lot of work? Pardon a tangent, but I should say that design is just one example – you can apply this principle to proposals of just about any kind (except maybe the marriage kind – "Will you marry me? How about Chad?").
A designer friend finally turned me on to the secret – take the original proposal and make some modifications you can live with it. At first, I have to admit that this seemed like cheating. If you just tweak a couple of colors and fonts and act like it’s a whole new proposal, isn’t that a bit shady? Well, no, and here’s why. First, what amounts to "just tweaking" for you only seems easy because you’re a professional. Second, every one of us, in the process of creating anything, inevitably makes choices along the way. Many times, we make a decision because we have to, but we could’ve gone more than one direction. Revisit those decision points, and use them to generate a second proposal. Ultimately, you’ll be able to present people with options that aren’t too difficult to create and still maintain your integrity.
There’s another worry people have with this approach, and it is justified in some cases, if a bit overblown. What if you present two options, and your target audience mixes and matches in ways you can’t live with? This could be true for designs as well as sales proposals. The complicated answer is that you eventually learn to engineer your choices in a way that makes mixing-and-matching a bit more palatable.
The short answer is: So what? Would you rather have a discussion about how Element B doesn’t fit Site A and have to get creative or have your client tell you why Site A sucks and they don’t want to pay you? If you can get your client to mix-and-match, then at least they’re telling you what they like. Hearing a laundry list of what someone doesn’t like is useless – hearing what they do like gives you options.
So, by my own logic, if two choices are good, how about three or more? More is always better, right?

Sorry, got carried away for a minute there. Unfortunately, more choices won’t necessarily yield more excitement for your target audience. Recent research certainly suggests that there’s such a thing as too many choices. In most cases, 2 options will be sufficient – in some situations, especially where a lot of money is involved or the risk of a bad decision is high, 3 or more choices may be required.
Let your own decision path be your guide. If you naturally encounter points along the creative path where you can’t decide which of two options is better, that may be a good place to diverge and create a second version of whatever you’re working on. If this happens frequently, then 3-4 versions may be natural. Just don’t invent versions for the sake of bombarding your audience with options – the goal is to give people a choice, not overwhelm them to the point of decision paralysis.
I’ve used the website design example to illustrate this concept, but there are many more cases where I think Yes/Yes questions can help you persuade someone in a win/win way:
Of course, never present an option you can’t live with. The whole point is to create a choice that helps you get an end result that’s positive for both you and the client/boss/etc. Get creative, and you’ll be amazed how often a little extra work up front can save you hours of headaches down the road.
Speaking of persuasion, this is where I try to persuade you to check out SEOmoz’s 6-DVD Advanced SEO Training series. The introductory price (20% off + free shipping) has been extended until December 12.
Posted by randfish
I’ve been a big fan of Chris Dixon’s excellent blog for a while now, so you can imagine that I was really excited to see him writing about SEO in a post last week. Chris kindly called out SEOmoz, which humbled me, but he also espoused some thinking in the comments that made me a bit concerned and was the catalyst for this post. Here’s how it went:
RAND: Chris - I think the biggest thing you’ve forgotten to mention is that 70%+ of the weighting/ranking used by all of the engines depends on links. If you’re not thinking about how your content and pages will incent users/bloggers/writers/media/other sites to link to your work, you’ll lose out to someone who does.
A while back I got riled up about the lack of SEO in startup marketing and wrote about it - http://j.mp/4q9zkh - might be relevant/useful, though I did write with a bit more anger than was likely deserved.
CHRIS: Rand - totally agree re links. But isn’t getting links primarily about creating great content?
Read the article you link to btw and am in complete agreement.
RAND: Tragically, at least in my experience, the answer is a resounding no. Great content is easily missed by the web’s link-heavy audience, while some pretty crummy content that’s been marketed well (or made the right connections or comes from the right sources) will tend to overperform.
The web’s link graph isn’t a meritocracy - like everything else in life, it’s a popularity contest. Those who find the best ways to distribute, promote and market their works to the audience most likely to link to it are going to succeed much more so than just the "great content" producers.
Just think of it like politics. The best, most rational, reasoned, intelligent arguments are the exception, not the rule. Instead, the conversation and media attention (and thus, public awareness) is focused on concepts that are easy to grasp, virally distributable (which often puts rumor and innuendo above fact) and fit a compelling narrative (rather than add complexity).
A post on this topic - http://j.mp/4tYThK
I would love to tell Chris that he’s right, that the better the content, the better, higher quality and greater quantity of links that content earns. But, perhaps sadly, that’s not the case. What those in the content world would call "better" does not always (nor even mostly) garner the links and rankings. Instead, those who have "better optimized" for attracting links tend to far outshine their peers with rankings and traffic.
This may seem like a tragedy, or even a travesty of the democratic structure the web is supposed to represent, but in fact, it’s the way all marketing has worked for generations. The "best" restaurants are often family-owned, hole-in-the-wall, never-marketed-themselves joints whose fabulous epicurean creations are a secret to all but the most diligent culinary Clouseaus. Meanwhile, the affront to humanity and cooking that is Olive Garden advertises relentlessly, conducts impeccable market research and appeals to the lowest common denominator in town after town to achieve geographic and market-penetration ubiquity (BTW - my wife is Italian and thus recoils at the very mention of this establishment and the tarnish it’s brought to her beloved countrymen’s kitchens).
Like many parts of life - it’s not about the quality, diligence or aptitude you bring to your field, but your ability to market it successfully. As SEOs, our responsibility is to help the best of the best become the most noticed, most beloved and most linked-to in their field. It’s a strange, almost paradoxical leap of logic, but one you internalize this principle, it gets easier to accept and to spread to your clients and managers.
p.s. I’m also a fan of Chris Dixon’s startup, Hunch - I’d urge you to check it out and try answering a few dozen questions. The results are quite fascinating.
Posted by willcritchlow
Bing recently came out of beta in the UK and we are seeing the beginnings of the advertising campaign to promote it.
For SEOs, however, there is a more immediate opportunity with Bing than hoping it gathers some market share from Google(*). Linkfromdomain is a search operator that is unique to Bing. It returns the pages that are linked-to from a domain. There are obviously other ways of getting this information in raw form (maybe including Linkscape one day, but certainly including Xenu for mid-sized sites), but for large sites especially, it can be really hard to gather it in any kind of usable form.
The usage of linkfromdomain is to search on Bing for something like:
The set of results is generally returned in a similar ordering to a regular search query - with a combination of highly relevant and more powerful results first. Unfortunately linkfromdomain does not support searches for sub-domains (even www.) you have do search for linkfromdomain:exampledomain.com.
How do you use this for SEO?
This is a linkbuilding tip post - the idea being two-fold:
The information contained in the second approach is typically findable through other means (or the targets are likely to appear on your radar in other ways) and there is a lot of searching through chaff to find wheat. I wanted to run through a worked example today to show you how powerful method #1 can be:
Worked example
I had to pick a niche and a target for my worked example. I decided to imagine I was linkbuilding for a technical but not-specifically-web-related company. I’m trying to get links from trusted authoritative domains so I start with big educational institutions.
As some of you may know, I studied at the University of Cambridge (ending with a year at the Statslab). I don’t want them getting link requests from all you lot, so I picked Oxford (**).
I’m pretending my imaginary client works in some area of telecoms and has resources and technical papers on subjects like wimax and spectrum usage.
First up, wimax:
It turns out that conted.ox.ac.uk is a goldmine for linkbuilders. It’s the Continuing Education section of the Oxford University site and seems to be very generous with linking out. I might suggest that my client gives a talk or writes a resource for a CPD course. At the very least, it might be worth creating some content to target this kind of page.
Tip: I find it best to look for links to pages that aren’t homepages because it’s typically easier to find where the link originates from. Bing doesn’t have an effective link: operator meaning that we have to use Yahoo, Linkscape or similar. Because we are then not using the same index, it can be tricky to track down the link found by linkfromdomain.
Another example starting with spectrum auctions - sometimes it’s funny where this kind of research can take you:
(Incidentally, I found a very similar opportunity on the Cambridge site, but no, I’m not going to tell you about it.)
In an unexpected turn of events, I also found some pretty active blogs writing about my target subject matter on ox.ac.uk URLs. Even I’m not mean enough to fill up those guys’ inboxes with outreach from you lot just because they picked the wrong university.
(**) seriously, we don’t get on (US folks, think of the relationship between Duke and UNC) but I’m not encouraging anyone to spam Oxford University. Really. I’m not. Even though the varsity match is this week.
There are some other great resources on linkfromdomain - I really liked PPC blog’s tip about expired and for sale domains.
Rand has also written about the uses of linkfromdomain for finding spam you are linking to as well as teasing you with the fact that he "gave up" a similar tip to my worked example above at SMX Advanced.
linkbuilding, bing, linkfromdomain
Posted by inflatemouse
This post was originally in YOUmoz, and was promoted to the main blog because it provides great value and interest to our community. The author’s views are entirely his or her own and may not reflect the views of SEOmoz, Inc.
In late October Forum One Networks put out a white paper titled "Online Community and Social Media Compensation." I applaud their efforts, but, I think they create an unrealistic view of the job space in online media.
One, I think the surveyed companies over-represent corporate jobs.
Answers Corp., Autodesk, Avid, Best Buy, Cartoon Network (Turner), Consumer Reports, Electronic Arts, hi5, IBM, KaBOOM!, Nokia, Quest Software, Sage Software, Seesmic, Sony Online Entertainment, The Knot, and Yahoo!
Their average respondent is in a department of 9-people and have at least one sub-ordinate. I suspect the truth of the job landscape is that there are far more web jobs (social media, web design/development, SEO, PPC and web analytics) in small business than in large corporations.
Two, they fail to address critical issues like education, work experience and job duties.
Forum One claims that in social media the average woman makes $75,624 and the average man makes $86,644. I feel simply looking at the averages is too shallow to make a good argument about compensation.
So, I am doing something about it. I think the SEOmoz community has a wide range of people and will contribute a broader, and more realistic, perspective on what jobs on the web really look like. I put together an 18-question anonymous survey (it will take less than 5 minutes to complete) to create a better look at salary and compensation on the web.
Once I collect the data we will make all of the findings transparent, free to download and creative-commons so you can use the data freely. Help the community by creating better data resources.
Posted by great scott!
Not unlike investing, when it comes to link acquisition diversity is key. Evidence points to a strong preference by the engines for a diverse link profile rather than a homogeneous one, even if the links in a narrow profile are from strong sites. In this week’s WBF, we’ll look at why a wide variety of linking domains is better than repeated links, even from very strong domains: it’s all about trust.
SEOmoz Whiteboard Friday - Link Diversity from Scott Willoughby on Vimeo.
Posted by jennita
Disclaimer: This article consists of our favorite articles of the past year and does not have actionable SEO techniques. Please read on if you’re interested in knowing more about us, and what we like!
This week I’ve been personally invested in Gwen Bell’s The Best of 2009 Blog Challenge aka #best09. The idea is that each day in December you reflect on the past year and write about a different topic each day. Obviously you can write every day, or pick and choose which topics you want to cover. It’s only been a few days but I’ve enjoyed reading through some of the blogs and tweets from people participating. Today the topic is:
December 3 Article. What’s an article that you read that blew you away? That you shared with all your friends. That you Delicious’d and reference throughout the year.
Since the topic is right up our alley, the SEOmoz crew decided to put together a list of our favorite articles from 2009. Some of these are search related, but many of them are not. Take a peek into our minds and I think you’ll find it interesting the types of articles we love.

Not sure if it "qualifies" since it’s from last year, but I shared this article, about what it really means to be a billionaire, with a ton of people. It’s absolutely fascinating, especially if you’re someone (like me) who fantasizes about how you would potentially spend great sums of cash :)
On the flip-side of the equation is this excellent article from the Washington Post illuminating the incredibly high cost of being poor. Fascinating and eye-opening.
Together they pack a one-two punch that sheds a ton of light on just how drastic wealth and class disparity can be, even in the U.S.

I’m a big fan of this GapingVoid post from October: The moment
From an SEO standpoint, I’ve been getting a lot of mileage from Eric Enge’s interview with Google Image search engineer Peter Linsley. It’s a topic that doesn’t get covered often, and the information in the article is incredibly useful.
This Smashing Mag post is Usability-oriented, but great stuff for any web person. Unlike many of these kinds of articles, almost every point in this one is directly actionable:
Of course, I also think this post was pretty good - the author is clearly a genius ;)

Life lesson: There is no speed limit - talks about how education is designed to get everyone through and how many people take this slow pace with them throughout their life.
We Have Been De-googled! - One blog talks about the impact of being kicked out of Google for seemingly no reason.

The article that made the biggest impact on my life this year was this one from SEOmoz. It is Lindsay’s first post and it was an announcement of the job opening I ended up getting. :)
Personally this short post helped me get my personal goals organized.

Rand’s favorites from the past few months:
http://www.contrast.ie/blog/youre-just-getting-started/
http://www.zeldman.com/2009/11/24/on-self-promotion/
http://000fff.org/getting-to-the-customer-why-everything-you-think-about-user-centred-design-is-wrong/
http://www.smashingmagazine.com/the-death-of-the-blog-post/
http://www.everywhereist.com/borough-market-a-place-for-love-but-not-vegetarians/
http://www.nytimes.com/interactive/2009/11/06/business/economy/unemployment-lines.html?hp
http://www.inc.com/magazine/20091101/does-slow-growth-equal-slow-death.html?partner=fogcreek
http://cdixon.org/?p=1391
By the way, there’s still time to get your FREE SES Chicago Pass by purchasing a year of PRO! We’ve only got a few passes left, so you should probably hurry. SES just raised their prices to $1995 for a pass, so $799 for an entire year of PRO and a full-access SES Pass is an awesome deal (and if Chicago’s not your thing, SES will let you exchange the pass for any SES Event in 2010).
Posted by Nick Gerner
As we rapidly approach the end of 2009 and opening of 2010, we’ve got a much anticipated index update ready to roll out gang. Say it with me "twenty-ten". Oh yeah, I’m so gonna get a flying car and a cyberpunk android :) …Ahem. I thought this would be a great time to take a look back at the year and ask, "where did all those pages go?" Being a data-driven kind of guy, I want to take a look at some numbers about churn, freshness and what it means for the size of the web and web indexes over the last year, and the hundreds of billions, indeed trillion plus urls we’ve gotten our hands on.
This index update has a lot going on, so I’ve broken things out section by section:
Not too long ago, at SMX East, I heard Joachim Kupke (senior software engineer on Google’s indexing team) say that "a majority of the web is duplicate content". I made great use of that point at a Jane and Robot meet up shortly after. Now, I’d like to add my own corollary to that statement: "most of the web is short-lived".

After just a single month, a full 25% of the URLs are what we call "unverifiable". By that I mean that the content was either duplicate, included session parameters, or for some reason could not be retrieved (verified) again (404s, 500s, etc.). Six months later, 75% of the tens of billions of URLs we’ve seen are "unverifiable" and a year later, only 20% qualifies for "verified" status. As Rand noted earlier this week, Google’s doing a lot of verifying themselves.
To visualize this dramatic churn, imagine the web six months ago…
Using Joachim’s point, plus what we’ve observed, that six-month old content today looks something like this:

What this means for you as a marketer is that some of the links you build and content you share across the web is not permanent. If you engage heavily with high-churn portions of the web, the statistics you monitor over time can vary pretty wildly. It’s important to understand the difference between getting links (and republishing content) in places that will make a splash now, but fade away, versus engaging in lasting ways. Of course, both are important (as high-churn areas may drive traffic that turns into more permanent value), but the distinction shouldn’t be overlooked.
Regarding Linkscape’s indices, we capture both of these cases:
To put it another way, consider the quality of most of the pages on the web, as measured, for instance, by mozRank:
I think the graph speaks for itself. The vast majority of pages have very little "importance" as defined by a measure of link juice. So it doesn’t surprise me (now at least) that most of these junk pages are disappearing after not too long. Of course, there are still plenty of really important pages that do stick around.
But what does this say about the pages we’re keeping? First of let’s take out any discussion of the pages that we saw over a year ago (as we’ve seen above, there’s likely less than 1/5th of them remaining on the web). In just the past 12 months, we’ve seen between 500 billion and well over 1 trillion pages depending on how you count it (via Danny at Search Engine Land).
So in just a year we’ve provided 500 billion unique urls through Linkscape and the Linkscape powered tools (Competitive Link Finder, Visualization, Backlink Analysis, etc.). And what’s more, this represents less than half of the URLs we’ve seen in total, as the "scrubbing" we do for each index cuts approx. 50% of the "junk" (including canonicalization, de-duping, and straight tossing for spam and other reasons). There’s likely many trillions of URLs out there, but the engines (and Linkscape) certainly don’t want anything close to all of these in an index.
From this latest index (compiled over approx. the last 30 days) we’ve included:
We’ve checked that all of these URLs and links existed within the last month or so. And I call out this notion of "verified" because we believe that’s what matters for a lot of reasons:
I hope you’ll agree. Or, at least, share your thoughts :)
I also want to call a shout out to Sarah who’s been hard at work on repackaging our site intelligence API suite. She’s got all kinds of great stuff planned for early the coming year, including tons of data in our free APIs. Plus she’s dropped the prices on our paid suite by nearly 90%.
Both of these items are great news to some of our many partners, including:
Thanks to these partners we’ve doubled the traffic to our APIs to over 4 million hits per day, more than half of which are from external partners! We’re really excited to be working with so many of you.
Posted by randfish
Yesterday night I stayed up way too late authoring a post on Google’s Indexation Cap. Today, despite getting up way too early, I wanted to follow up and answer some of the questions from the comments, Twitter and my email. I think SEOs who read the post rightly asked for more direction in solving this problem - a fair request. Below, I’ve done my best to tackle these problems visually, as I believe we all think about site architecture and crawling issues in a visual structure.
First off, here’s a sample site hieararchy to set down the concept and give the colors I’m using in the following diagrams more context:

Next, I’ve illustrated in a more representative fashion, how those hieararchies might look on a website, and noted the external link potential of each:

In this next piece, I’m trying to explain a very important concept and something that’s frequently misunderstood by SEOs. Once upon a time, search spiders would crawl the web largely recursively - hitting a homepage that had been submitted to its index (remember way back when search engines had submission?!), then crawling in an outward fashion based on the links they discoverd there. That hasn’t been the case for a long time, and as we all see with crawl paths (if you’re looking at the requests Google/Yahoo!/Bing make to your domain), multiple entry points are nearly universal and crawling pushes "outward" from those priority URLs. It looks a bit like Minesweeper, right? :-)

Finally, I’ve got a graphic to help understand how to positively approach these problems and solve them.

There are certainly more recommendations that can be provided around these issues, and I look forward to a discussion of them in the comments.
p.s. I covered site architecture and navigation in a good bit of detail at the PRO Training this summer, but I like this image format so much, I think I might re-craft something new for next year. It feels like structuring sites properly is still a big pain point for SEOs (but possibly that’s less to do with lack of knowledge and more to do with lack of influence during the design phase?)
Posted by randfish
Thanks so much for all your votes and feedback on our PRO Webinar Series over the holiday weekend. We received 285 responses and we’re taking your suggestions very seriously and conducting the webinar as you’ve requested :-)
Here are the stats from the questionairre/form (you can still fill it out if you’d like to give more input):
Will you be able to attend the PRO Webinar on Dec. 10th at 11am Pacific (2pm Eastern, 7pm London)?
What topics most interest you for the webinar (check all that apply)?
What webinar format would you prefer?
Based on this, we’re going to be running a 90 minute webinar, with a 45 minute slide deck presentation (and possibly video as well, though it will likely just be of me on the webcam) from 11am - 12:30pm Pacific (2pm - 3:30pm Eastern, 7pm-8:30pm London) on Thursday December 10th. The webinar will cover the following rough outline (obviously, in more detail):
I’m certainly open to feedback about what you’d like to see in there, and happy to make some inclusions where possible. All PRO members will receive an invite via email in the next 2-3 days with a link to register. You’ll be able to dial-in or hear the webinar via your computer speakers/headphone and ask questions via a chat interface. You can see an examples of a past presentation I’ve made below:
This lengthy one came from my HostingCon keynote and serves as a fun introduction to SEO (BTW - let me strongly recommend against creating slide decks using photos of a whiteboard; it’s fun and the audience likes it, but it took about 12 solid hours of surprisingly intensive whiteboard drawing and erasing, nevermind the editing, cropping and pasting):
I’m very much looking forward to spending the morning with our PRO members next week! If you’re not yet PRO, Scott’s got some pretty sweet offers still available including the SES Chicago ticket + 1 year of PRO for $799 (and you can trade in the Chicago pass for any SES event in 2010) and the Advanced Training DVD for PRO members at $199.
Note that the other topics that received lots of votes - SEO Metrics & KPIs, Social Media Marketing, etc. will likely be the topics for webinars in January, February and March.