One of the most important things you can to on your site for SEO is to have a well defined and optimized site structure.  Can you find everything on your site without searching?  Try getting to everything on your site using only a mouse.  Remember that search bots don’t do searches.  This is a problem that many large sites have (including WorldVitalRecords.com. FamilyLink.com is better).  You want to build your site horizontally not vertically.  Avoid content silos and use content themes to organize your site.  One great tip is to use a different sitemap for each section of your site.  Sitemaps greatly help the crawlability of any site.

Scott Polk is in charge of SEO at Edmunds.com.  They do very well in the search engines and they have reaped many benefits from organic search.  It was very interesting that EVERY employee who works on the site gets SEO training.  Plus every project on the site gets reviewed by the SEO team.  They also have this idea of attaching personal revenue to each employee and organic search is part of that.  I LOVE this idea.  I have never seen a company so committed to SEO.

Another great SEO feature is the “Share/tag this page” link that HP.com uses on some of their pages.  These link pop up a box with many social media tagging options like in this image.

tag this

This is a great way to get incoming links to many pages of your site. I could also see the email this to a friend option and print options tied into a page like this.

Many large sites have a problem with creating xml sitemaps.  I have tried to build mine using data right form the database but don’t feel like this is really getting all the pages (plus some of the pages are behind the member login).  But large sites are too hard to crawl with an external crawler.  I asked this question to Mr. Polk and he told me that Edmunds.com uses xml sitemaps and have like 160 of them daisy chained together.  But Tanyan Vaugh the SEO director at HP said they do not even try to build sitemap files for their site.  Bruce Clay said that he uses xml sitemaps whenever he can.  But if you cant build a complete one, he suggested at least getting all your major pages in the file.  He also suggested putting pages that are having trouble getting indexed in the sitemap file.  Sounds like practical sound advice.  I like Bruce (btw, check out his coverage of PubCon on his blog.  He must be recording it and getting it transcribed because almost every word is there.  Amazing!).

At Pubcon there was a session on duplicate content.  The speakers were from Ask.com,  Google, and Yahoo. All the search engines agree that sites do not get penalized for duplicate content.  It is a one of the largest seo myths on the web.  As you can imagine that almost every large site on the web has some duplicate content issues.  But duplicate content can still hurt your site.  The largest problem is that when the bots start finding duplicate content they may stop crawling your site.  So duplicate content can keep your site from getting completely crawled and indexed.  Also when the search engines find duplicate content they must decide which is the original or most important.  This may be bad if you have duplicate content on a print page for example with no navigation.  Anyway, it is still a large problem and must be addressed to fully achieve your best search rankings.

The most common reasons for duplicate content are:

  • Having multiple URLs pointing to the same pages.
  • Print pages
  • Having the same version of content on different country sites
  • Problems with dynamic sites like session IDs
  • Syndication of content on other sites
  • Mirrors

It also seems that many people have problems with duplicate content due to 3rd parties scraping their sites.  To discover this simply search for a sentence of text within your page (without punctuation) in Google and see if any pages come up.  If it does the only thing you can do is ask them nicely to take it down and then include the lawyers.

Here are some tips to remove duplicate content issues:

  1. Act on what you have control over!  Use robots.txt and no index tags.  Remove session ids, affiliate IDs, and tracking IDs from the URLS.  One problem is also if you have a list with multiple sort options many times this creates a new URL.  One slick solution to this is to put the sort option in a cookie instead of the URL.  This would also provide a better user experience because it would remember the sort order the last used.
  2. Use distinct TLDs (Top Level Domains) when localizing your site and make the content unique whenever you can.
  3. Make it hard for scrapers. First use copywrite/Creative Commons notices on your site and make it clear you don’t allow others to use your content.  The use of absolute links often messes up scrapers and hosting images locally also helps. Don’t be afraid to use legal action if you need to.
  4. If you have problems don’t hesitate to contact the search engines and let them know.  All the search engine reps seemed very interested in helping solve duplicate content issues.
  5. Use the webmaster tools that the search engines provide.  Google webmaster tools for example lets you easily remove duplicate content links in their interface.  Yahoo also has a nocontent tag you can use in your divs to tell the search engines not to crawl parts of a page.  They also have a great new feature called dynamic rul rewriting in their Site Explorer.
  6. It was thought that www.site.com and site.com would cause duplicate content issues.  The search engines are smart enough to make this a non-issue nowdays.  But everyone still said it was good coding practice t make the non-www version 301 redirect to the www version of your site.
  7. Make RSS versions uncrawlable.
  8. Regularly review URL requests at a server level to see what pages are getting crawled.  This is a great way to find duplicate content issues.
  9. Know your URL parameters.  Ideally you would not have parameters in the urls.  But if you do have them make sure you know what they all do.  Make sure you also put them in the same order each time.  If you have the parameters in a different order it will appear as a different URL.
  10. Get a regular crawl report form your analytics program or from your log files.
  11. USE 301 redirects whenever a page or content moves.
  12. If you use syndicated content on your site you need to set your expectations.  Syndicated content will never rank as well as the original.

There are many of these things I still need to do on my sites.  I have removed many duplicate content issues in the last few months (since SES) but I still have a ton to do.

Here are some tips and ideas on getting dynamic web sites indexed by the search engines.

1) Watch out for spider traps. Many dynamic web sites have multiple ways to access the exact same data. Also it is very easy to build infinite loops where the spiders get lost and give up. The search engines don’t ban you or anything because of this. Often that this just makes it so your best pages never get indexed. Monitor the spider activity using your analytics software or by looking at your log files. Also use tools such as Yahoo site explorer to see what is (and isn’t) indexed.

One way to fix this (or ensure you don’t have problems with this) is to use 301 redirects. Use your server to keep the spiders on the right path. Also use the robots.txt file to keep the spiders out of troubled areas. There was one example given where they blocked the spider from problem areas and within days their other ranking improved.


2) Watch out for form based navigation.
This is where you select an option in a dropdown and must click a button to get there. Bots don’t submit forms and would never find your content if you use a form as your main type of navigation.

3) URL parameters are ok. For years it was thought that the spiders simply stopped at the “?” in a URL. That is where the parameters start. This is not true. Now days there is so much dynamic content that the spiders are smart enough to figure out most parameter strings. But this holds only to a point. Really long URLs will not be crawled (more than 200 characters). Never go over 10 variables in your url string! So keep them as short as possible. Also make sure you are not passing session ids or user variables in the url strings. These show up as different pages to the search engines and will give you duplicate content problems.

One interesting idea is to use a one parameter URL schema. This is where you basically use mod-rewrites to map one simple url parameter to get to a much longer url string. This masks your complicated urls with simple ones. I love this idea.

4) Remove the junk. Often dynamically generated pages have much more “junk” code in them than hand coded pages. Javascript, comments in the code, bloated CSS and even bad HTML. Make sure the outputed source code of your pages is clean and optimized.

5) Make sure your pages are optimized. Dynamic pages can be some of the easiest and most difficult to optimize for the search engines. Make sure each “page” on your site has a unique title tag and meta tags. This is a problem we are currently fixing on the WVR website. Also make sure you have good headlines, alt tags, and cross linking on your pages. If you do it correctly, you can easily improve thousands of pages (or millions in our case) with a few simple lines of code with page variables inserted.

6) One tip from Laura Thieme was to watch the results in MSN after you make site changes. She says that MSN will pick up the changes first and often Google will rank the pages very similarly after that.

7) Use a “test” spider to find problems. There is a great spider you can get and run against your site that will find problems. Check out the Xenu spider to do this.

8) And last and most important is to use xml datafeeds. All the major search engines will accept data feeds where you can tell the spider what pages you want them to crawl. Make sure you are maintaining your sitemap files. Also both the google and yahoo systems will tell you if you have errors in your sitemap file and when your site was last crawled.

One big theme of SES was getting user feedback on your site and doing usability studies. Before I get into optimizing landing pages I just want to note that it should actually be your site visitors that are designing your pages. The best page is not one the designer, marketing department or CEO likes best. It is the one that your site visitors like best. And they vote by becoming your customers.

First here are some mistakes to avoid when designing your landing pages.

  1. Never ignore your baseline. You need to measure progress against something to know if you are making improvement. I have fallen into this trap before because I think I can remember when I make changes to the site and when I did it. But a week later I can’t remember where I started (or even what I am testing). So you need to keep some type of record. If you are using Google optimizer it will keep track of your baseline for you.
  2. Not collecting enough data. Make sure you have a valid sample size to draw valid conclusions.
  3. Forgetting about interactions. Interactions are how different elements of your page interact with each other. One headline may match up well with an image for example. And you need both to see the improvement. So don’t do multiple independent A/B tests on the same page. If you have more than one thing on a page to test do a multivariant test. This will test every combination and will tell you which is best taking into account all possible interactions.

Focus your optimization efforts above the fold. This is a mistake I often fall into because I love long sales pages. Long pages work well for single product-single page sites. But most commercial web sites have a much more complex decision making process. Conversions mirror eye tracking studies. So place your most important items where people’s eyes are naturally drawn.

You have to remember that most people on the web have a “blink of the eye” mentality. You only have a precious few seconds to convey your message and draw the user into taking the next step. Leave them a “scent” they cant resist. Before you can convert a user you must first get their attention.

Jamie Roche the president of Offermatica gave some great points on using personalization to improve your conversion rate. I touched on this in my previous post on personalization. But if you know what a user searched for in the search engine, then re-enforce that on your landing page when they get to your site. “You searched for X and we have it in stock!” Here are easiest and best ways that Mr. Roche suggests you use to get started with personalization:

  • Give new visitors a different message than return visitors. This is easy to do with a site cookie.
  • Affinity – If they enter your site in a specific category then show them information on that same topic the next time they visit.
  • Time/Day targeting. Visitors to your site who come during business hours may be very different than visitors coming to your site during the evening or on weekends. Tweek your site experience to best capture these different users.
  • Geo targeting. This can be an easy way of providing a personalized message. For example on World Vital Records we should highlight data that is close to where the user is from.
  • Paid vs direct. If your user came from a paid source they may need different messaging than if they came from organic search listing.

Mr Roche really enforced that you just need to pick a starting point. Don’t let it overwhelm you. Start small and work into a fully personalized site experience later. But even though you may start small you need to still think big. Have a big personalization strategy and work toward it.

There are several approaches you can take to optimizing your landing pages and your site messaging in general. You can do it in a evolutionary manner where you start with what you have now and make small changes over time. Or you can take a revolutionary approach and test something very different. This method has the largest chance to greatly improve your conversion rates but also has more risk that your test may tank hard. He likes the revolutionary method.

The overall point is that you need to optimize your landing pages with the user in mind. An optimized page means that your site visitors find it more compelling and more useful. The user experience is better and thus the conversion rate goes up. So don’t get lost in the numbers and in individual tests. Think about the user and making improvements for them. Show them what they want and conversion rates will follow.

It seems that one of the big BUZZ topics is personalization. Meaning that the search engines are starting to personalize results based on each user. If you think about how much data each of the search engines has on us they can easily do this. Google is doing this extensively already. One example was of the same search done from 5-6 different locations using the same computer. Each was different and provided different results. Not all of them seemed right or to make sense. But you can easily see the direction that Google is going with personalization.

Personalization is based on several items.

  1. Current tasks. What the user is actually doing.
  2. The user’s search history.
  3. The user’s web history in general. What services to they use and what sites do they frequent.
  4. Social patterns. This is the idea of matching users who do the same things online and then providing custom search results based on the social group you fall into. Common memberships, bookmarks, and search behavior.

Gordon Hotchkins from enquiro.com had a very interesting “heat map” of where a user’s eyes track on a page with personalization compared to a standard page. He used Google in his example and it was amazing. The personalized page had many more “hot spots” on the page and it totally changed the way the user viewed the page. I will try and get these screen shots and post them to my blog. But the idea is that a better user experience is better for both the user and the advertiser. There is some talk about how better organic search or personalized data means less paid clicks. But I think a better user experience can only mean more searches and more eyeballs.

There is also an idea that Google has come up with a “personal quality rank”. Not only does Google want to rank pages but they want to rank Internet users. Then they will use the high quality users to effect the results of the lower quality users. So we need to design our sites to attract high quality visitors. This will then improve our search engine rankings. This is a strange idea and it will be interesting to see how this turns out. Google has a patent on this idea.

Jonathan Mendez had some great ideas. (He is very impressive and knows his stuff). One was how you can use the parameters from the organic search engine referrals. Google passes many parameters such as language, country, keywords and more. So the idea is if someone comes into your site after doing a search in spanish for example, you can then serve up spanish content. That is a powerful idea!

Many searches are very subjective. Searches like “Cool Furniture” or “What doctor should I use”. The search engines currently have no way to provide relevant results for these types of searches. They need to personalize their searches to give the users what they want.

What all this means to the search engine optimizer is that it is getting harder and harder. Everyone will start getting different results based on the user instead of on page elements or links. Everything is getting more complex!

Bill Lancer from Hitwise gave a great presentation on some important changes in search. They have a blog at www.ilovedata.com and track traffic from over 1600 search engines. First of all Google is BIG and only getting bigger! They currently make up 4.8% of ALL internet usage. This is an amazing number if you think about it. They have a breakdown of Google’s traffic by their various sites. Search is #1 and makes up 70% of their traffic. But YouTube is now their #2 traffic generator at 10% of all Google traffic. After that is Google Images. A few months ago Google implemented the “Universal Search” results in which they mix in maps, videos, images, news and other sources into their global search results. This has greatly impacted their traffic mix. Google maps is seeing 20% more traffic over the last 30 days due to this.

Human search has been a hot topic as of late. But the first human search shows up at spot 400 in the search engine list. That is ChaCha.com. So in other words “nothing”! The top 4 search engines (Google, Yahoo, MSN/Live, Ask) make up 98% of all searches. So the others are hardly worth the effort.

Social networks now make up a large amount of traffic. The interesting thing is that the increase in social networking traffic has not effected the search engines at all. Social networking are unique in that they do not rely on search engines. They now have numbers that show that many people participate in multiple social networks. The increase in facebook traffic has not effected myspace traffic for example.

Facebook is touted as the next Google. There has been a ton written about this and there is lots of hype around facebook right now. My boss Paul Allen has a great facebook post. Facebook has had 2 major turning points in their traffic. The first is when they opened their site up to corporate networks. The second was when they opened their site to all users. Facebook now accounts for almost 2% of all web traffic! It is amazing to think about.

It is also interesting to think about how users start their web experience each day. In the beginning there where large portals like yahoo and netscape. This is what people used as their start page. Then people moved to using a search engine as their starting point. Now users are moving to social networks as their starting point. This is one reason they are growing so quickly.

There are some major differences in the demographics of each of the major search engines. Google has more males and they tend to be technical. One interesting stat is that extreme liberal and extreme conservatives tend to use Google more (people with political views in the middle don’t use Google as much). MSN and Ask have more women searchers. Also seniors use MSN/Live and Ask way more than they do Google.

There are 60 billion searches per month worldwide and only 13 billion of them happen in the US.

I am happy to say that I am at SES this week. It will give me a great opportunity to immerse myself in search and to catch up on what is happening in the search industry. There have been a ton of changes in the last 6 months. So I will be blogging about each session I attend. I am doing this for multiple audiences. First because my boss Paul Allen told me to provide a report. This is my way to do it. So everyone at WVR and FamilyLink.com can learn more about the search engines. Also I will be informing my affiliates about the posts so they can learn what I learn. Lastly to all the rest of the readers of my blog. So check back often this week for many updates to the blog.

Today I was doing some research on what words are pulling in the organic search engine listings. I found that our site World Vital Records is pulling for the words “world records”. Now obviously we have nothing on our site about world records and those words never appear without the word “vital” in the middle. Which changes everything. So according to my Omniture reports both MSN and Live.com (same system) think our site is related to world records. Search for “world records” on msn.com and we are listed as #2. But it is interesting that we are no where to be seen for this keyword on Google and Yahoo or even any of the smaller engines. Congratulations Microsoft, you get an “F” or relevancy on this search!

Today I attended the class “A Crash Course in Internet Marketing” by my boss Paul Allen. It was the first class of a 12 week course. The main reason I wanted to attend was learn more about blogging. I have never heard Paul’s blogging pitch before. I found blogging more difficult that I thought it would be. What to write about, how to make it interesting, and still seem somewhat professional. Paul had some great insight and suggestions for blogging that I found very useful. If I get the lesson notes from the class today I will post the specifics. But one of the main reasons he listed for blogging was to simply force yourself to write down your ideas and think things out. Paul gave me some good ideas and helped me want to energize my blog. I need to post more often. Simple as that. Whether people actually read anything here does not realy matter. My goal is to post at least 3 times a week. I have a few things in the works already.

One thing I just noticed is that it is good to have many categories to place your posts in. And make sure you have good keywords in your category names. But I was searching for “internet subscription models” in google and what is the 3rd spot? PaulAllen.net! It just so happens that Paul has a category with this exact set of keywords. Paul does a great job of assigning each posting to multiple categories. Great SEO!