Designing and Maintaining Scalable Information Architecture
There is a faction in the industry pushing SEOs toward developing websites with the main emphasis being search engine rankings. Sorry, last I looked search engines don't buy things—people do! Your target audience has been lost when you focus primarily on search engines. It seems to me, especially where links are concerned, SEOs have become search-engine-centric.
I was discussing a great site I'd found with some SEOs. One piped up and said, "Site is useless, it's nofollow." The site is a leading referrer and a top converter. The lesson? Nothing is cut and dry, and when you make decisions based only on the benefits to SEO you leave cash on the table. (In the section on future techniques I'll return to this theme.)
The one constant in SEO is that you often have to decide between the lesser of two evils, assessing the hit to usability, maintainability and SEO. Personally I've never seen the need to put search engines ahead of people because search engines give me enough elements to get the job done.
Keyword research is the most important part of any SEO project, so in this post we will assume we have our research and are beginning to implement it. Information architecture, or IA, can provide many opportunities to enhance the chances that keyword-rich text is assigned to a link. Google in particular has always analyzed the URL, so keyword-rich, relevant folder and file names should be implemented for complete optimization.
Information architecture is primarily designing, labeling and positioning the navigation and calls to action components of the page. Today I will discuss optimizing file and folder names and URL structure, weighing the three main decision elements:
SEO (indexing and optimization)
There are three groups to consider for the usability element of any IA design. The user, webmaster or site maintainer, and search engines are three distinct groups with differing needs and requirements. Users need short, understandable URLs. A user should be able to easily perceive the navigation, making breadcrumb navigation based on naming conventions and consistent structure very important.
For instance, if you maintain a convention of category, subcategory, brand, and SKU throughout a site it becomes easy for users to perceive this and use the browser input to quickly get around the site. Personally I often use this kind of browsing on a large website. I always use "-" rather than "_" because if the URL is displayed as the link text with "decoration" the "_" looks like a space to a user. I also like to keep the "-" to a bare minimum in folder names for aesthetic reasons and to make the URL easier to remember.
The above naming convention also makes it very easy for the webmaster to maintain and locate files. One of my pet peeves is agencies putting all files in the root. This technique makes setting security attributes tricky and finding files a hair-pulling experience, and it's actually a poor optimization technique. Funny part is the reason for this is often optimization.
In the very early days, search engines had trouble deep indexing sites. In most cases crawlers only went three folders deep. So some bright guy decided it was best to put everything in the root. Modern-day crawlers traverse sites based on following links, so the number of folders has no effect on a crawler's ability to index the full site.
The structure is quite simple. In the category folder there is an index file; this should link to every subcategory and brand, and these in turn link to every product page, etc. This hub and "indexing file" structure ensures that there are no orphan pages and users can reach their information in no more than three clicks from entering.
Generally for most sites the primary terms would be categories, subcategories and brands. Using these techniques your site structure is naturally optimized for primary terms. The hub and indexing file structure results in breadcrumb navigation, therefore, full indexing is more likely and should happen quickly.
The one gotcha with the hub and indexing file style is the brand and subcategory both have links to the same pages, so take care to ensure there are no dupe filters tripped. To avoid this you should use canonical tags on the product pages if you are using parameters in the URLs. I also suggest different order assignment in your SQL statements in the subcategory and brand pages.
Future Information Architecture Techniques
Microformats and RDFa are machine-readable markup languages that look to be on the rise. Whether they are a ranking factor is inconsequential to these tags being useful for SEO; if search engines "understand" them they are another signal to determine a geo-target or other factor, for instance hitting rich snippets. Google has over 200 signals—the advantage for anything that is not heavily weighted, e.g. title, is difficult to determine without an extensive test. I don't think Michael Gray’s test should be given any more than a passing thought, as in, so what?
Google has said that rich snippets are to some degree using microformats. Personally I think rich snippets could be very useful for e-commerce sites, since the snippets are tied to queries. In the end the snippet could provide a better description for the user since it is based on the query. There are examples of both formats used to provide rich snippets in results on the Google webmaster blog. My dojo friend Hugo Gill turned me on to microformats code tools on Microformats.org. Understanding Google's angle is good, but knowing the standard is always best from a development standpoint.
The rel="friend/me etc." is already being recommended as a "best practice" for Google Social Search. Michael Gray’s post on microformats was very SEO-centric and missed microformats' value, since he was trying to prove it is a ranking “factor” rather than a search “signal.” He determined microformats weren't much use for SEO. First, that is a current or backward-looking statement, because there are plenty of reasons for search engines to use them as a signal. In particular these formats are perfect signals for local and social search. Secondly, see above ... too many other "factors" to test this and get definitive answers.
Also, SEOs and webmasters should be cognizant of the value of microformats from a website development standpoint. Since these languages are machine-readable, they can be used to remove calls to databases which are major hits to website performance. Perhaps this is another reason to be wary of SEO-centric website development.
Terry Van Horne is the founder of SeoPros and a 15-year veteran of Web development, currently working out of his consulting and development firm International Website Builders. Terry's interests are primarily the socialization of search and analysis of social Web traffic and applications like Twitter.