Big Daddy Update

I’m excited (but holding by breath) because I went from 10 indexed pages up to 35 overnight including about 5 supplemental. The index has been jumping around a lot so here today, gone tomorrow is highly possible. I've seen reports that within the last day or so and particularly overnight there was a fair amount of new pages indexed (and supplemental pages transitioned back to regular pages) so maybe the growth period is finally starting.

Personal observations

I've seen a fair amount of spidering activity by Google here in the last 2 weeks–more than in the previous 2 months. I had been holding steady at 19 indexed pages for a month and then slipped to 15 and then to 10 within the last week. The 10 pages left were the home page and pages that had been clicked on frequently in search results for the last 6+ months. I have no idea how Google is tracking that, but the Google Sitemaps Administration section seems to indicate that they can track what keywords are getting you clicks. It's a obvious conclusion they know which page is associated with that click. Yahoo does a URL redirect from its search results. Google doesn’t seem to do something that obvious.

I remember hearing in the past that Google had to touch the page about 3 times before it would show up in the index. I'm not 100% sure if that's accurate or not, but it seems about right. There could definitely be other factors weighed in but for a site like me the observation seems to fit.

The majority of the 35 pages that are indexed are ones that are linked from just about every page on my site (top navigation and a bottom content). It really wasn't done from a Search Engine Optimization perspective. I linked in a careful selection of all major pages in a bottom content section of my site to encourage people to browse the site. Traffic patterns seem to indicate people generally hit one page and then disappear. Since it's a blog, I included links to the posts that get the most traffic, a few of the major associated category indexes as well as links to the yearly archives. I'm definitely not suggesting putting links to every page from every page but for the major ones, it may prove useful.

I had read a hypothesis that Google was only traveling down so many levels on a website depending on (quality) backlinking to the site. If that were 100% true, I would have thought that the links I had to new entries on the homepage would have shown up by now, but they haven't. I have to go back an analyze my raw log files (by hand) to see what pages Google has pulled in the last 2 weeks and see what the numbers look like. Overnight, on quick glance, all the new pages that just showed up as newly indexed were pulled in the early hours of the morning.

I'm thinking this potential backlinking filter is looking at the number of internal pages that link to a page to determine if it should be indexed or not. The more overall incoming backlinks, the less strict the filter. That would explain why some sites have their primary pages indexed but not lower level content pages. Those primary pages are probably linked from every page on the site-top navigation. I'm also thinking the filter is taking into account pages that were popular in the previous index version because of the 19 pages that were left originally, a few of them where frequently clicked on from certain search results. A few of my more popular pages didn't make it into the original 19, but the more popular ones did.

Watching Google's new indexing architecture is proving to be an interesting experience, trying to figure out how the big black box works.

Posted: June 16, 2006

about caradotcom

The personal website and blog of a 20-something web designer that works in a city by day and freelances by night (without a desk - long story). Continue reading

IconBuffet, free icons

Next Post »

Commuting Fiascos Come In Threes

« Previous Post

Roll Me Away