How Search Engines Index
It's important to understand how search engines discover new content on
the web, as well as how they interpret the locations of these pages. One way
that search engines identify new content is by following links. Much like you
and I will click through links to go from one page to the next, search engines
do the exact same thing to find and index content, only they click on every
link they can find. If you want to make sure that search engines pick up your
new content, an easy thing you can do is just make sure you have links pointing
to it. Another way for search engines to discover content is from an XML sitemap.
Sitemaps | Site Maps
How Search Engines Index Websites |
And while this is generally a good thing, there are plenty of times that
you might have pages up that you don't want search engines to find. Think of
test pages, or members-only areas of your site that you don't want showing up
on the search engine results pages. To control how search engines crawl through
your website, you can set rules in what's called a robots.txt file. This is a file
that you or your webmaster can create in the root folder of your site, and when
search engines see it, they'll read it and follow the rules that you've set.
Robotstxt
You can set rules that are specific to different browsers and search engine crawlers, and you can specify which areas of your website they can, and can't see. This can get a bit technical, and you can learn more about creating robots.txt files rules by visiting robotstxt.org. Again, once search engines discover your content, they'll index it by URLs. URLs are basically the locations of web pages on the Internet. It's important that each page on your site has a single, unique URL, so that search engines can differentiate that page from all the others.
You can set rules that are specific to different browsers and search engine crawlers, and you can specify which areas of your website they can, and can't see. This can get a bit technical, and you can learn more about creating robots.txt files rules by visiting robotstxt.org. Again, once search engines discover your content, they'll index it by URLs. URLs are basically the locations of web pages on the Internet. It's important that each page on your site has a single, unique URL, so that search engines can differentiate that page from all the others.
And the structure of this URL can also help them understand the structure
of your entire website. There are a lots of ways that search engines can find
your pages, and while you can't control how the crawlers actually do their job,
by creating links and unique and structured URLs for them to follow, site maps
for them to read, and robots.txt files to guide them, you'll be doing
everything you can to get your pages in the index as fast as possible.
Steve Steinberger
561-281-8330
No comments:
Post a Comment