Harnessing the Power of Robots

Harnessing the Power of Robots

Category Rss Feed - http://www.look4articles.com/rss.php?rss=32

By : artie frances 99 or more times read

Submitted 2011-01-31 22:45:56

Once we have a website up and running, we need to make sure that all visiting search engines can access all the pages we want them to look at.

Sometimes, we may want search engines to not index certain parts of the site, or even ban other SE from the site all together.

This is where a simple, little 2 line text file called robots.txt comes in.

Robots.txt resides in your websites main directory (on LINUX systems this is your /public_html/ directory), and looks something like the following:

User-agent: *
Disallow:

The first line controls the "bot" that will be visiting your site, the second line controls if they are allowed in, or which parts of the site they are not allowed to visit.

If you want to handle multiple "bots", then simple repeat the above lines.
So an example:

User-agent: googlebot
Disallow:

User-agent: askjeeves
Disallow: /

This will allow Goggle (user-agent name GoogleBot) to visit every page and directory, while at the same time banning Ask Jeeves from the site completely.
To find a "reasonably" up to date list of robot user names this visit http://www.robotstxt.org/wc/active/html/index.html

Even if you want to allow every robot to index every page of your site, it's still very advisable to put a robots.txt file on your site. It will stop your error logs filling up with entries from search engines trying to access your robots.txt file that doesn't exist.

Author Resource:

To find out more, take a look at sermons for youth and youth ministry information.

Related Articles

HTML Ready Article. Click on the "Copy" button to copy into your clipboard.

<!DOCTYPE HTML PUBLIC '-//W3C//DTD HTML 4.01 Transitional//EN' 'http://www.w3.org/TR/html4/loose.dtd'><html><head><title>Look For Articles - Articles Directory | Harnessing the Power of Robots</title></head><body><h3>Harnessing the Power of Robots</h3> By: artie frances Once we have a website up and running, we need to make sure that all visiting search engines can access all the pages we want them to look at. Sometimes, we may want search engines to not index certain parts of the site, or even ban other SE from the site all together. This is where a simple, little 2 line text file called robots.txt comes in. Robots.txt resides in your websites main directory (on LINUX systems this is your /public_html/ directory), and looks something like the following: User-agent: * Disallow: The first line controls the "bot" that will be visiting your site, the second line controls if they are allowed in, or which parts of the site they are not allowed to visit. If you want to handle multiple "bots", then simple repeat the above lines. So an example: User-agent: googlebot Disallow: User-agent: askjeeves Disallow: / This will allow Goggle (user-agent name GoogleBot) to visit every page and directory, while at the same time banning Ask Jeeves from the site completely. To find a "reasonably" up to date list of robot user names this visit http://www.robotstxt.org/wc/active/html/index.html Even if you want to allow every robot to index every page of your site, it's still very advisable to put a robots.txt file on your site. It will stop your error logs filling up with entries from search engines trying to access your robots.txt file that doesn't exist. Author Resource:-> To find out more, take a look at <a href="http://www.christianteenworld.com/youth-sermons/">sermons for youth</a> and <a href="http://www.christianteenworld.com/">youth ministry</a> information. Article From <a href='http://www.look4articles.com/'>Look For Articles - Articles Directory</a> </body></html>

Firefox users please select/copy/paste as usual

New Members
	Sign up
	Learn more
	ASK It!


Directory Menu
	Home
	Login to Directory
	Submit Articles
	Submission Guidelines
	Top Articles
	Link Directory
	About Us
	Articles Directory Advertisement Media Kit
	Contact Us
	Privacy Policy
	RSS Feeds

Categories


Accessories
Advice
Aging
Arts
Arts and Crafts
Automotive
Break-up
Business
Business Management
Cancer Survival
Career
Cars and Trucks
CGI
Cheating
Coding Sites
Computers
Computers and Technology
Cooking
Crafts
Culture
Current Affairs
Databases
Death
Education
Entertainment
Etiquette
Family Concerns
Film
Finances
Food and Drinks
Gardening
Healthy Living
Holidays
Home
Home Management
Internet
Jobs
Leadership
Legal
Medical
Medical Business
Medicines and Remedies
Men Only
Motorcyles
Opinions
Our Pets
Outdoors
Parenting
Pets
Recreation
Relationships
Religion
Self Help
Self Improvement
Society
Sports
Staying Fit
Technology
Travel
Web Design
Weddings
Wellness, Fitness and Di
Women Only
Womens Interest
World Affairs
Writing

Actions

Print This Article

Add To Favorites

Privacy Policy \|Advertising \| Contact us
Copyright LOOK 4 ARTICLES FREE DIRECTORY - 2005-2012 - Powered By: HYIP