What is in your WBB robots.txt file?

  • What's in your WBB robots.txt file? I've had them for other scripts but I haven't included a robots.txt file for any of my WBB forums yet, what should be in it?

  • That depends on what you want to achieve? ;)

    LOL, I want A LOT of traffic... just like everyone else.


    One one forum I have the forum, Filebase, Lexicon and EasyLink and I don't mind if search engines crawl my member's profiles.


    What are some basic things I should have in the robots.txt file?

  • Unless you don’t want to restrict anything of your site (as I understood your last post) you don’ need to add anything to the robots.txt.
    This file can just restrict something, not force search engine robots to rate your content higher or something like that…

  • This file can just restrict something, not force search engine robots to rate your content higher or something like that…

    I didn't think so. But I didn't know that it should just be used to restrict certain things from being crawled. What things should be disallowed with WBB?

  • There is no need in using a robots.txt if you don’t restrict anything. All directories, which should not be directly viewed already are protected by a .htaccess setting.


    You don't think what? Do you think that search engines will rate you higher if you have a robots.txt? Then you should check what reason the usage of a robots.txt has.

  • You don't think what?

    I didn't think having a robots.txt file would increase search engine ratings.


    What I'm trying to find out is what things are in my WBB/WCF folders that I should keep from being crawled. You're not saying search engines will crawl everything in my root folder unless I stop them from doing it through a robots.txt or .htaccess file are you?

  • I don't know why I bother using one really, all I have in mine is this. But still do like to have one present used.


    User-agent: *
    Disallow: /wcf/config.inc


    Anything you feel robots don't need be spending time browsing, they don't get anything from doing it. You can block.


    I did used to have crawl-delay in mine as well to limit bots coming on the site in huge numbers that obeys it like Yahoo Slurp! and Google bots, seeing as I'm on shared hosting... but removed that since.

  • As I already said: All directories, which should not be directly viewed already are protected by a .htaccess setting.


    So in short: You don’t need to do anything else here.


    @GTB: The config file does not need to be “protected” by a robots.txt because its content is never readable without direct access to the file system. There is no content generated by opening the file in the browser.


    Additionally: The robots.txt doesn’t protect anything really because it is just respected by good bots. Bad bots just ignore it.

  • As I already said: All directories, which should not be directly viewed already are protected by a .htaccess setting.

    Can you list what those directories are for those not running apache and running nginx instead?

  • I have the following configuration in nginx:

    Code
    location ~ ^(/cache|/wcf(/acp/be\.bastelstu\.wcf\.nodePush/|/templates|/attachments|/language|tmp|cache|log)) {
    		return 403;
    	}
    
    
    
    
    	location ~ ^/([a-z]+/)?(templates|lib) {
    		return 403;
    	}

Participate now!

Don’t have an account yet? Register yourself now and be a part of our community!