Printer Friendly Version Print this thread
Email this thread to a friend eMail this thread to a friend
  • Doorway Pages (In: Google)
  • MSN dropping pages out the index (In: MSN Search Engine)
  • Featured Web Site Template

    Hundreds More at Free Site Templates.com!

    Web Site Partners
    Sponsored Links
    Jet City Software
     
    Whos Here ?
    There are 0 guests and 1 members in the forums right now.
    Reflects user activity within the last 5 minutes
    Moderator(s): Prowler, jcokos
    Member Message

    jake02
    Joined: Nov 26, 2005
    # Posts: 2

    View the profile for jake02 Send jake02 a private message

    Posted: 2005-Nov-26 11:39
    Edit Message Delete Message Reply to this message

    i use htaccess to rewrite from /product_info/product_id=00 to:
    Product+Category/Product_Brand.html

    the problem:
    yahoo and google are indexing pages THAT DO NOT EXIST and never have.

    they're indexing VALID category names, but they are also linking them to other categories. for instance:

    Old+Antiques/1920s.html (valid)
    the spiders find this no problem.

    but for the past month, they have been finding url's like this:
    Old+Antiques/1920s__1930s.html
    what they are doing is merging one category with another, and this is producing a 200/OK response.

    i can't figure out why.
    could it be in my htaccess or my code?

    if any part of the codes used needs to be posted.. please let me know.

    this is from an oscommerce-based website, but i have already hit the osc forums and it has been determined to not be an oscommerce issue.

    the htaccess and rewrite script i use is not stock oscommerce.
    i have sent a bot to pull every page linked on my site, none of these url's are showing up anywhere.

    for reference, here is my htaccess rule:
    RewriteEngine on
    RewriteBase /
    RewriteRule ^([^/]*).html$ $1.php?%{QUERY_STRING} [NC]
    RewriteRule ^/?(category)/([^/]*).html$ index.php?cPath=$2&%{QUERY_STRING} [NC]

    any suggestions as to what i should look for??



    g1smd
    Staff
    Joined: Jul 28, 2002
    # Posts: 10440

    View the profile for g1smd Send g1smd a private message

    Posted: 2005-Nov-26 18:47
    Edit Message Delete Message Reply to this message

    Use Xenu LinkSleuth to generate a link report for your site, then look at it very carefully.

    Use WebBug to try URLs that should not exist and see what response you get. Use a browser to see what content is returned.

    You might need to modify something to either force a HTTP 404 response, or to add a <meta name="robots" content="noindex"> tag to the head of such pages.



    jake02
    Joined: Nov 26, 2005
    # Posts: 2

    View the profile for jake02 Send jake02 a private message

    Posted: 2005-Nov-26 23:53
    Edit Message Delete Message Reply to this message

    i will try xenu linksleuth out shortly.

    Use WebBug to try URLs that should not exist and see what response you get. Use a browser to see what content is returned.

    i've been using wannabrowser, which is essentially the same - isn't it?

    You might need to modify something to either force a HTTP 404 response, or to add a <meta name="robots" content="noindex"> tag to the head of such pages.

    all of my pages are dynamically generated from an sql database, the htaccess rewrites them to be static.

    how would i go about forcing a 404 for these types of pages?



    dirty_shame
    Joined: Aug 28, 2005
    # Posts: 191

    View the profile for dirty_shame Send dirty_shame a private message

    Posted: 2005-Nov-28 16:23
    Edit Message Delete Message Reply to this message

    Just a suggestion: Instead of .404ing them, you might want to permanently redirect the bad pages using Mod_Rewrite or Mod_Alias in your .htaccess file. It will accomplish the same thing, maybe more elegantly.

    You'd have to add all the particular conditions for your "bad pages" (that the engines are linking to you with) to the following base code:

    Mod_Rewrite would be something like:
    RewriteRule ^badpage.*$ YourURLtotheGoodPages [L,R=301]

    Mod_Alias would be like:
    RedirectMatch 301 /badpage.php(.*) YourURLtotheGoodPages

    Eventually the engines will toss the bad page links and at least create no new ones in the mean time. I can see that preserving the "good" query strings and redirecting the "bad" ones in your case might be challenging. I hate to work with oscommerce stuff myself, but I hope this helps you.

    Sorry if the code above looks screwed-up. I never have known how to post code cleanly in these forums. It truncates just about everything into drivel.

    [ Message was edited by: dirty_shame 11/28/2005 08:41 am ]




    You are not permitted to post messages in this forum or topic, because of one or more of the following reasons:
    1. You have not yet logged in, or registered properly as a member
    2. You are a member, but no longer have posting rights.
    3. This is a private forum, for which you do not have permissions.

    If you are a recent member, it's possible that you simply have not yet confirmed your account. Please check your email for a message entitled 'JimWorld Forums: Confirm Your Account' and follow the instructions contained within.

    If you cannot find this message, click here to Re-Send it.

    If you are still experiencing problem, please read the Login Assistance Article for some advice on what may be causing your login not to work properly.

    Switch to Advanced Editor and ... Create a New Topic or Reply to this Thread

    New posts Forum is locked
    © 1995  ·  iWeb, Inc  ·  DBA JimWorld Productions