I have a few links on google that are domain.com/results.php?name=a&address=b
The results page/parameters has now been renamed and I need to remove the existing links on google etc.
I tried
User-agent: *
Disallow: /results.php
in robots.txt and then on google webmaster added the url to be removed:
domain.com/results.php
it says it was removed successfully, however when I look at google an type domain.com - the existing urls with parameters are all still there.
What am I doing wrong? There are quite a few links so I need a way to deal with all of them at once instead of one by one.
Thanks
You could put a page at results.php, and just get it to return a 301 redirect back your home page.
<?php
header("HTTP/1.1 301 Moved Permanently");
header("Location: http://www.Your-Website.com");
?>
As Google recrawls your site, the old pages will disappear. You may find this works faster than just having removed the old page.
Related
I am trying to make nextjs Link component to redirect correctly to a website that don't have a http, https in it i am storing links in database based on what user writes and some of them don't inclue http, https but for some reason the Link component appends these links to my current website url for example:
google.com will redirect to mywebsite.com/google.com
any possible way to resolve this issue
i have tried searching online with no luck
If I am on a URL such as:
http://example/
And I write a link like this:
<a href="foo" />hi</a>
Then clicking the link will go to:
http://example/foo
So if your link look like:
<a href="google.com" />hi</a>
Then the expected behavior is for that link to go to:
http://example/google.com
So this is all exactly as expected. If you have links stored in the database that were supposed to include https://, then you need to add that before writing your links.
I want to remove "index.html" from the homepage URL of Weebly site without .htaccess file, please help me to resolve this problem.
Weebly does not currently provide the option to redirect /index.html to the root domain URL or give you the access needed to properly make those changes, however, links to your home page (at least on your website) should be going to the domain-root.com and not /index.html so you should be ok there.
Keep in mind that index.html is a file, that exists as the home page for the folder your website pages live in, and you can't remove it from it's existence(at least on Weebly).
So, the thing to do would be to submit it to Weebly as a Feature Request and request that they make the necessary changes on their end, for the sake of ALL Weebly users! ;)
https://community.weebly.com/t5/Vote-on-Features/idb-p/IdeaExchange
There is something very weird happening on a website I'm working with. When I do a research on Google about a product of the website, the return result is a page that doesn't exist but it doesn't return a 404 error.
The "page.html" does exist, but it is not on the path that Google found, the URL is somehow being created out of nowhere, and since it is inside Joomla, it is causing some visual errors.
I've done some research, and a-lot of people have this kind of error, but normally the page is a 404 or a duplicated version. In my case, the page opens normally, the only thing is that the "page.html" is somehow being put on a path that is not correct. I've already checked inside the folders and there is not a duplicated page inside them.
What could be happening?
If this is a website you host, you can mark the page as a nofollow. If not, Google Search has a beta feature called "About this Result" in which you can leave feedback telling Google the page does not exist. Also, if you reach a 404 page, Web Crawlers will find it on their own eventually.
I am trying to get robots.txt to work so that search engines start indexing my website and show meta info like descriptions etc.
However, I get this message:
A description for this result is not available because of this site's robots.txt – learn more.
Here is what my robots.txt look like.
# See http://www.robotstxt.org/wc/norobots.html for documentation on how to use the robots.txt file
#
# To ban all spiders from the entire site uncomment the next two lines:
User-agent: *
Disallow: /tags/*
Disallow: /users/*
What do I need to change?
This is a Rails4 application hosted on Heroku and is in the public directory in the Rails repository
First of all, it is not compulsory to use robots.txt file! you only need to use them in case you don't want the search engines to crawl specific pages or directories of your website.
In your case, you are restricting search engines to crawl tags and users' directories hosted on the root. Now, any page inside this directory will give this error.
I also recommend using the Google webmaster tool and verifying your website. You can test Robots.txt file from there.
Try removing some asterisks:
User-agent: *
Disallow: /tags/
Disallow: /users/
Meanwhile, providing a location to your site map might be helpful too:
Sitemap: www.yoursite.com/sitemap.xml
My Joomla 1.5.18 site, I enabled login, when I click login the page I get sent to is NOT styled with CSS. If I login it redirects to the home page and it is not styled anymore either.
It looks like it is recursively appending stuff to the URL incorrectly.
http://www.myjoomlasite.org/index.php/index.php/login
if I click on home page or login links it keeps putting more and more index.php entries in the URL, and sometimes on the end. The following is what I get when I try and go to a JEvents menu item.
http://www.myjoomlasite.org/index.php/index.php/index.php/index.php/upcomingevents/month.calendar/2010/06/09/index.php
Anyone have any idea why this is happening? I don't know what to search for on Google apparently, and none of the Joomla! books I have address this.
I figured it out I had turned on Search Engine Friendly URLs in SEO Settings under Global Configuration. Turning this back off fixed the problem. Now I guess another question will be along the lines on how to get the Search Engine Friendly URL's to work again.
Make sure you link to stylesheets and images using a link that starts with a leading slash and therefore counts from the root.
It is the browser that evaluates the URL for those resources, based on the URL of the currently viewed HTML page. Never use relative links for these resources.