Wonder what's the point of masking an affiliate link like making it a full javascript link?
You can disallow those links in robots.txt + make them nofollow.
Why some sites avoid using anchor links, instead they create links on buttons/divs/etc. with javascript?
Related
a month ago i relaunched a Website in Typo3 CMS. Before that, the site was hosted with Joomla CMS.
In Joomla Config, SEO Links were disabled, so Google indexed the Page Urls this:
www.domain.de/index.php?com_component&itemid=123....
for example.
Now, a month later (after the Typo3 Relaunch), these Links are still visible in Google because the Urls don't return a 404-Error. That's because "index.php" also exists on Typo3 and Typo3 doesnt care about the additional query string/variables - it returns a 200 status code and shows the front page.
In Google Webmaster Tools it's possible to delete single Urls from the Google Index, but that way i have to delete about 10000 Urls manually...
My Question is: Is there a way to remove these old Urls from the Google Index?
Greetings
With this amount of URL's there is only one sensible solution, implement the proper 404 handling in your TYPO3, or even better redirections to same content placed in TYPO3.
You can use TYPO3's handler (search for it in Install Tool > All configuration) it's called pageNotFound_handling, you can use options like REDIRECT for redirecting to some page or even USER_FUNCTION, which allow you to use own PHP script, check the description in the Install Tool.
You can also write a simple condition in TypoScript and check if Joomla typical params exists in the URL - so that easy way you can return custom 404 page. If it's important to you to make more sophisticated condition (for an example, you want to redirect links which previously pointed to some gallery in Joomla, to new gallery in TYPO3) you can make usage of userFunc condition and that would be probably best option for SEO
If these urls contain an acceptable number of common indicators, you could redirect these links with a rule in your virtual host or .htaccess so that google will run into the correct error message.
I wrote a google chrome extension to remove urls in bulk in google webmaster tools. Check it out here: https://github.com/noitcudni/google-webmaster-tools-bulk-url-removal.
Basically, it's a glorified for loop. You put all the urls in a text file. For example,
http://your-domain/link-1
http://your-domain/link-2
Having installed the extension as described in the README, you'll find a new "choose a file" button.
Select the file you just created. The extension reads it in, loops thru all the urls and submits them for removal.
I've converted all my URLs to SEO friendly URLs.
But I want to restrict to be accessed to my non-seo friendly URLs.
As an example, you can access to www.example.com/article-1 with http://www.example.com/index.php?option=com_content&view=article&id=76&Itemid=113. But I don't want this. I just want you to be able to access with http://www.example.com/article-1
I wish that I'm clear to explain what I need.
I don't think it's possible for the simple reason that Joomla always uses the non-SEF links internally. That's why they always work.
Also there are links which are not converted to SEF links because the user will not see and Google will not index them. Like links used by AJAX scripts or similar things.
If you block non-SEF urls in your .htaccess file, expect your page to break sooner than later. Don't blame the extension developer then :-)
I'm curious as to whether or not link shorteners like tinyurl, bitly, etc., affect backlinking in any way.
For instance, the gplus.to link shortener shortens your google+ page. So that will change your Google+ link anchor from:
https://plus.google.com/u/0/b/73418324312440134122432433432/posts
to
https://gplus.to/mycompanyname
Is this necessarily a bad thing for SEO backlinking purposes? Will the googlebot not "recognize" the shortened link and will it not aid my SERPS?
I suppose the same goes for everything. If I shorten a longer link to my company's website by using something like bitly.com ... Does this effectively make it "not a backlink", as opposed to if I posted the entire link (http://www.company.com/products/metals/default.aspx)? Any guidance in this regard would be greatly appreciated!
It depends on URL shortening service, if they are using HTTP 301/302 for redirects, than google will be good with it, however, few search engines also use words from url kinda
www.demo.com/how-to-search-better
Changes to
www.bit.ly/12345
Doesn't make sense to search engines or to the user either
Would be great if you guys could shed some light on this, has baffled me:
I was asked by a client if I could try and make the search term for his comedy night "sketchercise" put his website top of the Google ranking. I simply changed the title tag of the header for the whole site from "Allnutt and Simpson" to "Allnutt and Simpson - Sketchercise # Ginglik - Sketch Duo". It did the trick and now the site comes up top of the Google listing when typing in "sketchercise". However, it gives off this very strange link:
http://www.allnuttandsimpson.com/index.php/videos/
This is the link to the google search result too:
http://www.google.co.uk/search?sourceid=chrome&ie=UTF-8&q=sketchercise
This link is invalid, it doesn't make any sense. I guess it has something to do with the use of hash tags and the AJAX driven site, but before I changed the title tag, it linked to the site fine using the # tags. What is the deal with this slash?
The strangest part is that the valid URL for the videos page on that site is /index.php#vidspics, I have never used the word "videos" in a url!
If anyone can explain the cause of this or just help me stop it from happening, I'd be very grateful. I realise that this is an SEO question and I hate that stuff generally, but I hope you can see this is a bit of a strange case!
Just to compare, if you google "allnutt and simpson" it works just fine links to the site and all of it's pages absolutely fine as .php pages (and then my JS converts them to hash tags to keep things clean)
It's because there must be a folder called 'videos' under your hosted files, use an FTP client and check this.
Google crawls every folder and file unless you tell him not to do this, look for robot.txt files to learn how to avoid indexation.
Also ask google to remove that result when you solve this.
Finally that behaviour is not related with hash tags, these are just references to javascript in order to display the appropiate content in you webpage.
Not sure why its posted like this but the only way to stop that page from appearing is using a google webmaster account for this website and make sure the crawlers can't find this link anymore. The alternative is have the site admin put this tag, <META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW"> , in the header when isset($_REQUEST(videos)) is true.
The slash in the address is the parsed form of www.allnuttandsimpson.com/index.php?=videos. You can have the web server change all the php parameters into slashes to make the links look pretty.
Best option for correct results is to create a sitemap and submit it to https://www.google.com/webmasters/tools/ for that site. You will need access.
Oh forgot, the sitemap will make google see all the pages you want it to post, use this for the major pages like those in the main menu. To remove links you don't want requires a robots.txt in the main directory of the site.
How do I implement anchor tag urls so that search engines crawl my pages? Here's an example from twitter:
In search results it's:
http://twitter.com/username
When I click on it, it redirects me to
http://twitter.com/#!/username
How does twitter know when to redirect? Relying on a User-Agent doesn't seem such a good idea.
Twitter isn't optimizing their site for SEO. They have a special deal with Google so I wouldn't use them as an example. Google has support for hash URLs, which you can read about here https://developers.google.com/webmasters/ajax-crawling/docs/specification.
The main idea is that a URL like http://www.example.org/#!/my-url the crawlers convert to http://www.example.org/?_escaped_fragment_=/my-url. When Google encounters that URLs it makes a get requests to the alternative URL and will use that content to index it.