I've converted all my URLs to SEO friendly URLs.
But I want to restrict to be accessed to my non-seo friendly URLs.
As an example, you can access to www.example.com/article-1 with http://www.example.com/index.php?option=com_content&view=article&id=76&Itemid=113. But I don't want this. I just want you to be able to access with http://www.example.com/article-1
I wish that I'm clear to explain what I need.
I don't think it's possible for the simple reason that Joomla always uses the non-SEF links internally. That's why they always work.
Also there are links which are not converted to SEF links because the user will not see and Google will not index them. Like links used by AJAX scripts or similar things.
If you block non-SEF urls in your .htaccess file, expect your page to break sooner than later. Don't blame the extension developer then :-)
Related
I'm developing a SPA web app and it will support various languages. It is build with AngularJS and I am using angular-translate to provide i18n.
But I am struggling a little bit with how the URL structure should be. I do no plan on using either gTLDs nor ccTLDs, so that leaves me with three options.
Use query params: ?locale=en-us
Use url paths: /en-us/page
Store the chosen locale in localStorage or a cookie
The first option is a no-go according to Google's guidelines for web apps SEO. So that leaves me with the last two options.
I have a hard time deciding which is more beneficial, though I am inclined to believe that using url paths would probably be more crawler friendly.
P.S: Not sure if this is the best place to ask such a question either.
The second option is your safest bet as according to https://webmasters.stackexchange.com/questions/59652/what-happens-if-i-try-to-set-a-cookie-on-a-bot cookies are ignored. You can test this yourself by going to the Google Console and fetching your website.
As of now most crawlers ignore cookies and DO NOT execute JavaScript. This means that they usually just download the html and make their judgements from there.
Some developers get around the no javascript problem by pre-rendering parts of their content. I haven't done it personally but you might want to check out https://prerender.io/
Edit
As rolandjitsu mentioned google crawls and executes javascript content.
You should go with second option: provide the language tag (and, optionally, region subtags) in the URL path as first segment.
For the simple reason that it allows you, visitors, and bots to link to specific translations.
So, I've set up a site and have Search Engine Friendly URLs on YES, I've set up page aliases and my main URLs are fine but those pages, for some reason, can be accessed trough some weird links like mysite.com/component/content/article/17-category/61-article-name.html instead of just mysite.com/category/article-name.html like I want it and like I have it in my sitemap.
Why is joomla generating these redundant URLs and how to get rid of them (so when somebody clicks on them in google it takes him to 404)?
Thanks
PS. answer on question How to clean up Joomla! URLs? does not help me.
As per http://magazine.joomla.org/issues/issue-june-2013/item/1054-duplicate-pages-joomla-causes-errors-solutions
I used 301 Redirect in .htaccess file to redirect from doubled URLs
a month ago i relaunched a Website in Typo3 CMS. Before that, the site was hosted with Joomla CMS.
In Joomla Config, SEO Links were disabled, so Google indexed the Page Urls this:
www.domain.de/index.php?com_component&itemid=123....
for example.
Now, a month later (after the Typo3 Relaunch), these Links are still visible in Google because the Urls don't return a 404-Error. That's because "index.php" also exists on Typo3 and Typo3 doesnt care about the additional query string/variables - it returns a 200 status code and shows the front page.
In Google Webmaster Tools it's possible to delete single Urls from the Google Index, but that way i have to delete about 10000 Urls manually...
My Question is: Is there a way to remove these old Urls from the Google Index?
Greetings
With this amount of URL's there is only one sensible solution, implement the proper 404 handling in your TYPO3, or even better redirections to same content placed in TYPO3.
You can use TYPO3's handler (search for it in Install Tool > All configuration) it's called pageNotFound_handling, you can use options like REDIRECT for redirecting to some page or even USER_FUNCTION, which allow you to use own PHP script, check the description in the Install Tool.
You can also write a simple condition in TypoScript and check if Joomla typical params exists in the URL - so that easy way you can return custom 404 page. If it's important to you to make more sophisticated condition (for an example, you want to redirect links which previously pointed to some gallery in Joomla, to new gallery in TYPO3) you can make usage of userFunc condition and that would be probably best option for SEO
If these urls contain an acceptable number of common indicators, you could redirect these links with a rule in your virtual host or .htaccess so that google will run into the correct error message.
I wrote a google chrome extension to remove urls in bulk in google webmaster tools. Check it out here: https://github.com/noitcudni/google-webmaster-tools-bulk-url-removal.
Basically, it's a glorified for loop. You put all the urls in a text file. For example,
http://your-domain/link-1
http://your-domain/link-2
Having installed the extension as described in the README, you'll find a new "choose a file" button.
Select the file you just created. The extension reads it in, loops thru all the urls and submits them for removal.
I would like to hide the webpage name in the url and only display either the domain name or parts of it.
For example:
I have a website called "MyWebSite". The url is: localhost:8080/mywebsite/welcome.xhtml. I would like to display only the "localhost:8080/mywebsite/".
However if the page is at, for example, localhost:8080/mywebsite/restricted/restricted.xhtml then I would like to display localhost:8080/mywebsite/restricted/.
I believe this can be done in the web.xml file.
I believe that you want URL rewriting. Check out this link: http://en.wikipedia.org/wiki/Rewrite_engine - there are many approaches to URL rewriting, you need to decide what is appropriate for you. Some of the approaches do make use of the web.config file.
You can do this in several ways. The one I see most is to have a "front door" called a rewrite engine that parses the URL dynamically to internally redirect the request, without exposing details about how that might happen as you would see if you used simple query strings, etc. This allows the URL you specify to be digested into a request for a master page with specific content, instead of just looking up a physical page at that location to serve.
The StackExchange sites do this so that you can link to a question in a semi-permanent fashion (and thus can use search engines with crawlers that log these URLs) without them having to have a real page in the file system for every question that's ever been asked (we're up to 9,387,788 questions as of this one).
I am trying to implement Twitter's OAuth into my Code Igniter web application at which the callback URL is /auth/ so once you have authenticated with Twitter you are taken to /auth/?oauth_token=SOME-TOKEN.
I want to keep the nice clean URL's the framework provides using the /controller/method/ style of URL but I want to enable query strings as well, there will only ever be one name of the data oauth_token so it's ok if it has to be hard coded.
Any ideas?
I have tried tons of the things people are saying to do, but none work :(
PS: I'm using the .htaccess method of URL rewriting.
There are several ways to handle this.
Most People, and Elliot Haughin's Twitter Lib, extend the CI_Input library with a MY_Input library that sets allow_query_strings to true
You will also need to add ? to the allowed characters in config/config.php and set $config['url_protocal'] to PATH_INFO
see here: Enable GET in CodeIgniter
Codeigniter Reactor lets you access $_GET directly or via $this->input->get(). You don't need to use MY_Input or even change your config.php. This method leaves the query string in the URL, however.
I used a hacked index.php to recognise users coming back from Twitter, check for valid and safe values, then re-direct it to to a CodeIgniter friendly URL.
It may not be to everyones taste but I preferred it over allowing query strings throughout the entire application instead of just one particular circumstance.