SEO urls to Dynamic downgrade - url

I am using phpbb forum, with some seo plugin, which turned all my dynamic urls "viewtopic.php?=1234" to SEO urls such as "/super-jackpot-t821.html". I was happy with it.
but now problme is, i have moved host, moved phpbb to sub folder and upgraded to latest phpbb. Now that plugin stopped working and all the urls are already indexed by google, yahoo etc.
So i was thinking is it possible to 301 redirect SEO urls back to normal urls? May be picking the last 821 number of seo url using HTACCESS and turning it back to viewtopic.php?t=821 ???
thankx.

Here's a htaccess guide i found.
http://www.garnetchaney.com/htaccess_tips_and_tricks.shtml
to match 0 to 9999 the regex should be ^[0-9]{1,4}$

Related

Is PHP Routing with similar_text bad for search engines?

My website should only have one index.php that checks the requested URL and displays the right content via include(...) respectively.
Right now I am using similar_text for URLs that doesn't exist on website. Than the most simliar path should be choosen.
But I heard that simliar URLs that gives the same content aren't good for search engines.
So does it have a bad effect to search engines like Google?
All urls with the same content are counted as double.
You should return 404 not found header for all urls wich do not mach exactly with yours.
header('HTTP/1.0 404 Not Found');
echo "<h1>Error 404 Not Found</h1>";
More detailes about google's attitude see in documentation

Switched to HTTPS, should we use URL Removal in Webmaster Tools

We recently changed protocol to HTTPS and our google search impressions has plummeted. The old site, with the URL beginning with HTTP is still appearing in google search results even though we have set up our redirects correctly.
Do we go ahead and request URL Removal in Google Webmaster Tools or leave google to do its thing. We are worried that Google might be seeing our old site and penalising our new site for duplicate content.
No. If you do that the http version will be removed too. Ensure you have set up 301 redirects and be patient.

How some websites doesn't show up page extension in the address bar?

Its just a question out of curiosity.
I have seen a lot of websites that doesn't show the page types/extensions in the address bar.For example, the stackoverflow's Ask Question page has the address stackoverflow.com/questions/ask instead of something like stackoverflow.com/questions/ask.php.
Do they use something to hide that page extension?Or why I do not see the page extension?
I think its a nice think for page security.
using .htaccess file, you can do that
something similar here Remove .php extension with .htaccess
All the .htaccess answers that you have seen apply to traditional PHP applications because they are all uploaded as normal files to the document root of a webserver. This means that each PHP file is "browsable" directly, assuming you haven't prevented this at your webserver configuration.
StackOverflow (which is a .NET application) and other modern applications use a URL mapping paradigm - not only does this help with "clean" URLs, but also because cool URIs don't change. It really doesn't have anything to do with security.
So it is most likely that each URL is mapped to a function, this function returns a response that is sent to the browser.
PHP frameworks offer the same - Laravel routing, symfony routing and zend framework routing are all examples of this mapping paradigm.
A .htaccess (hypertext access) file is a directory-level configuration file supported by several web servers, that allows for decentralized management of web server configuration. They are placed inside the web tree, and are able to override a subset of the server's global configuration for the directory that they are in, and all sub-directories.
htaccess file
Rewrite Guides
More htaccess tips and tricks
Rewrite Url
Servers often use .htaccess to rewrite long, overly comprehensive URLs to shorter and more memorable ones.
Authorization, authentication
A .htaccess file is often used to specify security restrictions for a directory, hence the filename "access". The .htaccess file is often accompanied by a .htpasswd file which stores valid usernames and their passwords
Given three links above these will explain you in better way.
this is done by using the .htaccess file to configure the details of a website
example:
RewriteEngine on
Rewrite Base /
RewriteRule ([a-z]+)/?$ index.php?menu=$1 [NC,L]
this example rewrites a URL which looks like this www.mydomain.com/home into www.mydomain.com?index.php&menu=home
for more details please search stackoverflow / google

Google indexed my domain anyway?

I have a robots.txt like below but Google has still indexed my domain. Basically they've indexed mydomain.com but not mydomain.com/any_page
UserAgent: *
Disallow: /
I mean how can I go back further than / which I thought was the root of domain?
Note this domain is a work in progess, hence I don't want Google or any other search engines seeing it for a minute.
If you don't have one already, get a Google Webmaster Tools account. It includes a URL removal tool that may work for you.
This doesn't address the problem of search engines possibly ignoring or misinterpreting your robots.txt file, of course.
If you REALLY want your site to be off the air until it's launched, your best bet is to actually take it off the air. Make the site inaccessible except by password. If you put HTTP Basic authentication on your documentroot, then no search engine will be able to index anything, but you'll have full access with a password.

Resource for Recognizing Framework/CMS From URL or Other Clues?

I'm curious to know which web framework or content management system a website is using based upon clues from the URL, headers, content. Does anyone know of a resource on the web that would provide this? For example:
.html -> maybe a flat-file
.php -> something built using PHP, perhaps.
.jsp -> something using Java Server Pages
.asp -> Active Server Pages
0,2097,1-1-1928,00 -> Vignette CMS
.do -> ??
Thanks.
Finding CMS by url
http://2ip.ru/cms/
enter URL in center input field and click big blue button below
Results in black - not found,
in red - found
NOTE:
May play around with url path: with http://, with or without www. part -- results may differ.
If you're not restricted to just the query string then there are a few other options. For example to identify a rails app:
Script, stylesheet and image tags tend to have a 10 digits number appended (this allows you to cache, and still change the file):
<script src="/javascripts/all.js?1236037318" type="text/javascript"></script>
You can also sometimes tell from the cookies what the framework is. For example rails apps tend to have a session cookie called _appName_session, and often you can find a flash contained.
You're on the right track with your list there. If all you want to know is the stack (LAMP, IIS, Java) then that's all you really need.
If querying the URL in question is an option, then you can usually pull the webserver make/version out of the HTTP response header as well.
There is a nifty Chrome extension called Wappalyzer:
Wappalyzer is a browser extension that uncovers the technologies used
on websites. It detects content management systems, eCommerce
platforms, web servers, JavaScript frameworks, analytics tools and
many more.

Resources