Just trigger in my mind when I was going through some websites were they having upper case and lower case combination in url something like http://www.domain.com/Home/Article
Now as I know we should always use lowercase in url but have not idea about technical reason. I would like to learn from you expert to clear this concept why to use lowercase in url. What are the advantages and disadvantages for upper case url.
The domain part is not case sensitive. GoOgLe.CoM works. You can add uppercase as you like, but normally there's not a reason to do so and, as stated in the comments below, may hurt your SEO ranking.
The path part is or is not case sensitive, depending on the server environment and server. Typically Windows machines are case insensitive, while Linux machines are case sensitive. This means that you should stick to lowercase or you risk introducing a bug that's really hard to hunt down (mismatched case that doesn't matter on the dev server).
The query string part is available to the server as it is. You can readily use mixed-case as you like, or discard the case (toLowerCase(...)). This also means that using a base64-encoded keys will work. You can't expect the users to type that correctly, though.
The hash part (called "fragment identifier") is only available to the client code, not to the server. Javascript may distinguish between the cases as it likes, and so does the browser. url#a will scroll to the element with the ID a, but url#A won't.
I'm going to have to disagree with all established wisdom on this, so I'll probably get downvoted, but:
If you redirect all mixed case urls to your properly cased url, it solves all the problems mentioned. Therefore it seems this argument is coming from tradition and preference. The point of a URL is to have a user-friendly representation of a page, and if your url is friendlier with upper case, why not use it? Compare:
moviesforyoutowatch.com/batman-vii-the-dark-knight-whatevers
MoviesForYouToWatch.com/Batman-VII-The-Dark-Knight-Whatevers
I find the mixed case version superior for the purpose. If there's a technical reason that can't be solved with a lower-case compare and redirect, please share it.
I know you asked for technical reasons but it's also worth considering this from a UX perspective.
Say you have a URL with upper case characters and, for arguments sake, this has been distributed on printed media. When a user comes to enter that URL into their browser they may well be compelled to match that case (or be forced to match the specified case if your web server is case sensitive) ultimately you are giving them more work to do as they have to consider case as well. After all, they don't know if your server is case sensitive or not and they may have experienced 404s from case sensitive web servers in the past.
If your server is case sensitive and you are using mixed case URLs you are giving more scope for the user to mistype the URL. Furthermore, say you have the URL www.example.com/Contact. It's easy to confuse an upper and lower case "c" (especially if it is copied in hand writing) if the user overlooks this and uses the wrong case they may never reach your content.
With all this in mind consider www.example.com/News/Articles/FreeIceCreamForAll. On keyboard that's not too difficult but consider this on a mobile device, it would be very fiddly to input.
The reverse is also true should a user want to write down a URL from the address bar. They may feel they need to match the case, ultimately giving them more work to do and increasing the likelyhood of errors.
To conclude; keep URLs lower case.
REGARDING SECURITY ASPECTS OF THIS ISSUE:
There is actually a good security reason to use a mix of uppercase and lowercase.
It has the effect of confusing and blocking attackers !
In human conversation humans get easily confused with uppercase and lowercase use.
Humans can't "speak" the word of the "identifiers or passwords or url's" with clarity if they contain uppercase and lowercase.
This helps with security on data or passwords on site sub-parts that are provided as part of a locked-in or secure sub-part of an "automated access" part of sites or their data.
It's similar to NOT USING JSON.
JSON is "human-readable text" and so JSON is simply giving all the attackers (Including Governments, Google .. who steal your ideas and data) ... almost everything they need to know about the data ... it's much more secure to confuse them by using private bespoke very-fast "binary protocols" - that use your own "unknowable data structures" ... but just watch out, because it is actually possible to confuse yourself or your own development team.
All your security layers and protocols have to be "well managed" to avoid confusion.
There is therefore an extra level of site and data security from human attackers (and some robots) to be had by simply using totally unconventional systems (i.e. why on earth would anybody want to use a "standard security protocol" when by some simple heavyweight prior computing they can all be easily broken).
Just "salt and hash" everything - plus also add some extra extra bespoke security of your own - it's just commonsense !
Conclusion: All the above answers are very clear and correct - but you can also happily leverage that very same knowledge to confuse potential attackers.
Related
TL;DR: I need a source for as many different output formats from a whois query as possible.
Background:
I am looking for a single reference that can provide as many (if not all) unique whois query output formats as possible.
I don't believe this exists but hope to be proven wrong.
This appears to be an age old problem
This stackoverflow post from 2015 references the challenge of handling the "~40 formats" that the author was aware of.
The author never detailed any of these formats.
The RFC for whois is... depressing
The IETF ran an analysis in 2015 that examined the components of whois per each RIR at the time
In my own research I see that registrars like JPNIC do not appear to comply with the APNIC standards
I am aware of existing tools that do a bang-up job parsing whois (python-whois for example) however I'd like to hedge my bets against outliers with odd formats. I'm also open to possible approaches to gather this information, however that would likely be too broad to fit this question.
Hoping there is a simple "go here and download this" answer. Hoping...
"TL;DR: I need a source for as many different output formats from a whois query as possible."
There isn't, except if you use any kind of provider that does this for you, with whatever caveats.
Or more precisely there isn't something public, maintained and exhaustive. You can find various libraries that try to do this, in various languages, but none is complete, as this is basically an impossible task, especially if you want to include any TLDs, like ccTLDs (you are not framing your constraints space in a very detailed way, nor in fact really saying you are asking about domain name data in whois or IP addresses/ASN data?).
Some providers of course try to do that and offering you an abstract uniform API. But why would anyone share their internal secret sauce, that is list of parsers and so on? It makes no business incentive to do that.
As for opensource library authors (I was one at some point), it is just tedious and absolutely not rewarding at all to just update it forever with all new formats and tweaks per registry (battle scar example: one registrar in the past changed its output format at each query! one query gave you somefield: somevalue while next time it was somefield:somevalue or somefield somevalue, etc. of course that is only a simple example).
RFC 3912 specified just the transport part, not the content, hence a lot of cases appeared. Specifically in the ccTLD world, each registry is king in its kingdom and it is free to implement whatever it wants the way it wants. Also the protocol had some serious limitations (ex: internationalization, what is the "charset" used for the underlying data) that were circumvented in different ways (like passing "options" in your query... of course none of them are standardized in any way)
At the very least, gTLDs whois format is specified there:
https://www.icann.org/resources/pages/approved-with-specs-2013-09-17-en#whois
Note however that due to GDPR there were changes (see https://www.icann.org/resources/pages/gtld-registration-data-specs-en/#temp-spec) and will be other changes in the future.
However, you should be highly pressed to look at RDAP instead of whois.
RDAP is now a requirement in all gTLDs registries and registries. As it is JSON, it solves immediately the problem of format.
Its core specifications are:
RFC 7480 HTTP Usage in the Registration Data Access Protocol (RDAP)
RFC 7481 Security Services for the Registration Data Access Protocol (RDAP)
RFC 7482 Registration Data Access Protocol (RDAP) Query Format
RFC 7483 JSON Responses for the Registration Data Access Protocol (RDAP)
RFC 7484 Finding the Authoritative Registration Data (RDAP) Service
You can find various libraries doing RDAP for you (see below for links), but at its core it is JSON over HTTPS so you can emulate simple cases with any kind of HTTP client library.
Work is underway to fix some missing/not precise enough details on RFC 7482 and 7483.
You need also to take into account ICANN specifications (again, only for gTLDs of course):
https://www.icann.org/en/system/files/files/rdap-technical-implementation-guide-15feb19-en.pdf
https://www.icann.org/en/system/files/files/rdap-response-profile-15feb19-en.pdf
Note that, right now, even if it is an ICANN requirement, you will find a lot of missing or broken gTLD registries or registrar RDAP server. You will also find a lot of "deviations" in replies from what would be expected per the specification.
I gave full details in various other questions here, so maybe have a look:
https://stackoverflow.com/a/61877920/6368697
https://stackoverflow.com/a/48066735/6368697
https://webmasters.stackexchange.com/a/115605/75842
https://security.stackexchange.com/a/213854/137710
https://serverfault.com/a/999095/396475
PS: philosophical question on "Hoping there is a simple "go here and download this" answer. Hoping..." because a lot of people hoped for that in the past, and see initial remark at beginning. Let us imagine you go forward and build this magnificent resource with all exhaustive details. Would you be inclined to just share it with anyone, for free? The answer is probably no, for obvious reasons, so the same happened in the past for others that went on the same path as you, and hence the results of now various providers offering you more or less this service (you would need to find details on which formats are parsed, the rate limites, the prices, etc.), but nothing freely available to share.
Now you can just dream/hope that every registries and registrars switch to RDAP AND implement it properly. Then the problem of format is solved once for all. However, the above requirements ("every" + "properly") are not small, and may not happen "soon". Specifically in ccTLDs, where registries are in no way mandated by any external force (except market pressure?) to implement RDAP at all.
Can anyone suggest about the different between two domain in Search engine and it's effect. although there are two different words in the domain most prefer domain without "-" but in my knowledge "-" means space in the URL and "_" means same words but this two symbols are least use in domain name. Can anyone provide the different on these two.
One should first give priority to the domain name without '-' because it is hard to pronounce when telling someone your domain name, as well as chances are high that people will often forget '-' in your domain name when they are typing, at least the first few times. Of course this will impact your business negatively.
Also, the domain with hyphen doesn't produces very good feeling in the customer as well. Agree with what #chimpsarehungry said in the earlier answer.
Other than that, I guess it doesn't matters much in the SEO though. May be even produces good effect in some cases as in long URLs. For eg. WordPress posts. URL's with '-' are search engine friendly.
Take a look for yourself, based on 2011 data gathered by SEOmoz:
http://www.seomoz.org/article/search-ranking-factors#metrics
Not looking so good for dashes. Some of that is from correlation of spammers using such domains, but definitely not all of it. I apologize I don't have a reference to back this up, but there was a Matt Cutts QA where he said multiple dashes is indicative of spam and does indeed get a negative hit in overall rank score. I believe it was part of a big keynote speech so it'd be hard to find. You'll just have to take my word for it.
I don't think this will matter at all. But as a search engine user the sites with dashes in between them look spam-like to me. Name one popular website with a dash.
For my current application I use a very simple scheme to register new users. When a new user registers an email is sent with a key. To check wether this key is correct a kind of checksum is computed (3-7-11 digit check) which is added as the last 2 digits of the key. There is no check on any further validity of the key. The application does not check whether the key got invalidated.
It is a simple scheme and someone took the time to crack it by deassembling the code. I want to use another scheme for my new application but I am not sure what is the best way to do this.
Is there a Delphi library I could use?
Is it advisable to use some user supplied info in the key, like his name?
Is there a best practice way of registering users?
Anything else I have forgotten?
Some registration schemes require an application to check each time at a webserver whether the key is still valid. I'd rather not go that far because this requires a lot of effort on the server side.
Any suggestion or link for a robust way to register new users is very welcome.
A better registration scheme is based on asymmetric cryptography (usually RSA algorithm). The idea is that only you can generate a valid key, while everybody can check that a key is valid (asymmetric cryptography allows this trick). So when you see your program with a valid key on torrents you just cancel support for a customer who was given this key.
There are Delphi and non-Delphi libraries (i.e Protexis) available to protect your software - remember that almost anything that works with C can work with Delphi as well. But a sound copy protection scheme may be hard to achieve. A simple key may not work, usually it used together a machine fingerprint to allow it to be used on given system only.
A good key generator algorithm should generate keys that are not easily predictable, yet can be checked if valid. There are different ones around, there is not a "generic" one, depends on your needs, some may also include what features to activate or expiry informations. Some keys can be strings, other can be whole license files (as those used by Delphi itself). Anyway code can be disassembled to try to guess the algorithm, some techniques to obfuscate it and make it harder to understand can be used.
Also, one simple key check is not enough because it can be easily bypassed patching the executable. If you really need copy protection, you should scatter checks all around the code, maybe encrypting and then decrypting data or code sections using the key - it won't protect you against keygen, anyway and will require more code changes, it's not as simple as calling one function at startup.
The level of protection is up to you. If you need just a simple registration mechanism and you don't mind much about your software being cracked you can use a simple one. If you need a more secure one then there are more sophisticated one.
If your goal is to force people to download a cracked EXE from the Internet instead of a key generator from the Internet, then asymmetric cryptography is your answer.
If your goal is to be able to void serial numbers that have been released to the wild, restrict the number of installations, or force the user to have a real "paid for" serial number, then activation is your answer. Still, if they crack your EXE, they can get around this.
You only have control up to the point that someone cracks your EXE. We have to accept this and move on. We must figure out other ways to reach out to our customers, such as more affordable versions, value added support options, web services, and other ways that convince the user that the price of our software is fair, and there is a benefit in paying.
On my latest release, I use activation, so the serial numbers are randomly generated, though checked for uniqueness, and associated with an email address.
After all of this, the application is just $4.99, but with no individual support. The goal is to make it so affordable that if they want to use it, even just once, it's a good value.
We've been using Oreans' WinLicense for two years and are quite happy with it. They handle key generation (with the user name embedded), trial versions that time-out, hardware keys (where the key you send them is unique for their computer) and VM detection. They also use a variety of other techniques to make it harder for your code to be disassembled, including wrapping code of your choice in an encrypted VM they provide.
You can also disable specific keys if you determine that they are "stolen." Having done this, future updates you supply will no longer run with those keys.
We also have our software "phone home" at certain times to see if their key is stolen.
Any protection scheme can be broken by someone who is determined and skilled enough. But, we've been happy with the degree of security we believe that WinLicense gives us. Their support is also excellent. The library is callable from Delphi.
**update: horray! so it is a journey of practice and understanding. ;) now i no longer feel so dumb.*
I have read up many articles on REST, and coded up several rails apps that makes use of RESTful resources. However, I never really felt like I fully understood what it is, and what is the difference between RESTful and not-restful. I also have a hard time explaining to people why/when they should use it.
If there is someone who have found a very clear explanation for REST and circumstances on when/why/where to use it, (and when not to) it would benefit the world if you could put it up, thanks! =)
REST is usually learned like this:
You hear about REST being using HTTP the way it was meant to be used, and from that you shun SOAP Web Services' envelopes, since most of what's needed by many SOAP standards are handled by HTTP in a simple, no-nonsense way. You also quickly learn that you need to use the right method for the right operation.
Later, perhaps years later, you hear that REST is more than that. REST is in fact also the concept of linking between resources. This often takes a while to grasp the full meaning of, but when you learn this, you start introducing hyperlinks into your responses so that clients can navigate your system without being coupled to how the server wants to name its resources (i.e. the URIs).
Even later, you learn that you still haven't understood REST! And this is because you find out that media types are important. You start making media types called application/vnd.example.foo+json and put hyperlinks in them, since that's already your understanding of REST.
Years pass, and you re-read Fielding's thesis for the umpteenth time, to see if there's anything you missed, and it suddenly dawns upon you what really the HATEOAS constraint is: It's about the client not having any notion of how the server's resources are structured, but that it discoveres these relationships at runtime. It also means that the screen in front of the user is driven completely by what is passed over the wire, so in fact, if a server passes an image/jpeg then that's what you're supposed to show to the user, not an error message saying "AtomProcessor can't handle image/jpeg".
I'm just coming to terms with #4 and I'm hoping the ladder isn't much longer! It's taken me seven years.
This article does a good job classifying the differences in several http application styles from WS-* to RESTian purity. What I like about this post is it reminds you that most of what we call REST really is something only partly in line with Roy Fielding's original definition.
InfoQ has a whole section addressing more of the "what is REST" angle as well.
In terms of REST vs. SOAP, this question seems to have a number of good responses, particularly the selected answer.
I would imagine YMMV, but I found it very easy to start understanding the details of REST after I realised how REST essentially was a continuation of the static WWW concepts into the web application design space. I had written (a rather longish) post on the same : Why REST?
Scalability is an obvious benefit of REST (stateless, caching).
But also - and this is probably the main benefit of hypertext - REST is ideal for when you have lots of clients to your service. Following REST and the hypertext constraint drastically reduces the coupling between all those clients and your server, which means you have more freedom when evolving/developing your service over time - you are not tied down by the risk of breaking would-be-coupled clients.
On a practical note, if you're working with rails - then restfulie is a great little framework for tackling hypertext on the client and server. Server side is a rails extension, and client is a DSL for handling state changes. Interesting stuff, check it out here: http://restfulie.caelum.com.br/ - I highly recommend the tutorial/demo vids they have up on vimeo :)
Content-Type: text/x-flamebait
I've been asking the same question lately, and my supposition is that
half the problem with explaining why full-on REST is a good thing when
defining an interface for machine-consumed data is that much of the
time it isn't. OK, you'd need a really good reason to ignore the
commonsense bits (URLs define resources, HTTP verbs define actions,
etc etc) - I'm in no way suggesting we go back to the abomination that
was SOAP. But doing HATEOAS in a way that is both Fielding-approved
(no non-standard media types) and machine-friendly seems to offer
diminishing returns: it's all very well using a standard media type to
describe the valid transitions (if such a media type exists) but where
the application is at all complicated your consumer's agent still
needs to know which are the right transitions to make to achieve the
desired goal (a ticket purchase, or whatever), and it can't do that
unless your consumer (a human) tells it. And if he's required to
build into his program the out-of-band knowledge that the path with
linkrels create_order => add_line => add_payment_info => confirm is
the correct one, and reset_order is not the right path, then I don't
see that it's so much more grievous a sin to make him teach his XML
parser what to do with application/x-vnd.yourname.order.
I mean, obviously yes it's less work all round if there's a suitable
standard format with libraries and whatnot that can be reused, but in
the (probably more common) case that there isn't, your options
according to Fielding-REST are (a) create a standard, or (b) to
augment the client by downloading code to it. If you're merely
looking to get the job done and not to change the world, option (c)
"just make something up" probably looks quite tempting and I for one wouldn't
blame you for taking it.
For example: http://stackoverflow.com/questions/396164/exposing-database-ids-security-risk and http://stackoverflow.com/questions/396164/blah-blah loads the same question.
(I guess this is DB id of Questions table? Is this standard in ASP.NET?)
What are the pros and cons of using this type of scheme in your web app?
Well, for one, simple id's are usually sequential, so it's quite easy to guess at and retrieve other data from your application.
Load JSON at runtime rather than dynamically via AJAX
https://stackoverflow.com/questions/395858/doesnt-matter-what-I-type-here
Now, having said that, that might also be seen as a bonus, because nobody in their right mind would make their whole security hinge on the fact that you have to clink on a link to get to your secure data, and thus easy discoverability of the data might be good.
However, one point is that you're at some point going to reindex your database, having something that makes the old url's invalid would be bad, if for no other reason that search engines would still have old links.
Also, here on SO it's quite normal to use links like this to other questions, so if they at some point want to reindex and thus renumber things (or move to guid's), they will still have to keep the old structure and id's.
Now, is this likely to ever happen or be needed? Probably no.
I wouldn't worry too much about it, just build your security as though every entrypoint to your application is known and there should be no problems.
The database ID is used to lookup the question in the database. It's numerical which means: fast. If you would leave it out you had to lookup the title which is a lot slower.
The question itself is part of the url to make it "search engine friendly". It'll be higher ranked by g**gle etc.
Pro:
Super easy to retrieve the page information. Take the ID, call the database, viola. Your table will (should) be indexed to make this lookup super fast.
Guaranteed unique URL.
Con:
IDs in your system are being publicly displayed. Not a problem in a publicly available system like SO. However, proper security measures on the back end can make this not a problem even on sensitive systems.
Ugly URLs. 6+ digit numbers are just hard to remember, and makes it more difficult to distinguish pages, if the number is all that identifies it. This can also has SEO consequences, as URLs with more relevant and well structured information are generally ranked better. SO compensates by providing the post name in the URL as well. While I still can't rattle off a particular post to my buddy at lunch, I can still find it easier in the browser history.
Slower lookups. Doing text searches on a database is generally slower.
But remember in a community like this there is a higher (although still minimal) chance of the same question name being posted at the same time, which would break things, thus some kind of unique identification need be applied, ID's are probably quite logical in the context that this particular web application was developed in.
I dont think it's bad practice, and fairly common, to do it in ASP.NET and other frameworks. As #lassevk said, if your security depends on it, then you need some more checks in there (can user X get to record Y), but it more comes down to the SEO-friendlyness of the URLs for public sites.
For example, SO's URLs are fairly friendly:
Pros and cons of using DB id in the URL?
google rates information at the START of the URL higher than at the end, so having it look like:
https://stackoverflow.com/pros-and-cons-of-using-db-id-in-the-url/q/407120
should get a higher ranking for "pros and cons of using db id in the url". It's not the only factor, but it is quite a major one - look at Amazon's format, they do it for a very good reason:
http://www.amazon.com/Maverick-Ricardo-Semler/dp/0712678867
http://server/book-name/dp/book-id
Wordpress does it like this:
http://server/yyyy/mm/dd/name-of-the-post
however, if you post two posts on the same day called "foo", you get:
http://server/yyyy/mm/dd/foo
http://server/yyyy/mm/dd/foo2
the slug (foo/foo2) isn't a PK, but it IS maintained as unique over the posts table.
I think putting the ID in the URL isn't a problem, unless your URL is a GUID! Way too long, and hard to type. If it's an int, or some kind of short guid (eg 6-8 chars), then it shouldn't be a problem.