Base64 encode: Three different outputs from different tools? - oauth

I am trying to verify an OAuth signature generated in code against a "known reputable source". All my steps are verified correct except the last, wherein a 'base signature string' is HMAC-SHA1 hashed against a secret key and then base64 encoded.
I have confirmed that my hash value is the same as expected by the algorithm. I then disconfirmed that my base64 encode was the same. Attempting to determine why my encode failed, I wanted to check the encoder I was using.
Here is the (hash) string being base64 encoded:
203ebb13a65cccaae5cb1b9d5af51fe41f534357
Here is the base64 encode that results in my code:
MjAzZWJiMTNhNjVjY2NhYWU1Y2IxYjlkNWFmNTFmZTQxZjUzNDM1Nw==
According to http://www.motobit.com/util/base64-decoder-encoder.asp, that is the correct result:
But, according to http://www.online-convert.com/result/096d7b00138f3726daee5f6ddb107a62 (provided with the secret and base string, not the hash), a different base64 should have been output. Note that the hash output is my correct hash despite the difference in base64:
Finally, the "official" tester (http://hueniverse.com/oauth/guide/authentication/) outputs a third different base64 from the same hash:
I have no idea what I'm doing wrong, and the fact that these tools are outputting different results makes me wonder if there is in fact such a thing as base64 encoding or if they are actually using different algorithms? Perhaps the fact that it's for OAuth would help you help me identify the answer.
Thanks for any leads from the wise.

OK, in this case the first website was making the same "mistake" I was (in my case it was a mistake, the first website may just be making an unstated assumption).
That mistake is whether the hash is interpreted as a string (which gets base64encoded) or as a series of hexadecimal values which get base64encoded. In the former case, the resultant encode is longer than the original string, while in the latter the resultant encode is shorter than the original string. This is not only empirically true but the interwebs show that it is one of the concepts behind the standard in the first place.
The second website, working from (as stated) "hex" data, got the correct answer.

Try to check via https://base64-encode.org
On this website you can convert all types of images to Base64 string.

Related

Get hashed value from HMAC SHA256 in Swift [duplicate]

I have a string that was salted, hashed with SHA-256, then base64 encoded. Is there a way to decode this string back to its original value?
SHA-256 is a cryptographic (one-way) hash function, so there is no direct way to decode it. The entire purpose of a cryptographic hash function is that you can't undo it.
One thing you can do is a brute-force strategy, where you guess what was hashed, then hash it with the same function and see if it matches. Unless the hashed data is very easy to guess, it could take a long time though.
You may find the question "Difference between hashing a password and encrypting it" interesting.
It should be noted - Sha256 does not encrypt the data/content of your string, it instead generates a fixed size hash, using your input string as a seed.
This being the case - I could feed in the content of an encyclopedia, which would be easilly 100 mb in size of text, but the resulting string would still be 256 bits in size.
Its impossible for you to reverse the hash, to get that 100mb of data back out of the fixed size hash, the best you can do, is try to guess / compute the seed data, hash, and then see if the hash matches the hash your trying to break.
If you could reverse the hash, you would have the greatest form of compression to date.
SHA* is a hash function. It creates a representation (hash) of the original data. This hash is never intended to be used to recreate the original data. Thus it's not encryption. Rather the same hash function can be used at 2 different locations on the same original data to see if the same hash is produced. This method is commonly used for password verification.
You've done the correct thing by using a salt aka SSHA.
SHA and SHA-2 (or SHA-256) by itself without a salt are NOT considered secure anymore! Salting a SHA hash is called Salted SHA or SSHA.
Below is a simple example on how easily it is to de-hash SHA-1. The same can be done for SHA-2 without much effort as well.
Enter a password into this URL:
http://www.xorbin.com/tools/sha1-hash-calculator
Copy paste the hash into this URL:
https://hashes.com/en/decrypt/hash
Here's a page which de-hashes SHA-2. The way this pages works is somebody must have hashed your password before, otherwise it won't find it:
md5hashing dot net/hashing/sha256
Here's a page that claims to have complete SHA-2 tables available for download for a "donation" (I haven't tried it yet):
crackstation dot net/buy-crackstation-wordlist-password-cracking-dictionary.htm
Here's a good article that explains why you have to use SSHA over SHA:
crackstation dot net/hashing-security.htm

Best compression algorithm for Url query string

I have to pass a large url query string, so when this string size exceeds a certain number of characters, it creates problem when passed in the url.
Currently I have tried deflation + base64 encoding, which is giving me around 30-35% compression.
So if my query string becomes too large, say 4400 characters, it will be compressed to approximately 2650 chars, which wont fit to my url.
I need a solution that gives better results than this one.
I searched a lot, but not able to find a better solution.
Any suggestions on what else could be done will be appreciated. Thanks.
Example of my query string:
3d7821d1-e324-4cea-9bd7-763c0b62cdc2|94db7bdb-5e16-4700-a1f9-408ba7f7bee1|63360a17-0807-45a0-a798-31eb2614b0f7|9b37f302-2757-40e5-b9b4-390e5b786010|46ef6bce-c7e9-47d6-90d8-bc7c2b5784c0|e5f450a5-724b-42a0-aff9-34be2d50f59b|33db4e6b-bc53-4774-8267-759167a8dba9|30a8c7a9-0a3b-4df3-ab01-5e9b262d1902|d31086bb-98e8-41d0-a6cf-0bd48986bce7|30f27de5-1536-483a-85aa-6eb5000ba67b|41498746-3f45-4c16-9152-a6ca8355d502|6b5c643b-03f6-4390-9d54-79bf978f8e15|4537e3ba-09ed-465a-aad8-1c842084c3af|ad1161ab-0393-4a66-a538-6dda0c7b892a.....
Currently the solution- deflation + base64, doesnot completely solve my issue but improves the situation, so I integrated it with my code.
And for Future work, thinking about:
Converting the request to POST
OR
Taking sequential ids (1,2,3...), instead of UUID
(the example of query string shows that it is a concatenation of UUIDs)
and concatenating, and passing in GET request.

What is the usefulness of mb_http_output() given that the output encoding is typically fixed by other means?

All over the Internet, including in stackoverflow, it is suggested to use mb_http_input('utf-8') to have PHP works in the UTF-8 encoding. For example, see PHP/MySQL encoding problems. � instead of certain characters. On the other hand, the PHP manual says that we cannot fix the input encoding within the PHP script and that mb_http_input is only a way to query what it is, not a way to set it. See http://www.php.net/manual/en/mbstring.http.php and http://php.net/manual/en/function.mb-httpetinput.php . Ok, this was just a clarification of the context before the question. It seems to me that there is a lot of redundant commands in Apache + PHP + HTML to control the conversion from the input encoding to the internal encoding and finally to the output encoding. I don't understand the usefulness of this. For example, if the original input encoding from some external HTTP client is EUC-JP and I set the internal encoding to UTF-8, then PHP would have to make the conversion. Am I right? If I am right, why would I set an input encoding in php.ini (instead of just passing the original one) given that it would be next immediately converted to the utf-8 internal encoding anyway? A similar question hold for the output. In all my htpp files, I use a meta tag with charset=utf-8. So, the output HTTP encoding is fixed. Moreover, in PHP.ini, I can set the default_charset that will appear in the HTTP header to utf-8. Why would I bother to use mb_http_output('uft-8') when the final output encoding is already fixed. To sum up, can someone give me a practical concrete example where mb_http_output('uft-8') is clearly necessary and cannot be replaced by more usual commands that are often inserted by default in editors such as Dreamweaver?
These two options are just about the worst idea the PHP designers ever had, and they had plenty of bad ideas when it comes to encodings.
To convert strings to a specific encoding, one has to know what encoding one is converting from. Incoming data is often in an undeclared encoding; the server just receives some binary data, it doesn't know what encoding it represents. You should declare what encoding you expect the browser to send by setting the accept-charset attribute on forms; doing that is no guarantee that the browser will do so and it doesn't make PHP know what encoding to expect though.
The same goes for output; PHP strings are just byte arrays, they do not have an associated encoding. I have no idea how PHP thinks it knows how to convert arbitrary strings to a specific encoding upon input or output.
You should handle this manually, and it's really easy to do anyway: declare to clients what encoding you expect, check whether input is in the correct encoding using mb_check_encoding (not _detect encoding or some such, just check), reject invalid input, take care to keep everything in the same encoding within the whole application flow. I.e., ideally you have no conversion whatsoever in your app.
If you do need to convert at any point, make it a Unicode sandwich: convert input from the expected encoding to UTF-8 or another Unicode encoding on input, convert it back to desired output encoding upon output. Whenever you need to convert, make sure you know what you're converting from. You cannot magically "make all strings UTF-8" with one declaration.

Why are URLs encoded in Base32?

This is a really short question I think but I'm not sure I understand the point of it.
Why are URLs encoded in Base32? What are the benefits of it and what are the drawbacks of it?
Sometimes URL data needs to be encoded to encapsulate things that aren't easily type-able, such as "ÓĆ", or even binary data that has no text representation at all. Putting that inside of a query string was problematic. Some servers don't understand Unicode text in a query string, though that situation is certainly getting better.
So the data needs to be encoded somehow that the server can interpret correctly, and the application knows how to use. Base32 is commonly used for that. It encodes any binary data into a ASCII text representation of that data. When the original data is needed, it is decoded.
So why not base64? Base64 will almost always have a shorter encoding length. Base64's weakness is that it uses both upper and lower case letter for encoding. There is a distinction between A and a. Whereas Base32 only uses one letter's casing, so it can be case insensitive. Generally (but not always), URLs are case insensitive, and using Base32 keeps that notion alive. This distinction is useful when the encoded data is meant to be typed, read aloud, etc.
The drawback to Base32 is that the resulting encoding is almost always longer due to a much smaller character set.

Convert SHA1 back to string

I have a user model on my app, and my password field uses sha1. What i want is to, when i get the sha1 from the DB, to make it a string again. How do i do that?
You can't - SHA1 is a one-way hash. Given the output of SHA1(X), is not possible to retrieve X (at least, not without a brute force search or dictionary/rainbow table scan)
A very simple way of thinking about this is to imagine I give you a set of three-digit numbers to add up, and you tell me the final two digits of that sum. It's not possible from those two digits for me to work out exactly which numbers you started out with.
See also
Is it possible to reverse a sha1?
Decode sha1 string to normal string
Thought relating MD5, these other questions may also enlighten you:
Reversing an MD5 Hash
How can it be impossible to “decrypt” an MD5 hash?
You can't -- that's the point of SHA1, MDB5, etc. Most of those are one-way hashes for security. If it could be reversed, then anyone who gained access to your database could get all of the passwords. That would be bad.
Instead of dehashing your database, instead hash the password attempt and compare that to the hashed value in the database.
If you're talking about this from a practical viewpoint, just give up now and consider it impossible. Finding the original string is impossible (except by accident). Most of the point of a cryptographically secure hash is to ensure you can't find any other string that produces the same hash either.
If you're interested in research into secure hash algorithms: finding a string that will produce a given hash is called a "preimage". If you can manage to do so (with reasonable computational complexity) for SHA-1 you'll probably become reasonably famous among cryptanalysis researchers. The best "break" against SHA-1 that's currently known is a way to find two input strings that produce the same hash, but 1) it's computationally quite expensive (think in terms of a number of machines running 24/7 for months at a time to find one such pair), and does not work for an arbitrary hash value -- it finds one of a special class of input strings for which a matching pair is (relatively) easy to find.
SHA is a hashing algorithm. You can compare the hash of a user-supplied input with the stored hash, but you can't easily reverse the process (rebuild the original string from the stored hash).
Unless you choose to brute-force or use rainbow tables (both extremely slow when provided with a sufficiently long input).
You can't do that with SHA-1. But, given what you need to do, you can try using AES instead. AES allows encryption and decryption.

Resources