Why the strength of BCryptPasswordEncoder is in between 4 and 31?

Why the strength of BCryptPasswordEncoder is in between 4 and 31? - spring-security

According to the docs :
Implementation of PasswordEncoder that uses the BCrypt strong hashing function. Clients can optionally supply a "strength" (a.k.a. log rounds in BCrypt) and a SecureRandom instance. The larger the strength parameter the more work will have to be done (exponentially) to hash the passwords. The default value is 10.

The strength is translated to iterations. For strength x there will be 2x iterations. Implementations are assumed to use unsigned 32-bit integer, where the maximum value is 4294967295. If x is larger than 31, 2x is bigger than this maximum value and an overflow would occur.
In practice, the Java implementation in Spring Security actually uses a 64-bit long since integers are signed in Java (maximum of int is 231-1).
A strength of 31 or close thereof is very slow and not usable anyway.

Related

Which one is the better CRC scheme?

Say I have to error-check a message of some 120-bits long.I have two alternative for checksum schemes:
Split message to 5 24-bit strings and append each with a CRC8 field
Append the whole message with a CRC32 field
Which scheme has a higher error detection probability, and why? Let's assume no prior knowledge about the error patterns distribution.
UPDATE:
What if the system has a natural mode of failure which is a received cleared bit instead of a set bit (i.e., "1" was Tx-ed but "0" was Rx-ed), and the opposite does not happen?
In this case, the probability of long bursts of error bits is much smaller, assuming that the valid data has a uniform distribution of "0"s and "1"s, so the longest burst will be bound by the longest string of "1"s in the message.

You have to make some assumption about the error patterns. If you have a uniform distribution over all possible errors, then five 8-bit CRCs will detect more of the errors than one 32-bit CRC, simply because the former has 40 bits of redundancy.
However, I can construct many 24-bit error patterns that fool an 8-bit CRC, and use any combination of five of those to get no errors over all of the 8-bit CRCs. Yet almost all of those will be caught by the 32-bit CRC.

A good paper by Philip Koopman goes through evaluation of several CRCs, mostly focusing on their Hamming Distance. Like Mark Adler pointed out, the error distribution plays an important role in CRC selection (e.g. burst errors detection is one of the variable properties of CRC), as is the length of the CRC'ed data.
The Hamming Distance of a CRC indicates the maximum number of errors in the data which are 100% detectable.
Ref:
Cyclic Redundancy Code (CRC) Polynomial Selection For Embedded Networks:
https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.5.5027&rep=rep1&type=pdf
8-bit vs 32-bit CRC
For exemple, the 0x97 8-bit CRC polynomial has HD=4 up to 119 bits data words (which is more than your required 24-bit word), which means it detects 100% of 4 bits (or less) errors for data of length 119 bits or less.
On the 32-bit side, the 32-bit CRC 0x9d7f97d6 offer HD=9 up to 223 bits (greater than 5*24=120bits) data words. This means that it will detect 100% of the 9 bits (or less) errors for data composed of 223 bits or less.
Theoretically, 5x 8-bit CRCs would be able to 100% detect 4*4=16 evenly distributed bit flips across your 5 chunks (4 errors per 24-bit chunk). On the other end, the 32-bit CRC would only be able to 100% detect 9 bit flips per 120-bit chunk.
Error Distribution
Knowing all that, the only missing piece is the error distribution pattern. With it in hand, you'll be able to make an informed decision on the best CRC method to use. You seem to say that long burst of errors are not possible, but do not mention the exact maximum length. If that length does go up to 9 bits, you might be better off with the CRC32. If you expect occasional, <4-bit errors, both would do, though the 5x8-bit will consume more bandwidth (40 bits instead of 32 bits). If this is the case, a 32-bit CRC might even be overkill, a smaller CRC16 or even CRC9 could provide enough detection capabilities.
Beyond the hamming window, the CRC will not be able to catch every possible errors. The bigger the data length, the worse the CRC performances.

The CRC32 of course. It will detect ordering errors as between the five segments, as well as giving you 224 as much error detection.

Hardware implementation for integer data processing

I am currently trying to implement a data path which processes an image data expressed in gray scale between unsigned integer 0 - 255. (Just for your information, my goal is to implement a Discrete Wavelet Transform in FPGA)
During the data processing, intermediate values will have negative numbers as well. As an example process, one of the calculation is
result = 48 - floor((66+39)/2)
The floor function is used to guarantee the integer data processing. For the above case, the result is -4, which is a number out of range between 0~255.
Having mentioned above case, I have a series of basic questions.
To deal with the negative intermediate numbers, do I need to represent all the data as 'equivalent unsigned number' in 2's complement for the hardware design? e.g. -4 d = 1111 1100 b.
If I represent the data as 2's complement for the signed numbers, will I need 9 bits opposed to 8 bits? Or, how many bits will I need to process the data properly? (With 8 bits, I cannot represent any number above 128 in 2's complement.)
How does the negative number division works if I use bit wise shifting? If I want to divide the result, -4, with 4, by shifting it to right by 2 bits, the result becomes 63 in decimal, 0011 1111 in binary, instead of -1. How can I resolve this problem?
Any help would be appreciated!

If you can choose to use VHDL, then you can use the fixed point library to represent your numbers and choose your rounding mode, as well as allowing bit extensions etc.
In Verilog, well, I'd think twice. I'm not a Verilogger, but the arithmetic rules for mixing signed and unsigned datatypes seem fraught with foot-shooting opportunities.
Another option to consider might be MyHDL as that gives you a very powerful verification environment and allows you to spit out VHDL or Verilog at the back end as you choose.

Input of a fixed point DSP

i'm new to working with dsps and fixed point and i really need to know:
1. Is it the fixed point dsp that converts the float number to Q format or a device does that before feeding the Dsp?
2. Who specifies the Q format to be used. Does each DSP come with a specified Q_format or the programmer does that in his codes.
3. Can i have an idea of how to perform a simple say 4 by 4 fixed point matrix multiplication in c++?
Thanks in anticipation

The format is usually fixed for a given DSP, e.g. Motorola DSP 56k family uses a 24 bit signed fractional format (Q23).
Fixed point is really just the same as an ordinary integer but there's an implicit scale factor. For most operations this makes no difference, e.g. load/store/add/subtract all work the same way regardless of whether the data is integer or fixed point.
When it comes to multiplication or division however the implicit scaling factor needs to be taken into account - typically there will be a shift after the operation to correct for this. DSP instructions take care of this automatically, whereas normal CPUs have to do this explicitly.
When you're doing e.g. a 4x4 matrix multiply you just use the DSP's native fixed point arithmetic instructions and the scaling is all taken care of automatically.

Block dimensions in CUDA

I have a NVIDIA GTX 570 compute capability 2.0 running cuda-4.0.
The deviceQuery executable in the CUDA SDK gives me information on my CUDA device and its various properties. Two of the lines in the output are
Maximum number of threads per block: 1024
Maximum sizes of each dimension of a block: 1024 x 1024 x 64
Why is the 3rd dimension of the block restricted to be upto 64 threads only wheras the X and the Y dimension can vary upto 1024 threads?

EDIT2: ALso, please take this with a grain of salt; this is a purely hypothetical answer, or a guess. There may indeed be a clear hardware-based reason why 64 is the maximum. Frankly I don't know, and my answer is based on an assumption that there is no such hardware limit, per se.
It's probably a combination of three things: first, there is a limit to the number of threads which can be resident inside a block; second, block dimensions are typically in multiples of 32, and even more often in powers of 2 greater than 32; third, coordinate systems used in the solution of multi-dimensional problems are most often oriented so that you're looking at the scene directly (i.e., with the important bits more distributed in X and Y than in Z).
CUDA naturally has to support 1D access, as this is an immensely common and efficient access pattern when it is applicable. TO support this, the X dimension must be allowed to vary over the entire range of 1024 threads.
To support 2D access, which is less common, CUDA should minimally support up to 512 in the X dimension (using the convention that the X dimension should be oriented in the coordinate system so that it measures the biggest spread) and 32 in the Y dimension. It must support up to 1024 in the X dimension, and I suppose they relax the requirement that the X dimension be no smaller than the Y dimension and allow the full 1024 range of Y values. However, in my understanding, 32 would have been plenty big for the Y dimension maximum.
To support 3D access, maintaining X, Y >= Z and trying to reach 1024, it seems to be that in the best case X=Y=Z=10; so there's no real argument for allowing Z to be greater than 10, given my assumptions
In summary, I don't see why they couldn't have made the maximums (1024, 32, 10). My question is why make them (1024, 1024, 64)? The only answer I keep coming back to is to allow some flexibility to programmers to violate the X>=Y>=Z coordinate system convention.
Edit: given my summary and hypothetical answer, the real answer to your question is this: it's an arbitary decision.

My wild guess is that because threadIdx.x, threadIdx.y and threadIdx.z are kept in a special single 32-bit register, possibly with even some other additional data. Maybe warp id? Or maybe multiprocessor-block id to identify which block given thread handles, if given multiprocessor runs more than one?
This is purely speculative, I have no data to support it, but I would imagine that they want to have as few special registers as possible.

Floating point accuracy in F# (and .NET)

In "F# for Scientists" Jon Harrop says:
Roughly speaking, values of type int approximate real
numbers between min-int and max-int with a constant absolute error of +- 1/2
whereas values of the type float have an approximately-constant relative error that
is a tiny fraction of a percent.
Now, what does it mean? Int type is inaccurate?
Why C# for (1 - 0.9) returns 0.1 but F# returns 0.099999999999978 ? Is C# more accurate and suitable for scientific calculations?
Should we use decimal values instead of double/float for scientific calculations?

For an arbitrary real number, either an integral type or a floating point type is only going to provide an approximation. The integral approximation will never be off by more than 0.5 in one direction or the other (assuming that the real number fits within the range of that integral type). The floating point approximation will never be off by more than a small percentage (again, assuming that the real is within the range of values supported by that floating point type). This means that for smaller values, floating point types will provide closer approximations (e.g. storing an approximation to PI in a float is going to be much more accurate than the int approximation 3). However, for very large values, the integral type's approximation will actually be better than the floating point type's (e.g. consider the value 9223372036854775806.7, which is only off by 0.3 when represented as 9223372036854775807 as a long, but which is represented by 9223372036854780000.000000 when stored as a float).
This is just an artifact of how you're printing the values out. 9/10 and 1/10 cannot be exactly represented as floating point values (because the denominator isn't a power of two), just as 1/3 can't be exactly written as a decimal (you get 0.333... where the 3's repeat forever). Regardless of the .NET language you use, the internal representation of this value is going to be the same, but different ways of printing the value may display it differently. Note that if you evaluate 1.0 - 0.9 in FSI, the result is displayed as 0.1 (at least on my computer).
What type you use in scientific calculations will depend on exactly what you're trying to achieve. Your answer is generally only going to be approximately accurate. How accurate do you need it to be? What are your performance requirements? I believe that the decimal type is actually a fixed point number, which may make it inappropriate for calculations involving very small or very large values. Note also that F# includes arbitrary precision rational numbers (with the BigNum type), which may also be appropriate depending on your input.

No, F# and C# uses the same double type. Floating point is almost always inexact. Integers are exact though.
UPDATE:
The reason why you are seeing a difference is due to the printing of the number, not the actual representation.

For the first point, I'd say it says that int can be used to represent any real number in the intger's range, with a constant maximum error in [-0,5, 0.5]. This makes sense. For instance, pi could be represented by the integer value 3, with an error smaller than 0.15.
Floating point numbers don't share this property; their maximum absolute error is not independent of the value you're trying to represent.

3 - This depends on calculations: sometimes float is a good choice, sometimes you can use int. But there are tasks when you lack of precision for any of float and decimal.
The reason against using int:
> 1/2;;
val it : int = 0
The reason against using float (also known as double in C#):
> (1E-10 + 1E+10) - 1E+10;;
val it : float = 0.0
The reason against BCL decimal:
> decimal 1E-100;;
val it : decimal = 0M
Every listed type has it's own drawbacks.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart