How represent message for Elgamal EC? - elliptic-curve

I am working on my project that uses elgamal elliptic curve.
I know when the elgamal ec encrypt by following steps
Represent the message m as a point M in E(Fp).
Select k ∈R [1,n−1].
Compute C1 = kP.
Compute C2 = M +kQ.
Return(C1,C2).
Where Q is the intended recipient’s public key, P is base point.
My qusetion at number one. how represent m as a point. Is point represent one character or represent group of characters.

There's no obvious way to attach m to points in E(Fp). However, you can use variant algorithm of ElGamal such as Menezes-Vanstone Elliptic curve cryptosystem to encode a message
in a point, a good reference here(P.31).
As for java code, I suggest you do some work, and post another question on SO when you encounter a problem you really can't solve by yourself.

Related

The tensor product ti() in GAM package gives incorrect results

I am surprising to notice that it is somehow difficult to obtain a correct fit of interaction function from gam().
To be more specific, I want to estimate an additive function:
y=m_1(x)+m_2(z)+m_{12}(x,z)+u,
where m_1(x)=x^2, m_2(z)=z^2,m_{12}(x,z)=xz. The following code generate this model:
test1 <- function(x,z,sx=1,sz=1) {
#--m1(x) function
m.x<-x^2
m.x<-m.x-mean(m.x)
#--m2(z) function
m.z<-z^2
m.z<-m.z-mean(m.z)
#--m12(x,z) function
m.xz<-x*z
m.xz<-m.xz-mean(m.xz)
m<-m.x+m.z+m.xz
return(list(m=m,m.x=m.x,m.z=m.z,m.xz=m.xz))
}
n <- 1000
a=0
b=2
x <- runif(n,a,b)/20
z <- runif(n,a,b)
u <- rnorm(n,0,0.5)
model<-test1(x,z)
y <- model$m + u
So I use gam() by fitting the model as
b3 <- gam(y~ ti(x) + ti(z) + ti(x,z))
vis.gam(b3);title("tensor anova")
#---extracting basis matrix
B.f3<-model.matrix.gam(b3)
#---extracting series estimator
b3.hat<-b3$coefficients
Question: when I plot the estimated function by gam()above against its true function, I end up with
par(mfrow=c(1,3))
#---m1(x)
B.x<-B.f3[,c(2:5)]
b.x.hat<-b3.hat[c(2:5)]
plot(x,B.x%*%b.x.hat)
points(x,model$m.x,col='red')
legend('topleft',c('Estimate','True'),lty=c(1,1),col=c('black','red'))
#---m2(z)
B.z<-B.f3[,c(6:9)]
b.z.hat<-b3.hat[c(6:9)]
plot(z,B.z%*%b.z.hat)
points(z,model$m.z,col='red')
legend('topleft',c('Estimate','True'),lty=c(1,1),col=c('black','red'))
#---m12(x,z)
B.xz<-B.f3[,-c(1:9)]
b.xz.hat<-b3.hat[-c(1:9)]
plot(x,B.xz%*%b.xz.hat)
points(x,model$m.xz,col='red')
legend('topleft',c('Estimate','True'),lty=c(1,1),col=c('black','red'))
However, the function estimate of m_1(x) is largely different from x^2, and the interaction function estimate m_{12}(x,z) is also largely different from xz defined in test1 above. The results are the same if I use predict(b3).
I really can't figure it out. Can anybody help me out by explaining why the results end up with this? Greatly appreciate it!
First, the problem of the above issue is not due to the package, of course. It is closely related to the identification conditions of the smooth functions. One common practice is to impose the assumptions that E(mj(.))=0 for all individual function j=1,...,d, and E(m_ij(x_i,x_j)|x_i)=E(m_ij(x_i,x_j)|x_j)=0 for i not equal to j. Those conditions require one to employ centered basis function in series estimator, which has been done already in GAM package. However, in my case above, function m(x,z)=x*z defined in test1 does not satisfy the above identification assumptions, since the integral of x*z with respect to either x or z is not zero when x and z have range from zero to two.
Furthermore, series estimator allows the individual and interaction function to be identified if one impose m(0)=0 or m(0,x_j)=m(x_i,0)=0. This can be readily achieved if we center the basis function around zero. I have tried both cases, and they work well whenever DGP satisfies the identification conditions.

How do I tell Maxima about valid approximations of subexpressions of a large expression?

I have a fairly large expression that involves a lot of subexpressions of the form (100*A^3 + 200*A^2 + 100*A)*x or (-A^2 - A)*y or (100*A^2 + 100*A)*z
I know, but I don't know how to tell Maxima this, that it in this case is valid to make the approximation A+1 ~ A, thereby effectively removing anything but the highest power of A in each coefficient.
I'm now looking for functions, tools, or methods that I can use to guide Maxima in dropping various terms that aren't important.
I have attempted with subst, but that requires me to specify each and every factor separately, because:
subst([A+1=B], (A+2)*(A+1)*2);
subst([A+1=B], (A+2)*(A*2+2));
(%o1) 2*(A+2)*B
(%o2) (A+2)*(2*A+2)
(that is, I need to add one expression for each slightly different variant)
I tried with ratsimp, but that's too eager to change every occurrence:
ratsubst(B, A+1, A*(A+1)*2);
ratsubst(B, A+1, A*(A*2+2));
(%o3) 2*B^2-2*B
(%o4) 2*B^2-2*B
which isn't actually simpler, as I would have preferred the answer to have been given as 2*B^2.
In another answer, (https://stackoverflow.com/a/22695050/5999883) the functions let and letsimp were suggested for the task of substituting values, but I fail to get them to really do anything:
x:(A+1)*A;
let ( A+1, B );
letsimp(x);
(x)A*(A+1)
(%o6) A+1 --\> B
(%o7) A^2+A
Again, I'd like to approximate this expression to A^2 (B^2, whatever it's called).
I understand that this is, in general, a hard problem (is e.g. A^2 + 10^8*A still okay to approximate as A^2?) but I think that what I'm looking for is a function or method of calculation that would be a little bit smarter than subst and can recognize that the same substitution could be done in the expression A^2+A as in the expression 100*A^2+100*A or -A^2-A instead of making me create a list of three (or twenty) individual substitutions when calling subst. The "nice" part of the full expression that I'm working on is that each of these A factors are of the form k*A^n*(A+1)^m for various small integers n, m, so I never actually end up with the degenerate case mentioned above.
(I was briefly thinking of re-expressing my expression as a polynomial in A, but this will not work as the only valid approximation of the expression (A^3+A^2+A)*x + y is A^3*x + y -- I know nothing about the relative sizes of x and y.

Parameter "triangulatedPoints" of recoverPose and math behind it

cv::recoverPose has parameter "triangulatedPoints" as seen in documentation, though math behind it is not documented, even in sources (relevant commit on github).
When I use it, I get this matrix in following form:
[0.06596200907402348, 0.1074107606919504, 0.08120752154556411,
0.07162400555712592, 0.1112415181779849, 0.06479560707001968,
0.06812069103377787, 0.07274771866295617, 0.1036230973846902,
0.07643884790206311, 0.09753859499789987, 0.1050111597547035,
0.08431322508162108, 0.08653721971228882, 0.06607013741719928,
0.1088621999959361, 0.1079215237863785, 0.07874160849424018,
0.07888037486261903, 0.07311940086190356;
-0.3474319603010109, -0.3492386196164926, -0.3592673043398864,
-0.3301695131649525, -0.3398606744869519, -0.3240186574427479,
-0.3302508442361889, -0.3534091474425142, -0.3134288005980755,
-0.3456284001726975, -0.3372514921152191, -0.3229005408417835,
-0.3156005118578394, -0.3545418178651592, -0.3427899760859008,
-0.3552801904337188, -0.3368860879000375, -0.3268499974874541,
-0.3221050630233929, -0.3395139819250934;
-0.9334091581425227, -0.9288726274060354, -0.9277125424980246,
-0.9392374374147775, -0.9318967835907961, -0.941870018271934,
-0.9394698966781299, -0.9306592884695234, -0.9419749503870455,
-0.9332801148509925, -0.9343740431697417, -0.9386198310107222,
-0.9431781968459053, -0.9290466865633286, -0.9351167772249444,
-0.9264105322194914, -0.933362882155191, -0.9398254944757025,
-0.9414486961893244, -0.935785675955617;
-0.0607238817598344, -0.0607532477465341, -0.06067768097603395,
-0.06075467523485482, -0.06073245675798231, -0.06078081616640227,
-0.06074754785132623, -0.0606879948481664, -0.06089198212719162,
-0.06071522666667255, -0.06076842109618678, -0.06083346023742937,
-0.06084805655000008, -0.0606931888685702, -0.06071558440082779,
-0.06073329803512636, -0.06078189449161094, -0.06080195858434526,
-0.06083228813425822, -0.06073695721101467]
e.g. 4x20 matrix (in this case there were 20 points). I want to convert this data to std::vector in order to use it in solvePnP. How to do it, what is the math here? Thanks!
OpenCV offers a triangulatePoints function, which has the same output:
points4D 4xN array of reconstructed points in homogeneous coordinates.
Which indicates that each column is a 3D point in homogeneous coordinate system. However your points looks quite not as I would expect. For instance your first point is:
[0.06596200907402348, -0.3474319603010109, -0.9334091581425227, -0.0607238817598344]
But I would expect the last component to be 1.0 already. You should double check if something is not wrong here. You can always remove the "scaling" of the point by dividing each dimension by the last component:
[ x, y z, w ] = w [x/w, y/w, z/w, 1]
And then use the first three parts for your PnP solution.
I hope this helps

what's the good practice to program with dynamic inputs in dplyr 0.3

My original intention to do this is to integrate dplyr with shiny
Prior to 0.3 I have used eval(parse(text=....)), do.call() approach.
In 0.3, I saw two more options, for example:
var <- c('disp','hp')
select_(mtcars,.dots = as.lazy_dots(var))
select(mtcars,one_of(var))
but which one is better? I intended to pass the selectInput values from Shiny app to do data transformations through dplyr.
Another question, what will be the right way to join two different dataset with dynamic but different key column? Is there anything I can leverage in 0.3?
for example
col_a, col_b are key variables to join from datasets a & b
left_join(dataset_a,dataset_b, by=c(col_a=col_b))
Thanks.
After a few attempts, here is my solution for the 2nd question, use a function to create a named vector, and then feed to left_join.
joinCol_a = xxx
joinCol_b = xxx
f <- function(a,b){
vec <- c(b)
names(vec) <- a
return(vec)
}
left_join(dataset_a,dataset_b,by=f(joinCol_a,joinCol_b))
I know it's not the best solution but this is what I can think of so far.

Can a SHA-1 hash be all-zeroes?

Is there any input that SHA-1 will compute to a hex value of fourty-zeros, i.e. "0000000000000000000000000000000000000000"?
Yes, it's just incredibly unlikely. I.e. one in 2^160, or 0.00000000000000000000000000000000000000000000006842277657836021%.
Also, becuase SHA1 is cryptographically strong, it would also be computationally unfeasible (at least with current computer technology -- all bets are off for emergent technologies such as quantum computing) to find out what data would result in an all-zero hash until it occurred in practice. If you really must use the "0" hash as a sentinel be sure to include an appropriate assertion (that you did not just hash input data to your "zero" hash sentinel) that survives into production. It is a failure condition your code will permanently need to check for. WARNING: Your code will permanently be broken if it does.
Depending on your situation (if your logic can cope with handling the empty string as a special case in order to forbid it from input) you could use the SHA1 hash ('da39a3ee5e6b4b0d3255bfef95601890afd80709') of the empty string. Also possible is using the hash for any string not in your input domain such as sha1('a') if your input has numeric-only as an invariant. If the input is preprocessed to add any regular decoration then a hash of something without the decoration would work as well (eg: sha1('abc') if your inputs like 'foo' are decorated with quotes to something like '"foo"').
I don't think so.
There is no easy way to show why it's not possible. If there was, then this would itself be the basis of an algorithm to find collisions.
Longer analysis:
The preprocessing makes sure that there is always at least one 1 bit in the input.
The loop over w[i] will leave the original stream alone, so there is at least one 1 bit in the input (words 0 to 15). Even with clever design of the bit patterns, at least some of the values from 0 to 15 must be non-zero since the loop doesn't affect them.
Note: leftrotate is circular, so no 1 bits will get lost.
In the main loop, it's easy to see that the factor k is never zero, so temp can't be zero for the reason that all operands on the right hand side are zero (k never is).
This leaves us with the question whether you can create a bit pattern for which (a leftrotate 5) + f + e + k + w[i] returns 0 by overflowing the sum. For this, we need to find values for w[i] such that w[i] = 0 - ((a leftrotate 5) + f + e + k)
This is possible for the first 16 values of w[i] since you have full control over them. But the words 16 to 79 are again created by xoring the first 16 values.
So the next step could be to unroll the loops and create a system of linear equations. I'll leave that as an exercise to the reader ;-) The system is interesting since we have a loop that creates additional equations until we end up with a stable result.
Basically, the algorithm was chosen in such a way that you can create individual 0 words by selecting input patterns but these effects are countered by xoring the input patterns to create the 64 other inputs.
Just an example: To make temp 0, we have
a = h0 = 0x67452301
f = (b and c) or ((not b) and d)
= (h1 and h2) or ((not h1) and h3)
= (0xEFCDAB89 & 0x98BADCFE) | (~0x98BADCFE & 0x10325476)
= 0x98badcfe
e = 0xC3D2E1F0
k = 0x5A827999
which gives us w[0] = 0x9fb498b3, etc. This value is then used in the words 16, 19, 22, 24-25, 27-28, 30-79.
Word 1, similarly, is used in words 1, 17, 20, 23, 25-26, 28-29, 31-79.
As you can see, there is a lot of overlap. If you calculate the input value that would give you a 0 result, that value influences at last 32 other input values.
The post by Aaron is incorrect. It is getting hung up on the internals of the SHA1 computation while ignoring what happens at the end of the round function.
Specifically, see the pseudo-code from Wikipedia. At the end of the round, the following computation is done:
h0 = h0 + a
h1 = h1 + b
h2 = h2 + c
h3 = h3 + d
h4 = h4 + e
So an all 0 output can happen if h0 == -a, h1 == -b, h2 == -c, h3 == -d, and h4 == -e going into this last section, where the computations are mod 2^32.
To answer your question: nobody knows whether there exists an input that produces all zero outputs, but cryptographers expect that there are based upon the simple argument provided by daf.
Without any knowledge of SHA-1 internals, I don't see why any particular value should be impossible (unless explicitly stated in the description of the algorithm). An all-zero value is no more or less probable than any other specific value.
Contrary to all of the current answers here, nobody knows that. There's a big difference between a probability estimation and a proof.
But you can safely assume it won't happen. In fact, you can safely assume that just about ANY value won't be the result (assuming it wasn't obtained through some SHA-1-like procedures). You can assume this as long as SHA-1 is secure (it actually isn't anymore, at least theoretically).
People doesn't seem realize just how improbable it is (if all humanity focused all of it's current resources on finding a zero hash by bruteforcing, it would take about xxx... ages of the current universe to crack it).
If you know the function is safe, it's not wrong to assume it won't happen. That may change in the future, so assume some malicious inputs could give that value (e.g. don't erase user's HDD if you find a zero hash).
If anyone still thinks it's not "clean" or something, I can tell you that nothing is guaranteed in the real world, because of quantum mechanics. You assume you can't walk through a solid wall just because of an insanely low probability.
[I'm done with this site... My first answer here, I tried to write a nice answer, but all I see is a bunch of downvoting morons who are wrong and can't even tell the reason why are they doing it. Your community really disappointed me. I'll still use this site, but only passively]
Contrary to all answers here, the answer is simply No.
The hash value always contains bits set to 1.

Resources