How to get the number of combinations from some number that contains only two different digits? - delphi

For example, two-digit number have 4 combinations: 11, 12, 21, 22. Three-digit number have 8 combinations: 111, 112,...222.
How to get number of combinations for number that have 4, 5, ... 10 or more digits?
Thanks
P.S. This refers to the Delphi :)

The answer is 2N, where N is the number of digits.
This is a purely mathematical problem, and concerns very basic combinatorics. It is easy to see why 2N is the right answer. Indeed, there are two ways to choose the first digit. For each such choice, there are two ways to chose the second digit. Hence, there are 2×2 ways to chose a two-digit number. For each such number, there are two ways to add a third digit, making 2×2×2 ways to construct a three-digit number. Hence, there are
2 × 2 × ... × 2 = 2^N
ways to construct a N-digit number.
In Delphi, you compute 2N by Power(2, N) (uses Math). [A less naïve way, which works for N < 31, is 1 shl N.]

Related

Finding a controversy parameter from aggregated votes

I made a survey where users could vote on a subject. They were allowed to either yay it (+1) , nay it (–1) or don't care (0).
I only have the aggregate results in Google Sheets like
yay nay dontcare
Option A: 32 14 23
Option B: 12 37 20
Option C: 40 17 12
Option D: 64 3 2
The number of votes are always the same on every option.
Now I need to find out how controversial the answers are. I thought about STDEVP, but I do not have a list of cells, just the aggregates.
How do I find the standard deviation here with Google Sheets?
Assuming you ignore don't care's you can just take the prevalence of yay's and use sd=sqrt(p(1-p))
so if yay's are in column B, nays in C you use
=SQRT(B2/SUM(B2:C2) * (C2/SUM(B2:C2)))
Note that this is the standard deviation for a population.
If you want to include them you can use calculate the mean in E2 with
=SUMPRODUCT(B2:D2, {1, -1, 0}) / SUM(B2:D2)
Then you can calculate variance like this in F2
=SUMPRODUCT(ArrayFormula({1, -1, 0}-E2)^2, B2:D2) / (SUM(B2:D2)-1)
which is just taking every 1, -1, or 0 reduces by the mean, squares this deviation it and takes the average -1 degree of freedom (for the sample, leave the -1 out if you assume you have the population).
The Standard deviation is
=SQRT(F2)

Compute similarity between n entities

I am trying to compute the similarity between n entities that are being described by entity_id, type_of_order, total_value.
An example of the data might look like:
NR entity_id type_of_order total_value
1 1 A 10
2 1 B 90
3 1 C 70
4 2 B 20
5 2 C 40
6 3 A 10
7 3 B 50
8 3 C 20
9 4 B 50
10 4 C 80
My question would be what is a god way of measuring the similarity between entity_id 1 and 2 for example with regards to the type_of_order and the total_value for that type of order.
Would a simple KNN give satisfactory results or should I consider other algorithms?
Any suggestion would be much appreciated.
The similarity metric is a heuristic to capture a relationship between two data rows, with respect to the data semantics and the purpose of the training. We don't know your data; we don't know your usage. It would be irresponsible to suggest metrics to solve a problem when we have no idea what problem we're solving.
You have to address this question to the person you find in the mirror. You've given us three features with no idea of what they mean or how they relate. You need to quantify ...
relative distances within features: under type_of_order, what is the relationship (distance) between any two measurements? If we arbitrarily assign d(A, B) = 1, then what is d(B, C)? We have no information to help you construct this. Further, if we give that some value c, then what is d(A, C)? In various popular metrics, it could be 1+c, |1-c|, all distances could be 1, or perhaps it's something else -- even more than 1+c in some applications.
Even in the last column, we cannot assume that d(10, 20) = d(40, 50); the actual difference could be a ratio, difference of squares, etc. Again, this depends on the semantics behind these labels.
relative weights between features: How do the differences in the various columns combine to provide a similarity? For instance, how does d([A, 10], [B, 20]) compare to d([A, 10], [C, 30])? That's two letters in the left column, two steps of 10 in the right column. How about d([A, 10], [A, 20]) vs d([A, 10], [B, 10])? Are the distances linear, or do the relationships change as we slide up the alphabet or to higher numbers?

How to process % to negative number in Visual Foxpro

How to do % to negative number in VF?
MOD(10,-3) = -2
MOD(-10,3) = 2
MODE(-10,-3) = -1
Why?
It is a regular modulo:
The mod function is defined as the amount by which a number exceeds
the largest integer multiple of the divisor that is not greater than
that number.
You can think of it like this:
10 % -3:
The largest multiple of 10 that is less than -3 is -2.
So 10 % -3 is -2.
-10 % 3:
Now, why -10 % 3 is 2?
The easiest way to think about it is to add to the negative number a multiple of 2 so that the number becomes positive.
-10 + (4*3) = 2 so -10 % 3 = (-10 + 12) % 3 = 2 % 3 = 3
Here's what we said about this in The Hacker's Guide to Visual FoxPro:
MOD() and % are pretty straightforward when dealing with positive numbers, but they get interesting when one or both of the numbers is negative. The key to understanding the results is the following equation:
MOD(x,y) = x - (y * FLOOR(x/y))
Since the mathematical modulo operation isn't defined for negative numbers, it's a pleasure to see that the FoxPro definitions are mathematically consistent. However, they may be different from what you'd initially expect, so you may want to check for negative divisors or dividends.
A little testing (and the manuals) tells us that a positive divisor gives a positive result while a negative divisor gives a negative result.

Error correction on a short decimal number

I have short, variable length decimal numbers, like: #41551, that are manually transcribed by humans. Mistyping one will cause undesirable results, so my first thought is to use the Luhn algorithm to add a checksum -- #41551-3. However, that will only detect an error, not correct it. It seems adding another check digit should be able to detect and correct a single-digit error, so given #41515-3? (a transposition error) I'd be able to recover the correct #41551.
Something like a Hamming code seems like the right place to look, but I haven't been able to figure out how to apply them to decimal, instead of binary, data. Is there an algorithm intended for this use, or can Hamming/Reed-Solomon etc be adapted to this situation?
Yes, you can use Hamming codes in addition with check equations for correction. Use summation of data modulo 10 for finding check digits. Place check digits in 1,2,4,8, ... positions.
I can only provide an algorithm with FIVE extra digits.
Note: 5 original digits is really a worst case.
With FIVE extra digits you can do ECC for up to 11 original digits.
This like classical ECC calculations but in decimal:
Original (decimal) 5-digit number: o0,o1,o2,o3,o4
Distribute digits to positions 0..9 in the following manner:
0 1 2 3 4 5 6 7 8 9
o0 o1 o2 o3 o4
c4 c0 c1 c2 c3 <- will be calculated check digits
Calculate digits at positions 1,2,4,8 like this:
c0, pos 1: (10 - (Sum positions 3,5,7,9)%10)%10
c1, pos 2: (10 - (Sum positions 3,6,7)%10)%10
c2, pos 4: (10 - (Sum positions 5,6,7)%10)%10
c3, pos 8: (10 - (Sum positions 9)%10)%10
AFTER this calculation, calculate digit at position:
c4, pos 0: (10 - (Sum positions 1..9)%10)%10
You might then reshuffle like this:
o0o1o2o3o4-c0c1c2c3c4
To check write all digits in the following order:
0 1 2 3 4 5 6 7 8 9
c4 c0 c1 o0 c2 o1 o2 o3 c3 o4
Then calculate:
c0' = (Sum positions 1,3,5,7,9)%10
c1' = (Sum positions 2,3,6,7)%10
c2' = (Sum positions 4,5,6,7)%10
c3' = (Sum positions 8,9)%10
c4' = (Sum all positions)%10
If c0',c1',c2',c3',c4' are all zero then there is no error.
If there are some c[0..3]' which are non-zero and ALL of the non-zero
c[0..3]' have the value c4', then there is an error in one digit.
You can calculate the position of the erroneous digit and correct.
(Exercise left to the reader).
If c[0..3]' are all zero and only c4' is unequal zero, then you have a one digit error in c4.
If a c[0..3]' is unequal zero and has a different value than c4' then you have (at least) an uncorrectable double error in two digits.
I tried to use Reed-Solomon, generating a 3-digit code that can correct up to 1 digit: https://epxx.co/artigos/edc2_en.html

Check digit weighting that will never collide

I thought of implementing a simple check digit using a weighted sum of the digits modulo 10. In addition as serving as a check digit, I want to "abuse" the check digit to detect which of two pools (for example Article Numbers and Customer Numbers) a number belongs to.
According to Wikipedia it is recommended to use 1, 3, 7 and 9 as weight, so for example I could choose:
Article Numbers: Weights 1, 3, 7, 1, 3, 7, ...
Customer Numbers: Weights 7, 9, 1, 7, 9, 1, ...
Number 1234 as an Article Number (1*1+2*3+3*7+4*1 mod 10 = 2): 12342
Number 1234 as a Customer Number (1*7+2*9+3*1+4*7 mod 10 = 6): 12346
The problem is, that sometimes this gives the same check digit for both weight settings:
Number 1098 as an Article Number (1*1+0*3+9*7+8*1 mod 10 = 2): 10982
Number 1098 as a Customer Number (1*7+0*9+9*1+8*7 mod 10 = 2): 10982
Can I choose the weights of the number pools in a way that for any given original number it is ensured that the check digit is never the same for both pools?
I doubt it's possible, although I'd have to run an exhaustive check to be sure.
Have you thought about using even numbers for Article Numbers and odd numbers for Customer Numbers, or something like that?

Resources