How do I print reverse of the intersection of 2 arrays of different lengths? - c++17

I am trying to output the reverse of the intersection of two arrays of different length. I can so far print the intersection, but not in inverse order. I already have some code so far. How do I modify this to be able to print the intersection between the 2 arrays in reverse order? The arrays in question are not sorted.
#include<iostream>
#include<stack>
using namespace std;
int main(){
int n1,n2,i,j;
cin>>n1>>n2;
int arr2[n2];
int arr1[n1];
stack <int> s;
for(i=0;i<n1;i++){
cin>>arr1[i];
}
for(i=0;i<n2;i++){
cin>>arr2[i];
}
for(i=0;i<n1;i++){
for(j=0;j<n2;j++){
if(arr1[i]==arr2[j]){
s.push(arr1[i]);
cout<<s.top()<<endl;
}
}
}
}
Sample input:
6 4
1 2 3 4 5 6
2 6 4 1
Sample output:
1
2
4
6

Your program is incorrect in that expressions like
int arr2[n2];
are erroneous and work only because of compiler extensions to the standard. Use std::vector instead of plain arrays if you don't know array length. Also, don't underestimate the importance of source code indendation.
Now to the main point. The proper way of achieving your goal is this:
copy the arrays to std::vectors (or just sort them, if you don't care if they are sorted),
sort these vectors,
apply std::set_intersection from <algorithm> standard library to these sorted vectors,
print out the contents of the resulting vector in reversed order.
See https://en.cppreference.com/w/cpp/algorithm/set_intersection for an example of code that uses std::set_intersection.

Related

Dart double "bitwise not" is giving different result (~~-1 != -1)

So I am running dart on DartPad And I tried running the following code:
import 'dart:math';
void main() {
print(~0);
print(~-1);
print(~~-1);
}
Which resulted in the following outputs
4294967295
0
4294967295
As you can see inverting the bits from 0 results in the max number (I was expecting -1 as dart uses two's complement) and inverting from -1 results in 0, which creates the situation where inverting 2 times -1 does not give me -1.
Looks like it's ignoring the first bit when inverting 0, why is that?
Dart compiled for the web (which includes DartPad) uses JavaScript numbers and number operations.
One of the consequences of that is that bitwise operations (~, &, |, ^, <<, >> and >>> on int) only gives 32-bit results, because that's what the corresponding JavaScript operations do.
For historical reasons, Dart chooses to give unsigned 32-bit results, not two's complement numbers. So ~-1 is 0 and ~0 is the unsigned 0xFFFFFFFF, not -1.
In short, that's just how it is.

Calculating CLR coded index size

Im trying to read MemberRef coded index (MemberRefParent), Before I do that I need to know its size, according to ECMA-335 Section II.24.2.6, if I understood it correctly the coded index is calculated like so:
my pseudo code
m=max_rows(t0..tn-1); //returns the number of rows of the table that has the most rows.
if(m<2^(16-log(n)){
//size is 2
} else {
//size is 4
}
When I tested the code on CLI file, I got an error, I must have missed something, I hope someone can help me find where I was wrong.
From section II.24.2.6, ECMA-335
If e is a coded index that points into table ti out of n possible
tables t0, …tn-1, then it is stored as e << (log n) | tag{ t0, …tn-1}[
ti] using 2 bytes if the maximum number of rows of tables t0, …tn-1,
is less than 2^(16 – (log n)), and using 4 bytes otherwise.
-
MemberRefParent: 3 bits to encode tag Tag
TypeDef 0
TypeRef 1
ModuleRef 2
MethodDef 3
TypeSpec 4

variations in huffman encoding codewords

I'm trying to solve some huffman coding problems, but I always get different values for the codewords (values not lengths).
for example, if the codeword of character 'c' was 100, in my solution it is 101.
Here is an example:
Character Frequency codeword my solution
A 22 00 10
B 12 100 010
C 24 01 11
D 6 1010 0110
E 27 11 00
F 9 1011 0111
Both solutions have the same length for codewords, and there is no codeword that is prefix of another codeword.
Does this make my solution valid ? or it has to be only 2 solutions, the optimal one and flipping the bits of the optimal one ?
There are 96 possible ways to assign the 0's and 1's to that set of lengths, and all would be perfectly valid, optimal, prefix codes. You have shown two of them.
There exist conventions to define "canonical" Huffman codes which resolve the ambiguity. The value of defining canonical codes is in the transmission of the code from the compressor to the decompressor. As long as both sides know and agree on how to unambiguously assign the 0's and 1's, then only the code length for each symbol needs to be transmitted -- not the codes themselves.
The deflate format starts with zero for the shortest code, and increments up. Within each code length, the codes are ordered by the symbol values, i.e. sorting by symbol. So for your code that canonical Huffman code would be:
A - 00
C - 01
E - 10
B - 110
D - 1110
F - 1111
So there the two bit codes are assigned in the symbol order A, C, E, and similarly, the four bit codes are assigned in the order D, F. Shorter codes are assigned before longer codes.
There is a different and interesting ambiguity that arises in finding the code lengths. Depending on the order of combination of equal frequency nodes, i.e. when you have a choice of more than two lowest frequency nodes, you can actually end up with different sets of code lengths that are exactly equally optimal. Even though the code lengths are different, when you multiply the lengths by the frequencies and add them up, you get exactly the same number of bits for the two different codes.
There again, the different codes are all optimal and equally valid. There are ways to resolve that ambiguity as well at the time the nodes to combine are chosen, where the benefit can be minimizing the depth of the tree. That can reduce the table size for table-driven Huffman decoding.
For example, consider the frequencies A: 2, B: 2, C: 1, D: 1. You first combine C and D to get 2. Then you have A, B, and C+D all with frequency 2. Now you can choose to combine either A and B, or C+D with A or B. This gives two different sets of bit lengths. If you combine A and B, you get lengths: A-2, B-2, C-2, and D-2. If you combine C+D with B, you get A-1, B-2, C-3, D-3. Both are optimal codes, since 2x2 + 2x2 + 1x2 + 1x2 = 2x1 + 2x2 + 1x3 + 1x3 = 12, so both codes use 12 bits to represent those symbols that many times.
The problem is, that there is no problem.
You huffman tree is valid, it also gives the exactly same results after encoding and decoding. Just think if you would build a huffman tree by hand, there are always more ways to combine items with equal (or least difference) value. E.g. if you have A B C (everyone frequency 1), you can at first combine A and B, and the result with C, or at first B and C, and the result with a.
You see, there are more correct ways.
Edit: Even with only one possible way to combine the items by frequency, you can get different results because you can assign 1 for the left or for the right branch, so you would get different (correct) results.

F# Multidimensional Array Types

What's the difference between 'a[,,] and 'a[][][]? They both represent 3-d arrays.
It makes me write array3d.[x].[y].[z] instead of array3d.[x, y, z].
Why I can't do the following?
> let array2d : int[,] = Array2D.zeroCreate 10 10;;
> let array1d = array2d.[0];;
error FS0001: This expression was expected to have type
'a []
but here has type
int [,]
The difference is that 'a[][] represents an array of arrays (of possibly different lengths), while in 'a[,], represents a rectangular 2D array. The first type is also called jagged arrays and the second type is called multidimensional arrays. The difference is the same as in C#, so you may want to look at the C# documentation for jagged arrays and multidimensional arrays. There is also an excelent documentation in the F# WikiBook.
To demonstrate this using a picture, a value of type 'a[][] can look like this:
0 1 2 3 4
5 6
7 8 9 0 1
While a value of type a[,] will always be a rectangle and may look for example like this:
0 1 2 3
4 5 6 7
8 9 0 1
To get a single "line" of a multidimensional array, you can use the slice notation:
let row = array2d.[0,*];;
See https://learn.microsoft.com/en-us/dotnet/fsharp/language-reference/arrays#array-slicing-and-multidimensional-arrays
As of F# 3.1 (2013) things are simpler:
As of F# 3.1, you can decompose a multidimensional array into subarrays of the same or lower dimension. For example, you can obtain a vector from a matrix by specifying a single row or column.
// Get row 3 from a matrix as a vector:
matrix.[3, *]
// Get column 3 from a matrix as a vector:
matrix.[*, 3]
See https://learn.microsoft.com/en-us/dotnet/fsharp/language-reference/arrays#array-slicing-and-multidimensional-arrays

Constrained Sequence to Index Mapping

I'm puzzling over how to map a set of sequences to consecutive integers.
All the sequences follow this rule:
A_0 = 1
A_n >= 1
A_n <= max(A_0 .. A_n-1) + 1
I'm looking for a solution that will be able to, given such a sequence, compute a integer for doing a lookup into a table and given an index into the table, generate the sequence.
Example: for length 3, there are 5 the valid sequences. A fast function for doing the following map (preferably in both direction) would be a good solution
1,1,1 0
1,1,2 1
1,2,1 2
1,2,2 3
1,2,3 4
The point of the exercise is to get a packed table with a 1-1 mapping between valid sequences and cells.
The size of the set in bounded only by the number of unique sequences possible.
I don't know now what the length of the sequence will be but it will be a small, <12, constant known in advance.
I'll get to this sooner or later, but though I'd throw it out for the community to have "fun" with in the meantime.
these are different valid sequences
1,1,2,3,2,1,4
1,1,2,3,1,2,4
1,2,3,4,5,6,7
1,1,1,1,2,3,2
these are not
1,2,2,4
2,
1,1,2,3,5
Related to this
There is a natural sequence indexing, but no so easy to calculate.
Let look for A_n for n>0, since A_0 = 1.
Indexing is done in 2 steps.
Part 1:
Group sequences by places where A_n = max(A_0 .. A_n-1) + 1. Call these places steps.
On steps are consecutive numbers (2,3,4,5,...).
On non-step places we can put numbers from 1 to number of steps with index less than k.
Each group can be represent as binary string where 1 is step and 0 non-step. E.g. 001001010 means group with 112aa3b4c, a<=2, b<=3, c<=4. Because, groups are indexed with binary number there is natural indexing of groups. From 0 to 2^length - 1. Lets call value of group binary representation group order.
Part 2:
Index sequences inside a group. Since groups define step positions, only numbers on non-step positions are variable, and they are variable in defined ranges. With that it is easy to index sequence of given group inside that group, with lexicographical order of variable places.
It is easy to calculate number of sequences in one group. It is number of form 1^i_1 * 2^i_2 * 3^i_3 * ....
Combining:
This gives a 2 part key: <Steps, Group> this then needs to be mapped to the integers. To do that we have to find how many sequences are in groups that have order less than some value. For that, lets first find how many sequences are in groups of given length. That can be computed passing through all groups and summing number of sequences or similar with recurrence. Let T(l, n) be number of sequences of length l (A_0 is omitted ) where maximal value of first element can be n+1. Than holds:
T(l,n) = n*T(l-1,n) + T(l-1,n+1)
T(1,n) = n
Because l + n <= sequence length + 1 there are ~sequence_length^2/2 T(l,n) values, which can be easily calculated.
Next is to calculate number of sequences in groups of order less or equal than given value. That can be done with summing of T(l,n) values. E.g. number of sequences in groups with order <= 1001010 binary, is equal to
T(7,1) + # for 1000000
2^2 * T(4,2) + # for 001000
2^2 * 3 * T(2,3) # for 010
Optimizations:
This will give a mapping but the direct implementation for combining the key parts is >O(1) at best. On the other hand, the Steps portion of the key is small and by computing the range of Groups for each Steps value, a lookup table can reduce this to O(1).
I'm not 100% sure about upper formula, but it should be something like it.
With these remarks and recurrence it is possible to make functions sequence -> index and index -> sequence. But not so trivial :-)
I think hash with out sorting should be the thing.
As A0 always start with 0, may be I think we can think of the sequence as an number with base 12 and use its base 10 as the key for look up. ( Still not sure about this).
This is a python function which can do the job for you assuming you got these values stored in a file and you pass the lines to the function
def valid_lines(lines):
for line in lines:
line = line.split(",")
if line[0] == 1 and line[-1] and line[-1] <= max(line)+1:
yield line
lines = (line for line in open('/tmp/numbers.txt'))
for valid_line in valid_lines(lines):
print valid_line
Given the sequence, I would sort it, then use the hash of the sorted sequence as the index of the table.

Resources