I have tried using Djikstra's Algorithm on a cyclic weighted graph without using a priority queue (heap) and it worked.
Wikipedia states that the original implementation of this algorithm does not use a priority queue and runs in O(V2) time.
Now if we just removed the priority queue and used normal queue, the run time is linear, i.e. O(V+E).
Can someone explain why we need the priority queue?
I had the exact same doubt and found a test case where the algorithm without a priority_queue would not work.
Let's say I have a Graph object g, a method addEdge(a,b,w) which adds edge from vertex a to vertex b with weight w.
Now, let me define the following graph :-
Graph g
g.addEdge(0,1,5) ;
g.addEdge(1,3,1) ;
g.addEdge(0,2,2) ;
g.addEdge(2,1,1) ;
g.addEdge(2,3,7) ;
Now, say our queue contains the nodes in the following order {0,1,2,3 }
So, node 0 is visited first then node 1 is visited.
At this point of time the dist b/w 0 and 3 is computed as 6 using the path 0->1->3, and 1 is marked as visited.
Now node 2 is visited and dist b/w 0 and 1 is updated to the value 3 using the path 0->2->1, but since node 1 is marked visited, you cannot change the distance b/w 0 and 3 which (using the optimal path) (`0->2->1->3) is 4.
So, your algorithm fails without using the priority_queue.
It reports dist b/w 0 and 3 to be 6 while in reality it should be 4.
Now, here is the code which I used for implementing the algorithm :-
class Graph
{
public:
vector<int> nodes ;
vector<vector<pair<int,int> > > edges ;
void addNode()
{
nodes.push_back(nodes.size()) ;
vector<pair<int,int> > temp ; edges.push_back(temp);
}
void addEdge(int n1, int n2, int w)
{
edges[n1].push_back(make_pair(n2,w)) ;
}
pair<vector<int>, vector<int> > shortest(int source) // shortest path djkitra's
{
vector<int> dist(nodes.size()) ;
fill(dist.begin(), dist.end(), INF) ; dist[source] = 0 ;
vector<int> pred(nodes.size()) ;
fill(pred.begin(), pred.end(), -1) ;
for(int i=0; i<(int)edges[source].size(); i++)
{
dist[edges[source][i].first] = edges[source][i].second ;
pred[edges[source][i].first] = source ;
}
set<pair<int,int> > pq ;
for(int i=0; i<(int)nodes.size(); i++)
pq.insert(make_pair(dist[i],i)) ;
while(!pq.empty())
{
pair<int,int> item = *pq.begin() ;
pq.erase(pq.begin()) ;
int v = item.second ;
for(int i=0; i<(int)edges[v].size(); i++)
{
if(dist[edges[v][i].first] > dist[v] + edges[v][i].second)
{
pq.erase(std::find(pq.begin(), pq.end(),make_pair(dist[edges[v][i].first],edges[v][i].first))) ;
pq.insert(make_pair(dist[v] + edges[v][i].second,edges[v][i].first)) ;
dist[edges[v][i].first] = dist[v] + edges[v][i].second ;
pred[i] = edges[v][i].first ;
}
}
}
return make_pair(dist,pred) ;
}
pair<vector<int>, vector<int> > shortestwpq(int source) // shortest path djkitra's without priority_queue
{
vector<int> dist(nodes.size()) ;
fill(dist.begin(), dist.end(), INF) ; dist[source] = 0 ;
vector<int> pred(nodes.size()) ;
fill(pred.begin(), pred.end(), -1) ;
for(int i=0; i<(int)edges[source].size(); i++)
{
dist[edges[source][i].first] = edges[source][i].second ;
pred[edges[source][i].first] = source ;
}
vector<pair<int,int> > pq ;
for(int i=0; i<(int)nodes.size(); i++)
pq.push_back(make_pair(dist[i],i)) ;
while(!pq.empty())
{
pair<int,int> item = *pq.begin() ;
pq.erase(pq.begin()) ;
int v = item.second ;
for(int i=0; i<(int)edges[v].size(); i++)
{
if(dist[edges[v][i].first] > dist[v] + edges[v][i].second)
{
dist[edges[v][i].first] = dist[v] + edges[v][i].second ;
pred[i] = edges[v][i].first ;
}
}
}
return make_pair(dist,pred) ;
}
As expected the results were as follows :-
With priority_queue
0
3
2
4
Now using without priority queue
0
3
2
6
Like Moataz Elmasry said the best you can expect is O(|E| + |V|.|logV|) with a fib queue. At least when it comes to big oh values.
The idea behind it is, for every vertex(node) you are currently working on, you already found the shortest path to. If the vertex isn't the smallest one, (distance + edge weight) that isn't necessarily true. This is what allows you to stop the algorithm as soon as you have expanded(?) every vertex that is reachable from your initial vertex. If you aren't expanding the smallest vertex, you aren't guaranteed to be finding the shortest path, thus you would have to test every single path, not just one. So instead of having to go through every edge in just one path, you go through every edge in every path.
Your estimate for O(E + V) is probably correct, the path and cost you determined on the other hand, are incorrect. If I'm not mistaken the path would only be the shortest if by any chance the first edge you travel from every vertex just happens to be the smallest one.
So Dijkstra's shortest path algorithm without a queue with priority is just Dijkstra's path algorithm ;)
For sparse graph, if implement with binary min heap runtime is(E*logV), however if you implement it with Fibonacci heap, runtime would be(VlogV+E).
A heap is the best choice for this task as it guarantees O(log(n)) for adding edges to our queue and to remove the top element. Any other implementation of priority queue would sacrifice in either adding to our queue or removing from it to gain a performance boost somewhere else. Depending on how sparse the graph, then you might find better performance using a different implementation of priority queue, but generally speaking a min-heap is best since it balances the two.
Not the greatest source but: http://en.wikipedia.org/wiki/Heap_(data_structure)
Related
I am looking for efficient AVX (AVX512) implementation of
// Given
float u[8];
float v[8];
// Compute
float a[8];
float b[8];
// Such that
for ( int i = 0; i < 8; ++i )
{
a[i] = fabs(u[i]) >= fabs(v[i]) ? u[i] : v[i];
b[i] = fabs(u[i]) < fabs(v[i]) ? u[i] : v[i];
}
I.e., I need to select element-wise into a from u and v based on mask, and into b based on !mask, where mask = (fabs(u) >= fabs(v)) element-wise.
I had this exact same problem just the other day. The solution I came up with (using AVX only) was:
// take the absolute value of u and v
__m256 sign_bit = _mm256_set1_ps(-0.0f);
__m256 u_abs = _mm256_andnot_ps(sign_bit, u);
__m256 v_abs = _mm256_andnot_ps(sign_bit, v);
// get a mask indicating the indices for which abs(u[i]) >= abs(v[i])
__m256 u_ge_v = _mm256_cmp_ps(u_abs, v_abs, _CMP_GE_OS);
// use the mask to select the appropriate elements into a and b, flipping the argument
// order for b to invert the sense of the mask
__m256 a = _mm256_blendv_ps(u, v, u_ge_v);
__m256 b = _mm256_blendv_ps(v, u, u_ge_v);
The AVX512 equivalent would be:
// take the absolute value of u and v
__m512 sign_bit = _mm512_set1_ps(-0.0f);
__m512 u_abs = _mm512_andnot_ps(sign_bit, u);
__m512 v_abs = _mm512_andnot_ps(sign_bit, v);
// get a mask indicating the indices for which abs(u[i]) >= abs(v[i])
__mmask16 u_ge_v = _mm512_cmp_ps_mask(u_abs, v_abs, _CMP_GE_OS);
// use the mask to select the appropriate elements into a and b, flipping the argument
// order for b to invert the sense of the mask
__m512 a = _mm512_mask_blend_ps(u_ge_v, u, v);
__m512 b = _mm512_mask_blend_ps(u_ge_v, v, u);
As Peter Cordes suggested in the comments above, there are other approaches as well like taking the absolute value followed by a min/max and then reinserting the sign bit, but I couldn't find anything that was shorter/lower latency than this sequence of instructions.
Actually, there is another approach using AVX512DQ's VRANGEPS via the _mm512_range_ps() intrinsic. Intel's intrinsic guide describes it as follows:
Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst. imm8[1:0] specifies the operation control: 00 = min, 01 = max, 10 = absolute max, 11 = absolute min. imm8[3:2] specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit.
Note that there appears to be a typo in the above; actually imm8[3:2] == 10 is "absolute min" and imm8[3:2] == 11 is "absolute max" if you look at the details of the per-element operation:
CASE opCtl[1:0] OF
0: tmp[31:0] := (src1[31:0] <= src2[31:0]) ? src1[31:0] : src2[31:0]
1: tmp[31:0] := (src1[31:0] <= src2[31:0]) ? src2[31:0] : src1[31:0]
2: tmp[31:0] := (ABS(src1[31:0]) <= ABS(src2[31:0])) ? src1[31:0] : src2[31:0]
3: tmp[31:0] := (ABS(src1[31:0]) <= ABS(src2[31:0])) ? src2[31:0] : src1[31:0]
ESAC
CASE signSelCtl[1:0] OF
0: dst[31:0] := (src1[31] << 31) OR (tmp[30:0])
1: dst[31:0] := tmp[63:0]
2: dst[31:0] := (0 << 31) OR (tmp[30:0])
3: dst[31:0] := (1 << 31) OR (tmp[30:0])
ESAC
RETURN dst
So you can get the same result with just two instructions:
auto a = _mm512_range_ps(v, u, 0x7); // 0b0111 = sign from compare result, absolute max
auto b = _mm512_range_ps(v, u, 0x6); // 0b0110 = sign from compare result, absolute min
The argument order (v, u) is a bit unintuitive, but it's needed in order to get the same behavior that you described in the OP in the event that the elements have equal absolute value (namely, that the value from u is passed through to a, and v goes to b).
On Skylake and Ice Lake Xeon platforms (probably any of the Xeons that have dual FMA units, probably?), VRANGEPS has throughput 2, so the two checks can issue and execute simultaneously, with latency of 4 cycles. This is only a modest latency improvement on the original approach, but the throughput is better and it requires fewer instructions/uops/instruction cache space.
clang does a pretty reasonable job of auto-vectorizing it with -ffast-math and the necessary __restrict qualifiers: https://godbolt.org/z/NMvN1u. and both inputs to ABS them, compare once, vblendvps twice on the original inputs with the same mask but the other sources in the opposite order to get min and max.
That's pretty much what I was thinking before checking what compilers did, and looking at their output to firm up the details I hadn't thought through yet. I don't see anything more clever than that. I don't think we can avoid abs()ing both a and b separately; there's no cmpps compare predicate that compares magnitudes and ignores the sign bit.
// untested: I *might* have reversed min/max, but I think this is right.
#include <immintrin.h>
// returns min_abs
__m256 minmax_abs(__m256 u, __m256 v, __m256 *max_result) {
const __m256 signbits = _mm256_set1_ps(-0.0f);
__m256 abs_u = _mm256_andnot_ps(signbits, u);
__m256 abs_v = _mm256_andnot_ps(signbits, v); // strip the sign bit
__m256 maxabs_is_v = _mm256_cmp_ps(abs_u, abs_v, _CMP_LT_OS); // u < v
*max_result = _mm256_blendv_ps(v, u, maxabs_is_v);
return _mm256_blendv_ps(u, v, maxabs_is_v);
}
You'd do the same thing with AVX512 except you compare into a mask instead of another vector.
// returns min_abs
__m512 minmax_abs512(__m512 u, __m512 v, __m512 *max_result) {
const __m512 absmask = _mm512_castsi512_ps(_mm512_set1_epi32(0x7fffffff));
__m512 abs_u = _mm512_and_ps(absmask, u);
__m512 abs_v = _mm512_and_ps(absmask, v); // strip the sign bit
__mmask16 maxabs_is_v = _mm512_cmp_ps_mask(abs_u, abs_v, _CMP_LT_OS); // u < v
*max_result = _mm512_mask_blend_ps(maxabs_is_v, v, u);
return _mm512_mask_blend_ps(maxabs_is_v, u, v);
}
Clang compiles the return statement in an interesting way (Godbolt):
.LCPI2_0:
.long 2147483647 # 0x7fffffff
minmax_abs512(float __vector(16), float __vector(16), float __vector(16)*): # #minmax_abs512(float __vector(16), float __vector(16), float __vector(16)*)
vbroadcastss zmm2, dword ptr [rip + .LCPI2_0]
vandps zmm3, zmm0, zmm2
vandps zmm2, zmm1, zmm2
vcmpltps k1, zmm3, zmm2
vblendmps zmm2 {k1}, zmm1, zmm0
vmovaps zmmword ptr [rdi], zmm2 ## store the blend result
vmovaps zmm0 {k1}, zmm1 ## interesting choice: blend merge-masking
ret
Instead of using another vblendmps, clang notices that zmm0 already has one of the blend inputs, and uses merge-masking with a regular vector vmovaps. This has zero advantage of Skylake-AVX512 for 512-bit vblendmps (both single-uop instructions for port 0 or 5), but if Agner Fog's instruction tables are right, vblendmps x/y/zmm only ever runs on port 0 or 5, but a masked 256-bit or 128-bit vmovaps x/ymm{k}, x/ymm can run on any of p0/p1/p5.
Both are single-uop / single-cycle latency, unlike AVX2 vblendvps based on a mask vector which is 2 uops. (So AVX512 is an advantage even for 256-bit vectors). Unfortunately, none of gcc, clang, or ICC turn the _mm256_cmp_ps into _mm256_cmp_ps_mask and optimize the AVX2 intrinsics to AVX512 instructions when compiling with -march=skylake-avx512.)
s/512/256/ to make a version of minmax_abs512 that uses AVX512 for 256-bit vectors.
Gcc goes even further, and does the questionable "optimization" of
vmovaps zmm2, zmm1 # tmp118, v
vmovaps zmm2{k1}, zmm0 # tmp118, tmp114, tmp118, u
instead of using one blend instruction. (I keep thinking I'm seeing a store followed by a masked store, but no, neither compiler is blending that way).
So the Fibonacci number for log (N) — without matrices.
Ni // i-th Fibonacci number
= Ni-1 + Ni-2 // by definition
= (Ni-2 + Ni-3) + Ni-2 // unwrap Ni-1
= 2*Ni-2 + Ni-3 // reduce the equation
= 2*(Ni-3 + Ni-4) + Ni-3 //unwrap Ni-2
// And so on
= 3*Ni-3 + 2*Ni-4
= 5*Ni-4 + 3*Ni-5
= 8*Ni-5 + 5*Ni-6
= Nk*Ni-k + Nk-1*Ni-k-1
Now we write a recursive function, where at each step we take k~=I/2.
static long N(long i)
{
if (i < 2) return 1;
long k=i/2;
return N(k) * N(i - k) + N(k - 1) * N(i - k - 1);
}
Where is the fault?
You get a recursion formula for the effort: T(n) = 4T(n/2) + O(1). (disregarding the fact that the numbers get bigger, so the O(1) does not even hold). It's clear from this that T(n) is not in O(log(n)). Instead one gets by the master theorem T(n) is in O(n^2).
Btw, this is even slower than the trivial algorithm to calculate all Fibonacci numbers up to n.
The four N calls inside the function each have an argument of around i/2. So the length of the stack of N calls in total is roughly equal to log2N, but because each call generates four more, the bottom 'layer' of calls has 4^log2N = O(n2) Thus, the fault is that N calls itself four times. With only two calls, as in the conventional iterative method, it would be O(n). I don't know of any way to do this with only one call, which could be O(log n).
An O(n) version based on this formula would be:
static long N(long i) {
if (i<2) {
return 1;
}
long k = i/2;
long val1;
long val2;
val1 = N(k-1);
val2 = N(k);
if (i%2==0) {
return val2*val2+val1*val1;
}
return val2*(val2+val1)+val1*val2;
}
which makes 2 N calls per function, making it O(n).
public class fibonacci {
public static int count=0;
public static void main(String[] args) {
Scanner scan = new Scanner(System.in);
int i = scan.nextInt();
System.out.println("value of i ="+ i);
int result = fun(i);
System.out.println("final result is " +result);
}
public static int fun(int i) {
count++;
System.out.println("fun is called and count is "+count);
if(i < 2) {
System.out.println("function returned");
return 1;
}
int k = i/2;
int part1 = fun(k);
int part2 = fun(i-k);
int part3 = fun(k-1);
int part4 = fun(i-k-1);
return ((part1*part2) + (part3*part4)); /*RESULT WILL BE SAME FOR BOTH METHODS*/
//return ((fun(k)*fun(i-k))+(fun(k-1)*fun(i-k-1)));
}
}
I tried to code to problem defined by you in java. What i observed is that complexity of above code is not completely O(N^2) but less than that.But as per conventions and standards the worst case complexity is O(N^2) including some other factors like computation(division,multiplication) and comparison time analysis.
The output of above code gives me information about how many times the function
fun(int i) computes and is being called.
OUTPUT
So including the time taken for comparison and division, multiplication operations, the worst case time complexity is O(N^2) not O(LogN).
Ok if we use Analysis of the recursive Fibonacci program technique.Then we end up getting a simple equation
T(N) = 4* T(N/2) + O(1)
where O(1) is some constant time.
So let's apply Master's method on this equation.
According to Master's method
T(n) = aT(n/b) + f(n) where a >= 1 and b > 1
There are following three cases:
If f(n) = Θ(nc) where c < Logba then T(n) = Θ(nLogba)
If f(n) = Θ(nc) where c = Logba then T(n) = Θ(ncLog n)
If f(n) = Θ(nc) where c > Logba then T(n) = Θ(f(n))
And in our equation a=4 , b=2 & c=0.
As case 1 c < logba => 0 < 2 (which is log base 2 and equals to 2) is satisfied
hence T(n) = O(n^2).
For more information about how master's algorithm works please visit: Analysis of Algorithms
Your idea is correct, and it will perform in O(log n) provided you don't compute the same formula
over and over again. The whole point of having N(k) * N(i-k) is to have (k = i - k) so you only have to compute one instead of two. But if you only call recursively, you are performing the computation twice.
What you need is called memoization. That is, store every value that you already have computed, and
if it comes up again, then you get it in O(1).
Here's an example
const int MAX = 10000;
// memoization array
int f[MAX] = {0};
// Return nth fibonacci number using memoization
int fib(int n) {
// Base case
if (n == 0)
return 0;
if (n == 1 || n == 2)
return (f[n] = 1);
// If fib(n) is already computed
if (f[n]) return f[n];
// (n & 1) is 1 iff n is odd
int k = n/2;
// Applying your formula
f[n] = fib(k) * fib(n - k) + fib(k - 1) * fib(n - k - 1);
return f[n];
}
I am playing around with the KMP algorithm in f sharp. While it works for patterns like "ATAT" (result will be [|0; 0; 1; 2;|]) , the first while loop enters a deadlock when the first 2 characters of a string are the same and the 3rd is another, for example "AAT".
I understand why: first, i gets incremented to 1. now the first condition for the while loop is true, while the second is also true, because "A" <> "T". Now it sets i to prefixtable.[!i], which is 1 again, and here we go.
Can you guys give me a hint on how to solve this?
let kMPrefix (pattern : string) =
let (m : int) = pattern.Length - 1
let prefixTable = Array.create pattern.Length 0
// i : longest proper prefix that is also a suffix
let i = ref 0
// j: the index of the pattern for which the prefix value will be calculated
// starts with 1 because the first prefix value is always 0
for j in 1 .. m do
while !i > 0 && pattern.[!i] <> pattern.[j] do
i := prefixTable.[!i]
if pattern.[!i] = pattern.[j] then
i := !i+1
Array.set prefixTable j !i
prefixTable
I'm not sure how to repair the code with a small modification, since it doesn't match the KMP algorithm's lookup table contents (at least the ones I've found on Wikipedia), which are:
-1 for index 0
Otherwise, the count of consecutive elements before the current position that match the beginning (excluding the beginning itself)
Therefore, I'd expect output for "ATAT" to be [|-1; 0; 0; 1|], not [|0; 0; 1; 2;|].
This type of problem might be better to reason about in functional style. To create the KMP table, you could use a recursive function that fills the table one by one, keeping track of how many recent characters match the beginning, and start running it at the second character's index.
A possible implementation:
let buildKmpPrefixTable (pattern : string) =
let prefixTable = Array.zeroCreate pattern.Length
let rec run startIndex matchCount =
let writeIndex = startIndex + matchCount
if writeIndex < pattern.Length then
if pattern.[writeIndex] = pattern.[matchCount] then
prefixTable.[writeIndex] <- matchCount
run startIndex (matchCount + 1)
else
prefixTable.[writeIndex] <- matchCount
run (writeIndex + 1) 0
run 1 0
if pattern.Length > 0 then prefixTable.[0] <- -1
prefixTable
This approach isn't in danger of any endless loops/recursion, because all code paths of run either increase writeIndex in the next iteration or finish iterating.
Note on terminology: the error you are describing in the question is an endless loop or, more generally, non-terminating iteration. Deadlock refers specifically to a situation in which a thread waits for a lock that will never be released because the thread holding it is itself waiting for a lock that will never be released for the same reason.
In one of my java programs I am trying to read a number and then use the golden ratio (1.618034) to find the next smallest fibonacci number its index. For example, if I enter 100000 I should get back "the smallest fibonacci number which is greater than 100000 is the 26th and its value is 121393".
The program should also calculate a fibonacci number by index (case 1 in the code below) which I have coded so far, but I can't figure out how to solve the problem described above (case 2). I have a horrible teacher and I don't really understand what I need to do. I am not asking for the code, just kind of a step by step what I should do for case 2. I can not use recursion. Thank you for any help. I seriously suck at wrapping my head around this.
import java.util.Scanner;
public class Fibonacci {
public static void main(String args[]) {
Scanner scan = new Scanner(System.in);
System.out.println("This is a Fibonacci sequence generator");
System.out.println("Choose what you would like to do");
System.out.println("1. Find the nth Fibonacci number");
System.out.println("2. Find the smallest Fibonacci number that exceeds user given value");
System.out.println("3. Find the two Fibonacci numbers whose ratio is close enough to the golden number");
System.out.print("Enter your choice: ");
int choice = scan.nextInt();
int xPre = 0;
int xCurr = 1;
int xNew = 0;
switch (choice)
{
case 1:
System.out.print("Enter the target index to generate (>1): ");
int index = scan.nextInt();
for (int i = 2; i <= index; i++)
{
xNew = xPre + xCurr;
xPre = xCurr;
xCurr = xNew;
}
System.out.println("The " + index + "th number Fibonacci number is " + xNew);
break;
case 2:
System.out.print("Enter the target value (>1): ");
int value = scan.nextInt();
}
}
}
First, you should understand what this golden ration story is all about. The point is, Fibonacci numbers can be calced recursively, but there's also a formula for the nth Fibonacci number:
φ(n) = [φ^n - (-φ)^(-n)]/√5
where φ = (√5 + 1)/2 is the Golden Ratio (approximately 1.61803). Now, |(-φ)^(-1)| < 1 which means that you can calc φ(n) as the closest integer to φ^n/√5 (unless n = 1).
So, calc √5, calc φ, then learn how to get an integer closest to the value of a real variable and then calc φ(n) using the φ^n/√5 formula (or just use the "main" [φ^n - (-φ)^(-n)]/√5 formula) in a loop and in that loop compare φ(n) with the number that user input. When φ(n) exceeds the user's number, remember n and φ(n).
I have to build a compressor based on the Huffman algorithm.
So far, I managed to create the tree with the frequencies of each character and generate a representation with a smaller number of bits for each character.
Is something like this, for the phrase "good this sugarplum":
'o' 000, '' 001, 't' 0100, 'r' 0101, 'p' 0110, 'm' 0111, 'l' 1000, 'i' 1001, 'h' 1010, 'd' 1011, 'a'1100, 'u' 1101, 'g' 1110, 's' 1111
The problem I'm having now is finding a way to save the tree in the archive, so I can rebuild it and then decompress the file.
Any suggestions?
I did some research but found it difficult to understand, so if you can explain in detail, I would appreciate it.
The code I used to read the frequencies from file is:
int main (int argc, char *argv[])
{
int i;
TipoSentinela *sentinela;
TipoLista *no = NULL;
Arv *arvore, *arvore2, *arvore3;
int *repete = (int *) calloc (256, sizeof(int));
if(argc == 2)
{
in = load_base(argv[1]);
le_dados_arquivo (repete); //read the frequencies from the file
sentinela = cria_lista (); //create a marker for the tree node list
for (i = 0; i < 256; i++)
{
if(repete[i] > 0 && i != 0)
{
arvore = arv_cria (Cria_info (i, repete[i])); //create a tree node with the character i and the frequence of it in the file
no = inicia_lista (arvore, no, sentinela); //create the list of tree nodes
}
}
Ordena (sentinela); //sort the tree nodes list by the frequencies
for(Seta_primeiro(sentinela); Tamanho_lista(sentinela) != 1; Move_marcador(sentinela))
{
Seta_primeiro(sentinela); //put the marker in the first element of the list
no = Retorna_marcador(sentinela);
arvore2 = Retorna_arvore (no); //return the tree represented by the list marker
Move_marcador(sentinela); //put the marker to the next element
arvore3 = Retorna_arvore (Retorna_marcador (sentinela)); //return the tree represented by the list marker
arvore = Cria_pai (arvore2, arvore3); //create a tree node that will contain the both arvore2 and arvore3
Insere_arvoreFinal (sentinela, arvore); //insert the node at the end of the list
Remove_arvore (sentinela); //remove the node arvore2 from the list
Remove_arvore (sentinela); //remove the node arvore3 from the lsit
Ordena (sentinela); //sort the list again
}
out = load_out(argv[1]); //open the output file
Codificacao (arvore); //generate the code from each node of the tree
rewind(in);
char c;
while(!feof(in))
{
c = fgetc(in);
if(c != EOF)
arvore2 = Procura_info (arvore, c); //search the character c in the tree
if(arvore2 != NULL)
imprimebit(Retorna_codigo(arvore2), out); //write the code in the file
}
fclose(in);
fclose(out);
free(repete);
arvore = arv_libera (arvore);
Libera_Lista(sentinela);
}
return 0;
}
//bit_counter and cur_byte are global variables
void write_bit (unsigned char bit, FILE *f)
{
static k = 0;
if(k != 0)
{
if(++bit_counter == 8)
{
fwrite(&cur_byte,1,1,f);
bit_counter = 0;
cur_byte = 0;
}
}
k = 1;
cur_byte <<= 1;
cur_byte |= ('0' != bit);
}
//aux is the code of a character in the tree
void imprimebit(char *aux, FILE *f)
{
int i, j;
if(aux == NULL)
return;
for(i = 0; i < strlen(aux); i++)
{
write_bit(aux[i], f); //write the bits of the code in the file
}
}
With this, I can write the code of all characters in the output file, but I can't see a way to store the tree too.
You don't need to send the tree. Just send the lengths. Then establish a consistent algorithm to convert the lengths to codes on both ends. The consistency is called a "canonical" Huffman code. You sort the codes by length, and within each length, sort by the symbol. Then assign codes starting at 0. So you would end up with (_ means space):
_ 000
o 001
a 0100
d 0101
g 0110
h 0111
i 1000
l 1001
m 1010
p 1011
r 1100
s 1101
t 1110
u 1111
I did found a way to store the code of each character.
For example:
I write the tree, starting by the root and going down to the left, then right.
So, if my tree was something like
0
/ \
0 1
/ \ / \
'a' 'b' 'c' 'd'
The header of my file would be someting like this:
001[8 bits from 'a']1[8 bits from b]01[8 bits from c]1[8 bits from d]
With this, I would be able to rebuild my tree.
My problem now is in read bit-by-bit of the header of file to know in wich direction I have to create a new node.