not correct num histgram - histogram

Im trying to make a toString method that prints out a histogram that shows how often each character of the alphabet is used in a string. The most frequent character has to be 60 #s long, with the rest of the characters then scaled to match.
My issue is with making the equation that scales the rest of the letters to the correct length for the histogram. My current equation is (myArray[i]/max) * 60, but im getting really weird results.
If I put in "hello world" to be analyzed, L would be the most common occuring letter, seen 3 times. So L should have 60 #s for the histogram, h should have 20, o should have 40 etc. Instead im getting results like d : 10
e : 10
h : 10
l : 360
o : 20
r : 10
w : 10
Sorry for how sloppy this is right now, im just trying to figure out whats going on
public class LetterCounter
private static int[] alphabetArray;
private static String input;
/**
* Constructor for objects of class LetterCounter
*/
public LetterCounter()
{
alphabetArray = new int[26];
}
public void countLetters(String input) {
this.input = input;
this.input.toLowerCase();
//String s= input;
//s.toLowerCase();
for ( int i = 0; i < input.length(); i++ ) {
char ch= input.charAt(i);
if (ch >= 97 && ch <= 122){
alphabetArray[ch-'a']++;
}
}
}
public void getTotalCount() {
for (int i = 0; i < alphabetArray.length; i++) {
if(alphabetArray[i]>=0){
char ch = (char) (i+97);
System.out.println(ch +" : "+alphabetArray[i]);
}
}
}
public void reset() {
for (int i =0; i<alphabetArray.length; i++) {
if(alphabetArray[i]>=0){
alphabetArray[i]=0;
char ch = (char) (i+97);
System.out.println(ch +" : "+alphabetArray[i]);
}
}
}
public String toString() {
String s = "";
int max = alphabetArray[0];
int markCounter = 0;
for(int i =0; i<alphabetArray.length; i++) {
//finds the largest number of occurences for any letter in the string
if(alphabetArray[i] > max) {
max = alphabetArray[i];
}
}
for(int i =0; i<alphabetArray.length; i++) {
//trying to scale the rest of the characters down here
if(alphabetArray[i] > 0) {
markCounter = (alphabetArray[i] / max) * 60;
char ch = (char) (i+97);
System.out.println(ch +" : "+alphabetArray[i] + markCounter);
}
}
for (int i = 0; i < alphabetArray.length; i++) {
//prints the whole alphabet, total number of occurences for all chars
if(alphabetArray[i]>=0){
char ch = (char) (i+97);
System.out.println(ch +" : "+alphabetArray[i]);
}
}
return s;
}
}

There are many many problems with your code, but lets go one by one.
First of all, your print statement is simply misleading. Change it to
System.out.println(ch +" : "+alphabetArray[i] + " " + markCounter);
and you will see
d : 1 0
e : 1 0
h : 1 0
l : 3 60
o : 2 0
r : 1 0
w : 1 0
As you can see: the counters are correct (1,1,1,3,2,1,1). But the your scaling doesn't work:
1 / 3 --> 0 ... and 0 * 3 ... is still 0
3 / 3 --> 1 and 1 * 3 ... is 60
but of course, when you dont print a space between 1 and 0 and 3 and 60.
Thus to get correct scaling, just change to:
markCounter = alphabetArray[i] * 60 / max;
Other things worth mentioning:
You are overriding toString(). Then you should put #Override in fron t of that method
toLowerCase() returns a new string in lower case; just calling it without pushing the result back into your string ... just throws away the "lower casing".
toString() shouldnt print to the console. The whole idea is that you put all the information into the string that you return. In other words: in the end you do some System.out.println(someLetterCounter.toString()
Your code is extremely low-level. You don't iterate arrays using for (int), you can do (int letter : alphabetArray) instead
You might want to read about Map. You see, if you would be using a Map<Character, Integer> where the map key would represent the different characters, and the map value represents a counter for each character ... well, you could throw out most of your code; and come up with a solution that would require a few lines of code only!
( and seriously: because of all these issues, debugging your code was really much harder than it needed to be )

countLetters seems has some issues. You can not convert String to lowercase by just calling
this.input.toLowerCase();
Because String is immutable in java. You have to assign it like:
this.input = input.toLowerCase();
Another problem is you are using input variable from parameter instead of this.input which has lower case string. You can do this way to make work countLetters method:
public void countLetters(String input) {
this.input = input.toLowerCase();
for ( int i = 0; i < this.input.length(); i++ ) {
char ch= this.input.charAt(i);
if (ch >= 97 && ch <= 122) {
alphabetArray[ch-'a']++;
}
}
}

Related

How data map type find the value by the key?

I want to know. How did the Map type work when finding the value of a key?
Is this create an index like SQL?
Why these times are same as when I tried this code?
I don't have studied a programming class. Can someone explain in a simple way for me to understand?
import 'dart:math';
void main() {
int limit = 1000000;
Random random = new Random();
Map<String, int> map = {};
List<String> list = [];
for (int i = 0; i < limit; i++) {
String key = i.toString() + random.nextInt(limit).toString();
map[key] = i;
if (i == 0 ||
i == limit - 1 ||
i == limit / 2 ||
i == limit / 4 ||
i == limit / 10) {
list.add(key);
print("$key : $i");
}
}
for (var e in list) {
print("\n$e");
int time = DateTime.now().microsecondsSinceEpoch;
print(
"$e : ${map[e]} -> time = ${DateTime.now().microsecondsSinceEpoch - time}");
}
}
The result is:
0225402 : 0
100000478677 : 100000
250000355840 : 250000
50000052625 : 500000
999999681585 : 999999
0225402
0225402 : 0 -> time = 0
100000478677
100000478677 : 100000 -> time = 0
250000355840
250000355840 : 250000 -> time = 0
50000052625
50000052625 : 500000 -> time = 0
999999681585
999999681585 : 999999 -> time = 0

Unable to understand firstTerm = secondTerm; secondTerm = nextTerm; in fibonacci series

class Main {
public static void main(String[] args) {
int n = 5, firstTerm = 0, secondTerm = 1;
System.out.println("Fibonacci Series till " + n + " terms:");
for (int i = 1; i <= n; ++i) {
System.out.print(firstTerm + " ");
// compute the next term
int nextTerm = firstTerm + secondTerm;
firstTerm = secondTerm;
secondTerm = nextTerm;
}
}
}
//Q) Unable to understand why firstTerm = secondTerm;
secondTerm = nextTerm; statement is written, can anyone explain me this concept
The fibonnaci sequence is defined by
F(0) = 0 // This is our first term
F(1) = 1 // This is the second term
F(n) = F(n - 1) + F(n - 2)
To calculate a term that is neither the first term, nor the second term, we need to sum, the two previous terms.
This is the reason why while iterating, the second term value is assigned to the first term and so on
You will have more details here

Dart: how to convert a column letter into number

Currently using Dart with gsheets_api, which don't seem to have a function to convert column letters to numbers (column index)
As an example , this is what I use with AppScript (input: column letter, output: column index number):
function Column_Nu_to_Letter(column_nu)
{
var temp, letter = '';
while (column_nu > 0)
{
temp = (column_nu - 1) % 26;
letter = String.fromCharCode(temp + 65) + letter;
column_nu = (column_nu - temp - 1) / 26;
}
return letter;
};
This is the code I came up for Dart, it works, but I am sure there is a more elegant or correct way to do it.
String colLetter = 'L'; //Column 'L' as example
int c = "A".codeUnitAt(0);
int end = "Z".codeUnitAt(0);
int counter = 1;
while (c <= end) {
//print(String.fromCharCode(c));
if(colLetter == String.fromCharCode(c)){
print('Conversion $colLetter = $counter');
}
counter++;
c++;
}
// this output L = 12
Do you have any suggestions on how to improve this code?
First we need to agree on the meaning of the letters.
I believe the traditional approach is "A" is 1, "Z" is 26, "AA" is 27, "AZ" is 52, "BA" is 53, etc.
Then I'd probably go with something like these functions for converting:
int lettersToIndex(String letters) {
var result = 0;
for (var i = 0; i < letters.length; i++) {
result = result * 26 + (letters.codeUnitAt(i) & 0x1f);
}
return result;
}
String indexToLetters(int index) {
if (index <= 0) throw RangeError.range(index, 1, null, "index");
const _letters = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
if (index < 27) return _letters[index - 1];
var letters = <String>[];
do {
index -= 1;
letters.add(_letters[index.remainder(26)]);
index ~/= 26;
} while (index > 0);
return letters.reversed.join("");
}
The former function doesn't validate that the input only contains letters, but it works correctly for strings containing only letters (and it ignores case as a bonus).
The latter does check that the index is greater than zero.
A simplified version base on Irn's answer
int lettersToIndex(String letters) =>
letters.codeUnits.fold(0, (v, e) => v * 26 + (e & 0x1f));
String indexToLetters(int index) {
var letters = '';
do {
final r = index % 26;
letters = '${String.fromCharCode(64 + r)}$letters';
index = (index - r) ~/ 26;
} while (index > 0);
return letters;
}

How to convert large number to shorten K/M/B in Dart

How can I create function that convert large number into shorten number with character in Dart?
like
1000 => 1K
10000 => 10K
1000000 => 1M
10000000 => 10M
1000000000 => 1B
There is a built-in function in Dart that can be used and it's simple:
var f = NumberFormat.compact(locale: "en_IN");
print(f.format(12345));
to make it a method:
getShortForm(var number) {
var f = NumberFormat.compact(locale: "en_US");
return f.format(number);
}
for this to work import
import 'package:intl/intl.dart';
Refer to this doc for more https://pub.dev/documentation/intl/latest/intl/NumberFormat-class.html
If you are looking for a hard way:
getShortForm(int number) {
var shortForm = "";
if (number != null) {
if (number < 1000) {
shortForm = number.toString();
} else if (number >= 1000 && number < 1000000) {
shortForm = (number / 1000).toStringAsFixed(1) + "K";
} else if (number >= 1000000 && number < 1000000000) {
shortForm = (number / 1000000).toStringAsFixed(1) + "M";
} else if (number >= 1000000000 && number < 1000000000000) {
shortForm = (number / 1000000000).toStringAsFixed(1) + "B";
}
}
return shortForm;
}
String toString(int value) {
const units = <int, String>{
1000000000: 'B',
1000000: 'M',
1000: 'K',
};
return units.entries
.map((e) => '${value ~/ e.key}${e.value}')
.firstWhere((e) => !e.startsWith('0'), orElse: () => '$value');
}
A simpler approach, if you only need the suffix. It may not be compiling, but this is the idea.
String getSuffix (int t)
{
int i = -1;
for ( ; (t /= 1000) > 0 ; i++ );
return ['K','M','B'][i];
}
Edit
This is the mathematical way to do it, and it compiles. The point is you are searching for the amount of "groups of 3 decimal" places:
x 000 - 1
x 000 000 - 2
and so on. Which is log1000 number.
String getSuffix (int num)
{
int i = ( log(num) / log(1000) ).truncate();
return (num / pow(1000,i)).truncate().toString() + [' ','K','M','B'][i];
}
The Intl package does this as "compact" numbers, but it has a fixed format and it will also change with different locales, which might or might not be what you want.
Make a class and used its static method every where.
class NumberFormatter{
static String formatter(String currentBalance) {
try{
// suffix = {' ', 'k', 'M', 'B', 'T', 'P', 'E'};
double value = double.parse(currentBalance);
if(value < 1000){ // less than a thousand
return value.toStringAsFixed(2);
}else if(value >= 1000 && value < (1000*100*10)){ // less than a million
double result = value/1000;
return result.toStringAsFixed(2)+"k";
}else if(value >= 1000000 && value < (1000000*10*100)){ // less than 100 million
double result = value/1000000;
return result.toStringAsFixed(2)+"M";
}else if(value >= (1000000*10*100) && value < (1000000*10*100*100)){ // less than 100 billion
double result = value/(1000000*10*100);
return result.toStringAsFixed(2)+"B";
}else if(value >= (1000000*10*100*100) && value < (1000000*10*100*100*100)){ // less than 100 trillion
double result = value/(1000000*10*100*100);
return result.toStringAsFixed(2)+"T";
}
}catch(e){
print(e);
}
}
}

Huffman's encoding and decoding

I have to build a compressor based on the Huffman algorithm.
So far, I managed to create the tree with the frequencies of each character and generate a representation with a smaller number of bits for each character.
Is something like this, for the phrase "good this sugarplum":
'o' 000, '' 001, 't' 0100, 'r' 0101, 'p' 0110, 'm' 0111, 'l' 1000, 'i' 1001, 'h' 1010, 'd' 1011, 'a'1100, 'u' 1101, 'g' 1110, 's' 1111
The problem I'm having now is finding a way to save the tree in the archive, so I can rebuild it and then decompress the file.
Any suggestions?
I did some research but found it difficult to understand, so if you can explain in detail, I would appreciate it.
The code I used to read the frequencies from file is:
int main (int argc, char *argv[])
{
int i;
TipoSentinela *sentinela;
TipoLista *no = NULL;
Arv *arvore, *arvore2, *arvore3;
int *repete = (int *) calloc (256, sizeof(int));
if(argc == 2)
{
in = load_base(argv[1]);
le_dados_arquivo (repete); //read the frequencies from the file
sentinela = cria_lista (); //create a marker for the tree node list
for (i = 0; i < 256; i++)
{
if(repete[i] > 0 && i != 0)
{
arvore = arv_cria (Cria_info (i, repete[i])); //create a tree node with the character i and the frequence of it in the file
no = inicia_lista (arvore, no, sentinela); //create the list of tree nodes
}
}
Ordena (sentinela); //sort the tree nodes list by the frequencies
for(Seta_primeiro(sentinela); Tamanho_lista(sentinela) != 1; Move_marcador(sentinela))
{
Seta_primeiro(sentinela); //put the marker in the first element of the list
no = Retorna_marcador(sentinela);
arvore2 = Retorna_arvore (no); //return the tree represented by the list marker
Move_marcador(sentinela); //put the marker to the next element
arvore3 = Retorna_arvore (Retorna_marcador (sentinela)); //return the tree represented by the list marker
arvore = Cria_pai (arvore2, arvore3); //create a tree node that will contain the both arvore2 and arvore3
Insere_arvoreFinal (sentinela, arvore); //insert the node at the end of the list
Remove_arvore (sentinela); //remove the node arvore2 from the list
Remove_arvore (sentinela); //remove the node arvore3 from the lsit
Ordena (sentinela); //sort the list again
}
out = load_out(argv[1]); //open the output file
Codificacao (arvore); //generate the code from each node of the tree
rewind(in);
char c;
while(!feof(in))
{
c = fgetc(in);
if(c != EOF)
arvore2 = Procura_info (arvore, c); //search the character c in the tree
if(arvore2 != NULL)
imprimebit(Retorna_codigo(arvore2), out); //write the code in the file
}
fclose(in);
fclose(out);
free(repete);
arvore = arv_libera (arvore);
Libera_Lista(sentinela);
}
return 0;
}
//bit_counter and cur_byte are global variables
void write_bit (unsigned char bit, FILE *f)
{
static k = 0;
if(k != 0)
{
if(++bit_counter == 8)
{
fwrite(&cur_byte,1,1,f);
bit_counter = 0;
cur_byte = 0;
}
}
k = 1;
cur_byte <<= 1;
cur_byte |= ('0' != bit);
}
//aux is the code of a character in the tree
void imprimebit(char *aux, FILE *f)
{
int i, j;
if(aux == NULL)
return;
for(i = 0; i < strlen(aux); i++)
{
write_bit(aux[i], f); //write the bits of the code in the file
}
}
With this, I can write the code of all characters in the output file, but I can't see a way to store the tree too.
You don't need to send the tree. Just send the lengths. Then establish a consistent algorithm to convert the lengths to codes on both ends. The consistency is called a "canonical" Huffman code. You sort the codes by length, and within each length, sort by the symbol. Then assign codes starting at 0. So you would end up with (_ means space):
_ 000
o 001
a 0100
d 0101
g 0110
h 0111
i 1000
l 1001
m 1010
p 1011
r 1100
s 1101
t 1110
u 1111
I did found a way to store the code of each character.
For example:
I write the tree, starting by the root and going down to the left, then right.
So, if my tree was something like
0
/ \
0 1
/ \ / \
'a' 'b' 'c' 'd'
The header of my file would be someting like this:
001[8 bits from 'a']1[8 bits from b]01[8 bits from c]1[8 bits from d]
With this, I would be able to rebuild my tree.
My problem now is in read bit-by-bit of the header of file to know in wich direction I have to create a new node.

Resources