How to generate random lines of text of a given length from a dictionary of words (bin-packing problem)? - actionscript

I need to generate three lines of text (essentially jibberish) that are each 60 characters long, including a hard return at the end of each line. The lines are generated from a dictionary of words of various lengths (typically 1-8 characters). No word may be used more than once, and words must be separated by spaces. I think this is essentially a bin-packing problem.
The approach I've taken so far is to create a hashMap of the words, grouped by their lengths. I then choose a random length, pull a word of that length from the map, and append it to the end of the line I'm currently generating, accounting for spaces or a hard return. It works about half the time, but the other half of the time I'm getting stuck in an infinite loop and my program crashes.
One problem I'm running into is this: as I add random words to the lines, groups of words of a given length may become depleted. This is because there are not necessarily the same number of words of each length in the dictionary, e.g., there may only be one word with a length of 1. So, I might need a word of a given length, but there are no longer any words of that length available.
Below is a summary of what I have so far. I'm working in ActionScript, but would appreciate insight into this problem in any language. Many thanks in advance.
dictionary // map of words with word lengths as keys and arrays of corresponding words as values
lengths // array of word lengths, sorted numerically
min = lengths[0] // minimum word length
max = lengths[lengths.length - 1] // maximum word length
line = ""
while ( line.length < 60 ) {
len = lengths[round( rand() * ( lengths.length - 1 ) )]
if ( dictionary[len] != null && dictionary[len].length > 0 ) {
diff = 60 - line.length // number of characters needed to complete the line
if ( line.length + len + 1 == 60 ) {
// this word will complete the line exactly
line += dictionary[len].splice(0, 1) + "\n"
}
else if ( min + max + 2 >= diff ) {
// find the two word lengths that will complete the line
// ==> this is where I'm having trouble
}
else if ( line.length + len + 1 < 60 - max ) {
// this word will fit safely, so just add it
line += dictionary[len].splice(0, 1) + " "
}
if ( dictionary[len].length == 0 ) {
// delete any empty arrays and update min and max lengths accordingly
dictionary[len] = null
delete dictionary[len]
i = lengths.indexOf( len )
if ( i >= 0 ) {
// words of this length have been depleted, so
// update lengths array to ensure that next random
// length is valid
lengths.splice( i, 1 )
}
if ( lengths.indexOf( min ) == -1 ) {
// update the min
min = lengths[0]
}
if ( lengths.indexOf( max ) == -1 ) {
// update the max
max = lengths[lengths.length - 1]
}
}
}
}

You should think of an n-letter word as being n+1 letters, because each word has either a space or return after it.
Since all your words are at least 2 characters long, you don't ever want to get to a point where you have 59 characters filled in. If you get to 57, you need to pick something that is 2 letters plus the return. If you get to 58, you need a 1-letter word plus the return.
Are you trying to optimize for time? Can you have the same word multiple times? Multiple times in one line? Does it matter if your words are not uniformly distributed, e.g. a lot of lines contain "a" or "I" because those are the only one-letter words in English?
Here's the basic idea. For each line, start choosing word lengths, and keep track of the word lengths and total character count so far. As you get toward the end of the line, choose word lengths less than the number of characters you have left. (e.g. if you have 5 characters left, choose words in the range of 2-5 characters, counting the space.) If you get to 57 characters, pick a 3-letter word (counting return). If you get to 58 characters, pick a 2-letter word (counting return.)
If you want, you can shuffle the word lengths at this point, so all your lines don't end with short words. Then for each word length, pick a word of that length and plug it in.

dictionnary = Group your words by lengths (like you already do)
total_length = 0
phrase = ""
while (total_length < 60){
random_length = generate_random_number(1,8)
if (total_length + random_length > 60)
{
random_length = 60 - total_length // possibly - 1 if you cound \n and -2 if you
// append a blank anyway at the end
}
phrase += dictionnary.get_random_word_of_length(random_length) + " "
total_length += random_length + 1
}

Related

How do I fix the error "subscript out of range" in QBASIC?

I'm trying to create a code that generates random numbers within the range 10-30 but making sure that no number is repeated. It shows "subscript out of range" on NumArray(Count) = Count when I run the code.
'Make an array of completely sorted numbers
FOR Count = 10 TO 30
NumArray(Count) = Count
NEXT Count
RANDOMIZE TIMER
FOR Count = 10 TO 30
Number = (RND * (31 - Count)) + 10
PRINT #1, NumArray(Number)
FOR Counter = Number TO 30 - Count
NumArray(Counter) = NumArray(Counter + 1)
NEXT Counter
NEXT Count
This isn't actually my code. Copied and pasted for my assignment.
It looks like you're missing some DIM statements.
Variables containing numbers have type SINGLE by default, so you might see something like FOR Counter = 18.726493 TO 20 because the RND function returns a number between 0 and 1, excluding 1, meaning you will be trying to use NumArray(18.726493) which will not work.
Arrays that are not explicitly declared can only have 11 items with an index from 0 to 10, but the range 10-30 requires you to store 21 items (30 - 10 + 1 = 21). You can also specify a custom upper and lower bound if it will make your code easier for you to understand. Add these lines before the first line in your code shown above:
DIM Number AS INTEGER
DIM NumArray(10 TO 30) AS INTEGER
This will ensure Number only contains integers (any fractional values are rounded to the nearest integer), and NumArray will work from NumArray(10) to NumArray(30), but you can't use NumArray(9), NumArray(8), NumArray(31), etc. The index must be in the range 10-30.
I think that should fix your code, but I don't know for certain since I don't fully understand how it is supposed to work. At the very least, it will fix the type and subscript problems in your code.
You need to declare the array:
'Make an array of completely sorted numbers
DIM NumArray(30) AS INTEGER
FOR Count = 10 TO 30
NumArray(Count) = Count
NEXT Count
RANDOMIZE TIMER
FOR Count = 10 TO 30
Number = (RND * (31 - Count)) + 10
PRINT #1, NumArray(Number)
FOR Counter = Number TO 30 - Count
NumArray(Counter) = NumArray(Counter + 1)
NEXT Counter
NEXT Count

How do I fill up a number's decimal places with zeroes?

Assume the following numbers:
local a = 2
local b = 3.1
local c = 1.43
local d = 1.0582
My goal is to round these numbers to two decimal places. The result should be this, respectively:
a = 2.00
b = 3.10
c = 1.43
d = 1.06 or 1.05
Obviously I understand that any number with trailing decimal zeroes will get rounded. 2.00 will be 2. But I need the numbers as strings, and to make it visually more appealing, I would need these two decimal places.
Here's a function I use to round to two decimal places:
function round(num, numDecimalPlaces)
local mult = 10^(numDecimalPlaces or 0)
return math.floor(num * mult + 0.5) / mult
end
This works fine for test cases c and d, but will produce wrong results with a and b: it won't fill up with zeroes. I understand it is because the rounding function takes the numbers and calculates them - therefore the excess zeroes get cut off.
But that is exactly not my goal - not cutting them off.
I've tried string manipulation, by checking if and where a . is in a number, but that didn't work at all, for any case. My method:
local zei
if i < 100 then
if tostring(i):find("%.") == nil then
zei = round(i, 2) .. ".00" --No decimal point found, append .00
else
zei = round(i, 2) --Found decimal point, round to 2
end
if tostring(i):find("%.")+2 == tostring(i):len() then
zei = round(i, 2) .. "0" --Found point, but only one trailing number, append 0
end
else
zei = round(i, 0) --Number is over 100, no decimal points needed
end
The above 100 case is just for aesthetics and not relevant here. Where zei is the displayed string, and i is one of the test case numbers.
Summary
How would I round a number to two decimal places, but append trailing zeroes, even if they were excess, e.g. 2.30? I understand I need strings for this.
Contradicting question: Strip off excess zeroes
You don't round numbers. You create string representations of those numbers. That would be done by string.format, with an appropriate format. Like this:
string.format("%.2f", a);

Rounding to specific value?

I need to round a number, let's say 543 to either the hundreds or the tens place. It could be either one, as it's part of a game and this stage can ask you to do one or the other.
So for example, it could ask, "Round number to nearest tens", and if the number was 543, they would have to enter in 540.
However, I don't see a function that you can specify target place value to round at. I know there's an easy solution, I just can't think of one right now.
From what I see, the round function rounds the last decimal place?
Thanks
To rounding to 100's place
NSInteger num=543;
NSInteger deci=num%100;//43
if(deci>49){
num=num-deci+100;//543-43+100 =600
}
else{
num=num-deci;//543-43=500
}
To round to 10's place
NSInteger num=543;
NSInteger deci=num%10;//3
if(deci>4){
num=num-deci+100;//543-3+10 =550
}
else{
num=num-deci;//543-3=540
}
EDIT:
Tried to merge the above in one:
NSInteger num=543;
NSInteger place=100; //rounding factor, 10 or 100 or even more.
NSInteger condition=place/2;
NSInteger deci=num%place;//43
if(deci>=condition){
num=num-deci+place;//543-43+100 =600.
}
else{
num=num-deci;//543-43=500
}
You may just use an algorithm in your code:
For example, lets say that you need to round up a number to hundred's place.
int c = 543
int k = c % 100
if k > 50
c = (c - k) + 100
else
c = c - k
To round numbers, you can use the modulus operator, %.
The modulus operator gives you the remainder after division.
So 543 % 10 = 3, and 543 % 100 = 43.
Example:
int place = 10;
int numToRound=543;
// Remainder is 3
int remainder = numToRound%place;
if(remainder>(place/2)) {
// Called if remainder is greater than 5. In this case, it is 3, so this line won't be called.
// Subtract the remainder, and round up by 10.
numToRound=(numToRound-remainder)+place;
}
else {
// Called if remainder is less than 5. In this case, 3 < 5, so it will be called.
// Subtract the remainder, leaving 540
numToRound=(numToRound-remainder);
}
// numToRound will output as 540
NSLog(#"%i", numToRound);
Edit: My original answer was submitted before it was ready, because I accidentally hit a key to submit it. Oops.

How to take in digits 0-255 from a file with no delimeters

I have a plaintext file that has only numerical digits in it (no spaces, commas, newlines, etc.) which contains n digits which range from 0 to 255. I want to take it in and store these values in an array.
Example
Let's say we have this sequence in the file:
581060100962552569
I want to take it in like this, where in.read is the file input stream, tempArray is a local array of at most 3 variables that is wiped every time something is stored in endArray, which is where I want the final values to go:
in.read tempArray endArray
5 [5][ ][ ] [] //It reads in "5", sees single-digit number X guarantees that "5X" is less than or equal to 255, and continues
8 [5][8][ ] [58] //It reads in "8", realizes that there's no number X that could make "58X" smaller than or equal to "255", so it stores "58" in endArray
1 [1][ ][ ] [58] //It wipes tempArray and reads the next value into it, repeating the logic of the first step
0 [1][0][ ] [58] //It realizes that all single-digit numbers X guarantee that "10X" is less than or equal to "255", so it continues
6 [1][0][6] [58][106] //It reads "6" and adds "106" to the endArray
0 [0][ ][ ] [58][106] //It wipes tempArray and stores the next value in it
1 [0][1][ ] [58][106]
0 [0][1][0] [58][106][10] //Even though all single-digit numbers X guarantee that "010X" is less than or equal to "255", tempArray is full, so it stores its contents in endArray as "10".
0 [0][ ][ ] [58][106][10]
9 [0][9][ ] [58][106][10]
6 [0][9][6] [58][106][10][96] //Not only can "96" not have another number appended to it, but tempArray is full
2 [2][ ][ ] [58][106][10][96]
5 [2][5][ ] [58][106][10][96] //There are numbers that can be appended to "25" to make a number less than or equal to "255", so continue
5 [2][5][5] [58][106][10][96][255] //"5" can be appended to "25" and still be less than or equal to "255", so it stores it in tempArray, finds tempArray is full, so it stores tempArray's values in endArray as "255"
2 [2][ ][ ] [58][106][10][96][255][37]
5 [2][5][ ] [58][106][10][96][255][37] //There are numbers that can be appended to "25" to make a number less than or equal to "255", so continue
6 [6][ ][ ] [58][106][10][96][255][37][25] //It sees that adding "6" to "25" would make a number that's larger than 255, so it stores "25" in the endArray and remembers "6" in the tempArray
9 [6][9][ ] [58][106][10][96][255][37][25][69] //It sees that there is no number X such that "69X" is less than "255", so it stores "69" in endArray
Does anyone know how I can accomplish this behavior? Please try to keep your answers in pseudocode, so it can be translated to many programming langauges
I would not use the temp array for holding the intermediate numbers - for the CPU numbers are stored in binary format and you are reading decimal numbers.
Something like this could solve your problem:
array = []
accumulator = 0
count = 0
while not EOF:
n = readDigit()
if accumulator*10 + n > 256 or count == 2:
array.push(accumulator)
accumulator = n
count = 0
else:
accumulator = accumulator*10 + n
count = count + 1
The results are appended to the array called array.
Edit: Thanks to DeanOC for noticing the missing counter. But DeanOC's solution initializes the counter for the first iteration to 0 instead of 1.
antiguru's response is nearly there.
The main problem is that it doesn't take into consideration that the numbers can only have 3 digits. This modification should work for you.
array = []
accumulator = 0
digitCounter = 0
while not EOF
n = readDigit()
if accumulator*10 + n > 255 or digitcounter = 3:
array.push(accumulator)
accumulator = n
digitCounter = 1
else:
accumulator = accumulator*10 + n
digitCounter = DigitCounter + 1

Scaling a number between two values

If I am given a floating point number but do not know beforehand what range the number will be in, is it possible to scale that number in some meaningful way to be in another range? I am thinking of checking to see if the number is in the range 0<=x<=1 and if not scale it to that range and then scale it to my final range. This previous post provides some good information, but it assumes the range of the original number is known beforehand.
You can't scale a number in a range if you don't know the range.
Maybe what you're looking for is the modulo operator. Modulo is basically the remainder of division, the operator in most languages is is %.
0 % 5 == 0
1 % 5 == 1
2 % 5 == 2
3 % 5 == 3
4 % 5 == 4
5 % 5 == 0
6 % 5 == 1
7 % 5 == 2
...
Sure it is not possible. You can define range and ignore all extrinsic values. Or, you can collect statistics to find range in run time (i.e. via histogram analysis).
Is it really about image processing? There are lots of related problems in image segmentation field.
You want to scale a single random floating point number to be between 0 and 1, but you don't know the range of the number?
What should 99.001 be scaled to? If the range of the random number was [99, 100], then our scaled-number should be pretty close to 0. If the range of the random number was [0, 100], then our scaled-number should be pretty close to 1.
In the real world, you always have some sort of information about the range (either the range itself, or how wide it is). Without further info, the answer is "No, it can't be done."
I think the best you can do is something like this:
int scale(x) {
if (x < -1) return 1 / x - 2;
if (x > 1) return 2 - 1 / x;
return x;
}
This function is monotonic, and has a range of -2 to 2, but it's not strictly a scaling.
I am assuming that you have the result of some 2-dimensional measurements and want to display them in color or grayscale. For that, I would first want to find the maximum and minimum and then scale between these two values.
static double[][] scale(double[][] in, double outMin, double outMax) {
double inMin = Double.POSITIVE_INFINITY;
double inMax = Double.NEGATIVE_INFINITY;
for (double[] inRow : in) {
for (double d : inRow) {
if (d < inMin)
inMin = d;
if (d > inMax)
inMax = d;
}
}
double inRange = inMax - inMin;
double outRange = outMax - outMin;
double[][] out = new double[in.length][in[0].length];
for (double[] inRow : in) {
double[] outRow = new double[inRow.length];
for (int j = 0; j < inRow.length; j++) {
double normalized = (inRow[j] - inMin) / inRange; // 0 .. 1
outRow[j] = outMin + normalized * outRange;
}
}
return out;
}
This code is untested and just shows the general idea. It further assumes that all your input data is in a "reasonable" range, away from infinity and NaN.

Resources