Why does the first code generate a list of random numbers with repetition and the second code a list of numbers without repetition? The difference is only a list declared with int[] in (1) and int{} in(2).
1.)
final generatedRandoms = <int>[];
final rng = Random();
while (generatedRandoms.length < 100) {
final gr = rng.nextInt(100) + 1;
generatedRandoms.add(gr);
}
2.)
final generatedRandoms = <int>{};
final rng = Random();
while (generatedRandoms.length < 100) {
final gr = rng.nextInt(100) + 1;
generatedRandoms.add(gr);
}
Because of this declaration:
final generatedRandoms = <int>[];
versus
final generatedRandoms = <int>{};
The first declares a list, the second declares a set.
Lists can hold multiple values, even duplicates, while sets cannot have duplicates. Since both loops run until there are 100 items in the container, you might get duplicates in the first example and you will not have duplicates in the second.
If you time them, you will notice the second one takes longer on average, because it will geenerate a lot more than 100 random numbers to get to 100 unique numbers.
Related
Maybe I'm missing a keyword in my searches for a solution, but I didn't find what I'm looking for.
In Google Sheets I want to take a set of numbers and reorder it randomly. For example, start with the set [1,2,3,4] and get back [4,2,1,3].
Any ideas which function or a combination of functions may achieve this goal?
The entire process that I want to achieve is something like this:
I have a set of 4 fields. Their sum is fixed. I want to assign them randomized values.
So, I was thinking to iterate through this process:
Create a random integer between 0 and the max possible value (in the first iteration it's the fixed sum)
The new max value is the last max value minus the new random number.
Check if the new max is zero.
If not:
Return to the 1st step and repeat - This goes on until there are four values
If needed the 4th value shall be increased so the total will match the fixed sum.
Else, continue.
Randomize the order of the 4 values.
Assign the values to the 4 fields.
try:
=INDEX(SORT({{1; 2; 3; 4}, RANDARRAY(4, 1)}, 2, ),, 1)
or:
=INDEX(SORT({ROW(1:4), RANDARRAY(4, 1)}, 2, ),, 1)
Here are a couple of app script examples as well
function DiceRolls(nNumRolls) {
var anRolls = [];
nNumRolls = DefaultTo(nNumRolls, 1000)
for (var i = 1;i <= nNumRolls; i++) {
anRolls.push(parseInt((Math.random() * 6))+1);
}
return anRolls;
}
function CoinFlips(nNumFlips) {
var anFlips = [];
nNumFlips = DefaultTo(nNumFlips, 1000)
for (var i = 1;i <= nNumFlips; i++) {
anFlips.push(getRndInteger(1,2));
}
return anFlips;
}
function getRndInteger(min, max) {
return Math.floor(Math.random() * (max - min + 1) ) + min;
}
How to iterate digits of integer? for example sum of digits here, it works, but is any way to right way?
int sumOfDigits(int num) {
int sum = 0;
String numtostr = num.toString();
for (var i = 0; i < numtostr.length; i++) {
sum = sum + int.parse(numtostr[i]);
}
return sum;
}
If you're looking for a shorter way to do this, you can combine split, map and reduce
int sum = num.split('').map((e) => int.parse(e)).reduce((t, e) => t + e);
You can even do this:
int sum = num.split('').map(int.parse).reduce((t, e) => t + e);
Thank you #julemand101
It's fairly inefficient to create a string, then split the string, and parse the individual digits back to integers.
How about something like:
Iterable<int> digitsOf(int number) sync* {
do {
yield = number.remainder(10);
number ~/= 10;
} while (number != 0);
}
This iterates the digits of the (non-negative) number in base 10, from least significant to most significant, without allocating any strings along the way.
If you want the digits in the reverse order, you can either create a list from the iterable above and reverse it, or use a different approach:
Iterable<int> digitsHighToLow(int number) sync* {
var base = 1;
while (base * 10 < number) {
base = base * 10;
}
do {
var digit = number ~/ base;
yield digit;
number = (number - digit * base) * 10;
} while (number != 0);
}
(again, only works on non-negative numbers, you'll have to figure out what you want for negative numbers, either throw, or try negating the number, it's the same digits after all, or something else).
I am trying to query 2 long columns for agents' name, the issue is the names are repeated on 2 tables, one for the total sum of productivity and the other is for total sum of utilization.
The thing is when I query the columns it returns back the numbers for Productivity and Utilization all together.
How can I make the query to search only for Productivity alone and for Utilization alone?
Link is here: https://docs.google.com/spreadsheets/d/12Sydw6ejFobySHUj5JoYkAPbhr0mKoInCWxtHY1W4lk/edit#gid=0
Apps Script would be a better solution in this case. The code below works as follows:
Gets the names from Column D and Column A.
For each name of Column D, it will compare it with each name of Column A (that's the 2 for loops)
If the names coincide (first if), it will check the background color (second if) of the Column A name to accumulate Total Prod and Total Util.
Once it reaches the end of the Column A, writes the values in Total Prod and Total Util (Columns E and F) for each name in D.
function onOpen() { //Will run every time you open the sheet
//Gets the active Spreadsheet and sheet
let sprsheet = SpreadsheetApp.getActiveSpreadsheet();
let sheet = sprsheet.getActiveSheet();
var lastRow = sheet.getLastRow();
var getNames = sheet.getRange(3, 1, lastRow).getValues(); //Names from row 2, col 1, until the last row
var totalNames = sheet.getRange("D4:D5").getValues(); //Change the range for more names
let prodColor = '#f2f4f7'; //hexadecimal codes of the background colors of names in A
let utilColor = '#cfe2f3'; //
for (var i = 0; i < totalNames.length; i++) {
var totalProd = 0, totalUtil = 0; //Starts at 0 for each name in D
for (var j = 0; j < getNames.length; j++) {
if (totalNames[i][0] == getNames[j][0]) {
if (sheet.getRange(j + 3, 1).getBackgroundObject().asRgbColor().asHexString() == prodColor) { //if colors coincide
totalProd += sheet.getRange(j + 3, 2).getValue();
} else if (sheet.getRange(j + 3, 1).getBackgroundObject().asRgbColor().asHexString() == utilColor) {
totalUtil += sheet.getRange(j + 3, 2).getValue();
}
}
}
sheet.getRange(i+4, 5, 1 ,2).setValues([[totalProd, totalUtil]]);
}
}
Note: You will have to run the code manually and accept permissions the first time you run it. After that it will run automatically each time you open the Sheet. It might take a few seconds for the code to run and to reflect changes on the Sheet.
To better understand loops and 2D arrays, I recommend you to take a look at this.
References:
Range Class
Get Values
Get BackgroundObject
Set Values
You can learn more about Apps Script and Sheets by following the Quickstart.
Is there any way to generate random numbers without duplication?
For instance I want to generate 50 random numbers from 1 to 100 no duplication, any way to do this or do I have to check every time incoming number is already created or not?
you can use shuffle as following code.
import 'dart:math';
var list = new List<int>.generate(10, (int index) => index); // [0, 1, 4]
list.shuffle();
print(list);
You can use Set. Each object can occur only once when using it. Just try this:
Set<int> setOfInts = Set();
while (setOfInts.length < 50) {
setOfInts.add(Random().nextInt(range) + 1);
}
You can read the documentation here: Set Doc
Here is an alternative that avoids creating an array of all the possible values, and avoids repeatedly looping until no collision occurs. It may be useful when there is a large range to select from.
import 'dart:math';
class RandomList {
static final _random = new Random();
static List<int> uniqueSample({int limit, int n}) {
final List<int> sortedResult = [];
final List<int> result = [];
for (int i = 0; i < n; i++) {
int rn = _random.nextInt(limit - i); // We select from a smaller list of available numbers each time
// Increment the number so that it picks from the remaining list of available numbers
int j = 0;
for (; j < sortedResult.length && sortedResult[j] <= rn; j++) rn++;
sortedResult.insert(j, rn);
result.add(rn);
}
return result;
}
}
I haven't tested it exhaustively but it seems to work.
I am trying to extract topic from 7 millons of Twitter data. I have assumed each tweet as a document. So, I stored all tweets in a file where each line (or tweet) treated as a document. I used this file as a input file for Mallet api.
public static void LDAModel(int numofK,int numbofIteration,int numberofThread,String outputDir,InstanceList instances) throws Exception
{
// Create a model with 100 topics, alpha_t = 0.01, beta_w = 0.01
// Note that the first parameter is passed as the sum over topics, while
// the second is the parameter for a single dimension of the Dirichlet prior.
int numTopics = numofK;
ParallelTopicModel model = new ParallelTopicModel(numTopics, 1.0, 0.01);
model.addInstances(instances);
// Use two parallel samplers, which each look at one half the corpus and combine
// statistics after every iteration.
model.setNumThreads(numberofThread);
// Run the model for 50 iterations and stop (this is for testing only,
// for real applications, use 1000 to 2000 iterations)
model.setNumIterations(numbofIteration);
model.estimate();
// Show the words and topics in the first instance
// The data alphabet maps word IDs to strings
Alphabet dataAlphabet = instances.getDataAlphabet();
FeatureSequence tokens = (FeatureSequence) model.getData().get(0).instance.getData();
LabelSequence topics = model.getData().get(0).topicSequence;
Formatter out = new Formatter(new StringBuilder(), Locale.US);
for (int position = 0; position < tokens.getLength(); position++) {
// out.format("%s-%d ", dataAlphabet.lookupObject(tokens.getIndexAtPosition(position)), topics.getIndexAtPosition(position));
out.format("%s-%d ", dataAlphabet.lookupObject(tokens.getIndexAtPosition(position)), topics.getIndexAtPosition(position));
}
System.out.println(out);
// Estimate the topic distribution of the first instance,
// given the current Gibbs state.
double[] topicDistribution = model.getTopicProbabilities(0);
// Get an array of sorted sets of word ID/count pairs
ArrayList<TreeSet<IDSorter>> topicSortedWords = model.getSortedWords();
// Show top 10 words in topics with proportions for the first document
String topicsoutput="";
for (int topic = 0; topic < numTopics; topic++) {
Iterator<IDSorter> iterator = topicSortedWords.get(topic).iterator();
out = new Formatter(new StringBuilder(), Locale.US);
out.format("%d\t%.3f\t", topic, topicDistribution[topic]);
int rank = 0;
while (iterator.hasNext() && rank < 10) {
IDSorter idCountPair = iterator.next();
out.format("%s (%.0f) ", dataAlphabet.lookupObject(idCountPair.getID()), idCountPair.getWeight());
//out.format("%s ", dataAlphabet.lookupObject(idCountPair.getID()));
rank++;
}
System.out.println(out);
}
// Create a new instance with high probability of topic 0
StringBuilder topicZeroText = new StringBuilder();
Iterator<IDSorter> iterator = topicSortedWords.get(0).iterator();
int rank = 0;
while (iterator.hasNext() && rank < 10) {
IDSorter idCountPair = iterator.next();
topicZeroText.append(dataAlphabet.lookupObject(idCountPair.getID()) + " ");
rank++;
}
// Create a new instance named "test instance" with empty target and source fields.
InstanceList testing = new InstanceList(instances.getPipe());
testing.addThruPipe(new Instance(topicZeroText.toString(), null, "test instance", null));
TopicInferencer inferencer = model.getInferencer();
double[] testProbabilities = inferencer.getSampledDistribution(testing.get(0), 10, 1, 5);
System.out.println("0\t" + testProbabilities[0]);
File pathDir = new File(outputDir + File.separator+ "NumofTopics"+numTopics); //FIXME replace all strings with constants
pathDir.mkdir();
String DirPath = pathDir.getPath();
String stateFile = DirPath+File.separator+"output_state.gz";
String outputDocTopicsFile = DirPath+File.separator+"output_doc_topics.txt";
String topicKeysFile = DirPath+File.separator+"output_topic_keys";
PrintWriter writer=null;
String topicKeysFile_fromProgram = DirPath+File.separator+"output_topic";
try {
writer = new PrintWriter(topicKeysFile_fromProgram, "UTF-8");
writer.print(topicsoutput);
writer.close();
} catch (Exception e) {
e.printStackTrace();
}
model.printTopWords(new File(topicKeysFile), 11, false);
model.printDocumentTopics(new File (outputDocTopicsFile));
model.printState(new File (stateFile));
}
public static void main(String[] args) throws Exception{
// Begin by importing documents from text to feature sequences
ArrayList<Pipe> pipeList = new ArrayList<Pipe>();
// Pipes: lowercase, tokenize, remove stopwords, map to features
pipeList.add( new CharSequenceLowercase() );
pipeList.add( new CharSequence2TokenSequence(Pattern.compile("\\p{L}[\\p{L}\\p{P}]+\\p{L}")) );
pipeList.add( new TokenSequenceRemoveStopwords(new File("H:\\Data\\stoplists\\en.txt"), "UTF-8", false, false, false) );
pipeList.add( new TokenSequence2FeatureSequence() );
InstanceList instances = new InstanceList (new SerialPipes(pipeList));
Reader fileReader = new InputStreamReader(new FileInputStream(new File("E:\\Thesis Data\\DataForLDA\\freshnewData\\cleanTweets.txt")), "UTF-8");
instances.addThruPipe(new CsvIterator (fileReader, Pattern.compile("^(\\S*)[\\s,]*(\\S*)[\\s,]*(.*)$"),
3, 2, 1)); // data, label, name fields
int numberofTopic=5;
int numberofIteration=50;
int numberofThread=6;
String outputDir="J:\\Topics\\";
//int numberofTopic=5;
LDAModel(numberofTopic,numberofIteration,numberofThread,outputDir,instances);
TimeUnit.SECONDS.sleep(30);
numberofTopic=10; }
I have got three files from the above program.
1. state file
2. topic proportion file
3. key topic list
I would like to find out the number of documents allocated per topic.
For example I got the following output from key topic list file
0.004 obama (5471) canada (5283) woman (5152) vote (4879) police(3965)
where first column means topic serial number, second column means topic weight, third column means words under this topic (number of words)
Here, I got number of words under this topic but I would also like to show the number of documents where I got this topic. It would be helpful to show this output as a separate file like this. For example,
Topic 1: doc1(80%) doc2(70%) .......
Could anyone please give some idea or any source code for this?
Thanks.
The information you are looking for is contained in the file "2. topic proportion" you mentioned. Note that every document contains each topic with some percentage (although the percentages may be large for one topic and extremly small for others). You will have to decide what you want to extract from the file: The dominant topic (it is in column 3); The dominant topic, but only when the percentage is at least 50% (sometimes, two topics have almost the same percentage) ...