About the inference result of Blei's lda-c-dist [closed] - machine-learning

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I have a question about the inference result of lda-c-dist package. How many words should be displayed when viewing results of inference? For example, if I set number of words to a very large number N(assume number of all terms are N), it seems to exist some groups of words. In each group, the index of words are ranging from 1 to N.
What I got is like,
Assume number of terms is 10, and I assign the number of words displayed to 10.
Topic 0xx:
001
008
009
002
003
007
000
004
005
006
It seems that, may be I should set words displayed 3, not 10.
So, as to one topic, when viewing topics by calling topics.py, how many words should be specified?
Besides, I'm going to use this output to calculate the similarity of two topics. So ...

Actually, there can be as many items as the vocabularies are. What is displayed here, is just a probability descending order for a limited number indicated.

Related

Reverse percentage formula [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 1 year ago.
Improve this question
Apparently "reverse percentage calculations" are a confusing topic (I looked through existing questions) - but I'm not sure I understand it myself.
I have a counter in my Google Sheets that is counting down a certain number of cells, and then I have the total range of cells.
So for example: my counter is at 6000 and its counting down through a range of 7000 cells.
What I'm trying to do is calculate the percentage that it's done counting down.
When I divide the part by the total, I obviously get something like 90% - which is not what I want. It should be like 2% (or whatever) :P
Does anyone know the formula I should apply here?
try:
=1-COUNTA(A:A)/A1
(cell formatted as a percentage)
I think I solved it.
If A is counting down from 100 and 0 = 100%, then the information I was originally missing was the difference between 100 and A.
So in my case:
A3:
=ROWS(A1:A)-A2
(A1:A is 100%)
(A2 is the current countdown)
So for example if ROWS(A1:A) = 10 (total rows in your sheet) & if my current countdown is 3 then:
A3:
=ROWS(A1:A)-A2 = 7
Then we calculate the percentage by:
A4:
=A3/ROWS(A1:A)
That should give you the correct percentage.

Is there such thing as a Perabyte? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 6 years ago.
Improve this question
I recently saw a marketing video that cited the "Perabyte" as a unit of disk storage measure. I emailed a representative responsible for the video and got a response that "1 Perabyte (PB) = 1024 Terabytes (TB)."
A quick google search seems to indicate that a Petabyte is defined as 1000 TB.
Is a Perabyte a real thing, and is it 1024 (vs 1000) TB?
Edit: This thread has answered my question, despite the votes to close. No one has heard of a "Perabyte" except as a misspelling of "Petabyte." Thanks.
Edit (2): Petabytes have been tagged real by "BRIGHT SIDE"
A perabyte doesn't exist, but a petabyte is a real thing. It is
2^50 bytes; 1024 terabytes, or a million gigabytes.
It seems likely that your perabyte was the result of a typo, seeing as 'r' and 't' are close on the keyboard.
The 1024 versus 1000 question most likely arises from which base you are using.
1024 is base 2, 1000 is base 10.
See https://en.wikipedia.org/wiki/Petabyte
A petabyte is 10005, or 1000 terabytes.
A pebibyte is 250, or 1024 tebibytes.
A perabyte is not a thing.

'Georgetown' in UTC-3:00 - where is it exactly? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 6 years ago.
Improve this question
I am trying to verify a timezone list from:
https://github.com/tamaspap/timezones
And I have noticed that there is a 'Georgetown' in UTC-3:00. But I am wondering where is the location of this 'Georgetown' exactly? I did searched on the Internet, but don't see any Georgetown in UTC-3:00?
This case is a perfect example of why one has to be very careful with building custom lists of time zones. The fact is, there are many cities named GeorgeTown.
In particular, there are two different cities being confused here.
Georgetown, Guyana - the capital city of that country
IANA time zone ID: America/Guyana
Time Zone Offset: UTC-4 (no DST)
Georgetown, Córdoba, Argentina - a suburb of the city of Córdoba
IANA time zone ID: America/Argentina/Cordoba
Time Zone Offset: UTC-3 (no DST)
It appears that the author of that time zone list was looking at other various time zone lists that have an entry for Georgetown. Perhaps it was one of these (or based on them):
The Windows operating system, which has an entry: Georgetown, La Paz, Manaus, San Juan
Rails, whose ActiveSupport:TimeZone list contains an entry for Georgetown
However, in both cases, the Georgetown being referred to is the one in Guyana, and the author of the project you mentioned incorrectly mapped it to Argentina, as seen in their source code:
'(UTC-03:00) Georgetown' => 'America/Argentina/Buenos_Aires',
They also chose the wrong sub-region within Argentina, but both are on UTC-3.

Convert from inches,lbs to centimeter,kg in iOS [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I have a problem:
My app allow user choose 2 options : Metric or Imperial.
If user select "Metric", height(cm), weight(kg).
If Select "Imperial: height(inch) and weight (lbs).
I need to change value of height from inch to centimeter and weight from lbs to kg.
Example :
user input : 70 (inch) and 120 (lbs) ==> convert to 177.8 (cm) and 54.5 (kg)
Thanks in advance.
The best example of unit convert is below link..

how wireshark marks some packets as "tcp segment of a reassembled pdu" [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I opened a pcap in wireshark and it displays a lot of packets as "tcp segment of a reassembled pdu". How wireshark is able to determine which tcp packets are segments of a reassembled pdu ? I am not able to find any header field or anything else by which wireshark can determine this.
Any help will be greatly appreciated. THANKS !!!
Sequence number is the field which helps in reassembly. Say you have data bytes 1-300 to send.
For instance they were divided into 3 segments of size 100 each i.e. first (1-100 byte number), second (101 - 200) and third (201-300). Now even if they are received out of order, sequence numbers won't change. So when reassembling data, you would know the original order of packets and hence wireshark can display the assembled packets.
If the SYN flag is clear (0), then this is the accumulated sequence number of the first data byte of this packet for the current session.
TCP
Remember, this is different from ip fragmentation and reassembly. IP header has fields to specify if there are fragments and if so, what is the fragment number of current packet.

Resources