I read that using the Coherence measure can help estimate the most optimal number of topics (K) to be used in the LDA model. I created the below code to run multiple LDA Models with different numbers of topics and calculate the Coherence measure for each.
def compute_coherence_values(dictionary,corpus,texts,limit,start=2,step=1):
coherence_values = []
model_list = []
for num_topics in range(start, limit, step):
model = gensim.models.ldamodel.LdaModel(corpus=corpus, num_topics=num_topics,random_state=100,
chunksize=200,passes=10,per_word_topics=True,id2word=id2word)
model_list.append(model)
coherencemodel = CoherenceModel(model=model, texts=texts, dictionary=dictionary, coherence='c_v')
coherence_values.append(coherencemodel.get_coherence())
return model_list, coherence_values
model_list, coherence_values = compute_coherence_values(dictionary=id2word,corpus=corpus,
texts=data_lemmatized, start=2, limit=500, step=5)
# Show Coherence graph
limit=500; start=2; step=5;
x= range(start, limit, step)
plt.plot(x, coherence_values)
plt.xlabel("Num Topics")
plt.ylabel("Coherence score")
plt.legend(("coherence_values"), loc='best')
plt.show()
Now, according to this graph, the coherence measure peaks at for example 150, 187, 200, and above!
I do not quite understand how the coherence doesn't go down when
using too much number of topics? Why does it plateau?
When Running my LDA using any of the above-mentioned numbers of topics, I get the same topic name applied to all extracted topics with 0% relevance of each word!
Am I doing something wrong here?
Sample of output:
[(149, '0.000*"klebsiella" + 0.000*"cyclade" + 0.000*"slope"'), (103, '0.000*"klebsiella" + 0.000*"cyclade" + 0.000*"slope"'), (49, '0.000*"klebsiella" + 0.000*"cyclade" + 0.000*"slope"'), (40, '0.000*"klebsiella" + 0.000*"cyclade" + 0.000*"slope"'), (56, '0.000*"klebsiella" + 0.000*"cyclade" + 0.000*"slope"'), (105, '0.000*"klebsiella" + 0.000*"cyclade" + 0.000*"slope"'), (35, '0.000*"klebsiella" + 0.000*"cyclade" + 0.000*"slope"'), (63, '0.000*"klebsiella" + 0.000*"cyclade" + 0.000*"slope"'), (146, '0.288*"role" + 0.135*"machine" + 0.119*"age"'), (120, '0.000*"klebsiella" + 0.000*"cyclade" + 0.000*"slope"'), (18, '0.000*"klebsiella" + 0.000*"cyclade" + 0.000*"slope"'), (92, '0.000*"klebsiella" + 0.000*"cyclade" + 0.000*"slope"'), (157, '0.000*"klebsiella" + 0.000*"cyclade" + 0.000*"slope"'), (143, '0.000*"klebsiella" + 0.000*"cyclade" + 0.000*"slope"'), (39, '0.000*"klebsiella" + 0.000*"cyclade" + 0.000*"slope"'), (141, '0.000*"klebsiella" + 0.000*"cyclade" + 0.000*"slope"'), (78, '0.000*"klebsiella" + 0.000*"cyclade" + 0.000*"slope"'), (151, '0.000*"klebsiella" + 0.000*"cyclade" + 0.000*"slope"'), (90, '0.000*"klebsiella" + 0.000*"cyclade" + 0.000*"slope"'), (101, '0.000*"klebsiella" + 0.000*"cyclade" + 0.000*"slope"')]
If I choose topic numbers <100 I ok results:
[(132, '0.263*"subject" + 0.155*"deputy" + 0.091*"plastic"'), (110, '0.208*"club" + 0.164*"fan" + 0.096*"ground"'), (200, '0.225*"book" + 0.080*"writer" + 0.078*"winner"'), (16, '0.000*"cellnet" + 0.000*"katherine" + 0.000*"accommodation"'), (71, '0.000*"cellnet" + 0.000*"katherine" + 0.000*"accommodation"'), (29, '0.543*"language" + 0.095*"rush" + 0.074*"hip"'), (66, '0.312*"case" + 0.195*"court" + 0.129*"charge"'), (34, '0.000*"cellnet" + 0.000*"katherine" + 0.000*"accommodation"'), (191, '0.492*"film" + 0.146*"cinema" + 0.094*"director"'), (28, '0.295*"number" + 0.132*"chart" + 0.107*"week"'), (116, '0.130*"email" + 0.109*"union" + 0.086*"difference"'), (144, '0.283*"rate" + 0.253*"interest" + 0.036*"month"'), (207, '0.202*"camera" + 0.107*"message" + 0.058*"text"'), (174, '0.260*"revenue" + 0.188*"earning" + 0.104*"world"'), (121, '0.631*"distribution" + 0.000*"accommodation" + 0.000*"cambridgeshire"'), (68, '0.382*"price" + 0.179*"oil" + 0.095*"demand"'), (163, '0.417*"action" + 0.258*"official" + 0.095*"lawsuit"'), (206, '0.135*"race" + 0.111*"world" + 0.066*"year"'), (215, '0.305*"technology" + 0.194*"device" + 0.085*"generation"'), (125, '0.312*"man" + 0.072*"hunt" + 0.069*"ban"')]
N.B: The corpus is made up of 10k unique documents
I need help with the analysis of the data for my thesis. I have imported all the data from an Excel file, but the data were classified as string variables. To run the syntax compute variable, I had to change the variables to numeric variables. So I did that with Automatic Recode and it worked, but when I run the syntax now, it calculates all the labels instead of the actual values.
This is the formula
COMPUTE trajlength = ABS(X_2 - X_1) + ABS(X_3 - X_2) + ABS(X_4 - X_3) + ABS(X_5 - X_4) + ABS(X_6 - X_5) +
ABS(X_7 - X_6) + ABS(X_8 - X_7) + ABS(X_9 - X_8) + ABS(X_10 - X_9) + ABS(X_11 - X_10) +
ABS(X_12 - X_11) + ABS(X_13 - X_12) + ABS(X_14 - X_13) + ABS(X_15 - X_14) + ABS(X_16 - X_15) +
ABS(X_17 - X_16) + ABS(X_18 - X_17) + ABS(X_19 - X_18) + ABS(X_20 - X_19) + ABS(X_21 - X_20) +
ABS(X_22 - X_21) + ABS(X_23 - X_22) + ABS(X_24 - X_23) + ABS(X_25 - X_24) + ABS(X_26 - X_25) +
ABS(X_27 - X_26) + ABS(X_28 - X_27) + ABS(X_29 - X_28) + ABS(X_30 - X_29) + ABS(X_31 - X_30) +
ABS(X_32 - X_31) + ABS(X_33 - X_32) + ABS(X_34 - X_33) + ABS(X_35 - X_34) + ABS(X_36 - X_35) +
ABS(X_37 - X_36) + ABS(X_38 - X_37) + ABS(X_39 - X_38) + ABS(X_40 - X_39) + ABS(X_41 - X_40) +
ABS(X_42 - X_41) + ABS(X_43 - X_42) + ABS(X_44 - X_43) + ABS(X_45 - X_44) + ABS(X_46 - X_45) +
ABS(X_47 - X_46) + ABS(X_48 - X_47) + ABS(X_49 - X_48) + ABS(X_50 - X_49) + ABS(X_51 - X_50) +
ABS(X_52 - X_51) + ABS(X_53 - X_52) + ABS(X_54 - X_53) + ABS(X_55 - X_54) + ABS(X_56 - X_55) +
ABS(X_57 - X_56) + ABS(X_58 - X_57) + ABS(X_59 - X_58) + ABS(X_60 - X_59) + ABS(X_61 - X_60) +
ABS(X_62 - X_61) + ABS(X_63 - X_62) + ABS(X_64 - X_63) + ABS(X_65 - X_64) + ABS(X_66 - X_65) +
ABS(X_67 - X_66) + ABS(X_68 - X_67) + ABS(X_69 - X_68) + ABS(X_70 - X_69) + ABS(X_71 - X_70) +
ABS(X_72 - X_71) + ABS(X_73 - X_72) + ABS(X_74 - X_73) + ABS(X_75 - X_74) + ABS(X_76 - X_75) +
ABS(X_77 - X_76) + ABS(X_78 - X_77) + ABS(X_79 - X_78) + ABS(X_80 - X_79) + ABS(X_81 - X_80) +
ABS(X_82 - X_81) + ABS(X_83 - X_82) + ABS(X_84 - X_83) + ABS(X_85 - X_84) + ABS(X_86 - X_85) +
ABS(X_87 - X_86) + ABS(X_88 - X_87) + ABS(X_89 - X_88) + ABS(X_90 - X_89) + ABS(X_91 - X_90) +
ABS(X_92 - X_91) + ABS(X_93 - X_92) + ABS(X_94 - X_93) + ABS(X_95 - X_94) + ABS(X_96 - X_95) +
ABS(X_97 - X_96) + ABS(X_98 - X_97) + ABS(X_99 - X_98) + ABS(X_100 - X_99) + ABS(X_101 - X_100).
So for example, the actual data is -0.002 and -0.004 but because of the labels these are numbered 12 and 14. That means that the outcomes under the new variable are not the actual values and these are actually way higher than should be.
Is there any way that I can remove these labels or do something else so that the formula calculates the right values?
If you edit your post to add your syntax for importing the data from excel it may be possible to correct it there.
If it can't be changed, autorecode is not the right way to turn the data from string into numbers. Try this instead:
alter type X_1 to x_101 (f12.4).
while you're at it heres some shorter code to do what you were doing with the numbers:
compute trajlength=0.
do repeat s1=X_1 to x_100 /s2=X_2 to x_101.
compute trajlength=sum(trajlength, abs(s2-s1)).
end repeat.
NOTES:
I used (f12.4) for the new number format - you should change that according to the actual data.
Using X_1 to x_100 will only work if these variables are consecutive in the file. If they are not you'll have to name them individually.
My program runs successfully and the simulator runs, but all at once end.
An error occurs on this:
mWebView.LoadDataWithBaseURL("", str1, "text/html", "utf-8", null);
WebView mWebView = FindViewById<WebView>(Resource.Id.webView1);
String str1 =
"<html>" +
"<head>" +
"<style type='text/css'>" +
"#font-face {font-family:SFont; src:url('file:///android_asset/fonts/MyFont.TTF');}" +
"#font-face {font-family:TFont; src:url('file:///android_asset/fonts/times.ttf');}" +
"Ptext {" +
"font-family: SFont;" +
"font-size: 19px;" +
"text-align: justify;" +
"}" +
"Etext {" +
"font-family: TFont;" +
"font-size: 13px;" +
"text-align: justify;" +
"}" +
"body {" +
"text-align: justify;" +
"}" +
"</style>" +
"</head>" +
"<body dir = rtl>" +
"<font style='opacity:0.79'>" +
"<font color='white'>" +
"<Ptext>" + "Hello Hello" + " " + "</Ptext>" +
"<Etext>" + "Hello Hello" + "</Ptext>" +
"</font>" +
"</body>" +
"</html>";
mWebView.SetBackgroundColor(Color.ParseColor("#00000000"));
mWebView.LoadDataWithBaseURL("", str1, "text/html", "utf-8", null);
ERRORS:
[ERROR:gl_surface_egl.cc(327)] No suitab EGL configs found.
[ERROR:gl_surface_egl_android.cc(23)] GLSurfaceEGL::InitializeOneOff faild.
[Error:browser_main_lppo.cc(698)]GLSurdace::InitializeOneOff faild
[FATAL:gl_Surface_android.cc(58)] Check failed: kGLImplementationNone != GetGLImplementation()(0 vs. 0)
i send a request http://api-3t.paypal.com/nvp/ but i receive the 10002 error and my api signature and username and password is true
the invoice number is 10 digit number that is created in random C#.
my code is :
string strNVP = "METHOD=DoDirectPayment" +
"&VERSION=" + ApiVersion +
"&PWD=" + ApiPassword +
"&USER=" + ApiUsername +
"&SIGNATURE=" + ApiSignature +
"&PAYMENTACTION=Sale" +
"&IPADDRESS=151.243.189.92" +
"&RETURNFMFDETAILS=0" +
"&CREDITCARDTYPE=" + creditCard.type +
"&ACCT=" + creditCard.number +
"&EXPDATE=" + expirationMonth + "20" + expirationYear +
"&CVV2=" + creditCard.cvv2 +
"&STARTDATE=" +
"&ISSUENUMBER=" +
"&EMAIL=MatinF#outlook.com" +
//the following represents the billing details
"&FIRSTNAME=" + billingFirstName +
"&LASTNAME=" + billingLastName +
"&STREET=" + billingAddress1 +
"&STREET2=" + "" +
"&CITY=" + Address[8].ToString() +
"&STATE=" + stateName +
"&COUNTRYCODE=SW" +
"&ZIP=" + Address[5].ToString() +
"&AMT=" + TotalPrice +//orderdetails.GrandTotal.ToString("0.0")+
"&CURRENCYCODE=SEK" +
"&DESC=Test Sale Tickets" +
"&INVNUM=" + InvoiceNumber;
this appears to be a limit issue based on the recipient, version 122 for doDirect APIs update.
(10002) You've exceeded the receiving limit. This transaction can't be completed
Click here for more info
for some reason i have a dispatch_async thread, and it crashes unless i have a NSLog() method executed in front of it. the block runs a method that retrieves a username from a database.
crash:
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_HIGH, 0), ^{
user_web_communicator *usrWeb = [[user_web_communicator alloc]init];
NSString *author = [usrWeb getUsernameFromID:author_string];
[_author_label setText:[NSString stringWithFormat:#"Author: %#",author]];
});
working:
NSLog(#"Fetching author for id: %#",author_string);
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_HIGH, 0), ^{
user_web_communicator *usrWeb = [[user_web_communicator alloc]init];
NSString *author = [usrWeb getUsernameFromID:author_string];
[_author_label setText:[NSString stringWithFormat:#"Author: %#",author]];
});
error
2013-08-19 13:56:06.149 Poll Me[4995:c07] *** Terminating app due to uncaught exception 'NSInternalInconsistencyException', reason: '{
Rows: AsyncImageView:0x827aa00.minX == 10 + 1*0x827b8d0.marker +
-1*0x8281210.marker + 0.5*0x8281260.marker AsyncImageView:0x827aa00.minY == 27.5 + -1*0x827b910.marker +
-1*0x82812a0.marker + 0.5*0x82812e0.marker + 0.5*0x82823f0.marker + -0.5*AsyncImageView:0x827aa00.Height Regular_Cell:0x827a400.Height == 56 + 1*0x82823f0.marker Regular_Cell:0x827a400.Width == 320 +
1*0x8281ee0.marker Regular_Cell:0x827a400.minX == 0 +
1*0x8281740.marker + -0.5*0x8281ee0.marker
Regular_Cell:0x827a400.minY == 46 + 1*0x8281b30.marker +
-0.5*0x82823f0.marker UILabel:0x827a8e0.Width == 0 + 1*0x827b730.marker + 1*0x827b790.marker + -1*0x827b7d0.marker +
1*UILabel:0x827adc0.Width UILabel:0x827a8e0.minX == 18 +
1*0x827b7d0.marker + 1*0x827b8d0.marker + -1*0x8281210.marker +
0.5*0x8281260.marker + 1*AsyncImageView:0x827aa00.Width UILabel:0x827a8e0.minY == 19.5 + -1*0x827b810.marker +
1*0x827b890.marker + -1*0x827b910.marker + -1*0x82812a0.marker +
0.5*0x82812e0.marker + 0.5*0x82823f0.marker + 0.5*AsyncImageView:0x827aa00.Height + -1*UILabel:0x827a8e0.Height + -1*UILabel:0x827b140.Height UILabel:0x827adc0.minX == 18 + 1*0x827b730.marker + 1*0x827b8d0.marker + -1*0x8281210.marker +
0.5*0x8281260.marker + 1*AsyncImageView:0x827aa00.Width UILabel:0x827adc0.minY == 20.5 + 1*0x827b6f0.marker +
-1*0x82812a0.marker + 0.5*0x82812e0.marker UILabel:0x827b140.minX == 18 + 1*0x827b850.marker + 1*0x827b8d0.marker + -1*0x8281210.marker +
0.5*0x8281260.marker + 1*AsyncImageView:0x827aa00.Width UILabel:0x827b140.minY == 27.5 + 1*0x827b890.marker +
-1*0x827b910.marker + -1*0x82812a0.marker + 0.5*0x82812e0.marker + 0.5*0x82823f0.marker + 0.5*AsyncImageView:0x827aa00.Height + -1*UILabel:0x827b140.Height UITableViewCellContentView:0x827a560.Height == 55 +
1*0x82812e0.marker UITableViewCellContentView:0x827a560.Width == 300
+ 1*0x8281260.marker UITableViewCellContentView:0x827a560.minX == 0 + 1*0x8281210.marker + -0.5*0x8281260.marker
UITableViewCellContentView:0x827a560.minY == 0.5 + 1*0x82812a0.marker
+ -0.5*0x82812e0.marker objective == <> + <750:-1>*0x8280fc0.negError + <250:-1>*0x8280fc0.posErrorMarker + <750:-1>*0x8281030.negError + <250:-1>*0x8281030.posErrorMarker
Constraints: Marker:0x8281210.marker
Marker:0x8281260.marker
Marker:0x82812a0.marker (Integralization adjustment:0.5) Marker:0x82812e0.marker
Marker:0x8281740.marker
Marker:0x8281b30.marker
Marker:0x8280fc0.posErrorMarker
Marker:0x8281030.posErrorMarker
Marker:0x827b6f0.marker
Marker:0x827b730.marker
Marker:0x827b790.marker
Marker:0x827b7d0.marker
Marker:0x827b810.marker
Marker:0x827b850.marker
Marker:0x827b890.marker
Marker:0x827b8d0.marker
Marker:0x827b910.marker
Marker:0x8281ee0.marker
Marker:0x82823f0.marker }: internal
error. Cannot find an outgoing row head for incoming head
0x8280fc0.negError, which should never happen.'
* First throw call stack: (0x195d012 0x166ae7e 0x195cdeb 0xefef89 0xf01fcf 0xf025c7 0xf0d58f 0xf0d6d4 0x7d860a 0x7e02af 0x7e03be
0x2e7601 0x49484e 0x354ced 0x2e940c 0x354a7b 0x359919 0x3599cf
0x3421bb 0x352b4b 0x2ef2dd 0x167e6b0 0x17dfc0 0x17233c 0x172150
0xf00bc 0xf1227 0xf18e2 0x1925afe 0x1925a3d 0x19037c2 0x1902f44
0x1902e1b 0x29be7e3 0x29be668 0x29effc 0x1e5ed 0x1d75) 2013-08-19
13:56:06.149 Poll Me[4995:4f03] * Terminating app due to uncaught
exception 'NSInternalInconsistencyException', reason: '{ Rows:
AsyncImageView:0x827aa00.minX == 10 + 1*0x827b8d0.marker +
-1*0x8281210.marker + 0.5*0x8281260.marker AsyncImageView:0x827aa00.minY == 27.5 + -1*0x827b910.marker +
-1*0x82812a0.marker + 0.5*0x82812e0.marker + 0.5*0x82823f0.marker + -0.5*AsyncImageView:0x827aa00.Height Regular_Cell:0x827a400.Height == 56 + 1*0x82823f0.marker Regular_Cell:0x827a400.Width == 320 +
1*0x8281ee0.marker Regular_Cell:0x827a400.minX == 0 +
1*0x8281740.marker + -0.5*0x8281ee0.marker
Regular_Cell:0x827a400.minY == 46 + 1*0x8281b30.marker +
-0.5*0x82823f0.marker UILabel:0x827a8e0.Width == 0 + 1*0x827b730.marker + 1*0x827b790.marker + -1*0x827b7d0.marker +
1*UILabel:0x827adc0.Width UILabel:0x827a8e0.minX == 18 +
1*0x827b7d0.marker + 1*0x827b8d0.marker + -1*0x8281210.marker +
0.5*0x8281260.marker + 1*AsyncImageView:0x827aa00.Width UILabel:0x827a8e0.minY == 19.5 + -1*0x827b810.marker +
1*0x827b890.marker + -1*0x827b910.marker + -1*0x82812a0.marker +
0.5*0x82812e0.marker + 0.5*0x82823f0.marker + 0.5*AsyncImageView:0x827aa00.Height + -1*UILabel:0x827a8e0.Height + -1*UILabel:0x827b140.Height UILabel:0x827adc0.minX == 18 + 1*0x827b730.marker + 1*0x827b8d0.marker + -1*0x8281210.marker +
0.5*0x8281260.marker + 1*AsyncImageView:0x827aa00.Width UILabel:0x827adc0.minY == 20.5 + 1*0x827b6f0.marker +
-1*0x82812a0.marker + 0.5*0x82812e0.marker UILabel:0x827b140.minX == 18 + 1*0x827b850.marker + 1*0x827b8d0.marker + -1*0x8281210.marker +
0.5*0x8281260.marker + 1*AsyncImageView:0x827aa00.Width UILabel:0x827b140.minY == 27.5 + 1*0x827b890.marker +
-1*0x827b910.marker + -1*0x82812a0.marker + 0.5*0x82812e0.marker + 0.5*0x82823f0.marker + 0.5*AsyncImageView:0x827aa00.Height + -1*UILabel:0x827b140.Height UITableViewCellContentView:0x827a560.Height == 55 +
1*0x82812e0.marker UITableViewCellContentView:0x827a560.Width == 300
+ 1*0x8281260.marker UITableViewCellContentView:0xlibc++abi.dylib: terminate called throwing an exception827a560.minX == 0 +
1*0x8281210.marker + -0.5*0x8281260.marker
UITableViewCellContentView:0x827a560.minY == 0.5 + 1*0x82812a0.marker
+ -0.5*0x82812e0.marker objective == <> + <750:-1>*0x8280fc0.negError + <250:-1>*0x8280fc0.posErrorMarker + <750:-1>*0x8281030.negError + <250:-1>*0x8281030.posErrorMarker
Constraints: Marker:0x8281210.marker
Marker:0x8281260.marker
Marker:0x82812a0.marker (Integralization adjustment:0.5) Marker:0x82812e0.marker
Marker:0x8281740.marker Marker:0x8281b30.marker
Marker:0x8280fc0.posErrorMarker
Marker:0x8281030.posErrorMarker
Marker:0x827b6f0.marker
Marker:0x827b730.marker
Marker:0x827b790.marker
Marker:0x827b7d0.marker
Marker:0x827b810.marker
Marker:0x827b850.marker
Marker:0x827b890.marker
Marker:0x827b8d0.marker
Marker:0x827b910.marker
Marker:0x8281ee0.marker
Marker:0x82823f0.marker }: internal
error. Cannot find an outgoing row head for incoming head
0x8280fc0.negError, which should never happen.'
* First throw call stack: (0x195d012 0x166ae7e 0x195cdeb 0xefef89 0xf01fcf 0xf020d3 0x7d86dc 0x7d9280 0x7dd4a3 0x3f7e3c 0x3f8022
0x3f8064 0x2f33b 0x277553f 0x2787014 0x27782e8 0x2778450 0x92710e72
0x926f8d2a) (lldb)
can you please tell me why this could be happening? i dont want to keep the NSLog() there because it runs it several times and gets in the way when trying to read output. Thank you in advance =)
It is undefined behavior to set the text property of a label in a background thread, or any other UI changes for that matter. Since it is undefined behavior I can not explain why it works with the NSLog but you need to dispatch setting the label's text to the main thread.
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_HIGH, 0), ^{
user_web_communicator *usrWeb = [[user_web_communicator alloc]init];
NSString *author = [usrWeb getUsernameFromID:author_string];
dispatch_sync(dispatch_get_main_queue(), ^{
[_author_label setText:[NSString stringWithFormat:#"Author: %#",author]];
})
});