Variation in BLEU Score - machine-learning

I have some question on BLUE Score calculation for machine translation. I realized they may have a different metrics for BLEU. I found the code reports five value for BLEU, namely BLEU-1, BLEU-2, BLEU-3, BLEU-4 and finally BLEU, which seems to be an exponential average of the previous four BLEUs. Still it is not clear to me what the difference between those is. Do you have any ideas? Thanks
P.s. At first I thought that this question is more of a theoretical content and posted it on meta stackexange. A moderator has closed and commented it as a stackoverflow type question . So please don't punish me again. =)

source: http://www.statmt.org/book/slides/08-evaluation.pdf
I haven't heard of BLEU-1 and BLEU-2 but I guess it means 1-gram, 2-gram, 3-gram and 4-gram in the formula of BLEU score, I mean in the formula precision[i] = BLEU-i in your question:

Actually, BLEU-n doesn't use the n-gram scores only. It computes the 1-gram through n-gram scores and gives them equal weight to compute a final score. See the "Cumulative N-Gram Scores" section at this link for more info.

Related

Do I need to do a Bonferonni correction on a 2x2 chi square analysis?

I'm really hoping someone here can help.
I have performed a chi-square test of independence, looking at men/women and early/late drop out from therapy. I have a p-value of 0.047. Do I need to do any post hoc testing on this? Men drop out almost 50:50 early:late whereas women drop out almost 25:75 early:late. Do I need post hoc testing for this and a Bonferonni correction, or is the answer simply:
The frequency of retention rates was compared across gender, finding a significant interaction (X2 (1) = 3.94, p = 0.047), indicating that females were more likely to be retained past the third CBT session than men.
Any help would be greatly appreciated, stats hurt my head and I can't continue past this problem.
Since there's only one test performed, with a single degree of freedom, there's no way (or need) to do any multiple comparison correction.

How to get probability of topic given a query using Mallet

I want to use Mallet as a part of an expert finding project. I'm almost new to Mallet but I know that it trains topics from a set of the documents. Let's say that I have 50 topics trained by Mallet. I want to calculate this probability: p(topic|q) or either p(q|topic)
q is the query. It's a word (such as algorithm, android and etc) which I'm desired to find the experts in the specified area.
As I read this post : how to get word-topic probability using mallet, One of the users said we can calculate the probability using --word-topic-counts-file option. Let's say that I have generated this file by Mallet. It has the following structure:
0 android 2:21
1 is 3:3
.
.
.
I know the semantic of this structure, But I don't know how can I calculate the probability of topic given query ( i.e. p(topic|q) or either p(q|topic) )
P.S: I use the word "either" because I'm not sure mallet calculates which of them
Any help would be appreciated
Take this example line from GlieBrt's answer to the linked question
1 needham 19:2 17:1
Here p(topic|q) can be calculated as
p(19|needham) = 2/3 = 0.67
and
p(17|needham) = 1/3 = 0.33
With you own example, it is even simpler:
0 android 2:21
p(2|android) = 1.0

Silhoutte coefficient- Information retrieval

I have been trying to get my hands dirty with Information Retrieval.My professor gave us this problem to solve, but I can't get my way around it. The matrix given, if it is a distance matrix, the diagonal elements should all be 0. But in the table, they're given as 1. The other entries are also less than 1. How is this possible? Can someone please explain?
Please see question 5.c. I could not enter the table manually and apologize for that.
In every similarity measurement, 1 means totally similar and 0 means there is no similarity between documents.

Excel finding the code for a cell to be equal or more than another cell to give 1- 5 points

Google Spreadsheet on google drive. I have just made a survey on google drive for people to fill in their football predictions for fun! I can put all of the publics predictions into an excel document in one click but I want the public to earn points when they predict the right result. For example if a friend of mine predicted Arsenal 2 - 1 Liverpool and the result is Arsenal 1 - 0 Liverpool they should earn 1 point for predicting the right winner and if the prediction was equal to Arsenal 2 - 1 Liverpool, the prediction should earn 3 points.
I am struggling with the code for it but here is the closest I have got:
COUNTIF(F5, ʺ>=ʺ & E5)
The code above does not work and I cannot figure it out. Any help will be greatly appreciated as soon as possible. Thanks a lot.
I am sorry that this is not an answer, but I am unable to comment without higher reputation. It would be helpful to have access to a copy or example of what you have, as your cell references mean very little without. Then we can see exactly what you have tried.
EDIT::
Without knowing how your sheet works or is laid out, it is difficult, however I've put this together for you on Google Sheets here
https://docs.google.com/spreadsheets/d/13Ejaddmj2d8cysxdZE_Inro65KDFgzeJW7FqPWwaCpE/edit?usp=sharing
It's a very clunky set of nested IF statements but it does work. I'm sure there is a neater way of doing it.
In case you can't view it, here is the formula
=IF(A2>B2, IF(D2>E2, IF(A2=D2, IF(B2=E2,"3","1"),"1"),"0"), IF(A2=B2, IF(D2=E2, IF(A2=D2, IF(B2=E2,"3","1"),"1"),"0"), IF(D2<E2, IF(A2=D2, IF(B2=E2,"3","1"),"1"),"0")))
In this case, A2 is Arsenal's score, B2 is Liverpool's score, D2 is Fred's Arsenal score prediction and E2 is his Liverpool prediction. I placed the above formula in cell F2 to give Fred's points.
The above code first checks to see
if Arsenal beat Liverpool, if true it then checks
if Fred predicted that Arsenal would beat Liverpool. If true, then it checks
if Fred got Arsenal's score correct. If true, it checks
if Fred got Liverpool's score correct. If that's true,
then Fred got everything right and gets 3 points (and a cookie =]).
If Fred gets the winner right but not the scores, he gets 1 point. Otherwise he gets 0.
This formula also checks to see if Liverpool beat Arsenal and runs the same checks against Fred's prediction, then to see if the teams drew and to see if Fred predicted a draw or not. If he predicts a draw and the correct scores he gets 3 points, otherwise he gets 1 for getting the draw correct.
I believe that this is what you are looking for .. attached are a formula and result view of my answer ;)
formulas
results

Add data series to highlight cases on a box plot (Excel, SPSS or R)

first time user of this forum - guidance on how to provide enough information is very appreciated. I am trying to replicate the presentation of data used in the Medical education field. This will help improve the quality of examiners' marking of trainees in a Clinical Exam. What I would like to communicate will be similar to what is already communicated in the College of General Practitioners regarding one of their own exams, please see www.gp10.com.au/slides/thursday/slide29.pdf to help understand what it is I want to present. I have access to Excel, SPSS and R, so any help with any of these would be great. However as a first attempt I have used SPSS and created 3 variables: dummy variable, a "station score" and a "global rating score"(GRS). The "station score"(ST) is a value between 0 and 10 (non-integers) and is on the y-axis similar to the pdf presentation of "Candidate Final Marks". The x-axis is the "global rating scale", an integer from 1 to 6 and is represented in the pdf as the "Overall Performance Scale". When I use SPSS's boxplot I get a boxplot as depicted.
.
What I would like to do is overlay a single examiners own scoring of X number of examinees. So for one examiner (examiner A) provided the following marks:
ST: 5.53,7.38,7.38,7.44,6.81
GRS: 3,4,4,5,3
(this is transposed into two columns).
Whether it be SPSS, Excel or R how would I be able to overlay the box and whisker plots with the individual data points provided by the one examiner? This would help show the degree to which the examiners' marking styles are in concordance with the expected distribution of ST scores across GRS. Any help greatly appreciated! I like Excel graphics but I have found it very difficult to work with when choosing the examiners' data as a separate series - somehow the examiners' GRS scores do not line up nicely on the x-axis. I am very new to R but am also very interested in R, and would expend time to get a good result in R if a good result is viable. I understand JMP may be preferable for this type of thing but access to this may not be possible.

Resources