My 2 tailed sample significance is not showing in paired sample t test result on SPSS - spss

I am trying to do a paired sample t-test analysis with SPSS, but the column that should hold the two-tailed significance is split in two: "1 sided p" and "2 sided p". I do not know how to interpret this result. Please help me out.
I want to either get the result I am looking for, which is the two-tailed significance, or understand how to interpret the results I am getting, which is 1 sided p and 2 sided p. I'm not permitted to add images yet so here is a link to the report:
C:\Users\User\Documents\paired sample t test stack.png
Thank you in advance.
I tried to run the test about 4 times with different variables and tried clicking on other options before running the analysis but the result is the same.

I unfortunately don't have SPSS anymore and cannot see your link, but alas I have looked on YouTube and found a video that shows the output of a paired samples t-test for SPSS. Here is what they have and I have highlighted what I suspect your interpretation issue is:
Basically, SPSS by default gives you the result of a one-tail and two-tail test automatically without really saying which is "correct" (this is what "one-sided" and "two-sided" mean by the way). If you are only interested in testing if there is a significant difference in either direction (two-tailed), then you only use the two-tail test p value. So in your case, just ignore the "one-sided" p-value and use the "two-sided" p value instead.

Related

How to predict a textual field on the basis of input features

I'm stuck with a problem statement of predicting an identifier for a product on the basis of couple of product features. A sample of data available to me looks like the one shown below:
ABC10L 20.0 34 XYZ G345F FG MKD -> 000000DEF_VYA
Here, ABC10L,20.0,34,XYZ,G345F,FG,MKD are the features and 000000DEF_VYA is the unique identifier associated with the product. Initially I tried to formulate this problem as a regression problem but I'm not sure how to generate textual output from my model and what should be my cost function. Also, I'm not sure is regression the right tool to solve the issue here.
Please help in suggesting the right approach and how I may proceed to solve this !!!

Z3: Complex numbers?

I have been searching on whether z3 supports complex numbers and have found the following: https://leodemoura.github.io/blog/2013/01/26/complex.html
The author states that (1) Complex numbers are not yet implemented in Z3 as a built-in (this was written in 2013), and (2) that Complex numbers can be encoded on top of the Real numbers provided by Z3.
The basic idea is to represent a Complex number as a pair of Real numbers. He defines the basic imaginary number with I=(0,1), that is: I means the real part equals 0 and the imaginary part equals 1.
He offers the encoding (I mean, we can test it on our machines), where we can solve the equation x^2+2=0. I received the following result:
sat
x = (-1.4142135623?)*I
The sat result does make sense, since this equation is solvable in the simulation of the theory of complex numbers (as a result of the theory of algebraically closed fields) we have just made. However, the root result does not make sense to me. I mean: what about (1.4142135623?)*I?
I would understand to receive the two roots, but, if only one received, I do not understand why I get the negated solution.
Maybe I misread something or I missed something.
Also, I would like to say if complex numbers have already been implemented built in Z3. I mean, with a standard:
x = Complex("x")
And with tactics of kind of a NCA (from nonlinear complex arithmetic).
I have not seen any reference to this theory in SMT-LIB either.
AFAIK there is no plan to add complex numbers to SMT-LIB. There's a Google group for SMT-LIB and it might make sense to send a post there to see if there is any interest there.
Note, that particular blog post says "find a root"; this is just satisfiability, i.e. it finds one solution, not all of them. (But you can ask for another one by adding an assertion that says x should be different from the first result.)

How to handle homophones in speech recognition?

For those who are not familiar with what a homophone is, I provide the following examples:
our & are
hi & high
to & too & two
While using the Speech API included with iOS, I am encountering situations where a user may say one of these words, but it will not always return the word I want.
I looked into the [alternativeSubstrings] (link) property wondering if this would help, but in my testing of the above words, it always comes back empty.
I also looked into the Natural Language API, but could not find anything in there that looked useful.
I understand that as a user adds more words, the Speech API can begin to infer context and correct for these, but my use case will not work well with this since it will often only want one or two words at most, limiting the effectiveness of context.
An example of contextual processing:
Using the words above on their own, I get these results:
are
hi
to
However, if I put together the following sentence, you can see they are all wrong:
I am too high for our ladder
Ideally, I would either get a list back containing [are, our], [to, too, two], [hi, high] for each transcription segment, or would have a way to compare a string against a function that supports homophones.
An example of this would be:
if myDetectedWord == "to" then { ... }
Where myDetectedWord can be [to, too, two], and this function would return true for each of these.
This is a common NLP dilemma, and I'm not so sure what might be your desired output in this application. However, you may want to bypass this problem in your design/architecture process, if possible and if you could. Otherwise, this problem is to turn into a challenge.
Being said that, if you wish to really get into it, I like this idea of yours:
string against a function
This might be more efficient and performance friendly.
One way, I'd be liking to solve this problem would be though RegEx processing, instead of using endless loops and arrays. You could maybe prototype loops and arrays to begin with and see how it works, then you might want to use regular expression for gaining performance.
You could for instance define fixed arrays in regular expressions and quickly check against your string (word by word, maybe using back-referencing) and you can add many boundaries in your expressions for string processing, as you wish.
Your fixed arrays also can be designed based on probabilities of occurring certain words in certain part of a string. For instance,
^I
vs
^eye
The probability of I being the first word is much higher than that of eye.
The probability of I in any part of a string is higher than that of eye, also.
You might want to weight words based on that.
I'd say the key would be that you'd narrow down your desired outputs as focused as possible and increase accuracy, [maybe even with 100 words if possible], if you wish to have a good/working application.
Good project though, I hope you like/enjoy the challenge.

How to account for variation in spelling (especially for slang) for Word Embeddings/Word2Vec generation using song lyrics?

So I am working on a artist classification project that utilizes hip hop lyrics from genius.com. The problem is these lyrics are user generated, so the same word can be spelled in various different ways, especially if it is slang which is a very common case in hip hop.
I looked into spell correction using hunspell/pyhunspell, but the problem with that is it doesn't fix slang misspellings. I technically could make a mini dictionary with a bunch of misspelled variations but that is effectively useless because there could be a dozen variations of the same word over my (growing) 6000 song corpus.
Any suggestions?
You could try to stem your words. More information on stemming here. This would help grouping together words with close spelling variations.
A popular stemming scheme is the Porter Stemmer, which implementation can be found in most NLP packages, eg. NLTK
I would discard, if possible, short words, or contracted words which somehow are too hard to automatically correct them (conditioned on checking that it won't affect your final result).
For longer words, you may want to use metrics like Levenshtein distance or Jaro similarity. The first one consists of the minimum number of additions, deletes or replaces to convert one candidate word into another. The second one, provides a similar result, between 0 and 1, and putting more emphasis in the last characters of a word.
If you have access to the correct version of your slang word, you could convert the closest candidates to the correct one. Of course, trying not to apply it to different correct words.
If you're working with Python, here some implementations are provided.

What would short versions of given formulas be and how do you do it in general?

I am studying for a test and I would really appreciate some help with this. The instructions are simply to write these formulas in a shorter version. I understand how IF works; if the expression is true, it returns the first value, otherwise the second one (they are separated by semicolons).
Given examples are:
=D$5+IF(C4>1;F8-A12/4-53;F8-A12/4-B12+1)
=IF(C4="test";A4-55+C13;22+(C13+B2+A4))
=C3*IF(AND(C4="P";C5="P");4*A4/(1-m);m*A4-n*A4);
=IF(B3<>0;2*C6;2*IF(B3=2;SIN(omega*t-$A$2);0)
Explanation will be greatly appreciated.
The first formula can be shortened to:
=D$5+F8-A12/4+IF(C4>1;-53;-B12+1)
It's just a matter of factoring out F8-A12 since it is used twice.
To test this, I filled in some numbers into the cells mentioned, and it produced the correct result.
Similarly in the second formula, A4 and C13 are used twice and they could be factored out to be used only once.
In the third formula, everything is getting multiplied by A4 so it could be factored out.
In the fourth formula, it looks like 2 is the only thing that could be factored out.

Resources