I am trying to build an iOS application. In one of the screens the user can type something in a search bar and I have to take same action for different spellings of the same word.
For eg: User can type "elephant" or "alephant" or "elefant". I have to take same action for all these three words.
Is there any library that identifies these words as similar ones ? I cannot use spellchecker as I need this in languages other than english also ..
I did some research and I found that there are some phonetic algorithms like Text::soundex for achieving this on server side. Wondering if any libraries there for iOS ?
Thanks in advance !!
A better alternative to Soundex would be Double Metaphone or, even better, Metaphone 3. You don't say what language you are using, but both of these algorithms are available in C++, C#, and Java
There's no soundex available in for example NSString, but if that's what you want, it's fairly easy to implement. Here's a—albeit horribly formatted—soundex NSString category from CocoaDev.
You could also use the Levenstein Distance algorithm to catch simple spelling errors. Also easy to implement (read the Wikipedia article for the details), but here's a NSString category for that.
Before you use these algorithms, normalize the input. There's the amazing CFStringTransform class in Core Foundation (see this great article about it on NSHipster—especially the last part about normalization) that automatically can transform different language inputs into normalized forms.
I'm trying to do some algorithm comparison for plagiarism. I've found many TEXT comparison for plagiarism.
But in an algorithm it's very different. Let's say that some algorithm uses an huge number of variables, functions and user defined structures. If some guy copy the source code from someone, he'll at least, change the variables and functions names. With an simple text comparison algorithm this difference in functions and variables letters will count as an "difference" making the algorithm gives an "false" for plagiarism.
What I want to do is "generalize" (I don't know if that's the right word) all the variables, functions and user-defined structures names in an C++ source code. So the varibles will be named like "a", "b", the same for functions "... fa(...)", "... fb(...)".
I have the c++ source algorithms in strings variables in PHP to be compared.
I know that many other things should be analysed for an accurate source code comparison, but that will be enough to me.
It's an interesting question. Depending on how complex the algorithm, however, it might be that variable names are what gives the plagiarism away. How many ways can you really code up a tree traversal for example?
I think there was a paper a few years ago on identifying coders through their style - looking at all the little things like whitespace, where {}s are placed, etc. Who knows but maybe that is the way to go, look for a negative match to the student's previous style rather than positive match to the known sources. Saying that, students aren't likely to have developed a very personal coding style at an early stage of learning.
One thought - what language are the examples written in? Can it be compiled? If you compile C and then do a binary comparison on the executables, then will identical programs with different local variable names have the exact same binary? (Global vars and functions wouldn't, though).
I've used MOSS in the past: http://theory.stanford.edu/~aiken/moss/ to detect plagiarized code. Since it works on a semantic level, it will detect the situations you presented above. The tool is language-aware, so comments are not considered in the analysis, and it goes a long way in detecting code that has been modified through simple search-and-replace of variable and/or function names.
Note: I used the tool a few years ago when I taught computer science in grad school, and it worked wonderfully in detecting code that had been yanked from the internet. Here is a well-documented account of similar application: http://fie2012.org/sites/fie2012.org/history/fie99/papers/1110.pdf
If you google "measure software similarity", you should find a few more useful hits: http://www.ics.heacademy.ac.uk/resources/assessment/plagiarism/detectiontools_sourcecode.html
I am searching for information on algorithms to process text sentences or to follow a structure when creating sentences that are valid in a normal human language such as English. I would like to know if there are projects working in this field that I can go learn from or start using.
For example, if I gave a program a noun, provided it with a thesaurus (for related words) and part-of-speech (so it understood where each word belonged in a sentence) - could it create a random, valid sentence?
I'm sure there are many sub-sections of this kind of research so any leads into this would be great.
The field you're looking for is called natural language generation, a subfield of natural language processing
http://en.wikipedia.org/wiki/Natural_language_processing
Sentence generation is either really easy or really hard depending on how good you want the sentences to be. Currently, there aren't programs that will be able to generate 100% sensible sentences about given nouns (even with a thesaurus) -- if that is what you mean.
If, on the other hand, you would be satisfied with nonsense that was sometimes ungrammatical, then you could try an n-gram based sentence generator. These just chain together of words that tend to appear in sequence, and 3-4-gram generators look quite okay sometimes (although you'll recognize them as what generates a lot of spam email).
Here's an intro to the basics of n-gram based generation, using NLTK:
http://www.nltk.org/book/ch02.html#generating-random-text-with-bigrams
This is called NLG (Natural Language Generation), although that is mainly the task of generating text that describes a set of data. There is also a lot of research on completely random sentence generation as well.
One starting point is to use Markov chains to generate sentences. How this is done is that you have a transition matrix that says how likely it is to transition between every every part-of-speech. You also have the most likely starting and ending part-of-speech of a sentence. Put this all together and you can generate likely sequences of parts-of-speech.
Now, you are far from done, this will first of all not offer a very good result as you are only considering the probability between adjacent words (also called bi-grams), so what you want to do is to extend this to look for instance at the transition matrix between three parts-of-speech (this makes a 3D matrix and gives you trigrams). You can extend it to 4-grams, 5-grams, etc. depending on the processing power and if your corpus can fill such matrix.
Lastly, you need to patch up things such as object agreement (subject-verb-agreement, adjective-verb-agreement (not in English though), etc.) and tense, so that everything is congruent.
Yes. There is some work dealing with solving problems in NLG with AI techniques. As far as I know, currently, there is no method that you can use for any practical use.
If you have the background, I suggest getting familiar with some work by Alexander Koller from Saarland University. He describes how to code NLG to PDDL. The main article you'll want to read is "Sentence generating as a planning problem".
If you do not have any background in NLP, just search for the online courses or course materials by Michael Collings or Dan Jurafsky.
Writing random sentences is not that hard. Any parser textbook's simple-english-grammar example can be run in reverse to generate grammatically correct nonsense sentences.
Another way is the word-tuple-random-walk, made popular by the old BYTE magazine TRAVESTY, or stuff like
http://www.perlmonks.org/index.pl?node_id=94856
Is it possible for a person working with statistic to replace his specialized programs by F#? I'm thinking about SAS/SPSS mainly?
Any native support for it in F#?
I am not talking about the trivial things as standard deviation and the likes, but for example item-response modeling.
UPDATE : Dont't let the item-response modeling put you of! I don't even know it, just an example of things I know they do with SPSS to clarify it's about more advanced features.
Short : is there a way to use F# as your main statistical tool and replace SPSS all together?
Sadly, nothing comporable to combination of
R + PostgreSQL + Python/Java/Groovy/Scala/... + VisAD
Of course, there is nice http://www.codeplex.com/vslab instead of gnuplot
and some c# statistics code packaged in http://ta-lib.org/ http://www.alglib.net/
You can use R within F# with the type providers for R
see
http://blogs.msdn.com/b/dsyme/archive/2013/01/30/twelve-type-providers-in-pictures.aspx
and see
http://techblog.bluemountaincapital.com/2012/08/01/announcing-the-f-r-type-provider/
Here at BlueMountain we like to perform statistical analysis of data. The stats package R is great for doing that. We also like to use the data retrieval and processing capabilities of F#. F#’s interactive environment lends itself pretty well to data exploration, and we can also easily access our existing .NET-based libraries. Once we are done, we can build and release production-supportable applications.
Here is something maybe-useful:
http://fsmathtools.codeplex.com/
or
http://mathnetnumerics.codeplex.com/
When I write math in LaTeX I often need to perform simple arithmetic on numbers in my LaTeX source, like 515.1544 + 454 = ???.
I usually copy-paste the LaTeX code into Google to get the result, but I still have to manually change the syntax, e.g.
\frac{154,7}{25} - (289 - \frac{1337}{42})
must be changed to
154,7/25 - (289 - 1337/42)
It seems trivial to write a program to do this for the most commonly used operations.
Is there a calculator which understand this syntax?
EDIT:
I know that doing this perfectly is impossible (because of the halting problem). Doing it for the simple cases I need is trivial. \frac, \cdot, \sqrt and a few other tags would do the trick. The program could just return an error for cases it does not understand.
WolframAlpha can take input in TeX form.
http://blog.wolframalpha.com/2010/09/30/talk-to-wolframalpha-in-tex/
The LaTeXCalc project is designed to do just that. It will read a TeX file and do the computations. For more information check out the home page at http://latexcalc.sourceforge.net/
The calc package allows you to do some calculations in source, but only within commands like \setcounter and \addtolength. As far as I can tell, this is not what you want.
If you already use sage, then the sagetex package is pretty awesome (if not, it's overkill). It allows you get nicely formatted output from input like this:
The square of
$\begin{pmatrix}
1 & 2 \\
3 & 4
\end{pmatrix}$
is \sage{matrix([[1, 2], [3,4]])^2}.
The prime factorization of the current page number is \sage{factor(\thepage)}
As Andy says, the answer is yes there is a calculator that can understand most latex formulas: Emacs.
Try the following steps (assuming vanilla emacs):
Open emacs
Open your .tex file (or activate latex-mode)
position the point somewhere between the two $$ or e.g. inside the begin/end environment of the formula (or even matrix).
use calc embedded mode for maximum awesomeness
So with point in the formula you gave above:
$\frac{154,7}{25} - (289 - \frac{1337}{42})$
press C-x * d to duplicate the formula in the line below and enter calc-embedded mode which should already have activated a latex variant of calc for you. Your buffer now looks like this:
$\frac{154,7}{25} - (289 - \frac{1337}{42})$
$\frac{-37651}{150}$`
Note that the fraction as already been transformed as far as possible. Doing the same again (C-x * d) and pressing c f to convert the fractional into a floating point number yields the following buffer:
$\frac{154,7}{25} - (289 - \frac{1337}{42})$
$\frac{-37651}{150}$
$-251.006666667$
I used C-x * d to duplicate the formula and then enter embedded mode in order to have the intermediate values, however there is also C-x * e which avoids the duplication and simply enters embedded mode for the current formula.
If you are interested you should really have a look at the info page for Emacs Calc - Embedded Mode. And in general the help for the Gnu Emaca Calculator together with the awesome interactive tutorial.
You can run an R function called Sweave on a (mostly TeX with some R) file that can replace R expressions with their results in Tex.
A tutorial can be found here: http://www.scribd.com/doc/6451985/Learning-to-Sweave-in-APA-Style
My calculator can do that. To get the formatted output, double-click the result formula and press ctrl+c to copy it.
It can do fairly advanced stuff too (differentiation, easy integrals (and not that easy ones)...).
https://calculator-algebra.org/
A sample computation:
https://calculator-algebra.org:8166/#%7B%22currentPage%22%3A%22calculator%22%2C%22calculatorInput%22%3A%22%5C%5Cfrac%7B1%2B2%7D%7B3%7D%3B%20d%2Fdx(arctan%20(2x%2B3))%22%2C%22monitoring%22%3A%22true%22%7D
There is a way to do what you want just not quite how you describe.
You can use the fp package (\usepackage[options]{fp}) the floating point package will do anything you want; solving equations, adding dividing and many more. Unfortunately it will not read the LaTeX math you instead have to do something a little different, the documentation is very poor so I'll give an example here.
for instance if you want to do (2x3)/5 you would type:
\FPmul\p{2}{3} % \p is the assignment of the operation 2x3
\FPupn\p{\p{} 7 round} % upn evaluates the assignment \p and rounds to 7dp
\FPdiv\q{\p}{5} % divides the assigned value p by 5 names result q
\FPupn\q{\q{} 4 round} % rounds the result to 4 decimal places and evaluates
$\frac{2\times3}{5}=\FPprint\q$ % This will print the result of the calculations in the math.
the FP commands are always ibvisible, only FPprint prints the result associated with it so your documents will not be messy, FP commands can be placed wherever you wish (not verb) as long as they are before the associated FPprint.
You could just paste it into symbolab which as a bonus has free step by step solutions. Also since symbolab uses mathquill it instantly formats your latex.
Considering that LaTeX itself is a Turing-complete markup language I strongly doubt you can build something like this that isn't built directly into LaTeX. Furthermore, LaTeX math matkup itself has next to no semantic meaning, it merely describes the visual appearance.
That being said, you can probably hack together something which recognizes a non-programmable subset of LaTeX math markup and spits out the result in the same way. If all you're interested in is simple arithmetics with fractions and integers (careful with decimal fractions, though, as they may appear as 3{,}141... in German texts :)) this shouldn't be too hard. But once you start with integrals, matrices, etc. I fear that LaTeX lacks expressiveness to accurately describe your intentions. It is a document preparation system, after all and thus not very suitable as input for computer algebra systems.
Side note: You can switch to Word which has—in its current version—a math markup language which is sufficiently LaTeX-like (by now it even supports LaTeX markup) and yet still Google-friendly for simpler terms:
With the free Microsoft Math add-in you can even let Word calculate expressions in-place:
There is none, because it is generally not possible.
LaTeX math mode markup is presentational markup and there are cases in which it does not provide enough information to calculate the expression.
That was one of the reasons MathML content markup was created and also why MathML is used in Mathematica. MathML actually is sort of two languages in one:
presentation markup
content markup
To accomplish what you are after you'll have to have MathML with comibned presentation and content markup (see MathML spec).
In my opinion your best bet is to use MathML (even if it is verbose) and convert to LaTeX when necessary. That said, I also like LaTeX syntax best and maybe what we need is a compact syntax for MathML (something similar in spirit to RelaxNG compact syntax).
For calculations with LaTeX you can use a CalcTeX package.
This package understand elements of LaTeX language and makes an calculations, for example your problem is avialble on
http://sg.bzip.pl/CalcTeX/examples/frac.tgz
or just please write
\noindent
For calculation please use following enviromentals
$515.1544 + 454$
or
\[ \frac{154.7}{25}-(289-\frac{1337}{42.})
\]
or
\begin{equation}
154.7/25-(289-1337/42.)
\end{equation}
For more info please visite project web site or contact author of this project.
For performing the math within your LaTeX itself, you might also look into the pgfmath package, which is more powerful and convenient than the calc package. You can find out how to use it from Part VI of The TikZ and PGF Packages Manual, which you can find here (version 2.10 currently): http://mirror.unl.edu/ctan/graphics/pgf/base/doc/generic/pgf/pgfmanual.pdf
Emacs calc-mode accepts latex-input. I use it daily. Press "d", followed by "L" to enter latex input mode. Press "'" to open a prompt where you can paste your tex.
Anyone saing it is not possible is wrong.
IIRC Mathematica can do it.
There is none, because it is generally not possible. LaTeX math mode
markup is presentational markup and there are cases in which it does
not provide enough information to calculate the expression.
You are right. LaTeX as it is does not provide enough info to make any calculations.Moreover, it does not represent any information to do it. But nobody prevents to wright in LaTeX format a text that contains such an information.
It is a difficult path, because you need to build a system of rules superimposed on what content ofthe text in Latex format needs to contain that it would be recognizable by your interpreter. And then convince the user that it is necessary to learn, etc. etc...
The easiest way to create a logical and intuitive calculator of mathematical expressions. And the expression is already possible to convert latex. It's almost like what you said. This is implemented in the program which I have pointed to. AnEasyCalc allows to type an expression as you type the plane text in any text editor. It checks, calculates and generate LateX string by its own then. Its very easy and rapid work. Just try and you will see that.
This is not exactly what you are asking for but it is a nice package
that you can include in a LaTeX document to do all kind of operations including arithmetic, calculus and even vectors and matrices:
The package name is "calculator"
http://mirror.unl.edu/ctan/macros/latex/contrib/calculator/calculator.pdf
The latex2sympy2 Python library can parse LaTeX math expressions.
from latex2sympy2 import latex2sympy
tex_str = r"""YOUR TEX MATH HERE"""
tex_str = r"\frac{9\pi}{\ln(12)}+22" # example TeX math
sympy_object = latex2sympy(tex_str)
evaluated_tex = float(sympy_object.evalf())
print(evaluated_tex)
This Python 3 code evaluates 9𝜋/ln(12)+22 (in its LaTeX from above) to 33.37842899841745.
The snippet above only handles basic algebraic simplification (math expressions without variables). Since the library converts LaTeX math to SymPy objects, the above code can easily be tweaked and extended to handle much more complicated LaTeX math (including solving derivatives, integrals, etc...).
The latex2sympy2 library can be installed via the pip command: pip install --user latex2sympy2
<>
try the AnEasyCalc program. It allows to get the latex formula very easy:
http://steamandwater.od.ua/AnEasyCalc/
:)