Just wondering whether anyone has recommendations for optimisation utilities for Delphi.
eg Simplex, Genetic algorithms etc...
Basically I need to optimise my overarching model as a complete black box function, with input variables like tilt angle or array size, within pre-determined boundaries. Output is usually a smooth curve, and usually with no false summits.
The old NR Pascal stuff is looking a bit dated (no functions as variables etc).
Many thanks, Brian
I found a program, written in Pascal, that simulates the Simplex method. It's a little old, but you may convert it into Delphi. You can find it
here
I hope it's of some use to you.
PS: If you have some cash to spend, try
here
TSimplex Class
https://iie.fing.edu.uy/svn/SimSEE/src/rchlib/mat/usimplex.pas
For Mixed INteger Simplex
TMIPSimplex Class
https://iie.fing.edu.uy/svn/SimSEE/src/rchlib/mat/umipsimplex.pas
User: simsee_svn
Password: publico
Related
I am confused about how linear regression works in supervised learning. Now I want to generate a evaluation function for a board game using linear regression, so I need both the input data and output data. Input data is my board condition, and I need the corresponding value for this condition, right? But how can I get this expected value? Do I need to write an evaluation function first by myself? But I thought I need to generate an evluation function by using linear regression, so I'm a little confused about this.
It's supervised-learning after all, meaning: you will need input and output.
Now the question is: how to obtain these? And this is not trivial!
Candidates are:
historical-data (e.g. online-play history)
some form or self-play / reinforcement-learning (more complex)
But then a new problem arises: which output is available and what kind of input will you use.
If there would be some a-priori implemented AI, you could just take the scores of this one. But with historical-data for example you only got -1,0,1 (A wins, draw, B wins) which makes learning harder (and this touches the Credit Assignment problem: there might be one play which made someone lose; it's hard to understand which of 30 moves lead to the result of 1). This is also related to the input. Take chess for example and take a random position from some online game: there is the possibility that this position is unique over 10 million games (or at least not happening often) which conflicts with the expected performance of your approach. I assumed here, that the input is the full board-position. This changes for other inputs, e.g. chess-material, where the input is just a histogram of pieces (3 of these, 2 of these). Now there are much less unique inputs and learning will be easier.
Long story short: it's a complex task with a lot of different approaches and most of this is somewhat bound by your exact task! A linear evaluation-function is not super-uncommon in reinforcement-learning approaches. You might want to read some literature on these (this function is a core-component: e.g. table-lookup vs. linear-regression vs. neural-network to approximate the value- or policy-function).
I might add, that your task indicates the self-learning approach to AI, which is very hard and it's a topic which somewhat gained additional (there was success before: see Backgammon AI) popularity in the last years. But all of these approaches are highly complex and a good understanding of RL and the mathematical-basics like Markov-Decision-Processes are important then.
For more classic hand-made evaluation-function based AIs, a lot of people used an additional regressor for tuning / weighting already implemented components. Some overview at chessprogramming wiki. (the chess-material example from above might be a good one: assumption is: more pieces better than less; but it's hard to give them values)
The objective-c math library seems pretty basic.
I'm looking for some statistics analysis functions like the Excel function "linest" to retrieve the quadratic or polynomial regressions of a data set with a given order.
Is there any function similar to the "linest" function for objective-c? Or a known statistics library/framework?
I have a hard time to believe I'm the first person to stumble upon this problem in iOS.
I spend several days getting through the math and getting it in code because I couldn't find a math library for iOS with the function I needed. I wouldn't recommend anyone do to that again, it wasn't a walk in the park, so I published my solution on my github. You can find it here:
https://github.com/KingIsulgard/iOS-Polynomial-Regression
It's easy to use, just give the x values and y values of the data and the order of polynomial you want to get and voila, you got it.
Hope this might help some people. Feel free to improve if you can. I'm just happy it finally worked.
The standard math library in general only gives you an interface to the elementary mathematical operations that are implemented in the FPU part of a CPU.
For linear regression you need either your own algorithm, it is not that complicated to implement in a handful of loops, or a dedicated (most likely) statistics library.
Writing your own algorithm for higher order or general regression is simple if a QR decomposition algorithm is available, for instance via bindings for LAPACK or similar. Then to solve
minimize sum (b[0]*f[0](x[k])+...+b[n]*f[n](x[k])-y[k])^2
one has just to construct the matrix [X|Y] where X[k,j]=f[j](x[k]) is the matrix of the values of the ansatz functions and Y[k]=y[k] is the column vector of the values to approximate. Apply the QR algorithm to [X|Y], identify or extract the R factor from its result and solve for b in
R*[b|1]'=0
via back-substitution.
I found this nice paper about color reduction
http://asp.eurasipjournals.com/content/2013/1/95
The approach sounds really interesting and i like to evaluate the algorithm.
Does anyone know if
- Its there any implementation public available ?
- Or is it "easy" to implement it for instance with opencv ( I dont have much experience with opencv, but i m willing to learn it if its necessary )
Regards
The SWT part of this you can find here https://github.com/aperrau/DetectText it detects text regions with SWT. But it works rather slow, more when several seconds per image.
Paper about this implementation is here: http://www.cs.cornell.edu/courses/cs4670/2010fa/projects/final/results/group_of_arp86_sk2357/Writeup.pdf
I have to find similar URLs like
'http://teethwhitening360.com/teeth-whitening-treatments/18/'
'http://teethwhitening360.com/laser-teeth-whitening/22/'
'http://teethwhitening360.com/teeth-whitening-products/21/'
'http://unwanted-hair-removal.blogspot.com/2008/03/breakthroughs-in-unwanted-hair-remo'
'http://unwanted-hair-removal.blogspot.com/2008/03/unwanted-hair-removal-products.html'
'http://unwanted-hair-removal.blogspot.com/2008/03/unwanted-hair-removal-by-shaving.ht'
and gather them in groups or clusters. My problems:
The number of URLs is large (1,580,000)
I don't know which clustering or method of finding similarities is better
I would appreciate any suggestion on this.
There are a few problems at play here. First you'll probably want to wash the URLs with a dictionary, for example to convert
http://teethwhitening360.com/teeth-whitening-treatments/18/
to
teeth whitening 360 com teeth whitening treatments 18
then you may want to stem the words somehow, eg using the Porter stemmer:
teeth whiten 360 com teeth whiten treatment 18
Then you can use a simple vector space model to map the URLs in an n-dimensional space, then just run k-means clustering on them? It's a basic approach but it should work.
The number of URLs involved shouldn't be a problem, it depends what language/environment you're using. I would think Matlab would be able to handle it.
Tokenizing and stemming are obvious things to do. You can then turn these vectors into TF-IDF sparse vector data easily. Crawling the actual web pages to get additional tokens is probably too much work?
After this, you should be able to use any flexible clustering algorithm on the data set. With flexible I mean that you need to be able to use for example cosine distance instead of euclidean distance (which does not work well on sparse vectors). k-means in GNU R for example only supports Euclidean distance and dense vectors, unfortunately. Ideally, choose a framework that is very flexible, but also optimizes well. If you want to try k-means, since it is a simple (and thus fast) and well established algorithm, I belive there is a variant called "convex k-means" that could be applicable for cosine distance and sparse tf-idf vectors.
Classic "hierarchical clustering" (apart from being outdated and performing not very well) is usually a problem due to the O(n^3) complexity of most algorithms and implementations. There are some specialized cases where a O(n^2) algorithm is known (SLINK, CLINK) but often the toolboxes only offer the naive cubic-time implementation (including GNU R, Matlab, sciPy, from what I just googled). Plus again, they often will only have a limited choice of distance functions available, probably not including cosine.
The methods are, however, often easy enough to implement yourself, in an optimized way for your actual use case.
These two research papers published by Google and Yahoo respectively go into detail on algorithms for clustering similar URLs:
http://www.google.com/patents/US20080010291
http://research.yahoo.com/files/fr339-blanco.pdf
I have a set of X-Y values (i.e. a scatter plot) and I want a Pascal routine to generate the coefficients of a Nth order polynomial that fits those points, in the same way that Excel does.
I used David J Taylor's Polyfit example (curvefit.zip), which implements a least squares curve fitting algorithm (also known as linear regression) David's site is here, but keep reading, because my version is better. (See below).
The origin of the algorithms David is using is a book on scientific math for Pascal programmers, Allen Miller's Curve Fitting routine from the book "Pascal Programs For Scientists And Engineers", typed and submitted to MTPUG in Oct. 1982 by Juergen Loewner,
and corrected and adaptated for Turbo Pascal by Jeff Weiss.
You can grab curvefit.zip directly from bitbucket here. (You can clone the sourcecode with Mercurial/TortoiseHG, or download a ZIP from bitbucket)
hg clone https://bitbucket.org/wpostma/curvefit curvefit
It runs in any delphi version 5 and up, Unicode or not, even Delphi 10 Berlin. It has a little chart in the demo, added by me. I also added a way to force the result through the origin, a common technique where you want a best fit on all values, other than the constant term, which should be forced, either to zero, or to some experimentally derived average. A forced "blank subtraction" which is set equal to the average of a series of analytical "zero samples", is common in certain types of analytical chemistry when used with certain types of instrumentation, and in other scientific cases, where it can be more useful than a best-fit, because you may wish to minimize error around the origin more than minimize error across the area of the curve that is farthest from the origin.
I should also clarify that for purposes of linear regression, a "curve" may also be a line, which is the case I needed for analytical chemistry purposes, and that equation for any straight line (y=mx+b) is also called the "calibration curve". A first order curve fit is a line (y = mx +b), a second order curve fit (shown in the picture) is a parabola (y= nX^2 + mX + b). As you might guess, this algorithm scales from first order up to any level you might wish. I haven't tested it above 8 terms though.
Here's a screenshot:
Bitbucket project link:
https://bitbucket.org/wpostma/curvefit/overview
Try TPMath http://tpmath.sourceforge.net/ - I've been using this for years for fitting a hill regression and can recommend it.
Check the functions in Turbo Power's SysTools library, now is open source, it includes math functions in the unit StStat.
Even though you've already awarded an answer, for completeness, I thought I'd add this:
We use SDL Components' Math pack and have been very happy with it.
http://www.lohninger.com/delfcomp.html
It's well thought out, and does exactly what we need.
He's got a variety of other interesting tools on his site.
XlXtrFun is the best curve fitting I know and use, but it is for Excel:
http://www.xlxtrfun.com/XlXtrFun/XlXtrFun.htm