I went through numerous online sources on lambda calculus searching for the difference between beta reduction and single step beta reduction. But all that I know till now is that beta reduction is defined as :
(/x.L)M --> {M/x}L
and following definition of 1 step beta reduction:
Can someone please clear the difference between these two things with some example.They seem to be equivalent to me. Also then there is n step beta reduction which I understood it as being inductively applied 1 step beta reduction. But as the difference between beta reduction and single step beta reduction is not clear, I feel helpless. Thanks in advance.
I would think that beta reduction can designate both single and multi step beta reductions.
I can say that beta reduction can yield /z.a from (/x./y./z.x) a b, but I cannot say that a single step beta reduction can do that.
The rest of what you said is correct.
Related
I have a sample with 10 000 observations and I would like to test the normality of the distribution of the variables in this sample in order to work on Z-scores. Shapiro-Wilk and Kolmogorov-Smirnov tests seem to reach their limits on such a large sample. I drawn qqplots but I wonder if it could be sufficient ?
Thanks for your answers !
Claire
I saw machine learning class videos of course 10-701 year 2011 by Tom Mitchell at CMU. He was teaching on topic Maximum Likelihood Estimation when he used Beta distribution as prior on theta, I wonder he chose that only?
In this lecture, prof Mitchell gives an example of coin flipping and estimating its fairness, i.e. the probability of heads - theta. He reasonably chose a binomial distribution for this experiment.
The reason to choose beta distribution for prior is to simplify the math when computing the posterior. This works well, because beta is a conjugate prior for binomial - at the very end of the same lecture the prof mentions it. This doesn't mean that one can't possibly use any other prior, e.g. normal, Poisson, etc. But other priors lead to complicated posterior distributions, which are hard to optimize, calculate the integral, etc.
This is a general principle: prefer a conjugate prior to more complex distributions even if it doesn't fit the data exactly, because the math is simpler.
I am using Azure Machine Learning to build a model which will predict if a project will be approved (1) or not (0).
My dataset is composed of a list of projects. Each line represents a project and its details - starting day, theme, author, place, people involved, stage, date of last stage and approved.
There are 15 possible crescent stages a project can pass through before being approved. On the other hand, in some special cases, a project can be approved mid-way, that is, before getting to the last stage which is the most commom
I will be receiving daily updates on some projects, as well as, new projects that are coming in. I am trying to build a model which will predict the probability of it being approved or not based on my inputs (which will inclue stage).
I want to use stage as an input, but if I use it with a two-class boosted decision tree it will indirectly give the answer to my model.
I've read a little bit about HMM and tried to learn how to apply to my ML model but did not understand how to. Could anyone guide me to the right path, please? Should I really use HMM?
rather than stage, I would recommend to use duration in last stage, duration in last stage-1, duration in last stage -2
recently I came across this term,but really have no idea what it refers to.I've searched online,but with little gain.
Thanks.
Take a sample of the time of day that you wake up on Saturdays. Some Friday nights you have a few too many drinks, so you wake up early (but go back to bed). Other days you wake up at a normal time. Other days you sleep in.
Here are the results:
[3.1, 4.8, 6.3, 6.4, 6.6, 7.3, 7.5, 7.7, 7.9, 10.1]
What is the mean time that you wake up?
Well it's 6.8 (o'clock, or 6:48). A touch early for me.
How good a prediction is this of when you'll wake up next Saturday? Can you quantify how wrong you are likely to be?
It's a pretty small sample, and we're not sure of the distribution of the underlying process, so it might not be a good idea to use standard parametric statistical techniques†.
Why don't we take a random sample of our sample, and calculate the mean and repeat this? This will give us an estimate of how bad our estimate is.
I did this several times, and the mean was between 5.98 and 7.8
This is called the bootstrap, and it was first mentioned by Bradley Efron in 1979.
A variant is called the jackknife, where you sample all but one of your dataset, take the mean, and repeat. The jackknife mean is 6.8 (same as the arithmetic mean) and ranges from 6.4 to 7.2.
Another variant is called k-fold cross-validation, where you (at random) split your data set into k equally-sized sections, calculate the mean of all but one section, and repeat k times. The 5-fold cross-validation mean is 6.8 and ranges from 4 to 9.
† This distribution does happen to be Normal. The 95% confidence interval of the mean is 5.43 to 8.11, reasonably close but bigger than the bootstrap mean.
If you don't have enough data to train your algorithm you can increase the size of your training set by (uniformly) randomly selecting items and duplicating them (with replacement).
In machine learning bootstrapping is iterative training on a known set. http://en.wikipedia.org/wiki/Bootstrapping_(machine_learning)
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
What can we do more with qubits than normal bits, and how do they work? I read about them some time ago, and it appears that qubits can store not just 0 or 1, but also 0 and 1 at the same time. I don't really understand how they work. Can someone please explain this to me?
What are their pros and cons, and what impact will they have on programming languages like C after quantum computers are actually invented?
How would we manage memory when a bit (which is also a quantum) can take multiple values at once? How can we determine if something is true or false, when there is more than just 1 and 0?
Any "classical" (as it will be called once the technology is in wider use) problem which is solved by "classical" code can be solved using some sort of quantum processor by transforming the problem. For example, to do a database search, instead of using an index-based search/binary search, or a linear search for an unsorted database, you can use Grover's algorithm. Also, to take a step back from the previous poster's mention of BQP problems, problems with a classical "solution" that runs in NP-time can be sped up considerably by Grover's algorithm (a speedup in the time to search through every possible solution). RSA cryptography is also made much more insecure by the advent of Shor's algorithm, since it makes factorising large numbers into their prime factors (the hinge upon which RSA sits) solvable in logarithmic time.
EDIT: Shor's algorithm actually runs in O((log N)^3), which is polynomial-over-logarithmic time.
The conclusion of this sort of thing is that pre-existing programming languages like C will not be able to be used on a quantum computer due to the nature of quantum algorithms (applying certain functions to quantum states), unless someone invents a way to map quantum gates and so forth to logical gates (EDIT: This has apparantly been mostly addressed here), in which case about all we get is a very very fast logical processor when using languages like C.
PS: I'm sure there'll be OpenGL bindings for quantum computing eventually :P
If we can make a working quantum computer (still an open question) then it can efficiently solve certain algorithmic problems that (we think) a classical computer cannot efficiently solve. These are the problems in the complexity class BQP but not in P. One big one is integer factorization. As Will A mentioned, if you can factor enormous integers quickly, you can break a lot of modern ciphers.
The catch is that nobody knows for sure if BQP is actually "bigger" than P — it might be that anything a quantum computer can do quickly, so can a classical computer.
We also don't know if BQP is as big as NP — for instance, nobody has found an efficient way to solve the Traveling Salesman Problem on a quantum computer. This is a common misconception about quantum computers. They might be able to do NP-complete problems quickly, and then again they might not. Nobody knows.
http://scottaaronson.com/blog/?p=208 be good readin' on this topic (as is the rest of the blog).
Regarding what can be solved with quantum computers: A quantum computer would break current asymmetric encryption schemes. It is a common misconception, that quantum computers can solve most optimization problems. They cannot. See
this article for more details what can be solved using quantum computers and what cannot.
qubits doesn't store 0 and 1 simultaneously, actually they are made from the superposition of the 0 and 1 at a time.
so if a normal bit can represent 0 or 1 at a time, but qubits contain 0 and 1 at a time. three normal bits can store any one of the following....
000,001,010,...,111. but qubit can represent all of them at a time(which are in superposition). so a 'n' qubits store 2^n bits simultaneously!
Suppose a qubit an electron and it spins just like dipole momentum particle and when it spins it create an amplitude of multiple intensity and frequencies that minor amplitude can create spin vibration or momentum of particle that momentum can store thousand bits of information !!! (that's called quantum information processing) which is future !