Intelligent code-completion? Is there AI to write code by learning? - machine-learning

I am asking this question because I know there are a lot of well-read CS types on here who can give a clear answer.
I am wondering if such an AI exists (or is being researched/developed) that it writes programs by generating and compiling code all on it's own and then progresses by learning from former iterations. I am talking about working to make us, programmers, obsolete. I'm imagining something that learns what works and what doesn't in a programming languages by trial and error.
I know this sounds pie-in-the-sky so I'm asking to find out what's been done, if anything.
Of course even a human programmer needs inputs and specifications, so such an experiment has to have carefully defined parameters. Like if the AI was going to explore different timing functions, that aspect has to be clearly defined.
But with a sophisticated learning AI I'd be curious to see what it might generate.
I know there are a lot of human qualities computers can't replicate like our judgement, tastes and prejudices. But my imagination likes the idea of a program that spits out a web site after a day of thinking and lets me see what it came up with, and even still I would often expect it to be garbage; but maybe once a day I maybe give it feedback and help it learn.
Another avenue of this thought is it would be nice to give a high-level description like "menued website" or "image tools" and it generates code with enough depth that would be useful as a code completion module for me to then code in the details. But I suppose that could be envisioned as a non-intelligent static hierarchical code completion scheme.
How about it?

Such tools exist. They are the subject of a discipline called Genetic Programming. How you evaluate their success depends on the scope of their application.
They have been extremely successful (orders of magnitude more efficient than humans) to design optimal programs for the management of industrial process, automated medical diagnosis, or integrated circuit design. Those processes are well constrained, with an explicit and immutable success measure, and a great amount of "universe knowledge", that is a large set of rules on what is a valid, working, program and what is not.
They have been totally useless in trying to build mainstream programs, that require user interaction, because the main item a system that learns needs is an explicit "fitness function", or evaluation of the quality of the current solution it has come up with.
Another domain that can be seen in dealing with "program learning" is Inductive Logic Programming, although it is more used to provide automatic demonstration or language / taxonomy learning.

Disclaimer: I am not a native English speaker nor an expert in the field, I am an amateur - expect imprecisions and/or errors in what follow. So, in the spirit of stackoverflow, don't be afraid to correct and improve my prose and/or my content. Note also that this is not a complete survey of automatic programming techniques (code generation (CG) from Model-Driven Architectures (MDAs) merits at least a passing mention).
I want to add more to what Varkhan answered (which is essentially correct).
The Genetic Programming (GP) approach to Automatic Programming conflates, with its fitness functions, two different problems ("self-compilation" is conceptually a no-brainer):
self-improvement/adaptation - of the synthesized program and, if so desired, of the synthesizer itself; and
program synthesis.
w.r.t. self-improvement/adaptation refer to Jürgen Schmidhuber's Goedel machines: self-referential universal problem solvers making provably optimal self-improvements. (As a side note: interesting is his work on artificial curiosity.) Also relevant for this discussion are Autonomic Systems.
w.r.t. program synthesis, I think is possible to classify 3 main branches: stochastic (probabilistic - like above mentioned GP), inductive and deductive.
GP is essentially stochastic because it produces the space of likely programs with heuristics such as crossover, random mutation, gene duplication, gene deletion, etc... (than it tests programs with the fitness function and let the fittest survive and reproduce).
Inductive program synthesis is usually known as Inductive Programming (IP), of which Inductive Logic Programming (ILP) is a sub-field. That is, in general the technique is not limited to logic program synthesis or to synthesizers written in a logic programming language (nor both are limited to "..automatic demonstration or language/taxonomy learning").
IP is often deterministic (but there are exceptions): starts from an incomplete specification (such as example input/output pairs) and use that to constraint the search space of likely programs satisfying such specification and then to test it (generate-and-test approach) or to directly synthesize a program detecting recurrences in the given examples, which are then generalized (data-driven or analytical approach). The process as a whole is essentially statistical induction/inference - i.e. considering what to include into the incomplete specification is akin to random sampling.
Generate-and-test and data-driven/analytical§ approaches can be quite fast, so both are promising (even if only little synthesized programs are demonstrated in public until now), but generate-and-test (like GP) is embarrassingly parallel and then notable improvements (scaling to realistic program sizes) can be expected. But note that Incremental Inductive Programming (IIP)§, which is inherently sequential, has demonstrated to be orders of magnitude more effective of non-incremental approaches.
§ These links are directly to PDF files: sorry, I am unable to find an abstract.
Programming by Demonstration (PbD) and Programming by Example (PbE) are end-user development techniques known to leverage inductive program synthesis practically.
Deductive program synthesis start with a (presumed) complete (formal) specification (logic conditions) instead. One of the techniques leverage automated theorem provers: to synthesize a program, it constructs a proof of the existence of an object meeting the specification; hence, via Curry-Howard-de Bruijn isomorphism (proofs-as-programs correspondence and formulae-as-types correspondence), it extracts a program from the proof. Other variants include the use of constraint solving and deductive composition of subroutine libraries.
In my opinion inductive and deductive synthesis in practice are attacking the same problem by two somewhat different angles, because what constitute a complete specification is debatable (besides, a complete specification today can become incomplete tomorrow - the world is not static).
When (if) these techniques (self-improvement/adaptation and program synthesis) will mature, they promise to rise the amount of automation provided by declarative programming (that such setting is to be considered "programming" is sometimes debated): we will concentrate more on Domain Engineering and Requirements Analysis and Engineering than on software manual design and development, manual debugging, manual system performance tuning and so on (possibly with less accidental complexity compared to that introduced with current manual, not self-improving/adapting techniques). This will also promote a level of agility yet to be demonstrated by current techniques.

Related

Does lean enhance proof surveyability?

By proof-surveyability I understand the fact that a human user could "trace" all the details of a proof. There are things that are not easily traceable. For instance, an SMT proof is based on specific heuristics that are then translated into the prover. In that situations, it may be useful to have easy mechanisms (not need to be expert to have them at your disposal) to scan why the proof failed or examine the internal structures of the proof procedure.
I was wondering if Lean enhances this kind of proof surveyability in contrast to Coq or Isabelle. I get the impression that this may be the case skimming through A Metaprogramming Framework for Formal Verification.
If I understand proof-surveyability or -traceability correctly, then by definition, a fully detailed proof is "100% traceable", whereas just stating the result (e.g. a lemma) is "0% traceable".
In that case, I don't see why Lean would improve over Coq or Isabelle, or any other tool whose core purpose is to check correctness of a fully detailed proof. Such tools often provide means to increase automation, which is convenient, but arguably reduces traceability, depending on how the additional proof steps are represented. E.g. a Coq-like tactic can increase automation, but traceability can be "recovered" because the steps the tactic infers can be represented in the same way the explicitly provided steps are represented: as proof rule applications or deduction steps.
The latter part is difficult for SMT-inferred proof steps: SMT solvers can achieve a much higher degree of automation, compared to proof checkers such as Coq, but at the expense of traceability, because its "reasoning" is much more technical and less human-like/deductive.
As a side remark: this difference between proof checkers and SMT solvers reminds me of the difference between classical and AI-based image recognition. The former is less automated/efficient, but easier to trace/explain.

TML(Tractable Markov Logic) is a wonderful model! Why I haven't seen it being used for a wide of application scenarios of artificial intelligence?

I have been reading papers about the Markov model, suddenly a great extension like TML(Tractable Markov Logic) coming out.
It is a subset of Markov logic, and uses probabilistic class and part hierarchies to control complexity.
This model has both complex logical structure and uncertainty.
It can represent objects, classes, and relations between objects, subject to certain restrictions which ensure that inference in any model built in TML can be queried efficiently.
I am just wondering why such a good idea not widely spreading around the area of application scenarios like activity analysis?
More info
My understanding is that TML is polynomial in the size of the model, but the size of the model needs to be compiled to a given problem and may become exponentially large. So, at the end, it's still not really tractable.
However, it may be advantageous to use it in the case that the compiled form will be used multiple times, because then the compilation is done only once for multiple queries. Also, once you obtain the compiled form, you know what to expect in terms of run-time.
However, I think the main reason you don't see TML being used more broadly is that it is just an academic idea. There is no robust, general-purpose system based on it. If you try to work on a real problem with it, you will probably find out that it lacks certain practical features. For example, there is no way to represent a normal distribution with it, and lots of problems involve normal distributions. In such cases, one may still use the ideas behind the TML paper but would have to create their own implementation that includes further features needed for the problem at hand. This is a general problem that applies to lots and lots of academic ideas. Only a few become really useful and the basis of practical systems. Most of them exert influence at the level of ideas only.

Ordering movie tickets with ChatBot

My question is related to the project I've just started working on, and it's a ChatBot.
The bot I want to build has a pretty simple task. It has to automatize the process of purchasing movie tickets. This is pretty close domain and the bot has all the required access to the cinema database. Of course it is okay for the bot to answer like “I don’t know” if user message is not related to the process of ordering movie tickets.
I already created a simple demo just to show it to a few people and see if they are interested in such a product. The demo uses simple DFA approach and some easy text matching with stemming. I hacked it in a day and it turned out that users were impressed that they are able to successfully order tickets they want. (The demo uses a connection to the cinema database to provide users all needed information to order tickets they desire).
My current goal is to create the next version, a more advanced one, especially in terms of Natural Language Understanding. For example, the demo version asks users to provide only one information in a single message, and doesn’t recognize if they provided more relevant information (movie title and time for example). I read that an useful technique here is called "Frame and slot semantics", and it seems to be promising, but I haven’t found any details about how to use this approach.
Moreover, I don’t know which approach is the best for improving Natural Language Understanding. For the most part, I consider:
Using “standard” NLP techniques in order to understand user messages better. For example, synonym databases, spelling correction, part of speech tags, train some statistical based classifiers to capture similarities and other relations between words (or between the whole sentences if it’s possible?) etc.
Use AIML to model the conversation flow. I’m not sure if it’s a good idea to use AIML in such a closed domain. I’ve never used it, so that’s the reason I’m asking.
Use a more “modern” approach and use neural networks to train a classifier for user messages classification. It might, however, require a lot of labeled data
Any other method I don’t know about?
Which approach is the most suitable for my goal?
Do you know where I can find more resources about how does “Frame and slot semantics” work in details? I'm referring to this PDF from Stanford when talking about frame and slot approach.
The question is pretty broad, but here are some thoughts and practical advice, based on experience with NLP and text-based machine learning in similar problem domains.
I'm assuming that although this is a "more advanced" version of your chatbot, the scope of work which can feasibly go into it is quite limited. In my opinion this is a very important factor as different methods widely differ in the amount and type of manual effort needed to make them work, and state-of-the-art techniques might be largely out of reach here.
Generally the two main approaches to consider would be rule-based and statistical. The first is traditionally more focused around pattern matching, and in the setting you describe (limited effort can be invested), would involve manually dealing with rules and/or patterns. An example for this approach would be using a closed- (but large) set of templates to match against user input (e.g. using regular expressions). This approach often has a "glass ceiling" in terms of performance, but can lead to pretty good results relatively quickly.
The statistical approach is more about giving some ML algorithm a bunch of data and letting it extract regularities from it, focusing the manual effort in collecting and labeling a good training set. In my opinion, in order to get "good enough" results the amount of data you'll need might be prohibitively large, unless you can come up with a way to easily collect large amounts of at least partially labeled data.
Practically I would suggest considering a hybrid approach here. Use some ML-based statistical general tools to extract information from user input, then apply manually built rules/ templates. For instance, you could use Google's Parsey McParseface to do syntactic parsing, then apply some rule engine on the results, e.g. match the verb against a list of possible actions like "buy", use the extracted grammatical relationships to find candidates for movie names, etc. This should get you to pretty good results quickly, as the strength of the syntactic parser would allow "understanding" even elaborate and potentially confusing sentences.
I would also suggest postponing some of the elements you think about doing, like spell-correction, and even stemming and synonyms DB - since the problem is relatively closed, you'll probably have better ROI from investing in a rule/template-framework and manual rule creation. This advice also applies to explicit modeling of conversation flow.

What is Function Point Analysis?

What does function Point Analysis Mean?
is it that its used for cost estimation of a software? or are there any proper definition that would define function Point Analysis?
Can you please give me a short description on it.
While I agree with Leo's answer, I'll try a more practical description:
What it is
Function Point Analysis (FPA) is one of currently five standards for Functional Sizing (see ISO/IEC 14143) as approved by ISO. FPA is actually the widely used short term for the ISO/IEC 20926 standard titled "IFPUG Functional Size Measurement".
FPA is a means to rate (the term 'measure' is actually misleading) the amount of functional requirements to software. To achieve this rating, a technique is used that was known as 'functional decomposition' in earlier times. This concept is in fact very close to describing requirements with 'use cases', even though the detailed rules and notations are quite different.
In short, the functional requirements are decomposed into 'elementary functions', which then are rated each with a point value. The total of points for all elementary functions is used as an indication of the 'size' or amount of requirements. This is called the 'functional size' expressed in the unit of 'function points' (fp).
The natural representation of a functional decomposition is the functional tree.
The FPA standard also has a set of rules for rating changes to existing applications, thus it can be used to rate the functional requirements for the adaption of extension of existing systems ('enhancements' or 'releases').
What it is not
FPA is not an effort estimation technique by itself. Obviously, the relation between the size of functional requirements and the implementation effort can be and often is rather loose. Function points can be used as (one) input to more complex estimation models (such as COCOMO), which have to take into account all other effort drivers.
FPA is not a 'software metric' - functional size is always related to the user requirements fulfilled by software. While you can count and measure lines of code or code complexity, functional size is the result of an analytical process.
When to use it
FPA can be helpful to estimate the effort for a software project in an early stage, when the requirements are known, but the details of implementation have not yet been specified or evaluated. The functional requirements are reflected in the functional size, the non-functional requs need to be input in an estimation model. You need to have/use a good and proven (and trusted) model, otherwise the functional size is useless for this purpose.
FPA can also help to rate the 'value' of an application in the sense of 'recovery costs'.
Eventually, in the context of IT client/vendor relationships, FPA can be used as a basis for pricing. Clients are invoiced based on an agreed 'price per fp' instead of an hourly rate.
When not to use it
By definition, FPA requires a basic understanding of the functional requirements. Thus, if you do not have or know the functional requirements, it will be difficult if not impossible to use FPA.
FPA is also not suited to rate the performance of individuals, as it is a rather holistic rating for an application and cannot to be used to size only parts of it.
the authoritative answer, from IFPUG
http://www.ifpug.org/about-ifpug/about-function-point-analysis/
Function Point Analysis (FPA) is a sizing measure of clear business
significance. First made public by Allan Albrecht of IBM in 1979, the
FPA technique quantifies the functions contained within software in
terms that are meaningful to the software users. The measure relates
directly to the business requirements that the software is intended to
address. It can therefore be readily applied across a wide range of
development environments and throughout the life of a development
project, from early requirements definition to full operational use.
Other business measures, such as the productivity of the development
process and the cost per unit to support the software, can also be
readily derived.The function point measure itself is derived in a
number of stages. Using a standardized set of basic criteria, each of
the business functions is a numeric index according to its type and
complexity. These indices are totaled to give an initial measure of
size which is then normalized by incorporating a number of factors
relating to the software as a whole. The end result is a single number
called the Function Point index which measures the size and complexity
of the software product.
In summary, the function point technique provides an objective,
comparative measure that assists in the evaluation, planning,
management and control of software production.
ps. the IFPUG definition is what is taken as certain in the Court here in Brazil, when someone has any kind of dispute about function points (mostly because Government contracts are usually defined in FPs)

How to test an Machine Learning or statistic NLP algorithm implementation pack?

I am working on testing several Machine Learning algorithm implementations, checking whether they can work as efficient as described in the papers and making sure they could offer a great power to our statistic NLP (Natural Language Processing) platform.
Could u guys show me some methods for testing an algorithm implementation?
1)What aspects?
2)How?
3)Do I have to follow some basic steps?
4)Do I have to consider diversity specific situations when using different programming languages?
5)Do I have to understand the algorithm? I mean, does it offer any help if I really know what the algorithm is and how it works?
Basically, we r using C or C++ to implement the algorithm and our working env is Linux/Unix. Our testing methods only focus on black box testing and testing input/output of functions. I am eager to improve them but I dont have any better idea now...
Great Thx!! LOL
For many machine learning and statistical classification tasks, the standard metric for measuring quality is Precision and Recall. Most published algorithms will make some kind of claim about these metrics, or you could implement them and run these tests yourself. This should provide a good indicative measure of the quality you can expect.
When you talk about efficiency of an algorithm, this is usually some statement about the time or space performance of an algorithm in terms of the size or complexity of its input (often expressed in Big O notation). Most published algorithms will report an upper bound on the time and space characteristics of the algorithm. You can use that as a comparative indicator, although you need to know a little bit about computational complexity in order to make sure you're not fooling yourself. You could also possibly derive this information from manual inspection of program code, but it's probably not necessary, because this information is almost always published along with the algorithm.
Finally, understanding the algorithm is always a good idea. It makes it easier to know what you need to do as a user of that algorithm to ensure you're getting the best possible results (and indeed to know whether the results you are getting are sensible or not), and it will allow you to apply quality measures such as those I suggested in the first paragraph of this answer.

Resources