Software development metrics and reporting [closed] - code-coverage

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I've had some interesting conversations recently about software development metrics, in particular how they can be used in a reasonably large organisation to help development teams work better. I know there have been Stack Overflow questions about which metrics are good to use - like this one, but my question is more about which metrics are useful to which stakeholders, and at what level of aggregation.
As an example, my view is that code coverage is a useful metric in the following ways (and maybe others):
For a team's own internal use when combined with other measurements.
For facilitating/enabling/mentoring
teams, where it might be instructive
when considered on a team-by-team
basis as a trend (e.g. if team A and
B have coverage this month of 75 and
50, I'd be more concerned with team A
than B if the previous month they'd
had 80 and 40).
For senior management
when presented as an aggregated
statistic across a number of teams or
a whole department.
But I don't think it's useful for senior management to see this on a team-by-team basis, as this encourages artifical attempts to bolster coverage with tests that merely exercise, rather than test, code.
I'm in an organisation with a couple of levels in its management hierarchy, but where the vast majority of managers are technically minded and able (with many still getting their hands dirty). Some of the development teams are leading the way in driving towards agile development practices, but others lag, and there is now a serious mandate from the top for this to be the way the organisation works. A couple of us are starting a programme to encourage this. In this sort of an organisation, what sort of metrics do you think are useful, to whom, why, and at what level of aggregation?
I don't want people to feel their performance is being assessed based on a metric that they can artificially influence; at the same time, the senior management are going to want some sort of evidence that progress is being made. What advice or caveats can you provide based on experience in your own organisations?
EDIT
We are definitely wanting to use metrics as a tool for organisational improvement not as a tool for individual performance measurement.

A tale from personal experience. Apologies for the length.
A few years ago our development group tried setting "proper" measurable objectives for individuals and team leaders. The experiment lasted for just one year, because hard metrics didn't really work very well for individual objectives (see my question on the subject for some links and further discussion).
Note that I was a team leader, and involved in planning it all with my technical boss and the other team leaders, so the objectives weren't something dictated from on high by clueless upper management -- at the time we really wanted them to work. It is also worth noting that the bonus structure inadvertently encouraged competition between developers. Here are my observations on the things we tried.
Customer-visible issues
In our case, we counted outages on the service we provided to customers. In a shrink-wrapped product it might be the number of bugs reported by customers.
Advantages: This was the only real measure that was visible to upper management. It was also the most objective, being measured outside the development group.
Disadvantages: There weren't that many outages -- just around one per developer for the whole year -- which meant that failing or exceeding the objective was a matter of "pinning blame" for the few outages that did occur in each team. This led to bad feeling and loss of morale.
Amount of work completed
Advantages: This was the only positive measure. Everything else was "we notice when bad things happen," which was demoralising. Its inclusion was also necessary because, without it, a developer who did nothing all year would exceed all the other objectives, which clearly wouldn't be in the interests of the company. Measuring the amount of work completed checked the natural optimism of developers when estimating task size, which was useful.
Disadvantages: The measure of "work completed" was based on estimates provided by the developers themselves (usually a good thing), but making it part of their objectives encouraged gaming of the system to inflate estimates. We had no other viable measure of work completed: I think the only possible valuable way of measuring productivity is "impact on the company bottom line," but most developers are so far removed from direct sales that this is rarely practical at an individual level.
Defects found in new production code
We measured defects introduced into new production code during the year, as it was felt that bugs from previous years should not count against any individual in this year's objectives. Defects spotted by internal quality teams were included in the count even if they didn't impact customers.
Advantages: Surprisingly few. The time lag between the introduction of a defect and its discovery meant that there was really no immediate feedback mechanism to improve code quality. Macro trends at a team level were more useful.
Disadvantages: There was a heavy focus on the negative, since this objective was only invoked when a defect was found and we needed someone to blame for it. Developers were reluctant to record defects they found themselves, and a simple count meant that minor bugs were as bad as severe problems. Since the number of defects per individual was still quite low, the number of minor and severe defects didn't even out as it might with a larger sample. Old defects were not included, so the group's reputation for code quality (based on all bugs found) did not always match the measurable introduced-this-year count.
Timeliness of project delivery
We measured timeliness as the percentage of work delivered to internal QA teams by the stated deadline.
Advantages: Unlike counting defects, this was a measure that was under immediate, direct control of the developers, as they effectively decided when the work was complete. The presence of the objective focused the mind on completing tasks. This helped the team commit to realistic amounts of work, and improved the perception by internal customers of the development group's ability to deliver on promises.
Disadvantages: As the only objective directly under the developers' control, it was maximised at the expense of code quality: on the day of a deadline, given the choice between saying a task is complete or doing further testing to improve confidence in its quality, the developer would choose to mark it complete and hope any resulting bugs never come to the surface.
Complaints from internal customers
To gauge how well developers communicated with internal customers during development and subsequent support of their software, we decided that the number of complaints received about each individual would be recorded. The complaints would be validated by the manager, to avoid any possible vindictiveness.
Advantages: Really nothing I can recall. Measured at a sufficiently large group level it becomes a more useful "customer satisfaction" score.
Disadvantages: Not only highly negative, but also a subjective measure. As with other objectives, the numbers for each individual were around the zero mark, which meant that a single comment about someone could mean the difference between "infinitely exceeded" and "did not meet".
General comments
Bureaucracy: While our task management tools held much of the data for these metrics, there was still quite a lot of manual effort involved to collate it all. The time spent obtaining all the numbers was not enjoyable, generally focused on negative aspects of our work and may not even have been reclaimed by increased productivity.
Morale: For the measures where individuals were blamed for problems, not only did those with "bad" scores feel demotivated, but so did those with "good" scores, as they didn't like the loss in team morale and sometimes felt they were ranked higher not because they were better but because they were luckier.
Summary
So what did we learn from the episode? In later years we tried to re-use some of the ideas but in a "softer" way, where there was less emphasis on individual blame and more on team improvement.
It is impossible to define objectives for individual developers that are objectively measurable, add value to the company and cannot be gamed, so don't bother to try.
Customer issues and defects can be counted at a wider team level, if the location of the defect is unequivocally the responsibility of that team -- that is, you don't ever have to play the "blame game".
Once you measure defects only at the level of responsibility for a code module, you can (and should) measure old bugs as well as new ones, since it is in that group's interest to eliminate all defects.
Measuring defect counts at a group level increases the sample size per group, and so anomalies between minor and severe defects are smoothed out and a simple "number of bugs" measure can mean something, such as to see if you are improving month-on-month.
Include something that upper management care about, because keeping them happy is your primary purpose as a development group. In our case it was customer-visible outages, so even if the measure is sometimes arbitrary or seemingly unfair, if it's what the bosses are measuring then you need take notice too.
Upper management don't need to see metrics they don't have in their own objectives. This way it avoids the temptation to blame individuals for errors.
Measuring timeliness of project delivery did change developer behaviour and put a focus on completing tasks. It improved estimation and allowed the group to make realistic promises. If it were easy to collect the timeliness information then I would consider using it again at a team level to measure improvement over time.
All of this doesn't help when you are required to set measurable objectives for individual developers, but hopefully the ideas will be more useful for team improvement.

The key thing about metrics is knowing what you are using them for. Are you using them as a tool for improvement, a tool for reward, a tool for punishment, etc. It sounds like you're planning to use them as a tool for improvement.
The number one principle when setting metrics is to keep the information relevant so that the person receiving it can use it to make a decision. Most likely a senior manager cannot dictate the micro level of whether you need more tests, less complexity, etc. But a team leader can do that.
Therefore, I don't believe a measure of code coverage is going to be useful to management beyond the individual team. At the macro level, the organisation is probably interested in:
Cost of delivery
Timeliness of delivery
Scope of delivery & external quality
Internal quality won't be high on their list of things to cover off. It's a development team's mission to make it clear that internal quality (maintainability, test coverage, self-documenting code, etc) is a key factor in achieving the other three.
Therefore you should target metrics to more senior managers which cover off those three such as:
Overall Velocity (note that comparing velocity between teams is often artificial)
Expected vs Actual scope delivered to agreed timelines
Number of production defects (possibly per capita)
And measure things like code coverage, code complexity, cut 'n' paste score (code repetition using flay or similar), method length, etc at a team level where the recipients of the information can really make a difference.

A metric is a way of answering a question about a project, team or company. Before you start looking for the answers, you need to decide what questions you want to ask.
Typical questions include:
what is the quality of our code?
is the quality improving or degrading over time?
how productive is the team? Is it improving or degrading?
how effective is our testing?
...and so on.
Each question will require a different set of metrics to answer. Collecting metrics without knowing what questions you want answered is at best a waste of time and at worst counterproductive.
You also need to be aware that there is an 'uncertainty principle' at work - unless you are very careful the act of collecting metrics will change people's behaviour, often in unexpected and sometimes detrimental ways. This is especially so if people believe they are being evaluated on the metrics, or worse still have the metrics tied to some reward or punishment scheme.
I recommend reading Gerald Weinberg's Quality Software Management Vol 2: First Order Measurement. He goes into a lot of detail on software metrics, but says the most important are often what he calls "Zero Order Measurement" - asking people their opinion on how a project is going. All four volumes in the series are expensive and hard to get hold of, but well worth it.

Software writing
What must be optimised?
CPU(s) use, memory(s) use, memory cache(s) use, user time use, code size at run-time, data size at run-time, graphics performance, file access performance, network access performance, bandwidth use, code conciseness and readability, electricity use, (count of) distinct API calls used, (count of) distinct methods and algorithms used, maybe more.
How much must it be optimised?
It must be optimised the minimum reasonable amount (except in areas where surpassing acceptance test criteria is desirable) required to pass acceptance tests, facilitate maintenance, facilitate audit and meet user requirements.
("... for legal/illegal input test data and legal/illegal test events in all test states at all required test data volumes and test request volumes for all current and future test integration scenarios.")
Why the minimum reasonable amount?
Because optimised code is harder to write and so costs more.
What leadership is required?
Coding standards, basic structure, acceptance criteria and guidance on levels of optimisation required.
How can success of software writing be measured?
Cost
Time
Acceptance test passes
Extent to which acceptance tests it is desirable to surpass are surpassed
User approval
Ease of maintenance
Ease of audit
Degree of absence of over-optimisation
What cost/time should be ignored in assessing aggregate performance of programmers?
Wasted cost/time incurred because of requirements (inc architecture) changes
Extra cost/time incurred because of deficiencies in platforms/tools
But this cost/time should be included in assessing aggregate performance of teams (inc architects, managers).
How can success of architects be measured?
Other measures plus:
Instances of "avoiding early" being affected by deficiencies in platforms/tools
Degree of absence of changes in architecture

As I said in What is the fascination with code metrics?, metrics include:
different populations, meaning the scope of interest is not the same for developer or for manager
trends meaning any metrics in itself is meaningless without its associated trend, in order to take the decision to act upon it or to ignore it.
We are using a tool able to provide:
lots of micro-level metrics (interesting for developers), with trends.
lots of rules with multi-level (UI, Data, Code) static analysis capabilities
lots of aggregations rules (meaning those vast number of metrics are condensed in several domains of interests, adequate for higher level of populations)
The result is an analysis which can be drilled-down, from high level aggregation domains (security, architecture, practices, documentation, ...) all the way down to some line of code.
The current feedback is:
project managers can get defensive very quickly when some rules are not respected and make their global note significantly lower.
Each study has to be re-tailored to respect each project quirks.
The benefit is the definition of a contract where exceptions are acknowledged but rules to be respected are defined.
higher levels (IT department, stakeholder) use the global notes just as one element of their evaluation of the progress made.
They will actually look more closely at other elements based on delivery cycles: how often are we able to iterate and put an application into production?, how many errors did we had to solve before that release? (in term of merges, or in term of pre-production environment not correctly setup), what immediate feedbacks are generated by a new release of an application?
So:
which metrics are useful to which stakeholders, and at what level of aggregation
At high level:
the (static analysis) metrics are actually the result of low-level metric aggregations, and organized by domains.
Other metrics (more "operational-oriented", based on the release cycle of the application, and not just on the static analysis of the code) are taken into account
The actual ROI is achieved through other actions (like six-sigma studies)
At lower level:
the static analysis is enough (but has to encompass multi-level tiers applications, with sometimes multi-languages developments)
the actions are piloted by the trends and importance
the study has to be approved/supported by all levels of hierarchy to be accepted/acted upon (in particular, budget for the ensuing refactoring has to be validated)

If you have some Lean background/knowledge, then I would suggest the system that Mary Poppendieck recommends (that I've already mentioned in this previous answer). This system is based on three holistic measurements that must be taken as a package:
Cycle time
From product concept to first release or
From feature request to feature deployment or
From bug detection to resolution
Business Case Realization (without this, everything else is irrelevant)
P&L or
ROI or
Goal of investment
Customer Satisfaction
e.g. Net Promoter Score
The aggregation level is product/project level and I believe that these metrics are helpful for everybody (developers should never forget that they don't write code for fun, they write code to create value and should always keep that in mind).
Teams may (and actually do) use technical metrics to measure quality standards conformance which are integrated in the Definition of Done (as "no increase of the technical debt"). But high quality is not a end in itself, it's just a mean to achieve short cycle time (to be a fast company) which is the real target (with Business Case Realization and Customer Satisfaction).

This is a bit of a side note to the main question, but I had a very similar experience to Paul Stephensons answer above. One thing I would add to that is about collection of data and visibility of metrics.
In our case, the development director was meant to collate a bunch of data from various disparate systems and distribute individual metric results once a month. This often didn't happen, as it was a time consuming job and he was a busy man.
The results of this were:
Unhappy developers, as performance bonuses were based on metrics and people didn't know how they were getting on.
Some time consuming multiple entry of data into various different systems.
If you are going down this route, you need to be sure that all metric data can be collated automatically and is easily visible to those it affects.

One of the interesting approaches that's currently getting some hype is Kanban. It's fairly Agile. What's particularly interesting is that it permits a metric of "work done" to be applied. I havn't used/encountered this in actual practice yet, but I'd like to work towards getting a kanban-ish flow going at my job.

Interestingly I just finished reading PeopleWare, and the authors strongly discourage individual metrics being made visible to superiors (even direct managers), but that aggregate metrics should be very visible.
As far as code specific metrics I think it's good for a team to know the state of the code at the current time, and to know the trends affecting the code as it matures and grows.
The question is obviously not focussed on .NET, but I think the .NET product NDepend has done a lot of work to define and document common metrics that are useful.
The documentation section on metrics is educational reading, even if you're not doing .NET.

Software metrics have been with us for a long time and as best I
can tell nothing to date has emerged individually or in aggregate
that is capable of guiding projects during development. The nut of
the problem is that we want to use objective measures and these
can only measure what has happened,
not what is happening or about to happen.
By the time we have measured, analyzed and interpreted some
series of metrics we are reacting to things that
have already gone wrong, or very occasionally, gone right.
I don't want to underplay the importance of learning from
objective metrics but I do want to
point out that this is a reactive not a pro-active response.
Developing a "confidence index" may be a better way of monitoring
whether project is on-track or headed for trouble. Try
developing a voting system where a reasonable number of
representatives from each project area of interest are asked
to anonymously vote their
confidence from time to. Confidence is voted in two areas:
1) Things are on-track 2) Things will continue to be on-track or get
back on-track.
These are purely subjective measurements from people closest to the
"action".
Feed the results into a Kanban type chart where the
columns represent voting areas and you
should have a pretty good idea where to focus your attention. Use
question 1 to evaluate whether management reacted to the
previous voting cycle appropriately. Use question 2 to identify
where management should focus next.
This idea is based on each of us having a comfort level
within our own area of responsibility. Our confidence level
is a product of experience, knowledge within our
domain of expertise, the number and severity of problems
we are facing, the amount of time we have to accomplish our
tasks, the quality of the information we are working with and
a whole bunch of other factors.
MBWA (Management By Walking Around) is often touted as
one of the most effective tools we have - this is a variation of it.
This technique is not much use at the level of
individual teams because it only reflects the general mood
of the team. Kind of like using someone’s watch to tell them
the time. However, at higher levels of management it should
be quite informative.

Related

Creation of a platform for scientific studies and surveys

I am looking for competent people to create a platform for scientific studies and globalized surveys. I would like each human being to have an identification number (identity verification is important otherwise the study will not be valid) and with this system each person will be able to give their opinion on different issues. The objective is therefore on a large scale, the system will have to be efficient anywhere and with many people connected at the same time.
The data war has already begun. My goal is to counterbalance the power by giving it to humanity. The platform will therefore disturb -> many hacking attacks to be expected and coveted for its precious data that we must protect.
It is a project of the heart, the dimension is not pecuniary even if we will earn money, it is humanistic. Let's stop the mass manipulation, let's put the people with real practical solutions in the spotlight and let's be aware of the real percentages behind an idea.
I'm going on a world tour in a few days, I'd like to take the opportunity to highlight the project around the globe but for that I need the interface to be functional.
I hope you like the idea. I look forward to talking to you about it and finding out how and how quickly it can be done.
Sincerely Louise

Using Fortify Tool in project life cycle

I ran fortify SCA on my source code which is already developed found several issues..How should I go about fixing these issues? What approach should I take? Because while I start fixing the existing vulnerabilities, new vulnerabilities might come up.
If I don't have a proper approach to this ,I might spend a lot of going around in circles.
Please suggest a viable approach I should take.
TLDR: Start with the scariest vulnerability and work your way down the list of horrors.
Fortify provides a generalized ranking of issues called "Fortify Priority Order" which Fortify adjusts with your entries in the project description. If this is a web application, you may prefer to use one of the several OWASP Top 10 attributes. If this is a US Federal government application, you may select the FISMA attribute. Personally I prefer the Fortify Priority because it is colorful: it can be gratifying to reduce the red column of critical issues. Rest assured that management will focus on those eye-catching criticals. Once those baddies are gone, then the bright orange Highs will be grabbing their attention.
Fortify also gives you a filter set to help focus your efforts. These range from the bare essentials in the "Developer View" to the increase number in "Critical Exposure View" to all the gory details in the "Security Auditor View." Although the "Developer View" might be tempting for its brevity, if QA or auditors look at your scans, guess which one they will open up. :-)
Each of these attributes can be further filtered by priority order (critical, high, medium, and low) and/or by category of weakness. If time is short (Isn't it always!), you may consider fixing the "low hanging fruit" that are clearly exploitable and readily remediable, instead of getting bogged down a complex refactoring effort. Finally your security operations center may recommend that you focus on specific weaknesses because that those are the attack vectors they are seeing on your network or hosts.
The timing of re-scanning your code depends. If the application is small and scan can be accomplished quickly, then immediate rescanning will minimize the difficulty of dealing with errors "injected" by you during code remediation. Frequent rescanning reduces the chance that multiple injected errors will interact or otherwise confound one another. Frequent scanning makes it easier to focus on the impacts of your fixes to the issue at hand. However, as the time to scan increases, the delay in getting scan results goes from annoying to impractical. As a consequence of scanning overhead for large and complex applications, most teams will scan once per build; the frequency therefore depends on their build cycle. Managing the number of FPR files you generate with frequent scanning can be a minor nuisance; you probably don't want to clutter your Software Security Center with thousands of FPR files with minor differences and little historical value.
Fortunately Fortify allows you to hide or suppress issues you have determined to be insignificant. (See my answer to HP Fortify — annotating method parameters.) Nevertheless you will still see all the issues you haven't dealt with.

Proving the ROI of a technology?

How does one prove the ROI of a technology to their manager?
The closest thing I have found to a document on how to do this is:
http://www.agilejournal.com/pdf/Finding-ROI-in-Build-Automation.pdf
There are formulas in this document, but I can't really tell if they are just alot of marketing or if they are accurate formulas on how to calculate ROI.
I'm not really trying to calculate the ROI of the build tool in the above paper, I was just trying to calculate the ROI of a simple build tool like ANT.
They don't cut to the meat of the question: the intangible benefits - though they at least try to walk through an example. The formulas are just to get ROI as a nice percentage - if "using build tools" was a stock, how much return would I get on my investment?
Which already shows that the question itself is flawed: An automated build is mainly an instrument to improve quality; improving productivity is usually a secondary concern.
However, this doesn't help when talking to the guys sitting on the money.
Metrics I woud use to analyse effect of a build tool:
Turnaround time from checkin to final media
Number of builds (for testing, for release, ..)
Number of build requested (with faster builds, you can expect an increase in demand)
Number of errors introduced during manual build (assuming you track those)
Number of developers able to publish a release
Estimated resources (time, licences, build server, ..) for implementation and maintenance
Analysis of low-probability, high risk scenarios
Often, an automated build tool pays for itself just by removing a bottleneck: every developer can publish the software, not just John the Builder.
The last point is important (but hardest to give numbers for), as the total cost of bugs doesn't have a normal distribution, but is highly "pareto": a single bug can give you some nasty press, or make key customers switch to competition.
The core argument for maintaining an automated build is that publishing bugs are mostly avoidable.
I can't imagine that there would be any sensible way to accurately measure ROI on developer tools and practices. The only thing I can think of where that would might be possible would be in factory environments where you you might be able to measure productivity and average quality.
I'd suggest doing what everyone else does, which is pick some formulas that will support what you want and then tweak them until the ROI is high enough to justify the investment.
Put the estimate in hours:
Estimate how much time you currently spend, and how much time you'd spend
Put the estimate in customer complaints:
Estimate your current number of bugs. Estimate how many of those bugs the new system would have caught. Find out the percentage of bugs reported by users, and document how many fewer bugs will be user visible.
Add to the hours:
Figure out how long it would take to fix the bugs that would be caught, and tack that onto the hourly estimate.
Add a non-quantifiable "salability".
With the extra time, we build extra features. With fewer bugs, that's fewer demos where sales guys shoot themselves in the foot. How many extra copies of software can we sell if we do this?
The last bit won't be successful; it's there primarily to focus attention on the first two metrics; hours and customer-visible defects.
My advice is to discredit what's there now then offer the alternative.
You could try putting the emphasis on the problems of how you do build and deployment now and win that battle first. Manager's don't want to change something that isn't causing them grief so you need to prove it is going to be a problem if nothing's done.
You should consider: how much time and credibility is lost with bad builds; how many mistakes are made currently; how much manual repetition takes place etc. and try to put metrics or examples of these things.
If you can win the support of your developers then you can also add their approval to the strength of your argument. Another point to make is that good developers like to work with good tools so progressive Management equals motivated developers.
If you win the hearts-and-minds of the developers and your Manager that may mean more than a piece of paper with some numbers on it.

KPI for UX team

I'm going to lead a new User Experience Team and I'm struggling with the definition of KPI for the team. My question is non-technical and I'm asking after what KPI's I should measure.
That's a pretty general question - you probably need to be more specific. However, here are some starting points that you might be able to think about.
Cost of user support - number of user support incidents as a proportion of the user base. Break it down by task, module or some other breakdown that you can link to actual application functionality that could be fixed. Bonus if you can get stats on the rate of effective fixes.
Cost of data fixes - depending on your application, cost of having to fix incorrect data may be a support cost metric that you should track.
Number of mistakes - if you can spot erroneous navigation patterns in web server logs you could try to bring the level of incidence of this down.
Retention or number of aborted attempts. If you have a public web site then you might want to spot the incidence of users losing interest.
These are all questions that you are quite likely to be able to get hard data on (at least for a web application) and they will give you some objective measures that are correlated to real aspects of performance. Notice the correlation to costs and effectiveness (punters leaving the site, cost of support calls). From 'naive' economic perspective these will give you basic quantitative measures about how effective the site is at doing its job (I'm assuming you're working for someone maintaining a web application).
You can probably think of similar measures that are more directly appropriate to your application.
To get at a deeper view, you may want to do some traditional marketing/HCI stuff like surveys or even focus groups or usability testing. Beyond that you are evaluating the underlying strategy of the system which (I guess) is probably starting to fall outside your brief. Start with things that you can measure first.

"Multi-agent computing" in simple terms

I've encountered the term "multi-agent computing" as of late, and I don't quite get what it is. I've read a book about it, but that didn't answer the fundamental question of what an agent was.
Does someone out there have a pointer to some reference which is clear and concise and answers the question without a load of bullshit/marketing speak? I want to know if this is something I should familiarise myself, or whether it's some crap I can probably ignore, because I honestly can't tell.
In simple terms, multiagent research tries to design system composed of autonomous agents. That is, you have a bunch of robots/people/software-agents around, each of which can take its own actions but can only "see" stuff that is around him, how do get the system to behave as you want?
Example,
Given a bunch of robots with limited sensing capabilities, how do you get them to monitor a field for enemies? to find all the mines in a field?
Given a bunch of people, how do you get them to maximize the happiness of the least happy person? without taking away their freedom.
Given a group of people, how do you set up a meeting time(s) that maximizes their happiness? without revealing their private information?
Some of these questions might appear really easy to solve, but they are not.
Multiagent research mixes techniques from game theory, Economics, artificial intelligence, and sometimes even Biology in order to answer these questions.
If you want more details, I have a free textbook that I am working on called Fundamentals of Multiagent Systems.
A multi-agent system is a concept borrowed from AI. It's almost like a virtual world where you have agents that are able to observe, communicate, and react. To give an example, you might have a memory allocation agent that you have to ask for memory and it decides whether or not to give it to you. Or you might have an agent that monitors a web server and restarts it if it hangs. The main goal behind multiagent systems is to have a more Smalltalk-like communication system between different parts of the system in order to get everything to work together, as opposed to more top-down directives that come from a central program.
"Agents" are another abstraction in software design.
As a crude hierarchy;
Machine code, assembly, machine-independent languages, sub-routines, procedures, abstract data types, objects, and finally agents.
As interconnection and distribution become more important in computing, the need for systems that can co-operate and reach agreements with other systems (with different interests) becomes apparent; this is where agents come in. Acting independently agents represent your best interests in their environment.
Other examples of agents:
Space craft control, to make quick decisions when there's no time for craft-ground crew-craft messaging (eg NASA's Deep Space 1)
Air traffic control (Systems over-riding pilots; this is in place in most commercial flights, and has saved lives)
Multi-agent systems are related to;
Economics
Game theory
Logic
Philosophy
Social sciences
I don't think agents are something you should gloss over. There's 2 million hits on google scholar for "multi agent" and more on CiteSeer; it's a rapidly evolving branch of computer science.
There are several key aspects to multi-agent computing, distribution and independence are among them.
Multi-agents don't have to be on different machines, they could as #Kyle says, be multiple processes on a single chip or machine, but they act without explicit centralised direction. They might act in concert, so they have certain synchronisation rules - doing their jobs separately before coming together to compare results, for example.
Generally though the reasoning behind the segmentation into separate agents is to allow for differing priorities to guide each agent's actions and reactions. Perhaps using an economic model to divide up common resources or because the different functions are physically separated so don't need to interact tightly with each other.
<sweeping generalisation>
Is it something to ignore? Well it's not really anything in particular so it's a little like "can I ignore the concept of quicksort?" If you don't understand what quicksort is then you're not going to fail to be a developer because most of your life will be totally unaffected. If you have more understanding of different architectures and models, you'll have more knowledge to deploy in new and unpredictable places.
<sweeping generalisation>
Ten years ago, 'multi-agent systems' (MAS) was one of those phrases that appeared everywhere in the academic literature. These days it is less prevalent, but some of the ideas it represents are really useful in some places. But totally unnecessary in others. So I hope that's clear ;)
It is difficult to say what multi-agent computing is, because the definition of an agent is usually very soft surrounded by markting terms etc. I'll try to explain what is it and where it could be used based on the research of manufacturing systems, which is the area, I am familiar with.
One of the "unsolved" problems of modern manufacturing is scheduling. When the definition of the problem is static, an optimal solution can be found, but in reality, people don't come to work, manufacturing resources fail, computers fail etc. The demand is changing all the time, different products are required (i.e. mass customization of the product - one produced car has air conditioning, the next one doesn't, ...). This all leads to the conclusions that a) manufacturing is very complex, b) static approaches, like scheduling in advance for a week, don't work. So the idea is this: why wouldn't we have intelligent programs representing parts of the systems, working the way out of this mess on their own? These programs are called agents. They should communicate and negotiate amongst themselves and make sure the tasks are done in due time. By using agents we want to lower the complexity of the control system, make it more manageable, enable better human - machine interaction, make it more robust and less error prone and very importantly: make the control system decentralized.
In short: agents are just a concept, but they are a concept everyone can intuitively understand. Code still needs to be written, but it is written in a different way, one abstraction higher than OOP.
There was a time when it was hard to find good material on software agents, primarily because of the perception of marketing potential. The bloom on that rose has diminished so the signal to noise ratio on the Internet has improved vis-a-vie software agents.
Here is a good introduction to software agents on this blog post of an open source project for software agents. The term multi-agent systems just means a system where multiple software agents run and communicate and delegate sub tasks to each other.
According to Jennings and Wooldridge who are 2 of the top Mulit-agent researchers an agent is an object that is reactive to its environment, proactive and social. That is an agent is a piece of software that can react to its environment in real time in a way that is suitable to its own objeective. It is proactive, which means that it doesnt just always wait to be asked to perform a task, if it sees a chance to do something that it feels would be beneficial to its objectives it does it. And that it is social, ie that it can communicate with other Agents, doesnt nessecaily ever have to do any of these things in meeting its own objectives but it should be able to to do these if the situation arose. And thus a multi-agent system is just a collection of these in a distrubuted system that can all communicate and try to perform their own personal goals hat normally lead to an overall achievement of the system goal.
You can find a concentration of white papers concerning agents here.

Resources