I have read one of erlang's biggest adopters is the telecom industry. I'm assuming that they use it to send binary data between their nodes and provide for easy redundancy, efficiency, and parallelism.
Does erlang actually send just the binary to a central node?
Is it directly responsible for parsing the binary data into actual voice? Or is it fed to another language/program via ports?
Is responsible for the speed in a telephone call, speed as in the delay between me saying something and you hearing it.
Is it possible that erlang is solely used for the ease in parallel behavior and c++ or similar for processing speed in sequential functions?
I can only guess at how things are implemented in actual telecom switches, but I can recommend an approach to take:
First, you implement everything in Erlang, including much of the low-level stuff. This probably won't scale that much since signal processing is very costly. As a prototype however, it works and you can make calls and whatnot.
Second, you decide on what to do with the performance bottlenecks. You can push them to C(++) and get a factor of roughly 10 or you can push them to an FPGA and get a factor of roughly 100. Finally you can do CMOS work and get a factor of 1000. The price of the latter approach is also much steeper, so you decide what you need and go buy that.
Erlang remains in control of the control backplane in the sense of what happens when you push buttons the call setup and so on. But once a call has been allocated, we hand over the channel to the lower layer. ATM switching is easier here because once the connection is set you don't need to change it (ATM is connection-oriented, IP is packet-oriented).
Erlangs distribution features are there primarily for providing redundancy in the control backplane. That is, we synchronize tables of call setups and so on between multiple nodes to facilitate node takeover in case of hardware failure.
The trick is to use ports and NIFs post prototype to speed up the slower parts of the program.
Related
This post on Erlang scalability says there's an overhead for every call, cast or message to a gen_server. How much overhead is it, and what is it for?
The cost that is being referenced is the cost of a (relatively blind) function call to an external module. This happens because everything in the gen_* abstractions are callbacks to externally defined functions (the functions you write in your callback module), and not function calls that can be optimized by the compiler within a single module. A part of that cost is the resolution of the call (finding the right code to execute -- the same reason each "dot" in.a.long.function.or.method.call in Python or Java raise the cost of resolution) and another part of the cost is the actual call itself.
BUT
This is not something you can calculate as a simple quantity and then multiply by to get a meaningful answer regarding the cost of operations across your system.
There are too many variables, points of constraint, and unexpectedly cheap elements in a massively concurrent system like Erlang where the hardest parts of concurrency are abstracted away (scheduling related issues) and the most expensive elements of parallel processing are made extremely cheap (context switching, process spawn/kill and process:memory ratio).
The only way to really know anything about a massively concurrent system, which by its very nature will exhibit emergent behavior, is to write one and measure it in actual operation. You could write exactly the same program in pure Erlang once and then again as an OTP application using gen_* abstractions and measure the difference in performance that way -- but the benchmark numbers would only mean anything to that particular program and probably only mean anything under that particular load profile.
All this taken in mind... the numbers that really matter when we start splitting hairs in the Erlang world are the reduction budget costs the scheduler takes into account. Lukas Larsson at Erlang Solutions put out a video a while back about the scheduler and details the way these costs impact the system, what they are, and how to tweak the values under certain circumstances (Understanding the Erlang Scheduler). Aside from external resources (iops delay, network problems, NIF madness, etc.) that have nothing to do with Erlang/OTP the overwhelming factor is the behavior of the scheduler, not the "cost of a function call".
In all cases, though, the only way to really know is to write a prototype that represents the basic behavior you expect in your actual system and test it.
I'm am currently planning to setup a service that should be (sooner or later) globally available with high demands on availability and fault tolerance. There will be both a high read and hight write ratio and the system should be able to scale on demand.
A more special property of my planned service is, that the data will be extremely bound to a certain geo-location - e.g. in 99.99% of all cases, data meant for a city in the USA will never be queried from Europe (actually even data meant for a certain city will unlikely be queried from the city next to that city).
What I want to minimize is:
Administration overhead
Network latency
Unnecessary data replication (I don't want to have a full replication of the data meant for Europe in USA)
In terms of storage technologies I think that my best storage solution would be cassandra. The options that I see for my use-case are:
Use a completely isolated cassandra cluster per geo-location combined with a manually configured routing service that chooses the right cluster per insert/select query
Deploy a global cluster and define multiple data centers for certain geo-locations to ensure high availability in that regions
Deploy a global cluster without using data centers
Deploy a global cluster without using data centers and manipulate the partitioning to be geo-aware. My plan here is to manipulate the first 3 bits of the partition-key based on the geo-location (e.g. 000: North America, 001: South America, 010: Africa, 011: South/West Europe, etc.) and to assign the remaining bits by using a hash algorithm (similar to cassandras random partitioner).
The disadvantage of solution 1 would probably be a huge administrative overhead and a lot of manual work; the disadvantage of the second solution would be a huge amount of unnecessary data replication; and the disadvantage of the third solution would be a quite high network latency due to random partitioning across the world.
Therefore, in theory, I like solution 4 most. Here I would have a fair amount of administrative overhead, a low amount of unnecessary data replication and a decent availability. However, to implement this (as far as I know) I will need a ByteOrderPartitioning, which is highly disrecommended from many sources.
Is there a way to implement a solution close to solution 4 without using ByteOrderPartitioning, or is this a case where ByteOrderPartitioning could make sense or am I missing one obvious fifth solution?
Reconsider option 2.
Not only will it solve your problems. It will even solve geo-redundancy for you. As you mentioned you need to have high availability. Having one copy in a different datacenter sounds good in case that one of the datacenters dies.
If you are dead set on refraining from replication between DCs, then thats an option too. You can have multiple DCs over different regions without replicating between them.
We have a few non-erlang-connected clusters in our infrastructure and currently use term_to_binary to encode erlang terms for messages between the clusters. On the receiving side we use binary_to_term(Bin, [safe]) to only convert to existing atoms (should there be any in the message).
Occasionally (especially after starting a new cluster/stack), we run into the problem that there are partially known atoms encoded in the message, i.e. the sending cluster knows this atom, but the receiving does not. This can be for various reasons, most common is that the receiving node simply has not loaded a module containing some record definition. We currently employ some nasty work-arounds which basically amount to maintaining a short-ish list of potentially used atoms, but we're not quite happy with this error-prone approach.
Is there a smart way to share atoms between these clusters? Or is it recommended to not use the binary format for such purposes?
Looking forward to your insights.
I would think hard about why non-Erlang nodes are sending atom values in the first place. Most likely there is some adjustment that can be made to the protocol being used to communicate -- or most often there is simply not a real protocol defined and the actual protocol in use evolved organically over time.
Not knowing any details of the situation, there are two solutions to this:
Go deep and use an abstract serialization technique like ASN.1 or JSON or whatever, using binary strings instead of atoms. This makes the most sense when you have a largish set of well understood, structured data to send (which may wrap unstructured or opaque data).
Remain shallow and instead write a functional API interface for the processes/modules you will be sending to/calling first, to make sure you fully understand what your protocol actually is, and then back that up by making each interface call correspond to a matching network message which, when received, dispatches the same procedures an API function call would have.
The basic problem is the idea of non-Erlang nodes being able to generate atoms that the cluster may not be aware of. This is a somewhat sticky problem. In many cases the places where you are using atoms you can instead use binaries to similar effect and retain the same semantics without confusing the runtime. Its the difference between {<<"new_message">>, Data} and {new_message, Data}; matching within a function head works the same way, just slightly more noisy syntactically.
Brief description of requirements
(Lots of good answers here, thanks to all, I'll update if I ever get this flying).
A detector runs along a track, measuring several different physical parameters in real-time (determinist), as a function of curvilinear distance. The user can click on a button to 'mark' waypoints during this process, then uses the GUI to enter the details for each waypoint (in human-time, but while the data acquisition continues).
Following this, the system performs a series of calculations/filters/modifications on the acquired data, taking into account the constraints entered for each waypoint. The output of this process is a series of corrections, also as a function of curvilinear distance.
The third part of the process involves running along the track again, but this time writing the corrections to a physical system which corrects the track (still as a function of curvilinear distance).
My current idea for your input/comments/warnings
What I want to determine is if I can do this with a PC + FPGA. The FPGA would do the 'data acquisition', I would use C# on the PC to read the data from a buffer. The waypoint information could be entered via a WPF/Winforms application, and stocked in a database/flatfile/anything pending 'processing'.
For the processing, I would use F#.
The the FPGA would be used for 'writing' the information back to the physical machine.
The one problem that I can foresee currently is if processing algorithms require a sampling frequency which makes the quantity of data to buffer too big. This would imply offloading some of the processing to the FPGA - at least the bits that don't require user input. Unfortunately, the only pre-processing algorithm is a Kalman filter, which is difficult to implement with an FPGA, from what I have googled.
I'd be very greatful for any feedback you care to give.
UPDATES (extra info added here as and when)
At the entrance to the Kalman filter we're looking at once every 1ms. But on the other side of the Kalman filter, we would be sampling every 1m, which at the speeds we're talking about would be about 2 a second.
So I guess more precise questions would be:
implementing a Kalman filter on an FPGA - seems that it's possible, but I don't understand enough about either subject to be able to work out just HOW possible it is.
I'm also not sure whether an FPGA implementation of a Kalman will be able to cycle every 1ms - though I imagine that it should be no problem.
If I've understood correctly, FPGAs don't have hod-loads of memory. For the third part of the process, where I would be sending a (approximately) 4 x 400 array of doubles to use as a lookup table, is this feasible?
Also, would swapping between the two processes (reading/writing data) imply re-programming the FPGA each time, or could it be instructed to switch between the two? (Maybe possible just to run both in parallel and ignore one or the other).
Another option I've seen is compiling F# to VHDL using Avalda FPGA Developer, I'll be trying that soon, I think.
You don't mention your goals, customers, budget, reliability or deadlines, so this is hard to answer, but...
Forget the FPGA. Simplify your design, development environment and interfaces unless you know you are going to blow your real-time requirements with another solution.
If you have the budget, I'd first take look at LabView.
http://www.ni.com/labview/
http://www.ni.com/dataacquisition/
LabView would give you the data acquisition system and user GUI all on a single PC. In my experience, developers don't choose LabView because it doesn't feel like a 'real' programming environment, but I'd definitely recommend it for the problem you described.
If you are determined to use compiled languages, then I'd isolate the real time data acquisition component to an embedded target with an RTOS, and preferably one that takes advantage of the MMU for scheduling and thread isolation and lets you write in C. If you get a real RTOS, you should be able to realiably schedule the processes that need to run, and also be able to debug them if need be! Keep this off-target system as simple as possible with defined interfaces. Make it do just enough to get the data you need.
I'd then implement the interfaces back to the PC GUI using a common interface file for maintenance. Use standard interfaces for data transfer to the PC, something like USB2 or Ethernet. The FTDI chips are great for this stuff.
Since you are moving along a track, I have to assume the sampling frequency isn't more than 10 kHz. You can offload the data to PC at that rate easily, even 12 Mb USB (full-speed).
For serious processing of math data, Matlab is the way to go. But since I haven't heard of F#, I can't comment.
4 x 400 doubles is no problem. Even low-end FPGAs have 100's of kb of memory.
You don't have to change images to swap between reading and writing. That is done all the time in FPGAs.
Here is a suggestion.
Dump the FPGA concept.
Get a DSP evaluation board from TI
Pick one with enough gigaflops to make you happy.
Enough RAM to store your working set.
Program it in C. TI supply a small RT kernel.
It talks to the PC over, say a serial port or ethernet, whatever.
It sends the PC cooked data with a handshake so the data doesn't get lost.
There is enough ram in the DPS to store your data while the PC has senior moments.
No performance problems with the DSP.
Realtime bit does the realtime, with MP's of ram.
Processing is fast, and the GUI is not time-critical.
What is your connection to the PC? .Net will be a good fit if it is a network based connection, as you can use streams to deal with the data input.
My only warning to you regarding F# or any functional programming language involving large data sets is memory usage. They are wonderful and mathematically provable but when you are getting a stack overflow exception from to many recursions it means that your program won't work and you lose time and effort.
C# will be great if you need to develop a GUI, winforms and GDI+ should get you to something usable without a monumental effort.
Give us some more information regarding data rates and connection and maybe we can offer some more help?
There might be something useful in the Microsoft Robotics Studio: link text especially for the real time aspect. The CCR - Concurrency Coordination Runtime has a lot of this thought out already and the simulation tools might help you build a model that would help your analysis.
Sounds to me like you can do all the processing off line. If this is the case, then offline is the way to go. In other words divide the process into 3 steps:
Data acquisition
Data analysis
Physical system corrections based on the data analysis.
Data Acquisition
If you can't collect the data using a standard interface, then you probably have to go with a custom interface. Hard to say if you should be using an FPGA without knowing more about your interface. Building custom interfaces is expensive, so you should do a tradeoff study to select the approach. Anyway, if this is FPGA based then keep the FPGA simple and use it for raw data acquisition. With current hard drive technology you can easily store 100's of Gigabytes of data for post-processing, so store the raw data on a disk drive. There's no way you want to be implementing even a 1 dimensional Kalman filter in an FPGA if you don't have to.
Data Analysis
Once you've got the data on a hard drive, then you have lots of options for data analysis. If you already know F#, then go with F#. Python and Matlab both have lots of data analysis libraries available.
This approach also makes it much easier to test your data analysis software than a solution where you have to do all the processing in real time. If the results don't seem right, you can easily rerun the analysis without having to go and collect the data again.
Physical System Corrections
Take the results of the data analysis and run the detector along the track again feeding it the appropriate inputs through the interface card.
I've done a lot of embedded engineering including hybrid systems such as the one you've described. At the data rates and sizes you need to process, I doubt that you need an FPGA ... simply find an off the shelf data acquisition system to plug into your PC.
I think the biggest issue you're going to run into is more related to language bindings for your hardware APIs. In the past, I've had to develop a lot of my software in C and assembly (and even some Forth) simply because that was the easiest way to get the data from the hardware.
I've encountered the term "multi-agent computing" as of late, and I don't quite get what it is. I've read a book about it, but that didn't answer the fundamental question of what an agent was.
Does someone out there have a pointer to some reference which is clear and concise and answers the question without a load of bullshit/marketing speak? I want to know if this is something I should familiarise myself, or whether it's some crap I can probably ignore, because I honestly can't tell.
In simple terms, multiagent research tries to design system composed of autonomous agents. That is, you have a bunch of robots/people/software-agents around, each of which can take its own actions but can only "see" stuff that is around him, how do get the system to behave as you want?
Example,
Given a bunch of robots with limited sensing capabilities, how do you get them to monitor a field for enemies? to find all the mines in a field?
Given a bunch of people, how do you get them to maximize the happiness of the least happy person? without taking away their freedom.
Given a group of people, how do you set up a meeting time(s) that maximizes their happiness? without revealing their private information?
Some of these questions might appear really easy to solve, but they are not.
Multiagent research mixes techniques from game theory, Economics, artificial intelligence, and sometimes even Biology in order to answer these questions.
If you want more details, I have a free textbook that I am working on called Fundamentals of Multiagent Systems.
A multi-agent system is a concept borrowed from AI. It's almost like a virtual world where you have agents that are able to observe, communicate, and react. To give an example, you might have a memory allocation agent that you have to ask for memory and it decides whether or not to give it to you. Or you might have an agent that monitors a web server and restarts it if it hangs. The main goal behind multiagent systems is to have a more Smalltalk-like communication system between different parts of the system in order to get everything to work together, as opposed to more top-down directives that come from a central program.
"Agents" are another abstraction in software design.
As a crude hierarchy;
Machine code, assembly, machine-independent languages, sub-routines, procedures, abstract data types, objects, and finally agents.
As interconnection and distribution become more important in computing, the need for systems that can co-operate and reach agreements with other systems (with different interests) becomes apparent; this is where agents come in. Acting independently agents represent your best interests in their environment.
Other examples of agents:
Space craft control, to make quick decisions when there's no time for craft-ground crew-craft messaging (eg NASA's Deep Space 1)
Air traffic control (Systems over-riding pilots; this is in place in most commercial flights, and has saved lives)
Multi-agent systems are related to;
Economics
Game theory
Logic
Philosophy
Social sciences
I don't think agents are something you should gloss over. There's 2 million hits on google scholar for "multi agent" and more on CiteSeer; it's a rapidly evolving branch of computer science.
There are several key aspects to multi-agent computing, distribution and independence are among them.
Multi-agents don't have to be on different machines, they could as #Kyle says, be multiple processes on a single chip or machine, but they act without explicit centralised direction. They might act in concert, so they have certain synchronisation rules - doing their jobs separately before coming together to compare results, for example.
Generally though the reasoning behind the segmentation into separate agents is to allow for differing priorities to guide each agent's actions and reactions. Perhaps using an economic model to divide up common resources or because the different functions are physically separated so don't need to interact tightly with each other.
<sweeping generalisation>
Is it something to ignore? Well it's not really anything in particular so it's a little like "can I ignore the concept of quicksort?" If you don't understand what quicksort is then you're not going to fail to be a developer because most of your life will be totally unaffected. If you have more understanding of different architectures and models, you'll have more knowledge to deploy in new and unpredictable places.
<sweeping generalisation>
Ten years ago, 'multi-agent systems' (MAS) was one of those phrases that appeared everywhere in the academic literature. These days it is less prevalent, but some of the ideas it represents are really useful in some places. But totally unnecessary in others. So I hope that's clear ;)
It is difficult to say what multi-agent computing is, because the definition of an agent is usually very soft surrounded by markting terms etc. I'll try to explain what is it and where it could be used based on the research of manufacturing systems, which is the area, I am familiar with.
One of the "unsolved" problems of modern manufacturing is scheduling. When the definition of the problem is static, an optimal solution can be found, but in reality, people don't come to work, manufacturing resources fail, computers fail etc. The demand is changing all the time, different products are required (i.e. mass customization of the product - one produced car has air conditioning, the next one doesn't, ...). This all leads to the conclusions that a) manufacturing is very complex, b) static approaches, like scheduling in advance for a week, don't work. So the idea is this: why wouldn't we have intelligent programs representing parts of the systems, working the way out of this mess on their own? These programs are called agents. They should communicate and negotiate amongst themselves and make sure the tasks are done in due time. By using agents we want to lower the complexity of the control system, make it more manageable, enable better human - machine interaction, make it more robust and less error prone and very importantly: make the control system decentralized.
In short: agents are just a concept, but they are a concept everyone can intuitively understand. Code still needs to be written, but it is written in a different way, one abstraction higher than OOP.
There was a time when it was hard to find good material on software agents, primarily because of the perception of marketing potential. The bloom on that rose has diminished so the signal to noise ratio on the Internet has improved vis-a-vie software agents.
Here is a good introduction to software agents on this blog post of an open source project for software agents. The term multi-agent systems just means a system where multiple software agents run and communicate and delegate sub tasks to each other.
According to Jennings and Wooldridge who are 2 of the top Mulit-agent researchers an agent is an object that is reactive to its environment, proactive and social. That is an agent is a piece of software that can react to its environment in real time in a way that is suitable to its own objeective. It is proactive, which means that it doesnt just always wait to be asked to perform a task, if it sees a chance to do something that it feels would be beneficial to its objectives it does it. And that it is social, ie that it can communicate with other Agents, doesnt nessecaily ever have to do any of these things in meeting its own objectives but it should be able to to do these if the situation arose. And thus a multi-agent system is just a collection of these in a distrubuted system that can all communicate and try to perform their own personal goals hat normally lead to an overall achievement of the system goal.
You can find a concentration of white papers concerning agents here.