experimenting with BGP using Quagga and a set of openWrt routers - openwrt

I want to learn and experiment with BGP protocol by some non-trivial scenarios: setup anycasting, see how quickly and what way routing changes after I disconnect some link, etc. As I understand I cannot easily and should not do it on "the real internet" as I would need to register/obtain an Autonomous System(s), obtain a pool of some IP addresses etc, not to mention a chaos I could cause by my experiments.
Therefore I'm considering buying few (5-6) cheap, openWRT compatible routers (I was thinking about MikroTik RB750Gr3), setting up my own small isolated "clone of the internet" and play with BGP using Quagga that I would install on these routers. So now I need help with verifying whether my idea makes sense:
is my understanding (described at the beginning) that I cannot/should not do it on "the real internet" correct? or maybe there are some publicly available "sandboxes" that would allow me to experiment?
is it even possible to create such a small isolated clone of the internet as I described or maybe it will not work because of some reasons that I'm missing? (for example some central registry like IANA would need to be also present on my clone or something else that I'm not aware of?)
is there maybe an easier/simpler way to conduct such experiments than by purchasing several routers? Maybe I could somehow create several interconnected virtual networks on Qemu-KVM/libvirt instead and play there? (I couldn't google anything related)
is Quagga BGP software capable of doing what I intend or maybe it has some limitations which will not allow me to try some/many of the typical "real internet" scenarios?
assuming that I'm more or less on the right track up to this moment, is the MikroTik RB750Gr3 a good model to conduct such experiments? or maybe I could use something significantly cheaper? or maybe the opposite: I need something "more capable"?
are there any resources on the web that describe more or less the thing that I intend to do? so far I found mostly either very high-level overviews of BGP or documents that describe situation from the point of view of a single AS.
I've asked this question originally on network engineering stackexchange, but it turned that openWRT and Quagga are forbidden topics there, so it was closed immediately: hope here is a good place ;)

I admire the ambition, but I don't see how a network of 5-6 routers is really going to be an effective "clone of the Internet".
You talk of "how quickly and in what way routing changes after I disconnect some link" -- but in the real Internet, the response to a failed link depends first on the network in which that happens, and then on how the resulting change(s) propagate across the "BGP Mesh", which in turn depends on the networks involved. A small network of closely connected routers is going to struggle to simulate that -- even if you could find out how to configure your simulation to emulate real networks.
You say that the resources you have found describe BGP in terms of a single AS. I guess that mostly because all the BGP does is to exchange routeing information between an AS and the neighbour ASes it is connected to. In a way, global routeing is a "emergent property" of all the individual AS-to-AS BGP connections across the Internet -- what I call the "BGP Mesh". If you look for "Route Flap Damping" or "BGP Route Convergence" or "BGP Route Stability" using your chosen search engine, you should start to find stuff related to the behavior the BGP Mesh. Also caida.org, RIPE and renesys.
With a 5-6 routers and data extracted from the RIPE or Caida route-collectors, you could probably set up a Quagga instance to be some AS connected to a couple of transit providers and three or four peers... but it sounds like a lot of work for not mush return.
Sorry to be negative.
It's been a while since I last did anything with Quagga, but it's a capable BGP implementation. There's also BIRD.

Related

How to draw a use case diagram for microservices (no users + Docker)?

I know that the question seems odd, but I'll try to explain it as well as I can.
I have a docker-compose file with 4 services: zookeeper, kafka, schema registry and redis. Besides that, I also have 2 SpringBoot microservices which use the dockerized services. The first microservice receives HTTP requests from Postman and then processes those requests before sending them to a topic. The second microservice reads from that topic, processes the messages and sends them to, you've guessed it, another topic.
How would a UML use-case diagram for this situation look like? I have no idea how to start because I've always identified roles first and took it from there. In this case, there are no people actually using it because it's just meant to collect some data and manipulate it.
My professor told me I had to have a use-case diagram, so I suppose it has to be possible to draw it, I'm just confused in how to start.
Also, if you have any recommendations on which diagrams can explain microservices nicely, let me know.
You are already focusing on a technical solution, i.e. how should it work. You have described here in details a system made of several microservices, and using some identified technologies for their implementation.
But the systems have a purpose. They are meant to interact with some actors and help themto achieve some goals. This is what use-cases are about: what's the purpose of the system for which actor. And whether it’s 12 microservices or 1 monolith, does absolutely not influence the use-cases, since these are independent of the inner structure of the system.
Unfortunately, nothing in the narrative tells us what this system is supposed to to nor for whom. This is where you should start.
Finally actors of the UC diagram can be human users or technical systems (i.e. independent systems).
You can use feature in use case diagram called include. Basically, include means a use-case will call other specific use-case. (see https://www.uml-diagrams.org/use-case-diagrams.html for reference). If you have chain of activity on multiple microservices, you can connect them using include like so:
Note that use case diagram doesn't care about technology used to build an app. It doesn't care what's under the include between your services. It was intended as pure abstraction to allow non-developers to understand it.

How trustworthy are polls by pinpoll

I was recently looking at security issues from online-polls and the problem with online-elections and how they can sometimes very easily be tampered with.
Now it sprung to my eye that a lot of websites that I visit and even local newspapers in my area use "pinpoll" for online-polls.
So I wanted to know how trustworthy and secure these polls are?
Tobias here, Founder and CEO of Pinpoll.
I agree with #GreyFairer, let's not discuss this on SO (unless you want to know why fingerprinting libraries shouldn't be used to identify individual clients or how Pinpoll is applying ws to broadcast live updates across the globe).
Just send me an e-mail to privacy#pinpoll.com and I'm happy to explain to you in more detail what we do (and cannot do) to protect polls agains bots and fake votes.
And let me make one thing clear: we're one of the most trustworthy providers in Europe, especially when it comes to complying with the EU's strict data protection laws.
One example: You won't find a single request to a server other than our own (located in the EU) in our interactive elements.
And one last thing: What might be annoying to you (which is fully accepted), is interesting and entertaining to others. So let's agree to disagree when it comes to online polls in news portals ;)

Does Seaside scale?

Seaside is known as "the heretical web framework". One of the points that make it heretical is that it has much shared state. That however is something which, in my current understanding, hinders easy scaling.
Ruby on rails on the other hand shares as less state as possible. It has been known to scale pretty well, even if it is dog slow compared to modern smalltalk vms. flickr uses php and has scaled to an extremly big infrastructure...
So has anybody some experience in the scaling of Seaside?
Ramon Leon shares some of his experience on upscaling seaside on his (excellent) blog. You can read very concrete ideas with sample code about configuring and tuning seaside.
Enjoy :-)
http://onsmalltalk.com/scaling-seaside-more-advanced-load-balancing-and-publishing
http://onsmalltalk.com/scaling-seaside-redux-enter-the-penguin
http://onsmalltalk.com/stateless-sitemap-in-seaside
Short answer:
you can scale Seaside applications like hell yah
Long answer:
In the IT domain, scaling is one thing but it has two dimensions:
horozontal
vertical
Almost everybody thought about scaling in the vertical dimension. That was until intel and friends reached some physical barriers and started to add cores to compensate the current impossibility of adding MHz.
That's when all we started to be more aware of scaling horizontally as the way to go.
Why am I telling you this?
Because Seaside is a smalltalk image running in a VM and that is roughly the same situation of a system in a server of a monocore processor.
Taking that as foundation, you scale web apps by making a cluster of servers. It's the natural thing to do, it's the fault tolerant thing to do, is the topologically intelligent thing to do, is the flexible thing to do, I guess you get the idea...
So, if for scaling, you do the same as intel & friends, you embrace the horizontal way. And it's even cheaper that the vertical way (that will lead you to IBM and Sun servers that are as good as expensive).
RoR applications are typically scaled horizontally. Google has countless cheap servers to do their thing. It works perfectly fine no matter how dramatized people want's to impress you throwing at you a bunch of forgettable twitter whales.
If they talk to you about that, you just be polite and hear what they say but remember this:
perfect is the enemy of the good
the unfinished perfect will never be as value as the good thing done
BTW, Amazon does something like that too (and it kind of couple geolocation so they enhance the chances of attending your requests with the cluster that is closest to your location).
On the other hand, the way Avi scaled dabbledb (Seaside based web application company bought by twitter) was using one vm per customer account (starting up and shutting down those on demand).
Having a lot of state in an image doesn't make scaling impossible nor complicated.
Just different.
The way to go is with a load balancer that uses sticky sessions so you can have one image attenting all the requests of an user session. You make things so any worker-image behind the load balancer can attend any user of a given app. And that's pretty much it.
To be able to do that you need to share the persistent objects among workers. All the users databases needs to be accessible by the workers at anytime and need to deal well with concurrency.
We designed airflowing scalable in that way.
It's economically convenient too because you can start with N very small (depending on the RAM of your first server) and increase it on demand until you reach the hardware limit.
Once you reach the hardware limit, you just add another host to the cluster and recofigure the balancer (and the access to the databases).
Simple, economic and elegant.
http://dabbledb.com/ seems to scale quite well. Moreover, one can use GemStone GLASS to run Seaside.
On this interview Avi Bryant the creator of Seaside and Co founder DabbleDB
Explains how they make it scale.
From what I understand:
each customer has it's own Squeak
Image.
When a customer comes Apache decides based on the user name which port to send it to.
Based on the port it starts the customer's Squeak Image.
That way it can grow to an infinite number of servers.
I think this solution works for them based on the specifics of their application each customer doesn't need to share info between them. So no need for o centralized DB.
Anyway it is better to watch the interview rather than my half-made summary.
Yes, Seaside scales down fantastically. A single developer can create and maintain complex applications for small groups very well.
[coming back to this after a few years]
This actually is much more important than scaling up. Computer speed still grows a lot, and 99% of all applications can now run on one machine. Speed of development, and especially maintenance now totally dominates TCO.
I would rephrase your question slightly to: does Seaside prevent/discourage you from creating applications that scale? I would say usually no. Seaside doesn't have a default way to store your data (just like php on its doesn't, though Seaside gives you a few extra options) and my impression is interacting with your data tends to be the biggest hurdle when it comes to scaling.
If you want to store your data in a monolithic SQL db, like with rails, you can do that. Or you can use an object database. Or you can use a separate object database for each user, or separate db for each project, or a separate db for each user and project. Or you can store everything in a series of flat files or you can just store your data as objects in your VM's memory.
And because of continuations you don't need re-fetch your data out of your datastore on every webpage call. As when you are using a desktop application you can pull data out of your datastore when your user begins interacting with it, set the appropriate variables, and then use those variables between webcalls until the user is done with the data at which point you can update the datastore. When you don't share state you have to hit the datastore on every single webcall.
Of course this doens't mean scaling is free, it just means you have a larger domain in which to search for scaling solutions.
All that said, for many applications rails will scale much easier simply because there exist large hosting solutions for rails (and php for that matter) that will offer you a huge amount of resources without having to rent and setup a custom box.
These are just my impressions from reading and talking to people.
I just reminded that there is link on Pharo's success stories : a Seaside Web application with up to 1000 concurrent users for a large health insurance in Argentina .
Pharo success stories
Issys Tracking:
Load balancing: Apache as a proxy/balancer (round robin with session
affinity)
Server setup: 5 Pharo images on 3 different servers (Linux and Windows 2003)
GUI: Heavily AJAX-based. All code written in Smalltak: Seaside 3.0, Seaside JQuery integration and JQWidgetBox.
Persistency: Glorp (OR mapper) and OpenDBX (DB client)
Databases: large PostgreSQL and MS SQL Server DBs
From the Wikipedia article, it's a total pig. Prior to that, it hadn't scaled to the point where I'd heard of it. :)

What is the most critical piece of code you have written and how did you approach it?

Put it another way: what code have you written that cannot fail. I'm interested in hearing from those who have worked on projects dealing with heart monitors, water testing, economic fundamentals, missile trajectories, or the O2 concentration on the space shuttle.
How did you prepare for writing this sort of code: methodologically, intellectually, and emotionally?
Edit
I've marked this wiki in case the rep issue is keeping people from replying. I thought there would be a good deal more perspective on this issue than there has been.
While I am not personally involved in what is described there, this article will hopefully contribute to the spirit of your question: They Write the Right Stuff.
I wrote a driver for a blood pressure measuring device for hospital use. If it "fails", the patient will not have his blood pressure checked at the scheduled time; if his blood pressure is abnormal, no alarm (in the larger system) will be triggered. Such an event could be clinically significant.
My approach was to thoroughly read the spec/documentation in a non-work environment (to avoid the temptation to start coding right away), then read it again at work. After that, I summarized the possible states and actions on paper and "flowcharted" an algorithm, and annotated all the potential real-world "bad events" (cables getting unplugged, batteries dying, etc). Finally, I wrote and rewrote the driver three times, each with different mechanisms (e.g. FSM), and compared their results. Each iteration helped me identify weaknesses I hadn't yet discovered. The third rewrite was the "official" result. I reviewed each iteration with my co-worker.
Emotional preparation consisted of convincing myself that should the unthinkable happen, at least I wasn't willfully negligent -- just incompetent (the old "I'm only human" excuse). ;-)
I have written computer interface to a MRI machine. It had no chance of hurting the end user as it was just record management, but it could potentially have given an incorrect diagnosis or omit important information.
Tests, lots and lots of tests.
Unit tests, mid and high level tests. Simulate all possible input combinations. Also a great deal of testing with the hardware itself. Testing must be done in a complete and methodical way. It should take a great deal more time to test than to write.
Error Reporting
All errors must be reported and be obvious. If it won't hurt the patent to do so, fail fast.
For something that is actively keeping a person alive things are even worse. It must never stop working. If it fails it needs to restart and keep trying. Redundant internals are also a must in case the hardware fails.
At the wrong company it can really a difficult kind of situation to work in. However, if things are going well, you are well funded and release pressure is not high, it can be a very rewarding space to work in.
Not really an answer, but:
I've got a friend who writes embedded control software for laser eye surgery machines. When he had laser eye surgery himself, he made sure to go to an ophthalmologist who used his company's system. I have great admiration for this guy. I can't think of a piece of software I've ever written whose level of quality was high enough that I'd trust my own eyesight to it.
Right now I'm working on some base code for a system that retrieves medical patient information from clinics and hospitals for a medical billing office. We're starting out with a smaller client and a long break-in period to ensure quality, but eventually this code needs to securely handle a large variety of report formats from a number of clients at different facilities.
It's not quite in the same scale as your examples, but a bad mistake could result in the wrong people being billed or the right person billed to a defunct address (screwing up credit reports) or open people up to identity theft, so it's still pretty critical. Oh yeah, and it could mean doctors don't get paid quite as quick. That's important, too, especially from a business perspective, but not in the same class as data protection and integrity.
I've heard crazy stories of the processes used to write code at NASA for the spaceshuttles. Every line of code has about 10-20 lines of documentation, along with tests, full revision history, etc. Every time a bug is found, not only is the code evaluated and repaired, but the entire procedure of writing code, the entire command chain, etc. is reviewed to answer the question: "What happened wrong in our process that allowed this bug to get included in the first place?"
While nothing quite so important as an MRI machine or a blood pressure monitor, I did get tapped to do a rewrite of Blackjack when I worked for an online gambling provider. Blackjack is by far the most popular online game, and millions of dollars was going to go through this software (and did).
I wrote the game engine separate from the server and the client, and used Test Driven Development to ensure that what I was assuming was coming through in the results. I also had a wrapper "server" that had console output that would allow me to play. This was actually only useful in that it mimicked the real server interface, since playing a text version of blackjack isn't very fun or easy ("You draw a 10. You now have a 10 and a 6, while the dealer has a 6 showing. [bsd] >")
The game is still being run on some sites out there, and to my knowledge, has never had any financial bugs after years of play.
My first "real" software job was writing a GUI app for planning stereotactic brain surgery. Testing, testing, testing... absolutely no formal methods, engineering-style thoughts, just younger programmers cranking it out. When they started talking about using the software to control a robotic arm with a laser, without any serious engineering methods in place, i got a bit worried, left for more officey lands.
I've created information system application for local government cultures and tourism department in Bali island which were installed in several tourism denstinations, providing extensive informations about the culture, maps, accomodations etc.
if it failed then probably tourists couldnt get the right informations they need most, cheat by brookers, or lost somewhere :)

"Multi-agent computing" in simple terms

I've encountered the term "multi-agent computing" as of late, and I don't quite get what it is. I've read a book about it, but that didn't answer the fundamental question of what an agent was.
Does someone out there have a pointer to some reference which is clear and concise and answers the question without a load of bullshit/marketing speak? I want to know if this is something I should familiarise myself, or whether it's some crap I can probably ignore, because I honestly can't tell.
In simple terms, multiagent research tries to design system composed of autonomous agents. That is, you have a bunch of robots/people/software-agents around, each of which can take its own actions but can only "see" stuff that is around him, how do get the system to behave as you want?
Example,
Given a bunch of robots with limited sensing capabilities, how do you get them to monitor a field for enemies? to find all the mines in a field?
Given a bunch of people, how do you get them to maximize the happiness of the least happy person? without taking away their freedom.
Given a group of people, how do you set up a meeting time(s) that maximizes their happiness? without revealing their private information?
Some of these questions might appear really easy to solve, but they are not.
Multiagent research mixes techniques from game theory, Economics, artificial intelligence, and sometimes even Biology in order to answer these questions.
If you want more details, I have a free textbook that I am working on called Fundamentals of Multiagent Systems.
A multi-agent system is a concept borrowed from AI. It's almost like a virtual world where you have agents that are able to observe, communicate, and react. To give an example, you might have a memory allocation agent that you have to ask for memory and it decides whether or not to give it to you. Or you might have an agent that monitors a web server and restarts it if it hangs. The main goal behind multiagent systems is to have a more Smalltalk-like communication system between different parts of the system in order to get everything to work together, as opposed to more top-down directives that come from a central program.
"Agents" are another abstraction in software design.
As a crude hierarchy;
Machine code, assembly, machine-independent languages, sub-routines, procedures, abstract data types, objects, and finally agents.
As interconnection and distribution become more important in computing, the need for systems that can co-operate and reach agreements with other systems (with different interests) becomes apparent; this is where agents come in. Acting independently agents represent your best interests in their environment.
Other examples of agents:
Space craft control, to make quick decisions when there's no time for craft-ground crew-craft messaging (eg NASA's Deep Space 1)
Air traffic control (Systems over-riding pilots; this is in place in most commercial flights, and has saved lives)
Multi-agent systems are related to;
Economics
Game theory
Logic
Philosophy
Social sciences
I don't think agents are something you should gloss over. There's 2 million hits on google scholar for "multi agent" and more on CiteSeer; it's a rapidly evolving branch of computer science.
There are several key aspects to multi-agent computing, distribution and independence are among them.
Multi-agents don't have to be on different machines, they could as #Kyle says, be multiple processes on a single chip or machine, but they act without explicit centralised direction. They might act in concert, so they have certain synchronisation rules - doing their jobs separately before coming together to compare results, for example.
Generally though the reasoning behind the segmentation into separate agents is to allow for differing priorities to guide each agent's actions and reactions. Perhaps using an economic model to divide up common resources or because the different functions are physically separated so don't need to interact tightly with each other.
<sweeping generalisation>
Is it something to ignore? Well it's not really anything in particular so it's a little like "can I ignore the concept of quicksort?" If you don't understand what quicksort is then you're not going to fail to be a developer because most of your life will be totally unaffected. If you have more understanding of different architectures and models, you'll have more knowledge to deploy in new and unpredictable places.
<sweeping generalisation>
Ten years ago, 'multi-agent systems' (MAS) was one of those phrases that appeared everywhere in the academic literature. These days it is less prevalent, but some of the ideas it represents are really useful in some places. But totally unnecessary in others. So I hope that's clear ;)
It is difficult to say what multi-agent computing is, because the definition of an agent is usually very soft surrounded by markting terms etc. I'll try to explain what is it and where it could be used based on the research of manufacturing systems, which is the area, I am familiar with.
One of the "unsolved" problems of modern manufacturing is scheduling. When the definition of the problem is static, an optimal solution can be found, but in reality, people don't come to work, manufacturing resources fail, computers fail etc. The demand is changing all the time, different products are required (i.e. mass customization of the product - one produced car has air conditioning, the next one doesn't, ...). This all leads to the conclusions that a) manufacturing is very complex, b) static approaches, like scheduling in advance for a week, don't work. So the idea is this: why wouldn't we have intelligent programs representing parts of the systems, working the way out of this mess on their own? These programs are called agents. They should communicate and negotiate amongst themselves and make sure the tasks are done in due time. By using agents we want to lower the complexity of the control system, make it more manageable, enable better human - machine interaction, make it more robust and less error prone and very importantly: make the control system decentralized.
In short: agents are just a concept, but they are a concept everyone can intuitively understand. Code still needs to be written, but it is written in a different way, one abstraction higher than OOP.
There was a time when it was hard to find good material on software agents, primarily because of the perception of marketing potential. The bloom on that rose has diminished so the signal to noise ratio on the Internet has improved vis-a-vie software agents.
Here is a good introduction to software agents on this blog post of an open source project for software agents. The term multi-agent systems just means a system where multiple software agents run and communicate and delegate sub tasks to each other.
According to Jennings and Wooldridge who are 2 of the top Mulit-agent researchers an agent is an object that is reactive to its environment, proactive and social. That is an agent is a piece of software that can react to its environment in real time in a way that is suitable to its own objeective. It is proactive, which means that it doesnt just always wait to be asked to perform a task, if it sees a chance to do something that it feels would be beneficial to its objectives it does it. And that it is social, ie that it can communicate with other Agents, doesnt nessecaily ever have to do any of these things in meeting its own objectives but it should be able to to do these if the situation arose. And thus a multi-agent system is just a collection of these in a distrubuted system that can all communicate and try to perform their own personal goals hat normally lead to an overall achievement of the system goal.
You can find a concentration of white papers concerning agents here.

Resources