Reverse-engineering of communication protocols - communication

Just curious - what are some automatic or even semi-automatic techniques for reverse-engineering of communication protocols?
I am particularly interested in the case when one's sniffing traffic and trying to understand the protocol.
I could find a number of papers on scholar, but in my experience this is a completely manual process most of the times.
If anyone has experience in the field and feels like sharing it would be much appreciated.

Obtain some measure of control over a communication link and sniff the data. Then exercise the range of operations of the associated application to feel out how the protocol relates so you can gather general observations.
Google for the protocol. Maybe it is published. Maybe someone has already figured it out, or someone has carelessly leaked details about it.
Write a test program which replaces one end of the protocol: try eliciting responses from the other side by trial and error.
Often a protocol is a descendant of—or at least related in many ways to—another one. By seeing the specifics and having familiarity with many protocols, one can often make good educated guesses about its features and abilities.

Related

Is ejabberd a good choice as a core technology for turn-based MMORPG?

I think about technology stack for my project and I think about using ejabberd. The project will look like classic multi-user dungeon RPG where players will move across the world from one location to another (locations looks exactly like chat rooms), and they also will figth each other as well as creatures with AI in turn-based mode.
I never used ejabberd, but I have some experience in writing server applications using erlang.
Is ejabberd an overkill for this kind of game? It has a lot of features that I won't need ever. However it is well-known to erlang developers and is also very stable and mature. Is ejabberd worth using it as kind of transport layer for my online game, or I should better invent my own wheel, something tiny and simple?
I have several years of commercial experience, using ejabberd for things like this. So, my take:
Pros:
It is certainly technically capable.
It is quite easy to grok.
It is very easy to extend and modify.
If you update it regularly, it will solve two really important aspects. A. network security (this is extremely important thing for me); and B.properly done authentication. These two alone are enough of a reason to use it.
It is surprisingly fast.
It gives you chat, presence and friends list for free.
It gives you MUC (rooms) for free. With all the things like permissions solved quite well.
Cons:
Don't really expect to find any usable documentation. Source is mostly your only friend.
Don't really expect to find a community. It's a lonely path. There is a room - ejabberd#conference.jabber.ru but it's very quiet (and almost empty). Most of the people there are not developers, but just ejabberd users. Mailing list is a bit better, but usually not enough to find you the answer you seek.
The source code per-se is not the best example of an erlang project. If you want to learn how to write big, modular, distributed erlang software, better have a look at something like Riak.
The internal APIs are not very stable (they change quite a lot with releases). Because of this, I recommend writing your software as a separate erlang application, connecting to ejabberd as an external XMPP component. Thus you will be guaranteed that you communicate over a stable protocol (XMPP). Of course, you cannot escape having to write some internal stuff as well. Authentication and Roster (friend list) modules are the first that come to mind. This combination is quite hard to maintain and update, especially if you need hot code loading, but it's yet the best solution for me. Try to keep the "in-ejabberd" code to the viable minimum.
That being said, there is only one (to my knowledge) usable XMPP erlang library. It's called exmpp and is developed by the same company as ejabberd (ProcessOne). It's not yet considered stable. I have been using it for quite some time, and for now there are no problems, but you never know. It is also mostly undocumented (or was, when I was learning it).

How to evaulate the design of a brand new application which falls into fairly unfamiliar knowleadge domain to you?

Recently I participated in designing & writing of an application which my team was given complete requirements and had to basically design and code it - it was about automation of 3rd party handwriting recognition platform to interop with a couple of our systems. Now a few months after the customer called with what seemed to be at first glance a minor issue, but after investigating it turns out that the whole application requires re-design just to fix this inaccuracy (it's easier to re-design then patched).
I personally don't think the application was particularly badly designed by any of this points mentioned on this thread but just that there was way to many small unknowns for us and looks like have now accumulated into a major design flaw - something we basically failed to see. All those small factors in the design stage seemed be insignificant & ignorable so we thought we are doing ok. Now with the problem occurred it it seems silly we couldn't spot it at design time but I guess we ignored some 'small' details & nuances which turned out to be significant after all.
So is there any approach to take when you are entering the design stage of an application the you are not too familiar with but it's design (falsely) seems to be more or less straight forward (create tables, write BOs, write UI etc) so that you can increase you chance to foresee this type of pitfalls in the implementation stage ( or at least certainly before customer deployment) ?
PS: Sometimes we hire experts to help like mathematician one time, or geographical guy another but who can help us incorporate a third party platform into ours except us
I think the approach must be to find the "best practices" in the domain. Each domain has procedures in which things had been done always; it's often forgotten by practitioner what the rationale for these practices originally was. As a newcomer, it is good to find out what these best practices are, and to follow them - blindly.
That way, you have a good chance to avoid making common mistakes, and if you do run into problems, there is a chance that these problems are typical for the domain, with well-known solutions/work-arounds.
All speaking in the abstract, of course.

Should I make and implement a network protocol by hand or use a middleware (if so which)?

I have some data that I need to share between multiple services on multiple machines. Stuffing the data into a database or shuffling it over http won't work in this situation and ideally the different pieces of software will need to communicate with each other directly (or through one central coordinator that can send and receive).
Is it recommended to create and implement a network protocol or use some tool to do the communication?
If I did go the route of creating a protocol myself, it wouldn't have to be very complex. Under 10 different message types, but it would have to be re-implemented in a few different languages for this project, and support unicode. I have read plenty (and done some) with handling sockets, but don't have much knowledge in handling a protocol I create. Are there any good resources on this?
There are also things like ICE and RPC that look intresting. The limit of my experience is using ICE and XMLRPC for a few days each. Is this the better route to go? If so what tools are out there?
Recently I've been using Google Protocol Buffers for encoding and shipping data between different machines running software written in different languages. It is quite easy to do, and takes away a lot of the hassle of designing a custom protocol.
Without knowing what technologies and platforms you are dealing with, it's difficult to give you a very specific answer - so I'll try to give you some general feedback.
If the system(s) you are wishing to connect span more than a single platform and/or technology you are probably better using an existing transport mechanism and protocol to maximize the chance your base platform will already have a library (or multiple) to interact over it. Also, integrating security and other features in a stack with known behaviors is more likely to be documented (with examples floating around). RPC (and ICE, though I've less familiarity with it) has some useful capabilities, but it also requires a lot of control over the environment and security can be convoluted (particularly if you are passing objects between different languages).
With regards to avoiding polling, this is a performance related issue; there are design patterns which can help you to handle such things - if you understand how you need the system to work (e.g. the observer pattern - kind of a dont-call-us-we'll-call-you approach). The network environment you are playing in will dictate which options are actually viable (e.g. a local LAN will have different considerations from something which runs over a WAN or the internet). Factors like firewall tunneling, VPN traversal, etc. should play part in your final selected technology profile.
The only other major consideration (that I can think of just now... ;-)) would be to consider the type of data you need to pass about. Is it just text, or do you need to stream binary objects? Would an encoding format (like XML or JSON or bJSON) do the trick? You mention "less than ten message types" as part of the question, but is that the only information which would ever need to be communicated by the system?
Either way, unless the overhead of existing protocols is unacceptable you're better of leveraging established work 99% of the time. Creativity is great - but commercial projects usually benefit from well-known behaviors, even if not the coolest or slickest (kind of the "as long as it works..." approach).
hth!

Long-term memorization techniques to become an expert in the field?

I'm familiar with some mnemonic/memorization techniques for about a year.
I think that this techniques can give a developer significant benefit or even make you an expert in the field.
If you are familiar with this techniques, you know that there are mnemonic techniques for long-term memorizing. We often read lots of books, and there are many concepts which you don't remember because they won't appear often in your daily coding-life. So, you need to learn it again and again, months and years later.
The same situation with frameworks. It takes some time to become familiar with framework's syntax, useful code constructs and so on. But after some time you forget many concepts from your previous framework(or framework which you rarely use - but it is very important to you).
By using this techniques you can build with time your sustainable knowledge base, which will reliably grow - you can be confident that after some time you won't forget about the concepts you learned earlier.
Please tell me what do you think about this idea?
You are already familiar with Mnemonics techniques, please tell about your experience - it will be very useful and interesting to hear.
Useful links:
Method of loci
Mnemonic
My favorite method:
Type it into Google
I'm being totally serious - why do you need to remember it?
You don't memorize how to be a good programmer any more than you memorize how to be a good classical violinist. You practice, practice, practice. That will let you naturally recall the most important constructs, and as Chad says, Google is there for the less important ones. I have never felt the need to use mnemonic devices or rote memorization to learn a programming construct or technique.
"Expertise in the field" isn't about memorizing function calls. It's about the ability to break problems down, and provide performant, maintainable, reliable solutions in minimal time.
You could memorize every function call in the STL, and still be a complete neophyte programmer.
I read Harry Lorrayne's "The Memory Book" a few years ago, and found that the techniques therein were great for remembering related facts. However, in my experience I the techniques could have been more useful, namely:
The memorization didn't tend to work in the long run. If I wasn't practicing remembering a particular list, or body of facts, I would eventually completely forget them within a few days or weeks.
I had trouble applying the techniques to hierarchical data sets, like class libraries. This made their use less powerful for programming stuff.
The techniques were very useful for things that could be easily explained by voice, or a single stream of text. However, I had trouble applying them to things of a more visual nature, such as mathematical equations.
That said, I have used Mnumonic Techniques while coding for things that google could not replace. I sometimes use the number memorization trick to recall a specific line of code (by its line number) while I jump around a code file, or remember function names as I jump between files.
Agree with other answers, some of the more useful things you could focus on improving are:
Troubleshoot a problem, using the 'elimination' technique, basically eliminating problem areas, one by one, until you hit the right one
Quickly get to the resource/API/Information I need - Use Google, SO, CodePlex, Google code, Koders.com codesearch, Google codesearch, MSDN etc - Knowing what information lies where is enough to save time drastically
Avoid thrashing (stuck with a problem for too long, no results), once you've spent enough time on the problem, by giving others 'complete' and 'relevant' information on your problem you can help others help you
Finally, memorizing about theories in programming is not helpful, however just reading, listening to experts and podcasts, attending conferences can help great deal in 'access to information from memory'
HTH

Do you find that corporate buzzwords or heavy jargon gets in the way of software project communication? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
Do you find that corporate buzzwords or heavy management jargon gets in the way of software project communication? for example using words such as
Mainstreaming
Holistic
Contestability
Synergies
etc.
Would you rather see a initiative within the industry to put a stop to jargon such as this to help people communicate better and keep project communication in plain English? Is it even a problem? What are your thoughts/anecdotes?
I actually like buzzwords, when they are used in moderation.
They became buzzwords for a practical reason: Even though the concepts may be very complex and/or abstract, there is a consensus on the meaning. So with only one word, you can convey a whole lot of information to a large group of people. I see it as a form of encapsulation of information.
(Notice the use of the slightly outdated buzzword encapsulation?)
Of course, that is exactly the reason why many people start to abuse them: They only convey the general concept (i.e. why it's great to do FizzBuzz), and avoid discussing the messy details (i.e. why it won't work).
And since using a buzzword gives the impression that you are deeply familiar with the subject at hand, it can be used to silence others in the discussion.
Conclusion:
Buzzwords are ok - if they are used in the right way. If you want to improve your team communication, train them in the proper use of buzzwords.
I think some kind of industry-wide initiative would be impractical as jargon is in the eye of the beholder.
I think all you can do is make sure that you don't use buzzwords yourself even when communicating with people who do. For example, use the word "people" when talking to a Project Manager who refers to you and your colleagues as "resources".
The use of technical language can both help and hinder project team's progress, depending on appropriateness.
First it's necessary to point out that what is considered "too technical" depends purely on perspective. "Mainstreaming" is as much of a technical term, as SSD, CORBA and SOAP. Something that sounds as jargon nonsense to one person is actually a shortcut to communicate a complex concept for another.
Software development as a rule is cross-domain activity involving in addition to the software knowledge one or more technical user domains. It is a big mistake to assume that sales, marketing, management and banking (just to name a few fields often incorrectly considered "non-technical") haven’t developed and advanced their own complex body of knowledge, in other word — technology: sales technology, marketing technology, management technology and banking technology.
And it’s project manager’s responsibility to facilitate productive communication between representatives of different technical domains. Some suggestions:
Make handy a project dictionary that can be accessed and updated by everyone involved.
Ensure that common denominator language used for cross-domain documentation (i.e. functional specs).
Introduce domain specific terms only when necessary, but then always provide a brief explanation of the meaning (don’t “build from scratch” —leverage the wealth of online encyclopaedias by linking where possible).
Make sure that there is common understanding amongst the project team of the key terms.
Remember that what is considered “technical” depends purely on perspective and you need to facilitate communication in all directions, not just one-way (which is often from software developers to business users).
At the end the software will have to work in the realm of users and you have to make a judgement on how much the UI will rely on specific domain language (this is going to be a trade off between easiness-to-learn and usage-efficiency).
Technical jargon (ORM, TDD etc.) makes one's speech more precise. Corporate buzzwords (aka management jargon), on the other hand, are designed to be able to express vague ideas when full information is not available.
As such, management jargon serves its purpose pretty well, in the sense that it does allow managers to effectively communicate about thins they have very limited understanding of. That said, good manager knows when NOT use the jargon, such as when talking with developers, or with executives, both of whom hate bullshit.
Based on the above, the (Anti-)Buzzword Movement, should rather increase awareness of the proper usage and application of management jargon, and encourage proper information encapsulation only with appropriate auditory.
Personally, I think that the jargon should be used more. I see this occurring more and more and IT people want to simply hide behind the technical elements of the world and act like it is completely the business folks responsibility to speak more geeky.
I'll be honest, speaking more GEEK is not something that the business people can do and you should not want that to occur. Learn the jargon. Become one with the jargon. Own the jargon. Then the next time you are discussing things, you'll not be back pedaling.
Take ownership of the business terms and apply them to the technical side of things...
What's wrong with "holistic" or "synergy"? These are normal plain English words.
Every field has it's own jargon, and that must tell us something - people like having special words, phrases or assigning special meanings to existing words that are only relevant within their own field. I suspect if we went back to the pyramids, there'd be a full set of architectural and building phrases that your average Egyptian just wouldn't understand. So banning jargon just wont work, creating an FAQ and glossary normally do the trick.
BTW This must be a case of pots and kettles. Does anyone outside IT think phrases like - "...We'll use an ORM, and then WCF will talk over HTTPS, throw in a bit of AJAX and some clever CSS on the client and we're laughing ..."
Absolutely. Since managers only talk in general, and we as developers want to understand the precise meaning. I personally fall asleep trying to read abstract writing filled with buzzwords.
The worst being SOA. Neither academic folks nor managers understand it though both use it extensively.
I can't stand buzzwords. One person's "encapsulation" is another's Orwellian destruction of language. Buzwords appeal to the same people who like "decks" rather than memos. For something to act as a representation of many possible things [e.g. "leveraging resources" can mean pretty well anything from using double-sided printing options to drafting people for the Army] there is necessarily going to be a dilution of meaning. If a senior lawyer in my firm were to ask me to "leverage the resources, then run it up the flagpole to get the ducks in line", I'd know that he was tossing back the Johnny Walker at lunch.
Conversely, if I were to respond to a senior partner's memo request with emtpy catch phrases such as those above, I'd be fired on the spot for being an idiot - and rightly so. Too bad the rest of the white collar world isn't like that.
Grouchy Old-fashioned Gen X'r
I'm OK with buzzwords so long as all the stakeholders (see what I did there?) are clear on the shared meaning of each word/phrase.
In general I think that buzzwords are good when used for encapsulating ideas and concepts. it simplifies communication between people who understand the words. However, I draw the line when people use a buzzword when a perfectly normal word would do. I know someone that will say they were "On an Audio" when they mean phone calls or say "dialogging" instead of talking. It makes me want to hit them. Hard!

Resources