Related
Hopefully this question won't be too convoluted or vague. I know what I want in my head, so fingers crossed I can get this across in text.
I'm looking for a language with a syntax of my own specification, so I assume I will need to create one myself. I've spent the last few days reading about compilers, lexers, parsers, assembly language, virtual machines, etc, and I'm struggling to sort everything out in terms of what I need to accomplish my goals (file attached at the bottom with some specifications). Essentially, I'm deathly confused as to what tools specifically I will need to use to go forward.
A little background: the language made would hopefully be used to implement a multiplayer, text-based MUD server. Therefore, it needs easy inbuilt functionality for creating/maintaining client TCP/IP connections, non-blocking IO, database access via SQL or similar. I'm also interested in security insofar as I don't want code that is written for this language to be able to be stolen and used by the general public without specialist software. This probably means that it should compile to object code
So, what are my best options to create a language that fits these specifications
My conclusions are below. This is just my best educated guess, so please contest me if you think I'm heading in the wrong direction. I'm mostly only including this to see how very confused I am when the experts come to make comments.
For code security, I should want a language that compiles and is run in a virtual machine. If I do this, I'll have a hell of a lot of work to do, won't I? Write a virtual machine, assembler language on the lower-level, and then on the higher-level, code libraries to deal with IO, sockets, etc myself, rather than using existing modules?
I'm just plain confused.
I'm not sure if I'm making sense.
If anyone could settle my brain even a little bit, I'd sincerely appreciate it! Alternatively, if I'm way off course and there's a much easier way to do this, please let me know!
Designing a custom domain-specific programming language is the right approach to a problem. Actually, almost all the problems are better approached with DSLs. Terms you'd probably like to google are: domain specific languages and language-oriented programming.
Some would say that designing and implementing a compiler is a complicated task. It is not true at all. Implementing compilers is a trivial thing. There are hordes of high-quality compilers available, and all you need to do is to define a simple transform from your very own language into another, or into a combination of the other languages. You'd need a parser - it is not a big deal nowdays, with Antlr and tons of homebrew PEG-based parser generators around. You'd need something to define semantics of your language - modern functional programming langauges shines in this area, all you need is something with a support for ADTs and pattern matching. You'd need a target platform. There is a lot of possibilities: JVM and .NET, C, C++, LLVM, Common Lisp, Scheme, Python, and whatever else is made of text strings.
There are ready to use frameworks for building your own languages. Literally, any Common Lisp or Scheme implementation can be used as such a framework. LLVM has all the stuff you'd need too. .NET toolbox is ok - there is a lot of code generation options available. There are specialised frameworks like this one for building languages with complex semantics.
Choose any way you like. It is easy. Much easier than you can imagine.
Writing your own language and tool chain to solve what seems to be a standard problem sounds like the wrong way to go. You'll end up developing yet another language, not writing your MUD.
Many game developers take an approach of using scripting languages to describe their own game world, for example see: http://www.gamasutra.com/view/feature/1570/reflections_on_building_three_.php
Also see: https://stackoverflow.com/questions/356160/which-game-scripting-language-is-better-to-use-lua-or-python for using existing languages (Pythong and LUA) in this case for in-game scripting.
Since you don't know a lot about compilers and creating computer languages: Don't. There are about five people in the world who are good at it.
If you still want to try: Creating a good general purpose language takes at least 3 years. Full time. It's a huge undertaking.
So instead, you should try one of the existing languages which solves almost all of your problems already except maybe the "custom" part. But maybe the language does things better than you ever imagined and you don't need the "custom" part at all.
Here are two options:
Python, a beautiful scripting language. The VM will compile the language into byte code for you, no need to waste time with a compiler. The syntax is very flexible but since there is a good reason for everything in Python, it's not too flexible.
Java. With the new Xtext framework, you can create your own languages in a couple of minutes. That doesn't mean you can create a good language in a few minutes. Just a language.
Python comes with a lot of libraries but if you need anything else, the air gets thin, quickly. On a positive side, you can write a lot of good and solid code in a short time. One line of python is usually equal to 10 lines of Java.
Java doesn't come with a lot of frills but there a literally millions of frameworks out there which do everything you can image ... and a lot of things you can't.
That said: Why limit yourself to one language? With Jython, you can run Python source in the Java VM. So you can write the core (web server, SQL, etc) in Java and the flexible UI parts, the adventures and stuff, in Python.
If you really want to create your own little language, a simpler and often quicker solution is to look at tools like lex and yacc and similar systems (ANTLR is a popular alternative), and then you can generate code either to an existing virtual machine or make a simple one yourself.
Making it all yourself is a great learning-experience, and will help you understand what goes on behind the scenes in other virtual machines.
An excellent source for understanding programming language design and implementation concepts is Structure and Interpretation of Computer Programs from MIT Press. It's a great read for anyone wanting to design and implement a language, or anyone looking to generally become a better programmer.
From what I can understand from this, you want to know how to develop your own programming language.
If so, you can accomplish this by different methods. I just finished up my own a few minutes ago and I used HTML and Javascript (And DOM) to develop my very own. I used a lot of x.split and x.indexOf("code here")!=-1 to do so... I don't have much time to give an example, but if you use W3schools and search "indexOf" and "split" I am sure that you will find what you might need.
I would really like to show you what I did and past the code below, but I can't due to possible theft and claim of my work.
I am pretty much just here to say that you can make your own programming language using HTML and Javascript, so that you and other might not get their hopes too low.
I hope this helps with most things....
I am a bit confused about this. If you're building a distributed application, which in some cases may perform parallel operations (although not necessarily mathematical), should you use ASIO or something like MPI? I take it MPI is a higher level than ASIO, but it's not clear where in the stack one would begin.
I know nothing about ASIO but from a quick Google it looks to me to be a lot lower level than MPI. For me the whole point of MPI is so that I can program against a higher level of abstraction from the messaging than, it seems, ASIO provides. Where you begin depends on your needs. For mine, parallelising scientific codes for high-performance, the obvious answer is MPI. I'm not sure I'd use it, or at least not sure it would be my default choice, if I were writing more general-purpose distributed, as opposed to parallel, applications. Well, actually, it probably would be my default choice to avoid learning another approach (most of which are less portable and less long-lived than MPI) but I'll admit it might not be the best choice if starting from an equal footing.
As far as I know MPI is currently incapable of handling the situation, when the new distributed nodes want to join the already started group. The problems also may occur if one of the nodes goes offline.
MPI does not reveal any network related machinery that is underneath. Thus if you would ever need something on the lower level -- you're in trouble. If you on the other hand do not aticipate such a need, then you'll save yourself a lot of time using MPI.
I think Erlang is very well suited for server systems developed in my workplace (currently developed in Java). I am a bit skeptical how this would be accepted both by developers (who have no idea about functional or Erlang) and by managers.
Any ideas on how to approach the issue? I am thinking about some hybrid system, where the hardcore highly reliable infra uses Elrang, and app specific stuff developed in Java (as nodes?)
There are a few approaches, and neither have any guarantees to actually work
Implement something substantial in a short time frame, perhaps using your own time. Don't tell anyone until you have something to display that works. Unless you have a colleague in on it.
Pull up lots of Erlang projects that are good demonstrations of the features you want. Present it to your managers and try to frame them about the risk in keeping using Java with this kind of technology available.
If the company you work for actually have a working code base in Java already, they're not likely to take you seriously when you suggest to rewrite it in another language.
The true test that you believe in Erlang being a much better choice: Quit and start up a competing company and bring the technology insight you have in your current industry. Your managers are really comparing a similar risk-scenario as you would do if you were to quit your job, and they are looking for the same assuring facts for success as you would do, to consider leaving a "safe" paycheck.
As for how to integrate, check out the jinterface application in Erlang. It allows Java code to send messages to Erlang nodes, and it allows Java to expose mailboxes to the Erlang nodes as if there were Erlang processes.
It's all about ROI (Return On Investment) to a manager: a manager will be concerned about performance (of the company). In order to appeal to his business nature, you'll have to make a case for it using dollar$ (or whatever appropriate currency).
Beware that undertaking a "skunkwork" project on the side to "prove" your solution based on Erlang might backfire: "so you had time to play with Erlang, why didn't you spend the time on the project then?" (Of course, not all managers/companies would think this way).
You have to take into account the whole proposal e.g. impact on the team, skills to be developed etc. It's all about money.
If I have an advice for you: start small, plant a seed, nurture it and watch it grow.
A wise man once said to me:
"It's not about technology, it's about
the product & market".
Start by not targetting a rewrite but using erlang for a new feature/project. Rewrites can be expensive and taking a chance on erlang for something that is already a time consuming and costly undertaking is a hard sell. But if there is a new piece that could be done in erlang and java, you stand a better chance. The project will be small enough hopefully that you can discover early if erlang is a good fit and adapt accordingly. And when erlang proves itself in that project you will have better data to make your case with.
We're introducing RabbitMQ into our infrastructure, which currently runs a combination of C++, Java and Python applications. I'm not specifically intending to move the team towards Erlang, but if I were, introducing a well-written third-party tool that just happens to use Erlang is a very good way to get the foot in the door.
One major caveat is that while Erlang is a wonderful language to learn, the surrounding technology (OTP in particular) has a huge learning curve and is extremely primitive in many ways (debugging, IDE's, etc.). It is getting better all the time, but reluctant converts will crucify you if you don't warn them about the pain of learning to program in a radically different environment. Even simple things like the lack of code-sense technology (E.g., type 'foo.' and the IDE tells you what methods you can call on foo) can leave a really bad taste in the mouth.
I have an image processing routine that I believe could be made very parallel very quickly. Each pixel needs to have roughly 2k operations done on it in a way that doesn't depend on the operations done on neighbors, so splitting the work up into different units is fairly straightforward.
My question is, what's the best way to approach this change such that I get the quickest speedup bang-for-the-buck?
Ideally, the library/approach I'm looking for should meet these criteria:
Still be around in 5 years. Something like CUDA or ATI's variant may get replaced with a less hardware-specific solution in the not-too-distant future, so I'd like something a bit more robust to time. If my impression of CUDA is wrong, I welcome the correction.
Be fast to implement. I've already written this code and it works in a serial mode, albeit very slowly. Ideally, I'd just take my code and recompile it to be parallel, but I think that that might be a fantasy. If I just rewrite it using a different paradigm (ie, as shaders or something), then that would be fine too.
Not require too much knowledge of the hardware. I'd like to be able to not have to specify the number of threads or operational units, but rather to have something automatically figure all of that out for me based on the machine being used.
Be runnable on cheap hardware. That may mean a $150 graphics card, or whatever.
Be runnable on Windows. Something like GCD might be the right call, but the customer base I'm targeting won't switch to Mac or Linux any time soon. Note that this does make the response to the question a bit different than to this other question.
What libraries/approaches/languages should I be looking at? I've looked at things like OpenMP, CUDA, GCD, and so forth, but I'm wondering if there are other things I'm missing.
I'm leaning right now to something like shaders and opengl 2.0, but that may not be the right call, since I'm not sure how many memory accesses I can get that way-- those 2k operations require accessing all the neighboring pixels in a lot of ways.
Easiest way is probably to divide your picture into the number of parts that you can process in parallel (4, 8, 16, depending on cores). Then just run a different process for each part.
In terms of doing this specifically, take a look at OpenCL. It will hopefully be around for longer since it's not vendor specific and both NVidia and ATI want to support it.
In general, since you don't need to share too much data, the process if really pretty straightforward.
I would also recommend Threading Building Blocks. We use this with the IntelĀ® Integrated Performance Primitives for the image analysis at the company I work for.
Threading Building Blocks(TBB) is similar to both OpenMP and Cilk. And it uses OpenMP to do the multithreading, it is just wrapped in a simpler interface. With it you don't have to worry about how many threads to make, you just define tasks. It will split the tasks, if it can, to keep everything busy and it does the load balancing for you.
Intel Integrated Performance Primitives(Ipp) has optimized libraries for vision. Most of which are multithreaded. For the functions we need that aren't in the IPP we thread them using TBB.
Using these, we obtain the best result when we use the IPP method for creating the images. What it does is it pads each row so that any given cache line is entirely contained in one row. Then we don't divvy up a row in the image across threads. That way we don't have false sharing from two threads trying to write to the same cache line.
Have you seen Intel's (Open Source) Threading Building Blocks?
I haven't used it, but take a look at Cilk. One of the big wigs on their team is Charles E. Leiserson; he is the "L" in CLRS, the most widely/respected used Algorithms book on the planet.
I think it caters well to your requirements.
From my brief readings, all you have to do is "tag" your existing code and then run it thru their compiler which will automatically/seamlessly parallelize the code. This is their big selling point, so you dont need to start from scratch with parallelism in mind, unlike other options (like OpenMP).
If you already have a working serial code in one of C, C++ or Fortran, you should give serious consideration to OpenMP. One of its big advantages over a lot of other parallelisation libraries / languages / systems / whatever, is that you can parallelise a loop at a time which means that you can get useful speed-up without having to re-write or, worse, re-design, your program.
In terms of your requirements:
OpenMP is much used in high-performance computing, there's a lot of 'weight' behind it and an active development community -- www.openmp.org.
Fast enough to implement if you're lucky enough to have chosen C, C++ or Fortran.
OpenMP implements a shared-memory approach to parallel computing, so a big plus in the 'don't need to understand hardware' argument. You can leave the program to figure out how many processors it has at run time, then distribute the computation across whatever is available, another plus.
Runs on the hardware you already have, no need for expensive, or cheap, additional graphics cards.
Yep, there are implementations for Windows systems.
Of course, if you were unwise enough to have not chosen C, C++ or Fortran in the beginning a lot of this advice will only apply after you have re-written it into one of those languages !
Regards
Mark
I am trying to build out a useful 3d game engine out of the Ogre3d rendering engine for mocking up some of the ideas i have come up with and have come to a bit of a crossroads. There are a number of scripting languages that are available and i was wondering if there were one or two that were vetted and had a proper following.
LUA and Squirrel seem to be the more vetted, but im open to any and all.
Optimally it would be best if there were a compiled form for the language for distribution and ease of loading.
One interesting option is stackless-python. This was used in the Eve-Online game.
The syntax is a matter of taste, Lua is like Javascript but with curly braces replaced with Pascal-like keywords. It has the nice syntactic feature that semicolons are never required but whitespace is still not significant, so you can even remove all line breaks and have it still work. As someone who started with C I'd say Python is the one with esoteric syntax compared to all the other languages.
LuaJIT is also around 10 times as fast as Python and the Lua interpreter is much much smaller (150kb or around 15k lines of C which you can actually read through and understand). You can let the user script your game without having to embed a massive language. On the other hand if you rip the parser part out of Lua it becomes even smaller.
The Python/C API manual is longer than the whole Lua manual (including the Lua/C API).
Another reason for Lua is the built-in support for coroutines (co-operative multitasking within the one OS thread). It allows one to have like 1000's of seemingly individual scripts running very fast alongside each other. Like one script per monster/weapon or so.
( Why do people write Lua in upper case so much on SO? It's "Lua" (see here). )
One more vote for Lua. Small, fast, easy to integrate, what's important for modern consoles - you can easily control its memory operations.
I'd go with Lua since writing bindings is extremely easy, the license is very friendly (MIT) and existing libraries also tend to be under said license. Scheme is also nice and easy to bind which is why it was chosen for the Gimp image editor for example. But Lua is simply great. World of Warcraft uses it, as a very high profile example. LuaJIT gives you native-compiled performance. It's less than an order of magnitude from pure C.
I wouldn't recommend LUA, it has a peculiar syntax so takes some time to get used to. Depending on who will be doing the scripting, this may not be a problem, but I would try to use something fairly accessible.
I would probably choose python. It normally compiles to bytecode, so you would need to embed the interpreter. However, if you must you can use PyPy to for example translate the code to C, and then compile it.
Embedding the interpreter is no issue. I am more interested in features and performance at this point in time. LUA and Squirrel are both interpreted, which is nice because one of the games i am building out is to include modifiable code, which has an editor in game.
I would love to hear about python, as i have seen its use within the battlefield series i believe.
python is also nice because it has actual OGRE bindings, just in case you need to modify something lower-level on the fly. I don't know of any equivalent bindings for Lua.
Since it's a C++ library, I would suggest either JavaScript or Squirrel, the latter being my personal favorite of the two for being even closer to C++, in particular to how it handles tables/structs and classes. It would be the easiest to get used to for a C++ coder because of all the similarities.
However, if you go with JavaScript and find an HTML5 version of Ogre3D, you should be able to port your game code directly into the web version with minimal (if any) changes necessary.
Both of these are a good choice, and both have their pros and cons, but both would definitely be the easiest to learn since you're likely already working in C++. If you're working with Java, the same may hold true, and if it's Game Maker, you wouldn't need either one unless you're trying to make an executable that people wouldn't need Game Maker itself to run, in which case, good luck finding an extension to run either of these.