This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
What is the best way to learn Erlang?
I'm interested in learning Erlang; I'd appreciate suggestions on resources - books, websites, etc. - that can help me along. So far I've learned quite a bit from Learn you some Erlang. At this point I'm comfortable with the syntax and most of the (basic) concepts. As a practice project I was thinking of writting a server socket app that serves xml data when conected, unfortunately, I'm not sure where to start - i.e. what libraries to use and how to use them. Thanks.
gen_tcp is Erlang's interface to TCP/IP sockets. You can find many examples of how to use it in Erlang/OTP libraries or in open-source applications. For example, take a look at these http server and client libraries: https://github.com/mochi/MochiWeb, https://github.com/cmullaparthi/ibrowse
Handling XML in Erlang is more painful than it should be. JSON might be a little bit easier, if you have an option to use it instead of XML.
For XML, there's a standard Xmerl library which is a part of Erlang/OTP. I found that the least painful way to extract necessary pieces from XML is to use XPath (xmerl_xpath:string). For XML generation, xmerl:export_simple is the way to go.
I've also used Erlsom library. I has a (rather) simple XML parsing interface.
trapexit has an excellent article for building a tcp server with OTP.
Related
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
One of my college professors said that ruby on rails is used a lot for web, and I'm wondering how much Ruby on Rails is actually used vs JQuery, Node.js, PHP, etc. Also, what are the benefits?
You are mixing some stuff:
Ruby on Rails is a framework to create server side web applications using the Ruby language
jQuery is a client side JavaScript library that simplifies writing JavaScript web clients
Node.js is a server for the execution of server side JavaScript, thus providing a server version of JavaScript
PHP is a language popular for server side web application development
Thus: Ruby on Rails is a mature framework which offers a template engine, MVC architecture, a mapper between language objects and some relational database and a routing facility between URIs and controller.
Similiar designed frameworks exist for many programming languages / environments, e.g. Django for Python, or see Rails-inspired PHP frameworks in case of PHP.
About its popularity, see e.g. http://hotframeworks.com/
Benefits: IMHO it is a very elegant framework and as the plethora of inspired frameworks shows, has found many developers who like it.
The concepts and techniques learned here might also turn out to be useful when working with other modern frameworks.
And I should note as well, that there are web applications that need less features, e.g. see the Sinatra framework for a lighter alternative.
Also, what are the benefits?
There are a lot of things that websites have in common, e.g. html pages with forms, various javascript features, database interactions, security issues, logging in, etc. If you start from scratch, and try to program all those things yourself, it will be difficult and time consuming, and most likely your code will be full of exploitable security holes.
The other option is to use a web framework. Ruby on Rails is a web framework for the ruby programming language. All the various server side programming languages, such as ruby, python, php, perl, java, etc., have web frameworks(and usually many different frameworks to choose from!). A lot of smart people have come up with the best code for various things that websites need, and you get to use their code for free in your website.
The disadvantage of frameworks is that they are often large and complex, e.g. Ruby on Rails, Java Servlets+JSP, so it can take awhile to learn how to use them. Even then, you will probably not have a good grasp of their inner workings, so you are always sort of feeling around in the dark trying to get them to work the way you want them to. It's sort of like trying to push a large boulder which is at rest to another spot of your choosing: sometimes the boulder rolls cleanly into position, and other times the boulder seems to have a mind of its own and wanders off course.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I'm building a new game and I need to build a web app to help manage content generation. The app would consist of a couple simple forms that would tie into a MySQL db.
I've been really interested in learning Lua for a long time due to it's large popularity in the video game industry and was wondering how well it works as a server side language. I could easily write the web app in PHP but I'd rather use this opportunity to learn Lua if it makes sense.
What do you all think?
Cheers,
Sure it can be done. Good idea if you just want to learn Lua. You should start here: http://www.keplerproject.org/
Of course, if your app would consist of a couple simple forms, you can use all what you want. But if it is more complex (will become more complex in future) it will be better to use some industry standard languages like Python or Ruby (or, at least PHP), there are a lot of good frameworks writen in them that very simplify your work (I don't know about any complete lua web frameworks) .
You should remember, that in future other people will have to maintain your code and there are very few web-developers who know Lua.
Probably, there will be problems with documentation and basic libraries too.
While LUA is a nice language for embedded development but i would extremely vote against LUA for web development.
The reason is that in Games you simply don't have an external API. All is done with your own objects only some calls into your game engine.
But the web world is so full of stuff you need, like SMTP, POP3, IMAP, SSL, Amazon APIs, Google APIs, RSS Apis, Imaging etc. and while the checklist for LUA may have a check mark behind all this words - it doesn't mean anything. Most of the stuff i have seen is just a "me too| implementation but not industrial strength. They are projects by hobbyists and are published on a "Its good enough for me" basis which is total unacceptable if you ever go mission critical.
There is a reason why it takes years and a huge community to get this up. Lua has an extremely small community of web developers.
So if this is a professional project where you put your money i can only say hands off. On the other side if you have enough money i still have some snake oil here for sale, please contact me.
I have been using lua for years as a web language. Initially using the Xavante project and more recently apache2.
Dont listen to any neigh sayers, its a great language for web developement and we use it to write business software, and not just for form processing, for graphical applications too.
Also it offers us seamless integration to any other lua or system functions we might need to call.
Good Luck!
Have a look at Nanoki which is built on a pretty minimal set of libraries (lfs, luasocket, lzlib, slncrypto)
and Sputnik which is built on Xavante or CGI
Lua is a good language but it is best suited to embedding within an existing project in order to quickly extend the capabilities of that project. In particular, the interesting aspect comes with how you bind it to the host application. This is definitely the case when programming for games where it is an embedded language rather than the language the whole app tends to be written in. So using a web app to learn about Lua with a view to making games is probably not a very good approach, especially since the syntax is very simple and would be picked up quite quickly anyway.
I think that specific variants of lua can be used successfully for web applications and I have done that in the past using the maintained weblibrary. It can depend on if the lower level software on the computer is itself written in lua because of its high speed and this may cause a clash of lua versions. Regarding a serverside possibility the server would need a compatible version of the script developing facility for the hardware and a suitable bytecode or VM instructions and custom VM runtime implementation for running the application.
I've been developing a pure Lua Web Server, you could always check it out and see if it suits your needs
Lua4Web https://github.com/schme16/Lua4Web
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
We're designing a large scale web scraping/parsing project. Basically, the script needs to go through a list of web pages, extract the contents of a particular tag, and store it in a database.
What language would you recommend for doing this on a large scale(tens of millions of pages?).
.
We're using MongoDB for the database, so anything with solid MongoDB drivers is a plus.
So far, we have been using(don't laugh) PHP, curl, and Simple HTML DOM Parser but I don't think that's scalable to millions of pages, especially as PHP doesn't have proper multithreading.
We need something that is easy to develop in, can run on a Linux server, has a robust HTML/DOM parser to easily extract that tag, and can easily download millions of webpages in a reasonable amount of time.
We're not really looking for a web crawler, because we don't need to follow links and index all content, we just need to extract one tag from each page on a list.
If you're really talking about large scale, then you'll probably want something that lets you scale horizontally, e.g., a Map-Reduce framework like Hadoop. You can write Hadoop jobs in a number of languages, so you're not tied to Java. Here's an article on writing Hadoop jobs in Python, for instance. BTW, this is probably the language I'd use, thanks to libs like httplib2 for making the requests and lxml for parsing the results.
If a Map-Reduce framework is overkill, you could keep it in Python and use multiprocessing.
UPDATE:
If you don't want a MapReduce framework, and you prefer a different language, check out the ThreadPoolExecutor in Java. I would definitely use the Apache Commons HTTP client stuff, though. The stuff in the JDK proper is way less programmer-friendly.
You should probably use tools used for testing web applications (WatiN or Selenium).
You can then compose your workflow separated from the data using a tool I've written.
https://github.com/leblancmeneses/RobustHaven.IntegrationTests
You shouldn't have to do any manual parsing when using WatiN or Selenium. You'll instead write an css querySelector.
Using TopShelf and NServiceBus you can scale the # of workers horizontally.
FYI: With mono these tools i mention can run on Linux. (although miles may vary)
If JavaScript doesn't need to be evaluated to load data dynamically:
Anything requiring the document to be loaded in memory is going waste time. If you know where your tag is, all you need is a sax parser.
I do something similar using Java with the HttpClient commons library. Although I avoid the DOM parser because I'm looking for a specific tag which can be found easily from a regex.
The slowest part of the operation is making the http requests.
what about c++? there are many large scale libraries can help you.
boost asio can help you do the network.
TinyXML can parse XML files.
I have no idea about database, but almost all database have interfaces for c++, it is not a problem.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I recently discovered Erlang and am now working my way through a couple of tutorials. By now I'm looking forward to actually implement something as a hobby project. I'm not really interested in yet another chat server. I would like to code something more interesting (yes I'm aware that this is a rather fuzzy term) which is also manageable, so I can finish it in my spare time.
Any suggestions?
Edit: The project should preferably highlight Erlang's strenghts (concurrency, distributed).
Build a distributed system that searches twitter feeds in real time and allows anyone to perform searches from a web front end.
Build a distributed file system. Implement distributed B*Trees or B+Trees as the base of this file system. Do it in erlang.
Build a distributed key value store on top of the distributed file system built in step 2.
Build a distributed web index (to be used by a distributed web search engine) on top of the key value store.
Build a distributed linker. Advanced build automation offers remote agent processing for distributed builds and/or distributed processing.
Build a MMORPG backend that relies on distributed storage of the game/player state and distributed processing of user requests.
For something for yourself, consider writing a simple server; something that, for example, services date/time requests or -- a little fancier -- an HTTP daemon that serves only static content.
The best part of Erlang is the way it handles concurrency; exercize that.
Project Euler, for sure.
Some things from my copious ToDo list that would both be good learning exercises and helpful to the erlang community at large:
Profile all the available Key/Value stores:
Write a library for testing insert, lookup, delete, search times for a variety of K/V stores
Create a benchmark suite people can run
Make it work with ets, dets, proplists, gb_trees, dict, orddict, redblack trees, bdb, tokyocabinet, ...
Produce pretty graphs
Make it easy to update, contribute to and run on anyone's machine
write a new io_lib:format routine that uses named parameters:
io_lib:nformat("Hi there ~{name}s~n.", [{name, "Bob"}]).
This is useful for internationalisation if the position of parameters changes when the language of the format string changes.
Extend erl -make (make.erl)
Allow adding code paths (so that you don't need to do erl -pa LibraryPath -make)
Compile/load behaviour modules before modules that implement those behaviours
Handle hierarchal modules correctly (output path in particular)
This doesn't exactly answer your question, but if you are looking for an interesting free, open-source project that is written in Erlang, you should definitely check out CouchDB. From the website:
Apache CouchDB is a distributed,
fault-tolerant and schema-free
document-oriented database accessible
via a RESTful HTTP/JSON API. Among
other features, it provides robust,
incremental replication with
bi-directional conflict detection and
resolution, and is queryable and
indexable using a table-oriented view
engine with JavaScript acting as the
default view definition language.
CouchDB is written in Erlang, but can
be easily accessed from any
environment that provides means to
make HTTP requests. There are a
multitude of third-party client
libraries that make this even easier
for a variety of programming languages
and environments.
The CouchDB website has more details. Happy coding!
find something erlang doesn't have that you understand and like. I did that with etap https://github.com/ngerakines/etap/ Now nick has taken over management and it's used internally at EA games. It was fun to make and like a previous poster it was something real so I learned to serve real world problems working on it.
File indexing/search system. This was going to by intro project but I've switched over to something else.
Once you've got it working you could move the indexes to mnesia, and then spread the thing out other nodes to a have a whole network index.
I am considering Erlang as a potential for my upcoming project. I need a "Highly scalable, highly reliable" (duh, what project doesn't?) web server to accept HTTP requests, but not really serve up HTML. We have thousands of distributed clients (other systems, not users) that will be submitting binary data to central cluster of servers for offline processing. Responses would be very short, success, fail, error code, minimal data. We want to use HTTP since it is our best chance of traversing firewalls.
Given this limited information about the project, can you provide any weaknesses that might pop up using a technology like Erlang? For instance, I understand Erlang's text processing capabilities might leave something to be desired.
You comments are appreciated.
Thanks.
This sounds like a perfect candidate for a language like Erlang. The scaling properties of the language are very good, but if you're worried about the data processing abilities, you shouldn't be. It's a very powerful language, with many libraries available for developers. It's an old language, and it's been heavily used/tested in the past, so everything you want to do has probably already been done to some degree.
Make sure you use erlang version R11B5 or newer! Earlier versions of erlang did not provide the ability to timeout tcp sends. This results in stalled or malicious clients being able to execute a DoS attack on your application by refusing to recv data you send them, thus locking up the sending process.
See issue OTP-6684 from R11B5's release notes.
With Erlang the scalability and reliability is there but from your project definition you don't outline what type of text processing you will need.
I think Erlang's main limitation might be finding experienced developers in your area. Do some research on the availability of Erlang architects and coders.
If you are going to teach yourself or have your developers learn it on the job keep in mind that it is a very different way of coding and that while the core documentation is good a lot of people do wish there were more examples. Of course the very active community easily makes up for that.
I understand Erlang's text processing
capabilities might leave something to
be desired.
The starling project already provides basic unicode support and there is a EEP (Erlang Enhancement Proposal) currently in draft, but going in to bring it into the mainstream of Erlang/OTP support.
I encountered some problems with Redis read performance from Erlang. Here is my question. I tend to think the reason is Erlang-written module, which has troubles while processing tons of strings during communication with Redis.