Is FastCGI still a right answer? - fastcgi

FastCGI is old but it still seems like it must be the right answer in some cases.
It seems like the preferred deployment of Perl/Catalyst web applications is with FastCGI.
FastCGI was popular with Rails but seems to no longer be. (Why?)
The Java world doesn't seem to have anything to do with FastCGI. Is something like Tomcat way better than Apache+FastCGI?
Is choosing FastCGI still a good idea or just a lingering technology?
Ted

Since it depends a lot on your setup and requirements, I'll let the "Is X still a right answer?" up to you. However, by looking at different architectures, you can come up with a list of questions to ask to determine if it still is a right answer given specific circumstances.
Concerns of frequent interest
The questions you'll want to ask are usually related to security and flexibility. For security, you'll want to follow the principle of least privilege. For flexibility, you'll want to know if you can run multiple frameworks, multiple versions of the framework and how easily you can delegate work to other tasks.
Other concerns
For a simple web front-end to a database-backed application, not all of these questions are important. You also need to keep in mind that some of the recommendations have nothing to do with what's outlined here. Many web frameworks will recommend whatever architecture is easiest to setup with their framework. They do this because it helps get new users trying out the framework with minimal fuss and without flooding the mailing list. Also, the Java community tends to stick to a common denominator rather than take full advantage of the platform at hand, so they'll often recommend an all-Java solution.
Popular architectures
Single process architectures
From a pure performance point of view, a single process (probably threaded) with an embedded framework probably gives most performance as it reduces most communication overhead between whatever receives the request and whatever produces a response.
Security: a single process must have all of the permissions required to perform every single task it is handed. In simple applications, this might not be a problem. However, its possible you might serve multiple services
Flexibility: probably can't run multiple version of the same framework (e.g. code for different parts of your website require different versions of Java, Rails, Python, etc.). Moreover, changing your setup to serve some work on different machines becomes painful (less difficult when split up on virtual hosts).
Sub-process based architectures
Under the CGI model, you have to pay the price of spawning a new process for each request. Even on UNIX machines where spawning a process is considered cheap, 600 requests a second will kill your server if you spawn a process for each.
Security: to spawn child processes under different user accounts, your gateway probably runs under quite high privileges.
Flexibility: additional flexibility for the multiple frameworks, multiple versions, multiple languages approach, but you're still stuck on the same machine.
Distributed architectures
The FastCGI/SCGI approach tried to solve the CGI process management problem in a clean way. Just keep the process alive. Have the gateway talk to that process to serve the request.
Security: Because the gateway doesn't spawn the processes that serve requests, the gateway can run with far less privileges enabled. Actually, if it only serves as a gateway and doesn't do any work itself, it can run with hardly any privileges at all.
Flexibility: you get even better flexibility than the CGI model because you can forward the request to any machine on the network.
Conclusion
I like FastCGI, because it gives me high flexibility at a price (i.e. request forwarded through socket) I can afford to pay. It's not my full time job to administer systems. I don't develop all the apps I hosts. This means I look for the easiest solution for hosting whatever I try to host. FastCGI popular enough to be supported by major web servers and popular web frameworks. Adding another app usually just boils down to installing and mapping the desired URL to the application over FastCGI.

Related

Neo4j Standalone vs Embedded server?

I want to know what is the difference between these two implementations of neo4j. Of-course names of both techniques is self-explanatory,but still what are the main differences?
What factors should be considered in deciding which technique to use in the project?
Pros and cons.
P.S. Sorry if it is a repeat question but I searched and was not able to find any ques which answers my question.
Because the standalone server is built on the embedded server, the general rule of thumb is that the embedded server is more capable and has (obviously) lower latency. Either can operate in High-Availability mode, allow monitoring, and even accept connections from the neo4j-shell. With the server though, you get more functionality out-of-the-box, like remoting, basic visualization, monitoring interface, etc.
The differences are otherwise the practical ones you'd imagine. Choosing a deployment approach is influenced by two things:
Language - embedded mode requires that you're implementing your application with a JVM compatible language. The server supports any language/framework that can send HTTP requests.
Hardware - sharing physical resources between your application and Neo4j can be demanding. Scaling may argue for a dedicated machine to split out the persistence layer. The server obviously has a remote API to support segmenting your application.
It's otherwise difficult to give guidance without a specific usage scenario. Deploying into an existing Service Oriented Architecture? Probably server. Running on an copier machine? Go embedded. From scratch web application? What's the rest of your stack?

What is your experience with Nitrogen on Erlang?

I've been checking out the Nitrogen Project which is supposed to be the most mature web development framework for Erlang.
Erlang, as a language, is extremely impressive. However, with regards to Nitrogen, what I am not too keen about is using Erlang's rather uncommon syntax (unless you're native in PROLOG) to build UIs.
What is your experience with it as opposed to other mainstream web frameworks such as Django or Rails?
I've done very little with Nitrogen so far, but I've been monitoring the mailing list for months, so I think I have something useful to say about it.
To your concern about the syntax of Erlang and the Nitrogen framework, I'd respond that that sounds like a pure case of unfamiliarity, rather than unsuitability. Objectively, HTML is not a beautiful language, and it has plenty of quirks. You're used to this now, so it doesn't seem so bad. Give Nitrogen/Erlang a chance and you may find that you get used to it soon enough, too.
To your question about comparison to other languages and frameworks, I'd say the biggest difference is that with Nitrogen, the entire web site is being served directly by the Erlang runtime. Ruby on Rails has such a mode, but it's intended only for testing. Many other frameworks don't even offer the option of running everything within a single long-running process.
Running the entire web application and its underlying infrastructure within a single long-running process has significant implications on how the site runs:
With Apache, each child gets killed off every N connections, where N=500 or so, and you can't say whether a given child will always handle all of a given client's requests. Because HTTP is stateless but web apps almost always require some client state, an Apache child must rebuild its view of client state as part of handling a new connection. By default, this means going back to disk for persistent data stored about that client. There are alternatives like memcached, but these aren't built into the core of a LAMP type stack. With Erlang, nothing is killed off periodically, and Erlang offers standard facilities like Mnesia which provide disk-backed in-memory DBs.
Incidentally, if you're familiar with nginx, it's built on the same principles as Erlang, and it's fast for the same reason. The main difference between nginx and an Erlang instance running a web server is that nginx isn't a programming environment, so it still has to delegate a lot of processing to outside code. That means it shares the same IPC and persistent state problems as Apache.
Because the runtime stays up continuously and is a fully-functional programming environment, you can probably build more parts of your system in Erlang than with a lashed-together LAMP type stack. This magnifies the above benefits. The various parts of your system can coordinate via message passing and Mnesia instead of heavyweight IPC and MySQL, and all the pieces stay up and running continually, leading to less time-consuming state reconstruction.
A dozen or so Apache children all accessing the persistent client state data store is a lock-based hairball. The frameworks all handle locking and such for you transparently, but what they can't hide is the time it takes to do all this correctly.
Erlang is an impure functional language, which implies but does not require data purity; it is also built with multiprocessing in mind, going clear down to the core of the runtime design. These two facts mean you're less likely to spend time waiting on locks in an Erlang based server than one naively built on one of the other frameworks. It is certainly possible to optimize away lock delays in the other systems, but is that really what you want to be doing? Do you want to be on the thousandth team that has to learn how to optimize its web stack after the service becomes popular, or would you rather leave it all up to the tooling so you can spend your time doing something no one else has done yet?
I, too, was once concerned about clunky Erlang syntax. I've built a couple of tools to alleviate its annoyances for everyday web programming, and perhaps you will find one or both of them helpful:
ErlyDTL is an Erlang implementation of the Django Template Language; it's not available in Nitrogen, but it is available in other frameworks, such as Zotonic, Erlang Web, BeepBeep, and Chicago Boss
Chicago Boss is a full-stack Erlang framework that does a lot of code generation so that you can access data fields with function calls instead of Erlang's rather verbose record syntax (e.g. Person:name() instead of Person#person.name)
Note that Nitrogen does not include a database layer, so it's not really comparable to Rails or Django. For a comprehensive comparison of the database-driven frameworks, check out my answer to this StackOverflow question:
https://stackoverflow.com/questions/1822518/current-state-of-erlang-web-development-frameworks-template-languages/2898271#2898271
I would check out Webmachine if I were you. It is quite simple, fast, and leaves the interface up to you.
Erlang Web should also be considered mature. It is an MVC framework, whereas Nitrogen is more event based. It's a matter of preference.
I haven't used the other tools mentioned here except Webmachine, which I think it's a wonderful tool, but it is not a web framework like the others. It is as HTTP processor, and is ideal for building a restful interfaces.
I would also suggest you give the Erlang syntax a chance. Erlang is one of my favourite languages to use.

Cloud-aware programming and help choosing a good framework

How can i write a cloud-aware application? e.g. an application that takes benefit of being deployed on cloud. Is it same as an application that runs or a vps/dedicated server? if not then what are the differences? are there any design changes? What are the procedures that i need to take if i am to migrate an application to cloud-aware?
Also i am about to implement a web application idea which would need features like security, performance, caching, and more importantly free. I have been comparing some frameworks and found that django has least RAM/CPU usage and works great in prefork+threaded mode, but i have also read that django based sites stop to respond with huge load of connections. Other frameworks that i have seen/know are Zend, CakePHP, Lithium/Cake3, CodeIgnitor, Symfony, Ruby on Rails....
So i would leave this to your opinion as well, suggest me a good free framework based on my needs.
Finally thanks for reading the essay ;)
I feel a matrix moment coming on... "what is the cloud? The cloud is all around us, a prison for your program..." (what? the FAQ said bring your sense of humour...)
Ok so seriously, what is the cloud? It depends on the implementation but usual features include scalable computing resource and a charge per cpu-hour, storage area etc. So yes, it is a bit like developing on your VPS/a normal server.
As I understand it, Google App Engine allows you to consume as much as you want. The back-end resource management is done by Google and billed to you and you pay for what you use. I believe there's even a free threshold.
Amazon EC2 exposes an API that actually allows you to add virtual machine instances (someone correct me please if I'm wrong) having pre-configured them, deploy another instance of your web app, talk between private IP ranges if you wish (slicehost definitely allow this). As such, EC2 can allow you to act like a giant load balancer on the front-end passing work off to a whole number of VMs on the back end, or expose all that publicly, take your pick. I'm not sure on the exact detail because I didn't build the system but that's how I understand it.
I have a feeling (but I know least about Azure) that on Azure, resource management is done automatically, for you, by Microsoft, based on what your app uses.
So, in summary, the cloud is different things depending on which particular cloud you choose. EC2 seems to expose an API for managing resource, GAE and Azure appear to be environments which grow and shrink in the background based on your use.
Note: I am aware there are certain constraints developing in GAE, particularly with Java. In a minute, I'll edit in another thread where someone made an excellent comment on one of my posts to this effect.
Edit as promised, see this thread: Cloud Agnostic Architecture?
As for a choice of framework, it really doesn't matter as far as I'm concerned. If you are planning on deploying to one of these platforms you might want to check framework/language availability. I personally have just started Django and love it, having learnt python a while ago, so, in my totally unbiased opinion, use Django. Other developers will probably recommend other things, based on their preferences. What do you know? What are you most comfortable with? What do you like the most? I'd go with that. I chose Django purely because I'm not such a big fan of PHP, I like Python and I was comfortable with the framework when I initially played around with it.
Edit: So how do you write cloud-aware code? You design your software in such a way it fits on one of these architectures. Again, see the cloud-agnostic thread for some really good discussion on ways of doing this. For example, you might talk to some services on GAE which scale. That they are on GAE (example) doesn't really matter, you use loose coupling ideas. In essence, this is just a step up from the web service idea.
Also, another feature of the cloud I forgot to mention is the idea of CDN's being provided for you - some cloud implementations might move your data around the globe to make it more efficient to serve, or just because that's where they've got space. If that's an issue, don't use the cloud.
I cannot answer your question - I'm not experienced in such projects - but I can tell you one thing... both CakePHP and CodeIgniter are designed for PHP4 - in other words: for really old technology. And it seems nothing is going to change in their case. Symfony (especially 2.0 version which is still in heavy beta) is worth considering, but as I said on the very beginning - I can not support this with my own experience.
For designing applications for deployment for the cloud, the main thing to consider if recoverability. If your server is terminated, you may lose all of your data. If you're deploying on Amazon, I'd recommend putting all data that you need persisted onto an Elastic Block Storage (EBS) device. This would be data like user generated content/files, the database files and logs. I also use the EBS snapshot on a 5 day rotation so that's backed up itself. That said, I've had a cloud server up on AWS for over a year without any issues.
As for frameworks, I'm giving Grails a try at the minute and I'm quite enjoying it. Built to be syntactically similar to Rails but runs on the JVM. It means you can take advantage of all the Java goodness, like threading, concurrency and all the great libraries out there to build your web application.

Performance implications of using WF to control UI navigation in ASP.NET MVC application

I need to build a web application with different process flows and different UI steps depending on the locale of the logged in user.
I have developed a number of ASP.NET applications in C# and like the separation of concerns an MVC approach would give me. So I am looking at using these technologies.
The kicker is that different users in different locales need to have very different experiences, despite accessing the same datasource. I'm also constrained by the requirement to have new process flows be able to be configured easily. The XAML based Windows Workflow Foundation looks like a good candidate, and would allow me to avoid developing my own process flow engine.
However, I am a little concerned about the performance implications of such an approach. Has anyone tried this sort of architecture? What sort of impact can I expect on request time, CPU utilization and memory consumption?
All opinions gratefully received, Thanks.
Howdo
The principle objection, and I suppose, performance risk, which although not specically related to your ask, is that windows workflow hasn't been designed for that type of scenario. Simply put, WF first design tenet, was to enable long running transactions, built around web service and SOA, i.e. transactional connectivity lasting days, and as such its runtime is optimised for it. It doesn't really make a good fit, especially when you will need to shoe horn it into working, and it will be much more resource hungry for short running workflows, i.e. in relation to state changes, caching, etc, in terms of comparable other process flow code/engines. If you do go ahead, get a pilot up and running and test to death using Mercury Load Runner.
scope_creep

What are the limits of ruby on rails?

I have a memory of talking to people who have got so far in using Ruby on Rails and then had to abandon it when they have hit limits, or found it was ultimately too rigid. I forget the details but it may have had to do with using more than one database.
So what I'd like is to know is what features/requirements fall outside of Ruby on Rails, or at least requires such contortions that it is better to use another more flexible framework, even though you may have to lose some elegance or write extra boilerplate code.
Rails (not ruby itself) is proud to be "Opinionated Software".
What this means in practice is that the authors of rails have a certain target audience in mind (themselves basically) and aim rails specifically at that. If X feature isn't needed for that target audience, it doesn't get added.
Off the top of my head, things that rails explicitly doesn't support that people may care about:
Foreign keys in databases
Connections to multiple DB's at once
SOAP web services (since rails 2.0)
Connections to multiple database servers at once
That said, it is very easy to extend rails with plugins, and there are plugins which add all of the above functionality to rails, and a lot more, so I wouldn't really count these as limits.
The only other caveat is that rails is built around the idea of creating CRUD web applications using MVC. If you're trying to do something which is NOT a CRUD web app (like twitter, which is actually a messaging system, or if you are insane and want to use a model like ASP.NET webforms) then you will also encounter problems. In this case you're better off not using rails, as you're essentially trying to build a boat out of bicycle parts.
In all likelihood, the problems you will run into that can't just be fixed with a quick plugin or a day or 2 of coding are all inherent problems with the underlying C Ruby runtime (memory leaks, green threads, crap performance, etc).
Ruby on Rails does not support two-phase commits out of the box, which maybe required if your database-backed application needs to guarantee immediate consistency AND you need to use two or more database schemas.
For many web applications, I would venture that this is not a common use-case. One can perfectly well support eventual consistency with two or more databases. Or one could support immediate consistency with one database schema. The former case is a great problem to have if your app has to support a mondo amount of transactions (note the technical term :). The latter case is more typical, and Rails does just fine.
Frankly, I wouldn't worry about limits to using Ruby on Rails (or any framework) until you hit real scalability problems. Build a killer app first, and then worry about scalability.
CLARIFICATION: I'm thinking of things that Rails would have a hard-time supporting because it might require a fundamental shift in its architecture. I'll be generous and include some things that are part of the gem/plugin ecosystem such as foreign key enforcement or SOAP services.
By two-phase commits, I mean attempting to make two commits to physically distinct servers within one transactional context.
Use case #1 for a two-phase commit: you've clustered your database, so that you have 2 or more database servers and your schema is spread across both servers. You may want to commit to both servers, because you want to allow ActiveRecord think do a "foreign key map" that traverses across the different servers.
Use case #2 for a two-phase commit: you're attempting to implement a messaging solution (sorry, I'm J2EE developer by day). The message producer commits to the messaging broker (one server) and to the database (a different server).
Also found some good discussion about the limits of ActiveRecord.
I think there is a greater “meta-question” here, that could be answered and that is “when is it OK to lean on external libraries to speed up development time?”
Third party libraries are often great and can drastically reduce development time, however there is a major problem, Joel Spolsky calls this “the law of leaky abstractions.” If you look that up on Google his post will come up first. Essentially this means that the trade off in development time means that you have no idea what is going on under the covers. So when something breaks you are completely stuck and have very limited methods of debugging. This also means that if you hit one of the features that are simply unsupported in RAILS, that you really need, you’ll have no next step except to write the feature yourself, if you’re lucky. Many libraries can make this difficult to do.
We’ve been burned badly in my dev shop by this issue. Our solutions worked fine under normal load, but we found that the third party subscription libraries that we were using simply could not stand up to the kind of load that we experienced once our site started to get a large number of concurrent users. This puts us in a very difficult place; essentially we have to rewrite the entire subscription service ourselves, with performance in mind. Doing this means that we’ve wasted all the time that we spent using the library.
Third party libraries can be great for small to medium sized applications; they can drastically reduce development time and hide complexities that aren’t necessary to deal with in the early stages of development. However eventually they will catch up with you and you’ll likely have to rewrite or re-engineer your solution to get past the “law of leaky absctractions”
Ruby don't have a functionality like IsPostBack in ASP.Net
Orion's answer is right on. There are few hard limits to AR/Rails: deploying to Windows, AR connectors that aren't frequently used, e.g. Firebird, ), but even the things he mentioned, multiple databases and DB servers, there are gems and plugins that address those for legacy, sharding, and other reasons.
The real limitation is how time-consuming it is to keep on top of all the things that rails devs are working on, and researching specific issues, given how many blogs, and how much mailing list volume there are.

Resources