SSH tunneling to a remote DB from Heroku? - ruby-on-rails

I'm thinking of deploying a small Rails app on Heroku. In an effort to save money, I'd like my app to use an external database (to which I have free access), rather than a Heroku-hosted database. The trouble is that the free database only accepts local connections. To access it from Heroku, I'd need to do so via an SSH tunnel.
Is it possible for a Heroku app to persist its data in an external DB accessed via SSH? If so, how?
(For bonus points, here's a second question: is this a good idea? On the one hand, this scheme would save me from paying for a Heroku database. On the other hand, it means having to encrypt all my database traffic. I imagine that this would massively slow down my web dynos, and reduce the number of requests they can serve. Would the money I save on the database get used up paying for more dynos? Am I likely to come out ahead by doing this?)

Yes, you can.
It is possible to set up a tunnel on Heroku to an external database.
You don't want to do it for the reasons the O.P. mentions (to avoid paying for a local database) for the reasons #sgrif mentions (it would be painfully slow and probably not really save anything)
But there are legitimate reasons for wanting to tunnel to an external database, for example if data is residing in a legacy system that you need to analyze.
Rather than simply repeat myself (it's long), here's a link to the recipe that worked for me: SSH tunneling from Heroku

No, and even if it was an option it's a really bad idea as you'd be adding massive latency to every request, since you'd for all intents and purposes have to open a new tunnel for every request.
Your best option is likely to use Heroku's development or starter tiers. The free development tier will work if your database is less than 10,000 rows. Their $15/mo starter tier works for up to 1,000,000 rows.

Related

share session among different type of web servers

Some web services in my company are built with different web apps.(Rails, Django, PHP)
What's the better practice to share the session status
So that user won't have to login again and again among different servers.
I build my Rails apps in AWS auto scaling group.
That is, even I browse the same website, but next time I may browser on another server, so that I have to login again. because the server doesn't have my session status.
I need some better idea or keywords for me to search about that kind of issue.
Thanks in advance
I can think of two ways in which you can achieve this objective
Implement a custom session handling mechanism that makes use of database session management, i.e. all sessions will be stored in a special table in the database and will be accessible to all the servers.
Have a Central Authentication Service (CAS) which will act as a proxy to all the other servers. This will then mean that this step has to happen before the requests reach the load balancer.
If you look around, option 1 might be recommended by many, but it may also be an overkill since you'll need custom session management in each of the servers. However, your choice would probably depend on the specific objectives you want to achieve, the overall flexibility of the system architecture and the amount of time you have on your hands. The CAS might be a more straightforward way of solving the problem.
Storing user sessions in your applications database wouldn't be recommended option for AWS.
The biggest problem with using a database, is that you need to write some clean up script that runs every so often to clear the table of all the expired user sessions. This is messy, creates more overhead, and puts more pressure on your DB.
However, if you do want to use an actual database for this, you should use a NoSQL database like Dynamo. This will give you much better performance than a relational database. It's probably more cost effective too in terms of data transfer. However, the biggest problem with this is that you still need that annoying clean up script. Note There is built in support in the SDK for using PHP with Dynamo for storing the user's session:
http://docs.aws.amazon.com/aws-sdk-php/v2/guide/feature-dynamodb-session-handler.html
The best but most costly solution is to use an ElastiCache cluster. You can set a TTL of your choice which means you won't have to worry about clean up scripts. Also ElastiCache will get you much better performance than Dynamo or any relational DB as the data is stored in the RAM of the ElastiCache nodes. The main drawback of ElastiCache is that currently, it can't scale dynamically. So if too many users logged in at once, if you didn't have a big enough cluster already provisioned, things could get ugly.
http://docs.aws.amazon.com/AmazonElastiCache/latest/UserGuide/WhatIs.html
But you can bet that all the biggest, baddest and best applications being hosted on AWS are either using Dynamo, ElastiCache or a custom NoSQL or Cache cluster solution for storing user sessions.
Note that both of these services are supported in all of the AWS SDKs:
https://aws.amazon.com/tools/

Multiple Projects, Multiple languages, Same authentication

So I have multiple projects that I want to use a common core set of tables in a Postgres database to map an authentication scheme between them with. Things like a 'user' 'account' 'group' or other related user information is stored in these tables. The projects I have currently are a nodejs (multiple devices) and a Ruby web app (planning on multiple devices later on) and we could have a Django or another node project in the future as well. Is there an efficient, cost effective way to do this that would be scalable and reliable? I was thinking about using an s3 instance with a Postgres database hosted on it and pointing all my authentication from my multiple apps to that database but I wanted to see if others had thought about this problem as well.
First, please don't attempt to host postgres in S3.. I believe you may have meant EC2 with an EBS volume (which is really on s3). From an ease of use standpoint (particularly when considering ongoing maintenance), hosting any postgres instance on Amazon's RDS product is truly a pleasure. Without going into all the details of that product, I'll simply state that you can set up high availability (failover), backups, upgrades, and monitoring with just a few clicks.
That being said, RDS is not the cheapest solution, but the cost is not exorbitant either, depending on your load and number of simultaneous connections.
If all this database is going to do is authenticate people and then disconnect-- that very well will be overkill and will be a waste of resources. However, if you're housing a fairly complex set of permissions and other user data, it'll likely be a fairly straightforward solution.
Depending on your budget and requirements, you may benefit from running pgPool on your app server somewhere to pool the connections-- but I wouldn't start out using pgPool unless you need it.

How can one rails app on heroku access many databases

I want to be able to have one app access multiple databases on the HEROKU "system".
Can the connection to the database be changed dynamically?
Why I ask...
I have an app that has a lot of very processor heavy background jobs. If a given user uploads a product feed of say 50,000 product that have to be compared to existing products and update only the deltas it can take a "few" minutes.
Now to mitigate the delay I spin up multiple workers, each taking small bites out of the lot until there's none. I can get to about 20 workers before the GUI starts to feel sluggish because the DB is being hammered.
I've tuned some of the code and indexed the DB to some extent, and I'm sure there's more I could do, but it will eventually suffer the law of diminished returns.
For one user, I don't much care... if you upload 50k products you need to wait a bit..
But user one's choice to upload impacts user two. (different company so no cross over of data)..
Currently I handle different users by separating their data with schemas in postgresql.
The different users however share the same db connection and even on the best plan I can see a time when 20 users try to upload 50,000 products at the same time.(first of month/quarter for example).
User 21 would see a huge slow down on their system because of this..
So the question: Can I assign different users to different databases? User logs in, validates their info against a central DB, and then a different DB takes over?
My current solution is different instances of heroku. It's easy to maintain the code because it's one base and I just script the git push(es). The only issue is the different login URL's; which I suppose I could confront if I can't find an easy DB switch solution.
It sounds like you're able to shard your data by user, or set of users without much concern since you already separate them by schema. If that's the case, and you're using Ruby and ActiveRecord, look at https://github.com/tchandy/octopus. I imagine you're not looking to spin up databases on the fly, rather, you'll have them already built and ready to be used, and can add more as you go.
Granted, it sounds like what you're doing could be done a lot more effectively by using the right tool for that type of intensive processing like one of the Heroku Hadoop add-ons; nonetheless, if that's not an option for whatever reason, check out the gem above. There are a couple other gems like it, and of course you could technically manage your own ActiveRecord connections without this gem, but I think you'll find that will be painful really fast.
Of course, if you aren't using Ruby or ActiveRecord, still shard the data, and look for something like the gem above in your app's language :).
the postgres databases on heroku are configured with environment variables. when you run heroku config you should see:
DATABASE_URL: postgres://xxx.compute.amazonaws.com:5432/xxx
you can use these variables to connect to databases on other heroku instances or share a single database on different heroku apps.
if you try to run this kind of stuff on free heroku instances, i think it is against their terms of services.
if it's about scalability, i think you will just have to pay for a more expensive database instance...

Moving Heroku Shared Database

I recently reached the 5mb database limit with heroku, the costs rise dramatically after this point so I'm looking to move the database elsewhere.
I am very new to using VPS and setting up servers from scratch, however, I have done this recently for another app.
I have a couple questions related to this:
Is it possible to create a database on a VPS and point my rails app on heroku to use that database?
If so, what would database.yml actually look like. What would be an example localhost with the database stored outside the app?
These may be elementary questions but my knowledge of servers and programming is very much self taught, so I admit, there may be huge loopholes in things that I "should" already understand.
Note: Other (simpler) suggestions for moving my database are welcomed. Thanks.
OK - for starters, yes you can host a database external to Heroku and point your database.yml at that server - it's simply a case of setting up the hostname to point at the right address, and give it the correct credentials.
However, you need to consider a couple of things:
1) Latency - unless you're hosting inside EC2 East the latency between Heroku and your DB will cause you all sorts of performance issues.
2) Setting up a database server is not a simple task. You need consider how secure it is, how it performs, keeping it up to date, keeping it backed up, and having to worry day and night about it being up. With Heroku you don't need to do this as it's fully managed.
Price wise, are you aware of the new low cost Postgres plans at Heroku? $15/mo will get you 20Gb (shared instance), and $50/mp will get you a terabyte (dedicated instance). To me, that is absurdly cheap as I value my time much more, and I know how many hours I would need to invest in making my own server to save maybe $10 a month.
It would be cheaper to use Amazon RDS, which is officially supported by Heroku and served from the same datacenter (Amazon US-East). If you do want to use a VPS, use an Amazon EC2 instance in US-East for maximum performance. This tutorial shows exactly how to do it with Django in detail. Even if you don't decide to use EC2, refer to that tutorial to see how to properly add external database information to your Heroku application so that Heroku doesn't try to overwrite it.
Still, Heroku's shared database is extremely cost-competitive -- far moreso than most VPSes and with much less setup and maintenance.

Rescue rails app from server failure

I have a rails app which is now hosted on dedicated server. Today something happened: app doesn't respond and I have no ssh access, restarting doesn't help and I am waiting for tech support to respond. But this is not a question, I just need this app to be online even if server fails. Which is the easiest option? Can I create second server on different hosting and serve from there in case of failure, if so, how to sync db and files? Application is not heavily loaded, I just need it to be available.
Difficult problem to solve. There's no one proven way to make this happen, but in general you need "No Single Point of Failure"
There's an entire science devoted to reliability in web applications -- no way can you get that answered in a SO question.
You can take frequent backups of your database, store them on S3 (and/or somewhere else). You can then
have an image of your applications server at your host
spin it up when your server dies
restore the database
Have the new application server take over responsibility (easiest way: assume the old server's IP address)

Resources