Ruby aws-sdk - timeout error - ruby-on-rails

I am trying to upload a file to S3 with the following simple code:
bucket.objects.create("sitemaps/event/#{file_name}", open(file))
I get the following:
Your socket connection to the server was not read from or written to within the timeout period. Idle connections will be closed.
What could be going wrong? Any tips will be appreciated.

This timeout is generally happens when the content length could not be correctly determined based on the opened file. S3 is waiting for additional bytes that aren't coming. The fix is pretty simple, just open your file in binary mode.
Ruby 1.9
bucket.objects.create("sitemaps/event/#{file_name}", open(file, 'rb', :encoding => 'BINARY'))
Ruby 1.8
bucket.objects.create("sitemaps/event/#{file_name}", open(file, 'rb'))
The aws-sdk gem will handle this for you if you pass in the the path to the file:
# use a Pathname object
bucket.objects.create(key, Pathname.new(path_to_file))
# or just the path as a string
bucket.objects.create(key, :file => path_to_file)
Also, you can write to an object in s3 before it exists, so you could also do:
# my favorite syntax
obj = s3.buckets['bucket-name'].objects['object-key'].write(:file => path_to_file)
Hope this helps.

Try modifying the timeout parameters and see if the problem persists.
From the AWS website: http://aws.amazon.com/releasenotes/5234916478554933 (New Configuration Options)
# the new configuration options are:
AWS.config.http_open_timeout #=> new session timeout (15 seconds)
AWS.config.http_read_timeout #=> read response timeout (60 seconds)
AWS.config.http_idle_timeout #=> persistant connections idle longer are closed (60 seconds)
AWS.config.http_wire_trace #=> When true, HTTP wire traces are logged (false)
# you can update the timeouts (with seconds)
AWS.config(:http_open_timeout => 5, :http_read_timeout => 120)
# log to the rails logger
AWS.config(:http_wire_trace => true, :logger => Rails.logger)
# send wire traces to standard out
AWS.config(:http_wire_trace => true, :logger => nil)

Related

Rails how to parse text/event-stream?

I have an API url that is a stream of data with the content type: text/event-stream.
How is it possible to listen to the stream? Like subsribe to each event to print the data? I have tried to use the ruby libary em-eventsource
My test.rb file:
require "em-eventsource"
EM.run do
source = EventMachine::EventSource.new("my_api_url_goes_here")
source.message do |message|
puts "new message #{message}"
end
source.start
end
When I visit my api url I can see the data updated each second. But when I run the ruby file in the terminal it does not print any data/messages.
Set a timer to check source.ready_state, it seems like it does not connect to api for some reason
EDIT: it seems your problem is in https' SNI, which is not supported by current eventmachine release, so upon connecting eventsource tries to connect to default virtual host on api machine, not the one the api is on, thus the CONNECTING state
Try using eventmachine from master branch, it already states to have support for SNI, that is going to be released in 1.2.0:
gem 'eventmachine', github:'eventmachine/eventmachine'
require 'eventmachine'
require 'em-http'
require 'json'
http = EM::HttpRequest.new("api_url", :keepalive => true, :connect_timeout => 0, :inactivity_timeout => 0)
EventMachine.run do
s = http.get({'accept' => 'application/json'})
s.stream do |data|
puts data
end
end
I used EventMachine http libary.

Rails AWS-SDK: Set Expiration Date for Objects

In my Rails app, I allow users to upload images directly to S3, which creates a temporary file that gets automatically deleted after the image record is saved in the database.
Instead of automatically deleting the image after the record is saved, I'd like to set an expiration date for the file on S3 so that it automatically gets deleted after a period (say 24 hours).
I've seen documentation on how to set the expiration date on a bucket (http://docs.aws.amazon.com/AWSRubySDK/latest/AWS/S3/BucketLifecycleConfiguration.html), but I only want a certain folder within the bucket to have files that automatically get removed.
Does anyone have suggestions for how to do it?
s3 = AWS::S3.new(:access_key_id => ENV['AWS_ACCESS_KEY_ID'], :secret_access_key => ENV['AWS_ACCESS_KEY'])
foldername = #image.s3_filepath.split("/")[5]
folder_path = 'uploads/' + foldername
s3.buckets[ENV['AWS_BUCKET']].objects.with_prefix(folder_path).each( #set expiration date header here)
You set the lifecycle configuration on the bucket itself, not each individual object. Using the rest api you'd just write out an xml configuration (there's a field for prefix that let's you only apply this lifecycle config to those keys prefixed by it) and PUT it into the bucket.
Converting that over to the ruby SDK, it looks like the example is doing what you want; that first parameter of add_rule appears to be the prefix.
Although you set the lifecycle on the bucket, you don't set it on the bucket object directly. You need to use the AWS::S3::BucketLifecycleConfiguration class. There is more about how to manage life cycle here:
http://docs.aws.amazon.com/AWSRubySDK/latest/AWS/S3/BucketLifecycleConfiguration.html
You can now specify a folder or prefix within a bucket to narrow the lifecycle rule.
I was struggling with the same issue that you have and it seems that the AWS documentation for rails apps it doesn't say to much about how to send this parameter through the write method. Here you have some links where they describe how to upload a file to S3 bucket AWS SDK for Ruby and Upload an Object Using the AWS SDK for Ruby.
Using AWS SDK for Ruby - Version 1
#!/usr/bin/env ruby
require 'rubygems'
require 'aws-sdk'
bucket_name = '*** Provide bucket name ***'
file_name = '*** Provide file name ****'
# Get an instance of the S3 interface.
s3 = AWS::S3.new
# Upload a file.
key = File.basename(file_name)
s3.buckets[bucket_name].objects[key].write(:file => file_name)
puts "Uploading file #{file_name} to bucket #{bucket_name}."
This also was a good source for me Amazon S3: Cache-Control and Expiry Date difference and setting trough REST API but still he doesn't mention how to setup the expiration date, or any of the answers or links that are there.
So I went through the documentation of the code itself (I'm using the aws-sdk-v1 gem) and I found here Method: AWS::S3::S3Object#write all the possible options that allow us to configure the upload of the S3 object, however there was two that seemed to work for the same purpose:
:metadata (Hash) - A hash of metadata to be included with the object. These will be sent to S3 as headers prefixed with x-amz-meta. Each name, value pair must conform to US-ASCII.
:expires (String) - The date and time at which the object is no longer cacheable.
So I started looking which of them should I configure, and I found here Set content expires and cache-control metadata for AWS S3 objects with Ruby what I was looking for:
require 'rubygems'
require 'aws-sdk'
s3 = AWS::S3.new(
:access_key_id => 'your_access_key',
:secret_access_key => 'your_secret_access_key')
bucket = s3.buckets['bucket_name']
one_year_in_seconds = 365 * 24 * 60 * 60
one_year_from_now = Time.now + one_year_in_seconds
# for a new object / to update an existing object
o = bucket.objects['object']
o.write(:file => 'path_to_file',
:cache_control => "max-age=#{one_year_in_seconds}",
:expires => one_year_from_now.httpdate)
# to update an existing object
o.copy_to(o.key,
:cache_control => "max-age=#{one_year_in_seconds}",
:expires => one_year_from_now.httpdate)
And that is pretty much what I've done in order to configure the expiration date and cache control, here is the code adapted to the app where I'm working on:
one_year_in_seconds = 365 * 24 * 60 * 60
files.each do |path, s3path|
#puts "Uploading #{path} to s3: #{File.join(prefix, s3path)}"
s3path = File.join(prefix, s3path) unless prefix.nil?
one_year_from_now = Time.now + one_year_in_seconds
bucket.objects[File.join(s3path)].write(
:file => path,
:acl => (public == true ? :public_read : :private),
:cache_control => "max-age=#{one_year_in_seconds}",
:expires => one_year_from_now.httpdate
)
end
Also here http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/Expiration.html you will find the necessary support for the decision that I've done of configure expires and cache_control instead of metadata:
The Cache-Control max-age directive lets you specify how long (in seconds) that you want an object to remain in the cache before CloudFront gets the object again from the origin server. The minimum expiration time CloudFront supports is 0 seconds for web distributions and 3600 seconds for RTMP distributions. The maximum value is 100 years. Specify the value in the following format:
Cache-Control: max-age=seconds
For example, the following directive tells CloudFront to keep the associated object in the cache for 3600 seconds (one hour):
Cache-Control: max-age=3600
If you want objects to stay in CloudFront edge caches for a different duration than they stay in browser caches, you can use the Cache-Control max-age and Cache-Control s-maxage directives together. For more information, see Specifying the Amount of Time that CloudFront Caches Objects for Web Distributions.
The Expires header field lets you specify an expiration date and time using the format specified in RFC 2616, Hypertext Transfer Protocol -- HTTP/1.1 Section 3.3.1, Full Date, for example:
Sat, 27 Jun 2015 23:59:59 GMT
Regarding your question, yes you can specify an expiration date and a cache control date per each object in your bucket.

Can I use a Request / Reply - RPC pattern in Rails 3 with AMQP?

For reasons similar to the ones in this discussion, I'm experimenting with messaging in lieu of REST for a synchronous RPC call from one Rails 3 application to another. Both apps are running on thin.
The "server" application has a config/initializers/amqp.rb file based on the Request / Reply pattern in the rubyamqp.info documentation:
require "amqp"
EventMachine.next_tick do
connection = AMQP.connect ENV['CLOUDAMQP_URL'] || 'amqp://guest:guest#localhost'
channel = AMQP::Channel.new(connection)
requests_queue = channel.queue("amqpgem.examples.services.time", :exclusive => true, :auto_delete => true)
requests_queue.subscribe(:ack => true) do |metadata, payload|
puts "[requests] Got a request #{metadata.message_id}. Sending a reply..."
channel.default_exchange.publish(Time.now.to_s,
:routing_key => metadata.reply_to,
:correlation_id => metadata.message_id,
:mandatory => true)
metadata.ack
end
Signal.trap("INT") { connection.close { EventMachine.stop } }
end
In the 'client' application, I'd like to render the results of a synchronous call to the 'server' in a view. I realize this is a bit outside the comfort zone of an inherently asynchronous library like the amqp gem, but I'm wondering if there's a way to make it work. Here is my client config/initializers/amqp.rb:
require 'amqp'
EventMachine.next_tick do
AMQP.connection = AMQP.connect 'amqp://guest:guest#localhost'
Signal.trap("INT") { AMQP.connection.close { EventMachine.stop } }
end
Here is the controller:
require "amqp"
class WelcomeController < ApplicationController
def index
puts "[request] Sending a request..."
WelcomeController.channel.default_exchange.publish("get.time",
:routing_key => "amqpgem.examples.services.time",
:message_id => Kernel.rand(10101010).to_s,
:reply_to => WelcomeController.replies_queue.name)
WelcomeController.replies_queue.subscribe do |metadata, payload|
puts "[response] Response for #{metadata.correlation_id}: #{payload.inspect}"
#message = payload.inspect
end
end
def self.channel
#channel ||= AMQP::Channel.new(AMQP.connection)
end
def self.replies_queue
#replies_queue ||= channel.queue("reply", :exclusive => true, :auto_delete => true)
end
end
When I start both applications on different ports and visit the welcome#index view.
#message is nil in the view, since the result has not yet returned. The result arrives a few milliseconds after the view is rendered and is displayed on the console:
$ thin start
>> Using rack adapter
>> Thin web server (v1.5.0 codename Knife)
>> Maximum connections set to 1024
>> Listening on 0.0.0.0:3000, CTRL+C to stop
[request] Sending a request...
[response] Response for 3877031: "2012-11-27 22:04:28 -0600"
No surprise here: subscribe is clearly not meant for synchronous calls. What is surprising is that I can't find a synchronous alternative in the AMQP gem source code or in any documentation online. Is there an alternative to subscribe that will give me the RPC behavior I want? Given that there are other parts of the system in which I'd want to use legitimately asynchronous calls, the bunny gem didn't seem like the right tool for the job. Should I give it another look?
edit in response to Sam Stokes
Thanks to Sam for the pointer to throw :async / async.callback. I hadn't seen this technique before and this is exactly the kind of thing I was trying to learn with this experiment in the first place. send_response.finish is gone in Rails 3, but I was able to get his example to work for at least one request with a minor change:
render :text => #message
rendered_response = response.prepare!
Subsequent requests fail with !! Unexpected error while processing request: deadlock; recursive locking. This may have been what Sam was getting at with the comment about getting ActionController to allow concurrent requests, but the cited gist only works for Rails 2. Adding config.allow_concurrency = true in development.rb gets rid of this error in Rails 3, but leads to This queue already has default consumer. from AMQP.
I think this yak is sufficiently shaven. ;-)
While interesting, this is clearly overkill for simple RPC. Something like this Sinatra streaming example seems a more appropriate use case for client interaction with replies. Tenderlove also has a blog post about an upcoming way to stream events in Rails 4 that could work with AMQP.
As Sam points out in his discussion of the HTTP alternative, REST / HTTP makes perfect sense for the RPC portion of my system that involves two Rails apps. There are other parts of the system involving more classic asynchronous event publishing to Clojure apps. For these, the Rails app need only publish events in fire-and-forget fashion, so AMQP will work fine there using my original code without the reply queue.
You can get the behaviour you want - have the client make a simple HTTP request, to which your web app responds asynchronously - but you need more tricks. You need to use Thin's support for asynchronous responses:
require "amqp"
class WelcomeController < ApplicationController
def index
puts "[request] Sending a request..."
WelcomeController.channel.default_exchange.publish("get.time",
:routing_key => "amqpgem.examples.services.time",
:message_id => Kernel.rand(10101010).to_s,
:reply_to => WelcomeController.replies_queue.name)
WelcomeController.replies_queue.subscribe do |metadata, payload|
puts "[response] Response for #{metadata.correlation_id}: #{payload.inspect}"
#message = payload.inspect
# Trigger Rails response rendering now we have the message.
# Tested in Rails 2.3; may or may not work in Rails 3.x.
rendered_response = send_response.finish
# Pass the response to Thin and make it complete the request.
# env['async.callback'] expects a Rack-style response triple:
# [status, headers, body]
request.env['async.callback'].call(rendered_response)
end
# This unwinds the call stack, skipping the normal Rails response
# rendering, all the way back up to Thin, which catches it and
# interprets as "I'll give you the response later by calling
# env['async.callback']".
throw :async
end
def self.channel
#channel ||= AMQP::Channel.new(AMQP.connection)
end
def self.replies_queue
#replies_queue ||= channel.queue("reply", :exclusive => true, :auto_delete => true)
end
end
As far as the client is concerned, the result is indistinguishable from your web app blocking on a synchronous call before returning the response; but now your web app can process many such requests concurrently.
CAUTION!
Async Rails is an advanced technique; you need to know what you're doing. Some parts of Rails do not take kindly to having their call stack abruptly dismantled. The throw will bypass any Rack middlewares that don't know to catch and rethrow it (here is a rather old partial solution). ActiveSupport's development-mode class reloading will reload your app's classes after the throw, without waiting for the response, which can cause very confusing breakage if your callback refers to a class that has since been reloaded. You'll also need to ask ActionController nicely to allow concurrent requests.
Request/response
You're also going to need to match up requests and responses. As it stands, if Request 1 arrives, and then Request 2 arrives before Request 1 gets a response, then it's undefined which request would receive Response 1 (messages on a queue are distributed round-robin between the consumers subscribed to the queue).
You could do this by inspecting the correlation_id (which you'll have to explicitly set, by the way - RabbitMQ won't do it for you!) and re-enqueuing the message if it's not the response you were waiting for. My approach was to create a persistent Publisher object which would keep track of open requests, listen for all responses, and lookup the appropriate callback to invoke based on the correlation_id.
Alternative: just use HTTP
You're really solving two different (and tricky!) problems here: persuading Rails/thin to process requests asynchronously, and implementing request-response semantics on top of AMQP's publish-subscribe model. Given you said this is for calling between two Rails apps, why not just use HTTP, which already has the request-response semantics you need? That way you only have to solve the first problem. You can still get concurrent request processing if you use a non-blocking HTTP client library, such as em-http-request.

Mongodb server goes down, how to prevent Rails app from timing out?

I'm using central_logger to store logs from our Rails app in mongodb. When the mongo server went down recently our app started timing out on mongo inserts. How can I prevent Rails from timing out if the mongo server goes down?
The ruby driver supports timeouts like so
#conn = Connection.new("localhost", 27017, :pool_size => 5, :timeout => 5)
But the central_logger gem isn't using that. So you can either fork it to add that in there, or monkey-path the CentralLogger::MongoLogger.connect method
It currently has
def connect
#mongo_connection ||= Mongo::Connection.new(#db_configuration['host'],
#db_configuration['port'],
:auto_reconnect => true).db(#db_configuration['database'])
if #db_configuration['username'] && #db_configuration['password']
# the driver stores credentials in case reconnection is required
#authenticated = #mongo_connection.authenticate(#db_configuration['username'],
#db_configuration['password'])
end
end
You would need to monkey-path in :timeout=>5 (or whatever) to the Mongo::Connection.new
I would bet the author of central-logger would like to have this in there, so a fork and pull request would likely be welcome.
You could use replica sets - so if the master goes down, it can failover automatically to one of the replicas.
Usually the database insert should be fast, so you could work with the ruby timeout:
require 'timeout'
Timeout::timeout(0.2) do
... write to log server
end
this code will timeout and continue after 200 milliseconds in any case.

Ruby-Rails serve ftp file direct to client

I am new to ruby and to rails, so excuse my question... .
What i want to know is, how to take a file from a ftp server with ruby without saving the file on my rails application harddrive (streaming the filedata direct to the client). I am working with the ruby Net/FTP class.
With the method "retrbinary" from the Net/FTP class i have the following snippet:
ftp.retrbinary('RETR ' + filename, 4096) { |data|
buf << data
}
In my rails view i can do something like this:
send_data( buf )
So how do i combine these two. I dont know how to instanziate a buffer object, fill in the stream and than serve it to the user. Has anybody an idea how to do this?
thank you very much for your support! Your post get me going on. After some cups of coffee i found a working solution. Actually i am doing the following, which works for me:
def download_file
filename = params[:file]
raw = StringIO.new('')
#ftp.retrbinary('RETR ' + filename, 4096) { |data|
raw << data
}
#ftp.close
raw.rewind
send_data raw.read, :filename => filename
end
I will test this in production(real life situation). If this is not working well enough, i have to use a NFS mount.
fin
Do you want the following?
1) Client (browser) sends a request to the Rails server
2) Server should respond with the contents of a file that is located on an ftp server.
Is that it?
If so, then simply redirect the browser to the ftp location. Eg
# in controller
ftp_url = "ftp://someserver.com/dir_name/file_name.txt
redirect_to ftp_url
The above works if the ftp file has anonymous get access.
If you really need to access the file from the server and stream it, try the following:
# in controller
render :text => proc {|response, output|
ftp_session = FTP.open(host, user, passwd, acct)
ftp_session.gettextfile(remotefile) {|data| output.write(data)}
ftp_session.close
}
You should check the headers in the response to see if they're what you want.
ps. Setting up the ftp connection and streaming from a second server will probably be relatively slow. I'd use JS to show a busy graphic to the user.
I'd try alternatives to ftp. Can you set up an NFS connection or mount the remote disk? Would be much faster than ftp. Also investigate large TCP window sizes.

Resources