I'm going to be starting a project soon that requires support for large-ish binary files. I'd like to use Ruby on Rails for the webapp, but I'm concerned with the BLOB support. In my experience with other languages, frameworks, and databases, BLOBs are often overlooked and thus have poor, difficult, and/or buggy functionality.
Does RoR spport BLOBs adequately? Are there any gotchas that creep up once you're already committed to Rails?
BTW: I want to be using PostgreSQL and/or MySQL as the backend database. Obviously, BLOB support in the underlying database is important. For the moment, I want to avoid focusing on the DB's BLOB capabilities; I'm more interested in how Rails itself reacts. Ideally, Rails should be hiding the details of the database from me, and so I should be able to switch from one to the other. If this is not the case (ie: there's some problem with using Rails with a particular DB) then please do mention it.
UPDATE: Also, I'm not just talking about ActiveRecord here. I'll need to handle binary files on the HTTP side (file upload effectively). That means getting access to the appropriate HTTP headers and streams via Rails. I've updated the question title and description to reflect this.
As for streaming, you can do it all in an (at least memory-) efficient way. On the upload side, file parameters in forms are abstracted as IO objects that you can read from; on the download side, look in to the form of render :text => that takes a Proc argument:
render :content_type => 'application/octet-stream', :text => Proc.new {
|response, output|
# do something that reads data and writes it to output
}
If your stuff is in files on disk, though, the aforementioned solutions will certainly work better.
+1 for attachment_fu
I use attachment_fu in one of my apps and MUST store files in the DB (for annoying reasons which are outside the scope of this convo).
The (one?) tricky thing dealing w/BLOB's I've found is that you need a separate code path to send the data to the user -- you can't simply in-line a path on the filesystem like you would if it was a plain-Jane file.
e.g. if you're storing avatar information, you can't simply do:
<%= image_tag #youruser.avatar.path %>
you have to write some wrapper logic and use send_data, e.g. (below is JUST an example w/attachment_fu, in practice you'd need to DRY this up)
send_data(#youruser.avatar.current_data, :type => #youruser.avatar.content_type, :filename => #youruser.avatar.filename, :disposition => 'inline' )
Unfortunately, as far as I know attachment_fu (I don't have the latest version) does not do clever wrapping for you -- you've gotta write it yourself.
P.S.
Seeing your question edit - Attachment_fu handles all that annoying stuff that you mention -- about needing to know file paths and all that crap -- EXCEPT the one little issue when storing in the DB. Give it a try; it's the standard for rails apps. IF you insist on re-inventing the wheel, the source code for attachment_fu should document most of the gotchas, too!
You can use the :binary type in your ActiveRecord migration and also constrain the maximum size:
class BlobTest < ActiveRecord::Migration
def self.up
create_table :files do |t|
t.column :file_data, :binary, :limit => 1.megabyte
end
end
end
ActiveRecord exposes the BLOB (or CLOB) contents as a Ruby String.
I think your best bet is the attachment_fu plug-in:
http://github.com/technoweenie/attachment_fu/tree/master
UPDATE: Found some more info here http://groups.google.com/group/rubyonrails-talk/browse_thread/thread/a81beffb93708bb3
Look into the plugin, x_send_file too.
"The XSendFile plugin provides a simple interface for sending files via the X-Sendfile HTTP header. This enables your web server to serve the file directly from disk, instead of streaming it through your Rails process. This is faster and saves a lot of memory if you‘re using Mongrel. Not every web server supports this header. YMMV."
I'm not sure if it's usable with Blobs, it may just be for files on the file system. But you probably need something that doesn't tie up the web server streaming large chunks of data.
Related
I'm using the I18n Gem from Sven Fuchs in my Ruby on Rails 3.2 Application and while the gem works great I came across a situation, which I don't know the solution to:
I have a seed file, which contains the basic translation for my MVC's and is seeded, when I install my application on a new machine. The problem is that when one of these translations changes, I have to go to my seed file, edit it, delete in the database and reseed it. Which is problem not the best way to do this.
Furthermore, my application can create complete MVC's on the fly, which of course need translations as well. These translations get only stored in the database. But it would be nice to store them in a real file, keep them under version control and import or export them if I need to.
So, basically what I'm looking for, is an intelligent connection between the translations in my database and the ones in my files. So I can populate one from the other or vica verca and keep them in sync.
And also I looked at solutions like Globalize3 or localeapp, but they don't seem to fit.
Summarized, what I have is:
The I18n Gem from Sven Fuchs with a Backend I created myself
A Seed File which changes sometimes and has to be edited manually but seeds the basic translations
A database which contains translations that are created on the fly and are not under version control, nor stored in some file
What I want:
A sync between translations in my seed file and my database
A way to put my translations under version control
I'm sure I can't be the only one who needs this...
Thanks in regards!
Here is how I solved a problem closer to the question asked:
task :task_name => [:environment] do
file = "db/file_name.txt"
counter = 0
CSV.foreach(file, :headers => true, :col_sep => "^", :quote_char => "~") do |row|
identifier = row[0].to_i
model_name = ModelName.find_or_create_by_identifier(identifier)
I18n.locale = row[1]
model_name.name = row[3]
model_name.save!
end
end
Note that identifier needs to be a unique identifier that doesn't change and exists in the file and in the database. In this example, the columns are separated by "^" and quores are "~"
As #tigrish said in the comments, it is not a good idea to insert in the file and in the database, so it is important to restrict this.
These links may also help:
http://railscasts.com/episodes/396-importing-csv-and-excel
http://jasonseifer.com/2010/04/06/rake-tutorial
As the question is a little old, I hope it can help somebody else.
I'm using Paperclip / S3 for file uploading. I upload text-like files (not .txt, but they are essentially a .txt). In a show controller, I want to be able to get the contents of the uploaded file, but don't see contents as one of its attributes. What can I do here?
attachment_file_name: "test.md", attachment_content_type: "application/octet-stream", attachment_file_size: 58, attachment_updated_at: "2011-06-22 01:01:40"
PS - Seems like all the Paperclip tutorials are about images, not text files.
In Paperclip 3.0.1 you could just use the io_adapter which doesn't require writing an extra file to (and removing from) the local file system.
Paperclip.io_adapters.for(attachment.file).read
#jon-m answer needs to be updated to reflect the latest changes to paperclip, in order for this to work needs to change to something like:
class Document
has_attached_file :revision
def revision_contents(path = 'tmp/tmp.any')
revision.copy_to_local_file :original, path
File.open(path).read
end
end
A bit convoluted as #jwadsack mentioned using Paperclip.io_adapters.for method accomplishes the same and seems like a better, cleaner way to do this IMHO.
To access the file you can use the path method:
csv_file.path
http://rdoc.info/gems/paperclip/Paperclip/Attachment#path-instance_method
This can be used along with for example the CSV reader.
Here's how I access the raw contents of my attachment:
class Document
has_attached_file :revision
def revision_contents
revision.copy_to_local_file.read
end
end
Please note, I've omitted my paperclip configuration options and any sort of error handling.
You would need to load the contents of the file (using Rubys File.open) into a variable before you show it. This may be an expensive operation if your app gets lots of use, so it may be worthwhile reading the contents of the file and putting it into a text column in your database after uploading it.
Attachment already inherits from IOStream. http://rdoc.info/github/thoughtbot/paperclip/master/Paperclip/Attachment
So it should just be "#{attachment}" or <% RDiscount.new(attachment).to_html %> or send_data(attachment). However you wanted to display the data.
This is a method I used for upload from paperclip to active storage and should provide some guidance on temporarily working with a file in memory. Note: This should only be used for relatively small files.
Written for gem paperclip 6.1.0
Where I have a simple model
class Post
has_attached_file :image
end
Working with a temp file in ruby so we do not have to worry about closing the file
Tempfile.create do |tmp_file|
post.image.copy_to_local_file(nil, tmp_file.path)
post.image_temp.attach(
io: tmp_file,
filename: post.image_file_name,
content_type: post.image_content_type
)
end
I'm am adding tests to a Rails app that remotely stores files. I'm using the default Rails functional tests. How can I add file uploads to them? I have:
test "create valid person" do
post(:create, :person => { :avatar => fixture_file_upload('avatar.jpeg') })
end
This for some reason uploads a Tempfile and causes the AWS/S3 gem to fail with:
NoMethodError: undefined method `bytesize' for Tempfile
Is their any way that I can get the test to use an ActionDispatch::Http::UploadedFile and perform more like it does when testing with the web browser? Is fixture_file_upload the way to test uploading files to a controller? If so why doesn't it work like the browser?
As a note, I really don't want to switch testing frameworks. Thanks!
I use the s3 gem instead of the aws/s3 gem. The main reasons for this are no support for european buckets and development of aws/s3 seems to be stopped.
If you want to test file upload than using the fixtures_file_upload method is correct, it maps directly to Rack::Test::UploadedFile.new (you can use this if the test file isn't in the fixtures folder).
But I've also noticed that the behavior of the Rack::Test::Uploaded file objects isn't exactly the same as the ActionDispatch::Http::UploadedFile object (that's the class of uploaded files). The basic methods (original_filename, read, size, ...) all work but there are some differences when working with the file method. So limit your controller to these methods and all will be fine.
An other possible solution is by creating an ActionDispatch::Http::Uploaded file object and using that so:
upload = ActionDispatch::Http::UploadedFile.new({
:filename => 'avatar.jpeg',
:type => 'image/jpeg',
:tempfile => File.new("#{Rails.root}/test/fixtures/avatar.jpeg")
})
post :create, :person => { :avatar => upload }
I'd recommend using mocks.
A quick google search reveals:
http://www.ibm.com/developerworks/web/library/wa-mockrails/index.html
You should be able to create an object that will respond to the behaviors you want it to. Mostly used in a Unit test environment, so you can test your stuff in isolation, as integration tests are supposed to fully exercise the entire stack. However, I can see in this case it'd be useful to mock out the S3 service because it costs money.
I'm not familiar with the AWS/S3 gem, but it seems that you probably aren't using the :avatar param properly. bytesize is defined on String in ruby1.9. What happens if you call read on the uploaded file where you pass it into AWS/S3?
I have paper_clip installed on my Rails 3 app, and can upload a file - wow that was fun and easy!
Challenge now is, allowing a user to upload multiple objects.
Whether it be clicking select fileS and being able to select more than one. Or clicking a more button and getting another file upload button.
I can't find any tutorials or gems to support this out of the box. Shocking I know...
Any suggestions or solutions. Seems like a common need?
Thanks
Okay, this is a complex one but it is doable. Here's how I got it to work.
On the client side I used http://github.com/valums/file-uploader, a javascript library which allows multiple file uploads with progress-bar and drag-and-drop support. It's well supported, highly configurable and the basic implementation is simple:
In the view:
<div id='file-uploader'><noscript><p>Please Enable JavaScript to use the file uploader</p></noscript></div>
In the js:
var uploader = new qq.FileUploader({
element: $('#file-uploader')[0],
action: 'files/upload',
onComplete: function(id, fileName, responseJSON){
// callback
}
});
When handed files, FileUploader posts them to the server as an XHR request where the POST body is the raw file data while the headers and filename are passed in the URL string (this is the only way to upload a file asyncronously via javascript).
This is where it gets complicated, since Paperclip has no idea what to do with these raw requests, you have to catch and convert them back to standard files (preferably before they hit your Rails app), so that Paperclip can work it's magic. This is done with some Rack Middleware which creates a new Tempfile (remember: Heroku is read only):
# Embarrassing note: This code was adapted from an example I found somewhere online
# if you recoginize any of it please let me know so I pass credit.
module Rack
class RawFileStubber
def initialize(app, path=/files\/upload/) # change for your route, careful.
#app, #path = app, path
end
def call(env)
if env["PATH_INFO"] =~ #path
convert_and_pass_on(env)
end
#app.call(env)
end
def convert_and_pass_on(env)
tempfile = env['rack.input'].to_tempfile
fake_file = {
:filename => env['HTTP_X_FILE_NAME'],
:type => content_type(env['HTTP_X_FILE_NAME']),
:tempfile => tempfile
}
env['rack.request.form_input'] = env['rack.input']
env['rack.request.form_hash'] ||= {}
env['rack.request.query_hash'] ||= {}
env['rack.request.form_hash']['file'] = fake_file
env['rack.request.query_hash']['file'] = fake_file
if query_params = env['HTTP_X_QUERY_PARAMS']
require 'json'
params = JSON.parse(query_params)
env['rack.request.form_hash'].merge!(params)
env['rack.request.query_hash'].merge!(params)
end
end
def content_type(filename)
case type = (filename.to_s.match(/\.(\w+)$/)[1] rescue "octet-stream").downcase
when %r"jp(e|g|eg)" then "image/jpeg"
when %r"tiff?" then "image/tiff"
when %r"png", "gif", "bmp" then "image/#{type}"
when "txt" then "text/plain"
when %r"html?" then "text/html"
when "js" then "application/js"
when "csv", "xml", "css" then "text/#{type}"
else 'application/octet-stream'
end
end
end
end
Later, in application.rb:
config.middleware.use 'Rack::RawFileStubber'
Then in the controller:
def upload
#foo = modelWithPaperclip.create({ :img => params[:file] })
end
This works reliably, though it can be a slow process when uploading a lot of files simultaneously.
DISCLAIMER
This was implemented for a project with a single, known & trusted back-end user. It almost certainly has some serious performance implications for a high traffic Heroku app and I have not fire tested it for security. That said, it definitely works.
The method Ryan Bigg recommends is here:
https://github.com/rails3book/ticketee/commit/cd8b466e2ee86733e9b26c6c9015d4b811d88169
https://github.com/rails3book/ticketee/commit/982ddf6241a78a9e6547e16af29086627d9e72d2
The file-uploader recommendation by Daniel Mendel is really great. It's a seriously awesome user experience, like Gmail drag-and-drop uploads. Someone wrote a blog post about how to wire it up with a rails app using the rack-raw-upload middleware, if you're interested in an up-to-date middleware component.
http://pogodan.com/blog/2011/03/28/rails-html5-drag-drop-multi-file-upload
https://github.com/newbamboo/rack-raw-upload
http://marc-bowes.com/2011/08/17/drag-n-drop-upload.html
There's also another plugin that's been updated more recently which may be useful
jQuery-File-Upload
Rails setup instructions
Rails setup instructions for multiples
And another one (Included for completeness. I haven't investigated this one.)
PlUpload
plupload-rails3
These questions are highly related
Drag-and-drop file upload in Google Chrome/Chromium and Safari?
jQuery Upload Progress and AJAX file upload
I cover this in Rails 3 in Action's Chapter 8. I don't cover uploading to S3 or resizing images however.
Recommending you buy it based solely on it fixing this one problem may sound a little biased, but I can just about guarantee you that it'll answer other questions you have down the line. It has a Behaviour Driven Development approach as one of the main themes, introducing you to Rails features during the development of an application. This shows you not only how you can build an application, but also make it maintainable.
As for the resizing of images after they've been uploaded, Paperclip's got pretty good documentation on that. I'd recommend having a read and then asking another question on SO if you don't understand any of the options / methods.
And as for S3 uploading, you can do this:
has_attached_file :photo, :styles => { ... }, :storage => :s3
You'd need to configure Paperclip::Storage::S3 with your S3 details to set it up, and again Paperclip's got some pretty awesome documentation for this.
Good luck!
Let's say I have an image that does not reside in the normal location:
{appname}/public/images/unconventional.gif
But instead here:
{appname}/unconventional.gif
I understand this is a complete violation of Rails conventions, is immoral and you should never do this under any circumstances and, furthermore, why would I even suggest such a foolish thing?
Ok, now that we have that out of the way, assuming I am on Windows and therefore symbolic links are out of the question, how is it possible to set this up?
Rails does not serve these images, it lets the web server do that. You had best change the configuration of your web server to handle this scenario. If you use Apache, for example, it would fairly easy to set up with mod_rewrite.
Making Rails serve these images will be ugly, but it is possible if you provide a route in your routes.rb that matches /public/images/unconventional.gif, and if the file itself does not exist. For example:
map.connect "public/images/unconventional.gif",
:controller => "static_image_controller",
:action => "serve"
And then create a controller StaticImageController:
class StaticImageController < ApplicationController
def serve
image = File.read(File.join(Rails.root, "unconventional.gif"))
send_data image, :type => "image/gif", :disposition => "inline"
end
end
Warning: if you use the above concept, note that if you use input from the URL to decide which file to serve (with params[:file], for example), you need to thoroughly sanitize the input, because you are risking exposing your entire file system to the outside world.