How to return a pdf file from a rest api? - ruby-on-rails

I have setup a rest API inside a ruby on rails application, I now have a requirement to generate a PDF and return this PDF from a get request. I am looking for some advice on how to implement this feature.
Some of the requirements that I have are as follows: I can't save the file and give the end user a link to the file because the data in the file can be updated at any time. I am using the application as microservice so there isn't a front end that I can use to display the file.
So here is my thinking I would love some advice on how to implement this feature.
I would like to make a get request to a specific endpoint in the application. I expect a PDF file to be returned which I can then display to the end user.
I am currently using WickedPdf gem to generate a temporary PDF file, but I am really struggling with how the response should look.
Any advice would be much appreciated.

One way is to create a PDF file in memory and stream it to the client. I prefer this way, maybe later you will have to send PDF files via email, or just save them to some backup disk etc...
def get_pdf
pdf = WickedPdf.new.pdf_from_string('<h1>Hello There!</h1>')
send_data pdf, filename: 'file_name.pdf'
end
You can put the PDF generation to a different service and just call it in the controller. This provides isolation and you can test it separately.
Also you can debug the endpoint response with HTTPie http get http://localhost:3000/invoices/1/get_pdf
Rails will set all the necessary HTTP response headers:
Content-Disposition: attachment; filename="file_name.pdf"
Content-Length: 5995
Content-Transfer-Encoding: binary
Content-Type: application/pdf
So when the user clicks on a link that points to the endpoint, most probably the download dialog will pop up because of the Content-Disposition: attachment; header
Other solution is to render the get_pdf.html as PDF and send back to the client:
def get_pdf
render pdf: "file_name"
end
But in this case the Content-Disposition header will be inline, which means the browser will open the pdf (if it can read PDF format) instead of offering to download it.

Upload pdf to Amazon s3 and generate link then get pdf link in apis.

I don't know if you still need this, but for anyone in the future I found a nice solution:
pdf = WickedPdf.new.pdf_from_string(render_to_string "entradas/entradaspdf.pdf.erb")
send_data pdf, filename: "bergha.pdf", disposition: "inline"
I'm loading my pdf-html-view based template through "render_to_string" ruby method which returns the view contents in string. Then WickedPdf converts it to a pdf binary, and finally save that to "pdf" var.
Finally instead of "render" I use the "send_data" method, where first parameter is the output data (my pdf var), second is the filename of the output data, and third (optional) is to change Content-Disposition header to tell browser whether to load the file (inline) or just download it (attachment).
Hope it works, it does just fine for me

Related

How to validate a file as image on the server before uploading to S3?

The flow is:
The user selects an image on the client.
Only filename, content-type and size are sent to the server. (E.g. "file.png", "image/png", "123123")
The response are fields and policies for upload directly to S3. (E.g. "key: xxx, "alc": ...)
The case is that if I change the extension of "file.pdf" to "file.png" and then uploads it, the data sent to the server before uploads to S3 are:
"file.png"
"image/png"
The servers says "ok" and return the S3 fields for upload .
But the content type sent is not a real content type. But how I can validate this on the server?
Thanks!
Example:
Testing Redactorjs server side code (https://github.com/dybskiy/redactor-js/blob/master/demo/scripts/image_upload.php) it checks the file content type. But trying upload fake image (test here: http://imperavi.com/redactor/), it not allows the fake image. Like I want!
But how it's possible? Look at the request params: (It sends as image/jpeg, that should be valid)
When I was dealing with this question at work I found a solution using Mechanize.
Say you have an image url, url = "http://my.image.com"
Then you can use img = Mechanize.new.get(url)[:body]
The way to test whether img is really an image is by issuing the following test:
img.is_a?(Mechanize::Image)
If the image is not legitimate, this will return false.
There may be a way to load the image from file instead of URL, I am not sure, but I recommend looking at the mechanize docs to check.
With older browsers there's nothing you can do, since there is no way for you to access the file contents or any metadata beyond its name.
With the HTML5 file api you can do better. For example,
document.getElementById("uploadInput").files[0].type
Returns the mime type of the first file. I don't believe that the method used to perform this identification is mandated by the standard.
If this is insufficient then you could read the file locally with the FileReader apis and do whatever tests you require. This could be as simple as checking for the magic bytes present at the start of various file formats to fully validating that the file conforms to the relevant specification. MDN has a great article that shows how to use various bits of these apis.
Ultimately none of this would stop a malicious attempt.

Setting the name of a file downloaded from the browser

Disclaimer I am aware of the Content-Disposition header to send back to the client to set the downloaded file name - however my problem is a little more complicated than just that
I have an application (RubyOnRails using rails 3.1.3) that is essentially a document search/view application (search for documents and then render them in the browser). This is accomplished using an iframe.
<iframe src="<%= #frameURL %>" width="100%" height="100%">
#frameURL is a call to the plugin function of our Documents controller. The plugin function makes a RESTful call to our back end API to retrieve the referenced document, and then send the document contents back to the browser for rendering inside the iframe.
This works perfectly for documents like JPEG, PDF, TXT, etc. However, when the browser does not know how to handle the content-type (like a word document - we run Mac OS-X) - then the browser downloads the returned file as plugin.doc <- NOTE this is without setting the Content-Disposition header.
Since we want to name the file appropriately when it needs to be downloaded, we set the Content-Disposition header:
response.headers['Content-Disposition'] = "attachment; filename.extension"
Now the file gets downloaded as filename.doc - however, with this header set, even files like JPEG which the browser can render internally, get downloaded.
Questions:
Does anyone know where rails or the browser is getting the name of plugin.extension when we don't set the Content-Disposition header?
Is there a way to set Content-Disposition but have it only applied IF the browser can't render the document - so the default should be browser handles everything it can, and as a fallback, the browser uses the Content-Disposition content to name the downloaded file.
Thanks!
If you are calling some Rails function like "send_file", then search the source code of your version of Rails to find the source code of that function and see what headers it sets. You have to follow the call stack down a couple of levels but you should be able to find out how it sets the headers; I have done this before. As for the browser, I think if it doesn't find a file name in the Content-Disposition header it will more or less use the last portion of the URL for a filename.
Try using "inline" instead of "attachment" in the header.

How to change filename prompt text browser Save As dialog?

In my web page (rendered by Rails), I'd like to let the user right-click on a photo to bring up the browser's Save As dialog, to let the user save the photo to their hard drive.
However, the photos on my server have unusual filenames (long hex names) with no file extension. The filename prompt in the Save As dialog has this ugly filename. If the user hits save, they'll end up with a poorly-named file, with no file extension.
The web page is aware of the photo's real file name (the name that came off the camera, for example). Is there a way for me to programmatically override the Save As dialog's filename prompt with a filename of my choosing?
I'm aware of the Content-Dispostion header, and that via this header a filename can be specified. However, I think that in order to be able to make use of this header, I need to load/render the entire file to the browser. If the asset to be made available for download is a movie, that loading of the file could timeout the browser...like, if it's a 100meg video.
Thoughts?
-A
I think I understand the problem here because I encountered (and resolved) at least part of it myself not too long ago.
I have some large mp3's and I link to them on my website
A few problems
I needed to set my content-disposition header to attachment in order to prevent files from automatically streaming whenever a user clicked the download button
my files are on a remote server
my files are large (100MB)
large files can tie up rails controllers if not handled properly
Now, Michael Koziarsky advises in this article that the best way to keep your rails processes free when serving large files, is to create a download action in your controller, and the do something like this (note the use of x_sendfile=>true):
def download
send_file '/path/to/podcast.mp3', :type => 'application/octet-stream', :disposition => 'attachment', :filename=>'something.mp3', :x_sendfile=>true
end
:x_sendfile tells apache to let the file through without tying up a rails controller process. The rest of the code sets the filename and the content-disposition header.
Great, but I'm on heroku, like everyone else nowadays. So I can't use x_sendfile.
I found that I couldn't modify the nginx configuration file either as it's locked down by heroku so it was not possible to get x-accel-redirect (nginx equivalent of x-sendfile) working
So, I decided to add a perl script (see below) to the cgi-bin on our asset-host and this script sets the content-disposition to attachment and gives our file a name too.
Instead of doing a restful download like this:
link_to "download", download_podcast_path(#podcast.mp3)
we just link to the mp3 making sure that we go in through the cgi-bin so that the perl script gets called on every mp3 that leaves the server
# I'm using haml
%a{:href=>"http://afmpodcast.com/cgi-bin/download.cgi?ID=#{#podcast.mp3}"}
download
The result is that my rails controller is no longer called into action when someone downloads a file
I found the perl script here and chopped it up a bit to work for me:
#!/usr/local/bin/perl -wT
use CGI ':standard';
use CGI::Carp qw(fatalsToBrowser);
my $files_location;
my $ID;
my #fileholder;
$files_location = "../";
$ID = param('ID');
open(DLFILE, "<$files_location/$ID") || Error('open', 'file');
#fileholder = <DLFILE>;
close (DLFILE) || Error ('close', 'file');
print "Content-Type:application/x-download\n";
print "Content-Disposition:attachment;filename=$ID\n\n";
print #fileholder
My code, is on github but you'll likely have all sorts of problems using it on your machine as i make heavy use of ENV variables that I store in bashrc and I have no documentation or tests ^hides^
You could do some smart server side url rewrite, like for example rewriting foo.mpeg to youveryuglyfilenamewithoutextension.
Set the Content-Disposition to "attachment; filename="...that's fine. "attachment" explicitly means it's not to be rendered in the browser, file renaming works nonetheless (or possibly particularly for that case).
Based on your comments, you have a few problems.
You want to set the filename using your Rails app.
The file is on a remote host and your Rails app is acting as a middleman.
The file might be big, so you want the file to be sent out to the browser as you receive it instead of queuing the whole thing.
Streaming only with Rails is tricky for a few reasons.
You would need an HTTP client that lets you access the message body as you receive data instead of blocking until you have everything. Net::HTTP is not that client. I'm not sure what library would be better suited.
Once you have a more event-driven way to get your file in pieces, you can pass a proc to the render:
render :text => proc { |response, output| ... }
output can be used like an IO object. Some servers may buffer before sending anyway, though, so that's something to look out for.
It would be easier not handle the byte-shuffling in Rails.
If your webserver or the proxy in front of your webserver supports the X-REPROXY-URL HTTP header, your application can set that header and your webserver or proxy will stream the file.
Perlbal is the only proxy server I know of that supports that header out of the box.
An Apache2 module is also available.

Rails Paperclip XML POST File

I am able to 'POST' to a Rails application (with Paperclip) using XML instead of the standard web form (trying to do it from another Ruby script). However, I would like to include a binary file.
Is there any way to include the binary data within an XML tag? Or can I do something like B64 encode the data on the client and then decode it before it hits the Paperclip plugin?
UPDATE:
The browser sends a POST with this data (among others):
Content-Disposition: form-data; name="upload[upload]"; filename="foo.jpg"
Content-Type: image/jpeg
ÿØÿà�JFIF��`�`��ÿþ�Created by AccuSoft Corp.ÿÛ�C�...
I'd like to replicate that, but within XML
The short version is: use type="file", base64-encode the file, and put it inside a CDATA block. I originally found an explanation at this link:
http://techblog.floorplanner.com/2010/02/15/restful-uploading-of-files-using-xml/
That link appears to have died, so I recommend checking out the Internet Archive copy of the blog post:
http://web.archive.org/web/20100825030057/http://techblog.floorplanner.com/2010/02/15/restful-uploading-of-files-using-xml/
Also linked from that post is a gem that implements an encoder for files posted to Rails as XML: https://github.com/nragaz/encoded_attachment

Load testing multipart form

I'm trying to load-test a Rails application using JMeter. A critical part of the application involves a form that includes both text inputs and file uploads. It works fine in a browser, but when I try to post that page in JMeter, Rails is saving all of the parts of the multipart form as temp files, which causes things to break when it's looking for a string and gets a tempfile instead.
It appears that the difference is that, from a browser, the piece of the multipart request that contains a text input looks like this:
-----------------------------7d93b4186074c
Content-Disposition: form-data; name="field_name"
test
-----------------------------7d93b4186074c
while from JMeter it looks like this:
-----------------------------7d159c1302d0y0
Content-Disposition: form-data; name="field_name"
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
test
-----------------------------7d159c1302d0y0
So apparently Rails sees the former and interprets it as a plain text value and treats it as a string, but sees the latter and saves it to a temp file.
I have not been able to find a setting to convince JMeter not to send the additional headers in the multipart form for non-file fields.
Is there a way to convince Rails to ignore those headers and treat the text/plain text as strings instead of text files? Or a quick way to put a filter in front of my controller that will strip the extra headers?
Alternately, is there a better tool to load-test a Rails application that includes file upload?
Turns out these days you can just tick "use browser compatible headers" in JMeter. Could've saved myself a hell of a lot of time there :-)
So, I have customized JMeter's multipart request posting part in the source code to put out the request that rails understand. The change is easy as shown below but to create compiling Java/JMeter environment took time. :(
Anyways, now I can successfully upload a file by multipart post via JMeter.
in src/protocol/http/org/apache/jmeter/protocol/http/sampler/PostWriter.java
writeStartFileMultipart()
//writeln(out, "Content-Transfer-Encoding: binary"); // $NON-NLS-1$
writeFormMultipart()
/*****
writeln(out, "Content-Type: text/plain; charset=" + charSet); // $NON-NLS-1$
writeln(out, "Content-Transfer-Encoding: 8bit"); // $NON-NLS-1$
*****/
P.S.
A tip tip to create the build environment for 2.4 is
to comment out the 3rd party libraries check in build.xml file.
copy lib/xstream-1.3.1.jar from binary archive into lib/ directory
There may be a better way, but I ended up adding a quick filter to turn the text/plain tempfiles into strings within the parameter hash:
def change_text_files_to_strings
params.each_pair do |key, value|
params[key] = value.read if (value.class.to_s=='Tempfile' && value.content_type.start_with?('text/plain') )
end
end
By the way, it turns out that jmeter is correct here, and rails incorrect: according to RFC 2388, each item in a multipart request should have a content type (not just files), so Rails really shouldn't be using the presence of a content-type header to determine whether it's a file. Ah well.
I also used the solution above as ColdFusion was sending similar headers (minus the Content-Transfer-Encoding) with each piece of form data. I wonder if there's a better way.
EDIT: Anyone know if this has been fixed in Rails 3?
What kind of error do you get? Something like
NoMethodError (undefined method `rewind' for "1":String):
There is an issue with Rack that could explain your problem. See https://github.com/rack/rack/issuesearch?state=open&q=rewind#issue/116
We were also having a similar issue, In addition to the above answers we also correlate the X-CSRF-Token of HTTP Header Manager in that request and were
successfully able to upload the required media as many as times we wanted.

Resources