Opening File in Ruby returning empty file - ruby-on-rails

I am currently trying to store a pdf in a hash for an api call in ruby. The path to the pdf is stored as:
/Users/myUserName/Desktop/REPOSITORY/path_to_file.pdf
I am using a block to store the file in the hash as so:
File.open(pdf_location, "rb") do |file|
params = {
other irrelevant entries
:document => file
}
pdf_upload_request('post', params, headers)
end
I am receiving a 400 error from the server that my document is empty and when I do puts file.read, it is empty. However, when I visit the filepath, it's clear that the file is not empty. Am I missing something here? Any help would be greatly appreciated. Thank you!
Edit------
I recorded my http request with vcr, here it is:
request:
method: post
uri: request_uri
body:
encoding: US-ASCII
string: ''
headers:
Authorization:
- Bearer 3ZOCPnwfoN7VfdGh7k4lrBuEYs4gN1
Content-Type:
- multipart/form-data; boundary=-----------RubyMultipartPost
Content-Length:
- '246659'
So i don't think the issue is with me sending the file with multipart encoding
Update--------
The filepaths to the pdf are generated from a url, and stored in a the tmp folder of my application. They are generated through this method:
def get_temporary_pdf(chrono_detail, recording, host)
auth_token = User.find(chrono_detail.user_id).authentication_token
# pdf_location = "https://54.84.224.252/recording/5/analysis.pdf/?token=Ybp37kw7HrSt8NyyPnBZ"
pdf_location = host + '/recordings/' + recording.id.to_s + '/analysis.pdf/?format=pdf&download=true&token=' + auth_token
filename = "Will" + '_' + recording.id.to_s + '_' + Date.new.to_s + '.pdf'
Thread.new do
File.open(Rails.root.join("tmp",filename), "wb") do |file|
file.write(open(pdf_location, {ssl_verify_mode: OpenSSL::SSL::VERIFY_NONE}).read)
end
end
Rails.root.join("tmp",filename)
end
They are then called using the api call:
client.upload_document(patient_id, file_path, description)
I can see them physically in my temp folder, and can view them with preview. Everything seems to work. But as a test of uncertainty, I changed file_path to point to a different pdf:
Users/myUsername/Desktop/example.pdf.
Using this file path worked. The pdf was uploaded to the third party system correctly, I can physically see it there. Do you think this means it is an issue with the tmp folder or how i generate the temporary pdf's?

Most likely, the API is expecting a POST with Content-Type: multipart/form-data. Just sending the file handle (which document: file does) won't work, as the file handle is only relevant to your local Ruby process; and even sending the binary string as a parameter won't work, since your content-type isn't properly set to encode a file.
Since you're already using HTTParty, though, you can fix this by using HTTMultiParty:
require 'httmultiparty'
class SomeClient
include HTTMultiParty
base_uri 'http://localhost:3000'
end
SomeClient.post('/', :query => {
:foo => 'bar',
:somefile => File.new('README.md')
})

Try this:
file = File.read(pdf_location)
params = {
# other irrelevant entries
document: file
}
headers = {}
pdf_upload_request('post', params, headers)
Not sure but may be you need to close file first...

So the issue arose from the multi threading i used to avoid timeout errors. The file path would get generated and referenced in the api call before anything was actually written into the document.

Related

How to parse a very huge XML file from a remote server rails

I have a very large XML from a remote server which I have to parse and get the data.
I have tried to open the file using the open() function but it is taking more than 15 minutes and still no response.
Then I tried Nokogiri::XML(open(URL)) where URL is the link which contains the data to parse.
Also, I have tried using Net::HTTP::Get but again with no fruitful results.
Can anyone suggest which gem and function can be used to parse the data?
As mentioned before, Nokogiri::XML::Reader is your friend here. The example in the documentation works fine if you have the file locally.
It is also possible to parse the data as soon as it comes in, fully streaming. This involves getting the data in chunks (e.g. using Net::HTTP) and connecting it to the Nokogiri::XML::Reader by means of an IO.pipe.
Example (adapted from this gist):
require 'nokogiri'
require 'net/http'
# setup request
uri = URI("http://example.com/articles.xml")
req = Net::HTTP::Get.new(uri.request_uri)
# read response in a separate thread using a pipe to communicate
rd, wr = IO.pipe
reader_thread = Thread.new do
Net::HTTP.start(uri.host, uri.port, use_ssl: uri.scheme == 'https') do |http|
http.request(req) do |response|
response.read_body {|chunk| wr.write(chunk) }
end
wr.close
end
end
# parse the incoming data chunk by chunk
reader = Nokogiri::XML::Reader(rd)
reader.each do |node|
next if node.node_type != Nokogiri::XML::Reader::TYPE_ELEMENT
next if node.name != "article"
# now that we have the desired fragment, put it to use
doc = Nokogiri::XML(node.outer_xml)
puts("Got #{doc.text}")
end
rd.close
# let the reader thread finish cleanly
reader_thread.join
If you are working with large XML files then you can use Nokogiri::XML::Reader class. I have successfully opened 1 GB files without any problems. For optimal performance you could download the file first and then parse it using XML::Reader class localy on your server
The usage is something like this (replace XML_FILE with your path):
Nokogiri::XML::Reader(File.open(XML_FILE)).each do |node|
if node.name == 'Node' && node.node_type == Nokogiri::XML::Reader::TYPE_ELEMENT
puts node.outer_xml # you can do something like this also Nokogiri::XML(node.outer_xml).at('./Node')
end
end
Heere is the documentation: http://www.rubydoc.info/github/sparklemotion/nokogiri/master/Nokogiri/XML/Reader
Hope it helps

How do you post video data from ios to Signed Url to Google Cloud Bucket?

I have a python method that successfully creates a GET Signed Url that will download the video that is in the Google Cloud Bucket.
def _MakeUrlForApp(self, verb, path, content_type='', content_md5=''):
"""Forms and returns the full signed URL to access GCS."""
base_url = '%s%s' % (self.gcs_api_endpoint, path)
signature_string = self._MakeSignatureString(verb, path, content_md5,
content_type)
signature_signed = self._Base64Sign(signature_string)
"""replace # with %40 - and + with %2 and == with %3D"""
signature_signed = signature_signed.replace("+", "%2B")
signature_signed = signature_signed.replace("/", "%2F")
signature_signed = signature_signed.replace("=", "%3D")
self.client_id_email = self.client_id_email.replace("#", "%40")
signedURL = base_url + "?Expires=" + str(self.expiration) + "&GoogleAccessId=" + self.client_id_email + "&Signature=" + signature_signed
print 'this is the signed URL '
print signedURL
return signedURL
This is called in ios swift with a get post with http. It returns the signed url and it downloads the video to the ios app.
This method here, If i specify the bucketname, the objectname, text/plain as content type, and a couple words for the data, It creates and puts that file into the Google Cloud bucket for me.
def Put(self, path, content_type, data):
"""Performs a PUT request.
Args:
path: The relative API path to access, e.g. '/bucket/object'.
content_type: The content type to assign to the upload.
data: The file data to upload to the new file.
Returns:
An instance of requests.Response containing the HTTP response.
"""
md5_digest = base64.b64encode(md5.new(data).digest())
base_url, query_params = self._MakeUrl('PUT', path, content_type,
md5_digest)
headers = {}
headers['Content-Type'] = content_type
headers['Content-Length'] = str(len(data))
headers['Content-MD5'] = md5_digest
return self.session.put(base_url, params=query_params, headers=headers,
data=data)
What I want to know is one of these two things and nothing else. How do I upload data from a video to this data parameter in my python webapp2.requestHandler from ios? OR How do I get the correct put signed Url to upload video data?
Please do not comment with anything that will not solve this specific question and do not bash me for my methods. Please provide suggests that you feel will specifically help me and nothing else.
There are a few ways to upload images to GCS, and each way works with signed URLs. If the video files are small, your simplest option is to have the users perform a non-resumable upload, which has the same URL signature except that the verb is PUT instead of GET. You'll also need to add the "Content-Type" header to the signature.
Video files can be fairly large, though, so you may prefer to use resumable uploads. These are a bit more complicated but do work with signed URLs as well. You'll need to use the "x-goog-resumable: start" header (and include it in the signature) and set "Content-Length" to 0. You'll get back a response with a Location header containing a new URL. Your client will then use that URL to do the upload. Only the original URL needs to be signed. The client can use the followup URL directly.

Ruby: Open URI. Get file name from the remote url

I have a remote URL, that doesn't have any reference to filename.
https://example.site.com/sadfasfsadfasdfasfdsafas/
But on downloading it gives file as 'Intro.pdf'
I would like to get that filename in my ruby code so that I can use it to create the file or send file in requests. As of now, I am sending the hardcode name as attachment.pdf
obj = open(url, :ssl_verify_mode => OpenSSL::SSL::VERIFY_NONE)
data = obj.read
send_data data, :disposition => 'attachment', :filename=>"attachment.pdf"
Pls Advice.
Thanks
Check result of meta method:
p obj.meta
probably it have Content-Disposition header. In this case it can have file name as optional parameter.

How can I have Grape return error messages in CSV format?

I have a Rails app and I have implemented api using Grape gem. Now, I created a custom error formatter (CSVFormatter) to return error response in CSV format.
And, also I have this in my application's v2.rb file:
error_formatter :csv, Api::Base::Errors::CSVFormatter
When I hit a url like this:
http://example.com/api/v2/datasets/CODE/data.csv?&trim_start=06/01/99&trim_end=2014-05/28&sort_order=desc
It shows the error in the console like this which is good and means that my custom error formatter is working properly:
Error
trim_start is invalid
trim_end is invalid
But, I just need to download this error message in a csv file. After looking at Grape's documentation, I found a way of setting Content-type and I tried this:
rack = Rack::Response.new(as_csv , 422, { "Content-type" => "text/csv" }).finish
rack[2].body[0]
But, this is not working as I expected.
EDIT:
Looks like there is no clean way of doing it using grape without forcefully overriding the status code according to the answer of Simon. But, one may not wish to do that as it may result other issues in the application like if some other program tries to read the data from the api and gets the incorrect response or so even without knowing why.
You're looking for the Content-Disposition header. Include it in your response like this:
Content-Disposition: attachment; filename=error.csv
And the Web browser will treat the response body as a file to be downloaded (to "error.csv", in this example).
However, modifying your code to do this is complicated by two things:
From the Grape source code it's apparent there's no way to set response headers from within an error formatter, so you'll need to add a custom exception handler that formats the response body and sets the response headers appropriately for each output format you plan to support.
According to my experimentation, browsers will ignore the Content-Disposition header if the HTTP status code indicates an error (e.g. anything in the 400 or 500 range), so the status code will also need to be overridden when the user requests a CSV file.
Try adding this to your API class:
# Handle all exceptions with an error response appropriate to the requested
# output format
rescue_from :all do |e|
# Edit this hash to override the HTTP response status for specific output
# formats
FORMAT_SPECIFIC_STATUS = {
:csv => 200
}
# Edit this hash to add custom headers specific to each output format
FORMAT_SPECIFIC_HEADERS = {
:csv => {
'Content-Disposition' => 'attachment; filename=error.csv'
}
}
# Get the output format requested by the user
format = env['api.format']
# Set the HTTP status appropriately for the requested output format and
# the error type
status = FORMAT_SPECIFIC_STATUS[format] ||
(e.respond_to? :status) && e.status ||
500
# Set the HTTP headers appropriately for the requested format
headers = {
'Content-Type' => options[:content_types][format] || 'text/plain'
}.merge(FORMAT_SPECIFIC_HEADERS[format] || { })
# Format the message body using the appropriate error formatter
error_formatter =
options[:error_formatters][format] || options[:default_error_formatter]
body = error_formatter.call(e.message, nil, options, env)
# Return the error response to the client in the correct format
# with the correct HTTP headers for that format
Rack::Response.new(body, status, headers).finish
end
Now if you configure your API class to handle two different formats (I've picked CSV and plain-text here for simplicity), like this:
module Errors
module CSVErrorFormatter
def self.call(message, backtrace, options, env)
as_csv = "CSV formatter:" + "\n"
message.split(",").each do |msg|
as_csv += msg + "\n"
end
# Note this method simply returns the response body
as_csv
end
end
module TextErrorFormatter
def self.call(message, backtrace, options, env)
as_txt = "Text formatter:" + "\n"
message.split(",").each do |msg|
as_txt += msg + "\n"
end
as_txt
end
end
end
content_type :csv, 'text/csv'
content_type :txt, 'text/plain'
error_formatter :csv, Api::Base::Errors::CSVErrorFormatter
error_formatter :txt, Api::Base::Errors::TextErrorFormatter
You should find your API always returns an error response suitable for the requested format, and triggers the browser to download the response only when CSV format is requested. Naturally this can be extended to support as many formats as you like, by explicitly declaring content types and error formatters.
Note there's one case in which this code doesn't automatically do the right thing, and that's when an error response is invoked directly using error!. In that case you'll have to supply the correct body and headers as part of the call itself. I'll leave extracting the relevant parts of the above code into reusable methods as an exercise for the reader.

Ruby on Rails - OAuth 2 multipart Post (Uploading to Facebook or Soundcloud)

I am working on a Rails App that Uses OmniAuth to gather Oauth/OAuth2 credentials for my users and then posts out to those services on their behalf.
Creating simple posts to update status feeds work great.. Now I am to the point of needing to upload files. Facebook says "To publish a photo, issue a POST request with the photo file attachment as multipart/form-data." http://developers.facebook.com/docs/reference/api/photo/
So that is what I am trying to do:
I have implemented the module here: Ruby: How to post a file via HTTP as multipart/form-data? to get the headers and data...
if appearance.post.post_attachment_content_type.to_s.include?('image')
fbpost = "https://graph.facebook.com/me/photos"
data, headers = Multipart::Post.prepare_query("title" => appearance.post.post_attachment_file_name , "document" => File.read(appearance.post.post_attachment.path))
paramsarray = {:source=>data, :message=> appearance.post.content}
response = access_token.request(:post, fbpost, paramsarray, headers)
appearance.result = response
appearance.save
end
I but I am getting a OAuth2::HTTPError - HTTP 400 Error
Any assistance would be Incredible... As I see this information will also be needed for uploading files to SoundCloud also.
Thanks,
Mark
Struggled with this myself. The oauth2 library is backed by Faraday for it's HTTP interaction. with a little configuration it supports uploaded files out of the box. First step is to add the appropriate Faraday middleware when building your connection. An example from my code:
OAuth2::Client.new client_id, secret, site: site do |stack|
stack.request :multipart
stack.request :url_encoded
stack.adapter Faraday.default_adapter
end
This adds the multipart encoding support to the Faraday connection. Next when making the request on your access token object you want to use a Faraday::UploadIO object. So:
upload = Faraday::UploadIO.new io, mime_type, filename
access_token.post('some/url', params: {url: 'params'}, body: {file: upload})
In the above code:
io - An IO object for the file you want to upload. Can be a File object or even a StringIO.
mime_type - The mime type of the file you are uploading. You can either try to detect this server-side or if a user uploaded the file to you, you should be able to extract the mime type from their request.
filename - What are are calling the file you are uploading. This can also be determined by your own choosing or you can just use whatever the user uploading the file calls it.
some/url - Replace this with the URL you want to post to
{url: 'params'} - Replace this with any URL params you want to provide
{file: upload} - Replace this with your multipart form data. Obviously one (or more) of the key/value pairs should have an instance of your file upload.
I'm actually using successfully this code to upload a photo on a fb page :
dir = Dir.pwd.concat("/public/system/posts/images")
fb_url = URI.parse("https://graph.facebook.com/#{#page_id}/photos")
img = File.open("myfile.jpg")
req = Net::HTTP::Post::Multipart.new(
"#{fb_url.path}?access_token=#{#token}",
"source" => UploadIO.new(img, "application/jpg", img.path),
"message" => "some messsage"
)
n = Net::HTTP.new(fb_url.host, fb_url.port)
n.use_ssl = true
n.verify_mode = OpenSSL::SSL::VERIFY_NONE
n.start do |http|
#result = http.request(req)
end

Resources