Check remote file size through fetched link in Ruby - ruby-on-rails

I was wondering if there was a way to check on the size of files you have a link to?
I have extracted the path to an image (with mechanize) from a site and want to put a condition on it that turns true or false depending on the file size.
page = Mechanize.new.get(http://www.someurl.com/).parser
image = page.search('//img[#id="img1"]/#src').text
Now, what I want to do is checking for the file size of image.
For a local file I could do something like File.size to get its size in bytes. Is there any way to check the size of image?

I think the Mechanize#head method will work:
image_size = Mechanize.new.head( image_url )["content-length"].to_i
HTTP HEAD requests are a lesser known cousin of HTTP GET, where the server is expected to respond with the same headers as if performing the GET request, but does not include the body. It is used often in web caching.
More on HTTP HEAD
Example taken from Mobile Phones/eBay (requested by Arup Rakshit)
start_url = 'http://www.ebay.in/sch/Mobile-Phones-/15032/i.html'
crawler = Mechanize.new
page = crawler.get( start_url ).parser
image_url = page.search('//img/#src').first.text
image_size = crawler.head( image_url )["content-length"].to_i
=> 4244

Related

Rails request.headers data not updating without refresh

When I hit site for first time, then results of request.headers[HTTP_ACCEPT] is "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8"
but between internal links requests it shows text/html, application/xhtml+xml
until I hard refresh the page.
Is it due to turbolinks or any other issue?
It's a turbolinks issue. By default, the turbolinks' XMLHttpRequest, only send the headers "text/html, application/xhtml+xml", but you can send the "image/webp" header by capturing the request-start event like this:
document.addEventListener("turbolinks:request-start", function(event) {
var xhr = event.data.xhr
xhr.setRequestHeader("Accept", "image/webp")
})
The question is how to know from javascript whether or not the browser accepts webp images to send the "image/webp" header to the server.
You can use the Modernize library or a custom function that guess if webp is available when trying to decode a small webp image like described here: https://developers.google.com/speed/webp/faq. However, these solutions add a certain overhead that precludes the benefit of the fastest load time of webp images.
Maybe, the most efficient solution is to create a boolean cookie in the server side to store whether or not webp images are accepted. The cookie is stored with the first browser GET call to the server. In subsequent turbolinks calls you can check this cookie in your javascript code:
document.addEventListener("turbolinks:request-start", function(event) {
browserAcceptsWebp = document.cookie.includes('webp_available=true');
if (browserAcceptsWebp) {
var xhr = event.data.xhr;
xhr.setRequestHeader("Accept", "image/webp");
}
});

How do you post video data from ios to Signed Url to Google Cloud Bucket?

I have a python method that successfully creates a GET Signed Url that will download the video that is in the Google Cloud Bucket.
def _MakeUrlForApp(self, verb, path, content_type='', content_md5=''):
"""Forms and returns the full signed URL to access GCS."""
base_url = '%s%s' % (self.gcs_api_endpoint, path)
signature_string = self._MakeSignatureString(verb, path, content_md5,
content_type)
signature_signed = self._Base64Sign(signature_string)
"""replace # with %40 - and + with %2 and == with %3D"""
signature_signed = signature_signed.replace("+", "%2B")
signature_signed = signature_signed.replace("/", "%2F")
signature_signed = signature_signed.replace("=", "%3D")
self.client_id_email = self.client_id_email.replace("#", "%40")
signedURL = base_url + "?Expires=" + str(self.expiration) + "&GoogleAccessId=" + self.client_id_email + "&Signature=" + signature_signed
print 'this is the signed URL '
print signedURL
return signedURL
This is called in ios swift with a get post with http. It returns the signed url and it downloads the video to the ios app.
This method here, If i specify the bucketname, the objectname, text/plain as content type, and a couple words for the data, It creates and puts that file into the Google Cloud bucket for me.
def Put(self, path, content_type, data):
"""Performs a PUT request.
Args:
path: The relative API path to access, e.g. '/bucket/object'.
content_type: The content type to assign to the upload.
data: The file data to upload to the new file.
Returns:
An instance of requests.Response containing the HTTP response.
"""
md5_digest = base64.b64encode(md5.new(data).digest())
base_url, query_params = self._MakeUrl('PUT', path, content_type,
md5_digest)
headers = {}
headers['Content-Type'] = content_type
headers['Content-Length'] = str(len(data))
headers['Content-MD5'] = md5_digest
return self.session.put(base_url, params=query_params, headers=headers,
data=data)
What I want to know is one of these two things and nothing else. How do I upload data from a video to this data parameter in my python webapp2.requestHandler from ios? OR How do I get the correct put signed Url to upload video data?
Please do not comment with anything that will not solve this specific question and do not bash me for my methods. Please provide suggests that you feel will specifically help me and nothing else.
There are a few ways to upload images to GCS, and each way works with signed URLs. If the video files are small, your simplest option is to have the users perform a non-resumable upload, which has the same URL signature except that the verb is PUT instead of GET. You'll also need to add the "Content-Type" header to the signature.
Video files can be fairly large, though, so you may prefer to use resumable uploads. These are a bit more complicated but do work with signed URLs as well. You'll need to use the "x-goog-resumable: start" header (and include it in the signature) and set "Content-Length" to 0. You'll get back a response with a Location header containing a new URL. Your client will then use that URL to do the upload. Only the original URL needs to be signed. The client can use the followup URL directly.

Browser caching of images served by Grails

I have a controller action which reads an image from the database and serves it to the client:
def profilePicture() {
def profilePicture = ProfilePicture.get(1)
response.setHeader("Content-disposition", "attachment; filename=1.png")
response.contentType = "PNG"
response.outputStream << profilePicture.profilePicture
response.outputStream.flush()
}
Every time the client requests the image, the server serves the whole image with status 200. What can I do to instruct the client that this content can be cached?
I have already tried response.setHeader("Expires", "...") with a date in the future but this didn't help (I'm guessing this is only part of the story as the server is not returning 304).
Actually, there is a plugin called Caching Headers for Grails which will handle all the needed caching and etag generation so that browsers won't request the download of the file when it's not modified and within your configured caching period.
It's pretty simple to use if you read through the documentation and take your time to test your configuration.

Grails HTTP Proxy

I want to create a proxy controller in grails, something that just takes whatever is passed in based on a url mapping, records what was asked for, sends the request to another server, records the response, and send the response back to the browser.
I'm having trouble with when the request has an odd file extension (.gif) or no file extension (/xxx?sdcscd)
My url mapping is:
"/proxy/$target**"
and I've attempted (per an answer to another question):
def targetURL = params.target
if (!FilenameUtils.getExtension(targetURL) && request.format) {
targetURL += ".${response.format}"
}
but this usually appends .html and never the .gif or ?csdcsd
Not sure what to do as I might just write the thing in straight Java
Actually, the real answer was sitting in the post you linked to previously all along, by Peter Ledbrook:
Disable file extension truncation by adding this line to grails-app/conf/Config.groovy:
grails.mime.file.extensions = false
This will disable the usage of file extensions for format, but will leave the file extension on params.target. You can completely ignore response.format!

Ruby-Rails serve ftp file direct to client

I am new to ruby and to rails, so excuse my question... .
What i want to know is, how to take a file from a ftp server with ruby without saving the file on my rails application harddrive (streaming the filedata direct to the client). I am working with the ruby Net/FTP class.
With the method "retrbinary" from the Net/FTP class i have the following snippet:
ftp.retrbinary('RETR ' + filename, 4096) { |data|
buf << data
}
In my rails view i can do something like this:
send_data( buf )
So how do i combine these two. I dont know how to instanziate a buffer object, fill in the stream and than serve it to the user. Has anybody an idea how to do this?
thank you very much for your support! Your post get me going on. After some cups of coffee i found a working solution. Actually i am doing the following, which works for me:
def download_file
filename = params[:file]
raw = StringIO.new('')
#ftp.retrbinary('RETR ' + filename, 4096) { |data|
raw << data
}
#ftp.close
raw.rewind
send_data raw.read, :filename => filename
end
I will test this in production(real life situation). If this is not working well enough, i have to use a NFS mount.
fin
Do you want the following?
1) Client (browser) sends a request to the Rails server
2) Server should respond with the contents of a file that is located on an ftp server.
Is that it?
If so, then simply redirect the browser to the ftp location. Eg
# in controller
ftp_url = "ftp://someserver.com/dir_name/file_name.txt
redirect_to ftp_url
The above works if the ftp file has anonymous get access.
If you really need to access the file from the server and stream it, try the following:
# in controller
render :text => proc {|response, output|
ftp_session = FTP.open(host, user, passwd, acct)
ftp_session.gettextfile(remotefile) {|data| output.write(data)}
ftp_session.close
}
You should check the headers in the response to see if they're what you want.
ps. Setting up the ftp connection and streaming from a second server will probably be relatively slow. I'd use JS to show a busy graphic to the user.
I'd try alternatives to ftp. Can you set up an NFS connection or mount the remote disk? Would be much faster than ftp. Also investigate large TCP window sizes.

Resources