Open URI Wrong Output - ruby-on-rails

Open URI Wrong Output - ruby-on-rails

I am trying to download images from the web and upload them back to Cloudinary. The code I have works for some images, but not for others. I have isolated the problem down to this line (it requires open-uri):
image = open(params[:product_image][:main])
For this image, it works fine. image is
#<Tempfile:/var/folders/49/bmhbmmzj5fl31dm9j6m6gxr00000gn/T/open-uri20150526-7662-1b676ws>
and cloudinary accepts this. However, when I try to pull this image, image becomes
#<StringIO:0x007fa0267c8f80 #base_uri=#<URI::HTTP:0x007fa0267c92c8 URL:http://www.spiresources.net/WebImages/480/swatch/CELW.JPG>,
#meta={"date"=>"Tue, 26 May 2015 22:17:47 GMT", "server"=>"Apache/2.2.22 (Ubuntu)",
"last-modified"=>"Mon, 29 Jun 2009 00:00:00 GMT", "etag"=>"\"44700f-c35-46d715f090000\"",
"accept-ranges"=>"bytes", "content-length"=>"3125", "content-type"=>"image/jpeg"}, #metas={"date"=>["Tue, 26 May 2015 22:17:47 GMT"], "server"=>["Apache/2.2.22 (Ubuntu)"],
"last-modified"=>["Mon, 29 Jun 2009 00:00:00 GMT"], "etag"=>["\"44700f-c35-46d715f090000\""], "accept-ranges"=>["bytes"],
"content-length"=>["3125"], "content-type"=>["image/jpeg"]}, #status=["200", "OK"]>
which cloudinary rejects and raises an error of "No conversion of StringIO to string". Why does open-uri return different objects for what would seem like similar images? How can I make open-uri return a tempfile or at least turn my StringIO to a tempfile?

You can simply give the URL to the Cloudinary upload method. Then Cloudinary will fetch the remote resource directly.

Related

Rails: open() returns StringIO instead of Tempfile

I have two valid URL's to two images.
When I run open() on the first URL, it returns an object of type Tempfile (which is what the fog gem expects to upload the image to AWS).
When I run open() on the second URL, it returns an object of type StringIO (which causes the fog gem to crash and burn).
Why is open() not returning a Tempfile for the second URL?
Further, can open() be forced to always return Tempfile?
From my Rails Console:
2.2.1 :011 > url1
=> "https://fbcdn-profile-a.akamaihd.net/hprofile-ak-xpf1/v/t1.0-1/c0.0.448.448/10298878_10103685138839040_6456490261359194847_n.jpg?oh=e2951e1a1b0a04fc2b9c0a0b0b191ebc&oe=56195EE3&__gda__=1443959086_417127efe9c89652ec44058c360ee6de"
2.2.1 :012 > url2
=> "https://fbcdn-profile-a.akamaihd.net/hprofile-ak-xfa1/v/t1.0-1/c0.17.200.200/1920047_10153890268465074_1858953512_n.jpg?oh=5f4cdf53d3e59b8ce4702618b3ac6ce3&oe=5610ADC5&__gda__=1444367255_396d6fdc0bdc158e4c2e3127e86878f9"
2.2.1 :013 > t1 = open(url1)
=> #<Tempfile:/var/folders/58/lpjz5b0n3yj44vn9bmbrv5180000gn/T/open-uri20150720-24696-1y0kvtd>
2.2.1 :014 > t2 = open(url2)
=> #<StringIO:0x007fba9c20ae78 #base_uri=#<URI::HTTPS https://fbcdn-profile-a.akamaihd.net/hprofile-ak-xfa1/v/t1.0-1/c0.17.200.200/1920047_10153890268465074_1858953512_n.jpg?oh=5f4cdf53d3e59b8ce4702618b3ac6ce3&oe=5610ADC5&__gda__=1444367255_396d6fdc0bdc158e4c2e3127e86878f9>, #meta={"last-modified"=>"Tue, 25 Feb 2014 19:47:06 GMT", "content-type"=>"image/jpeg", "timing-allow-origin"=>"*", "access-control-allow-origin"=>"*", "content-length"=>"7564", "cache-control"=>"no-transform, max-age=1209600", "expires"=>"Mon, 03 Aug 2015 22:01:40 GMT", "date"=>"Mon, 20 Jul 2015 22:01:40 GMT", "connection"=>"keep-alive"}, #metas={"last-modified"=>["Tue, 25 Feb 2014 19:47:06 GMT"], "content-type"=>["image/jpeg"], "timing-allow-origin"=>["*"], "access-control-allow-origin"=>["*"], "content-length"=>["7564"], "cache-control"=>["no-transform, max-age=1209600"], "expires"=>["Mon, 03 Aug 2015 22:01:40 GMT"], "date"=>["Mon, 20 Jul 2015 22:01:40 GMT"], "connection"=>["keep-alive"]}, #status=["200", "OK"]>
This is how I'm using fog:
tempfile = open(params["avatar"])
user.avatar.store!(tempfile)

I assume you are using Ruby's built-in open-uri library that allows you to download URLs using open().
In this case Ruby is only obligated to return an IO object. There is no guarantee that it will be a file. My guess is that Ruby makes a decision based on memory consumption: if the download is large, it puts it into a file to save memory; otherwise it keeps it in memory with a StringIO.
As a workaround, you could write a method that writes the stream to a tempfile if it is not already downloaded to a file:
def download_to_file(uri)
stream = open(uri, "rb")
return stream if stream.respond_to?(:path) # Already file-like
Tempfile.new.tap do |file|
file.binmode
IO.copy_stream(stream, file)
stream.close
file.rewind
end
end
If you're looking for a full-featured gem that does something similar, take a look at "down": https://github.com/janko-m/down

The open uri library has 10K size limit for choose either StringIO or Tempfile.
My suggestion for you is change to constant OpenURI::Buffer::StringMax, that used for open uri set default
In your initializer you could make this:
require 'open-uri'
OpenURI::Buffer.send :remove_const, 'StringMax' if OpenURI::Buffer.const_defined?('StringMax')
OpenURI::Buffer.const_set 'StringMax', 0

This doesn't answer my question - but it provides a working alternative using the httparty gem:
require "httparty"
File.open("file.jpg", "wb") do |tempfile|
tempfile.write HTTParty.get(params["avatar"]).parsed_response
user.avatar.store!(tempfile)
end

Consistent Encoding for iCal file import

I'm trying to use the iCalendar gem to import some iCal files on a rails 4 site.
Sometimes the file is of type 'text/calendar;charset=utf-8' and sometimes its 'text/calendar; charset=UTF-8;'
I am retrieving it like this:
uri = URI.parse(url)
calendar = Net::HTTP.get_response(uri)
new_calendar = Icalendar.parse(calendar.body)
When its text/calendar;charset=utf-8 it works fine. but when its text/calendar; charset=UTF-8 encoded I get UTF codes in the string
SUMMARY:Tech Job Fair – City(ST) – Jul 1, 2015
ends up being
["Tech Job Fair \xE2\x80\x93 City(ST) \xE2\x80\x93 Jul 1", " 2015"]
Which is then saved to the database and that is undesirable.
Is the charset/content-type revealing the problem here or could it actually just be encoded wrong from the source?
How do I change my retrieval commands to strip those codes out effectively or tell it its a UTF string so it doesn't include them in the first place?
Update: it looks like some are text/calendar;charset=utf-8 and some are text/calendar;charset=UTF-8 and some are text/calendar; charset=UTF-8. Note the last one has a space between the two segments. Could this be causing an issue?
Update2: Opening up my three example iCal files in Notepad++ shows them encoded as "UTF-8 without BOM" in the menu.

Parse a date in rails

I have a date (Which is actually parsed from a PDF) and it could be any of the following format:
MM/DD/YYYY
MM/DD/YY
M/D/YY
October 15, 2007
Oct 15, 2007
Is there any gem or function available in rails or ruby to parse my date?
Or I need to parse it using regex?
BTW I'm using ruby on rails 3.2.

You can try Date.parse(date_string).
You might also use Date#strptime if you need a specific format:
> Date.strptime("10/15/2013", "%m/%d/%Y")
=> Tue, 15 Oct 2013
For a general solution:
format_str = "%m/%d/" + (date_str =~ /\d{4}/ ? "%Y" : "%y")
date = Date.parse(date_str) rescue Date.strptime(date_str, format_str)

I find the chronic gem very easy to use for time parsing and it should work for you. i tried the examples you gave.
https://github.com/mojombo/chronic

Following links to get RSS entry content with feedzirra

I have a Rails app (3.2.11, Ruby 1.9.3) and I'm trying to read the feed at http://www2c.cdc.gov/podcasts/createrss.asp?t=r&c=66 using feedzirra. Looking at the source XML of the feed, this is what entries looks like:
<item>
<title>In the News - Novel (New) Coronavirus in the Arabian Peninsula and United Kingdom</title>
<description>Novel (New) Coronavirus in the Arabian Peninsula and United Kingdom</description>
<link>http://wwwnc.cdc.gov/travel/notices/in-the-news/coronavirus-arabian-peninsula-uk.htm</link>
<guid isPermaLink="true">http://wwwnc.cdc.gov/travel/notices/in-the-news/coronavirus-arabian-peninsula-uk.htm</guid>
<pubDate>Thu, 07 Mar 2013 05:00:00 EST</pubDate>
</item>
<item>
<title>Outbreaks - Dengue in Madeira, Portugal</title>
<description>Dengue in Madeira, Portugal</description>
<link>http://wwwnc.cdc.gov/travel/notices/outbreak-notice/dengue-madeira-portugal.htm</link>
<guid isPermaLink="true">http://wwwnc.cdc.gov/travel/notices/outbreak-notice/dengue-madeira-portugal.htm</guid>
<pubDate>Wed, 20 Feb 2013 05:00:00 EST</pubDate>
</item>
As you can see, this feed doesn't seem to be exposing the entry contents, just a link to the underlying article. My question is this, can I use feedzirra to access the content of the original article? If not, any recommendations on good tools out there? wget? mechanize? httparty? Thanks!

Well, I don't know if it's possible with feedzirra, but from what I see with the XML, all you can get is the title and some more snippets like the description, pubication date..., I can however recommend a tool for this, you should check FeedsAPI , it has a nice simple to use RSS Feeds API and can do what you are tryng to achieve. i hope this could help.

Indy is altering the binary data in my URL

I want to send some binary data over via GET using the Indy components.
So, I have an URL like www.awebsite.com/index.php?data=xxx where xxx is the binary data encoded using ParamsEncode function. After encoding the binary data is converted to something like bB7%18%11z\ so my URL is something like:
www.awebsite.com/bB7%18%11z\
I have seen that if my URL contains the backshash char (see the last char in the URL) it is replaced with slash char (/) in TIdURI.NormalizePath so my binary data is corrupted. What am I doing wrong?

Backslashes aren't allowed in URL's, and to avoid confusion between Windows and *nix systems, all backslashes are replaced by slashes to attempt to keep things working. See http://www.faqs.org/rfcs/rfc1738.html section 5, HTTP, httpurl
You could try with replacing backslashes with %5C yourself.
That said, you should either try with MIME encoding, or try to get a hang of POST requests.

You're using an old version of Indy. Backslashes are included in the UnsafeChars list that Indy uses now. Remy changed it in July 2010 with revision 4272 in the Tiburon branch:
r4272 | Indy-RemyLebeau | 2010-07-07 03:12:23 -0500 (Wed, 07 Jul 2010) | 1 line
Internal logic changes for TIdURI, and moved some sharable logic into IdGlobalProtocols.pas for later use in TIdHTTP.
It was merged into the trunk with the rest of Indy 10.5.7 with revision 4394, in September 2010.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Open URI Wrong Output - ruby-on-rails

You can simply give the URL to the Cloudinary upload method. Then Cloudinary will fetch the remote resource directly.

Related

Rails: open() returns StringIO instead of Tempfile

Consistent Encoding for iCal file import

Parse a date in rails

Following links to get RSS entry content with feedzirra

Indy is altering the binary data in my URL

Categories

Resources