Ruby/Rails - Bad URI - ruby-on-rails

Not sure why I'm getting the following error when the URI works just fine in the browser:
http://oracleofbacon.org/cgi-bin/xml?a=Kevin Bacon&b=Tom Cruise&u=1&p=google-apps
This is my code:
def kb(to)
uri = "http://oracleofbacon.org/cgi-bin/xml?a=Kevin Bacon&b=#{to.strip}&u=1&p=google-apps"
doc = Nokogiri::XML(open(uri)) # throws error on this line
return parse(doc)
end
I get the following error:
in `split': bad URI(is not URI?): http://oracleofbacon.org/cgi-bin/xml?a=Kevin Bacon&b=Tom Cruise&u=1&p=google-apps (URI::InvalidURIError)`
I execute the method in the following way:
kb("Tom Cruise")

It's because a browser is pathologically friendly, like a puppy, and will go to great lengths to render a page or resolve a URL. An application won't do that because you have to tell it how to be friendly.
Your URL is not valid because it has embedded spaces. Replace the spaces with %20:
irb -f
irb(main):001:0> require 'open-uri'
=> true
irb(main):002:0> open('http://oracleofbacon.org/cgi-bin/xml?a=Kevin%20Bacon&b=Tom%20Cruise&u=1&p=google-apps').read
=> "<?xml version=\"1.0\" standalone=\"no\"?>\n<link><actor>Tom Cruise</actor><movie>A Few Good Men (1992)</movie><actor>Kevin Bacon</actor></link>"
Escaping the characters needing to be escaped is easy:
irb -f
irb(main):001:0> require 'uri'
=> true
irb(main):002:0> URI.escape('http://oracleofbacon.org/cgi-bin/xml?a=Kevin Bacon&b=Tom Cruise&u=1&p=google-apps')
=> "http://oracleofbacon.org/cgi-bin/xml?a=Kevin%20Bacon&b=Tom%20Cruise&u=1&p=google-apps"

Related

Ruby hexdigest sha1 pack('H*') string encoding...

I meet an encoding problem... No errors in the console, but the output is not well encoded.
I must use Digest::SHA1.hexdigest on a string and then must pack the result.
The below example should outputs '{´p)ODýGΗ£Iô8ü:iÀ' but it outputs '{?p)OD?GΗ?I?8?:i?' in the console and '{�p)OD�G^BΗ�I�8^D�:i�' in the log file.
So, my variable called pack equals '{?p)OD?GΗ?I?8?:i?' and not '{´p)ODýGΗ£Iô8ü:iÀ'. That's a big problem... I'm doing it in a Rails task.
Any idea guys?
Thanks
# encoding: utf-8
require 'digest/sha1'
namespace :my_app do
namespace :check do
desc "Description"
task :weather => :environment do
hexdigest = Digest::SHA1.hexdigest('29d185d98c984a359e6e6f26a0474269partner=100043982026&code=34154&profile=large&filter=movie&striptags=synopsis%2Csynopsisshort&format=json&sed=20130527')
pack = [hexdigest].pack("H*")
puts pack # => {?p)OD?GΗ?I?8?:i?
puts '{´p)ODýGΗ£Iô8ü:iÀ' # => {´p)ODýGΗ£Iô8ü:iÀ
end
end
end
This is what I did (my conversion from PHP to Ruby)
# encoding: utf-8
require 'open-uri'
require 'base64'
require 'digest/sha1'
class Allocine
$_api_url = 'http://api.allocine.fr/rest/v3'
$_partner_key
$_secret_key
$_user_agent = 'Dalvik/1.6.0 (Linux; U; Android 4.2.2; Nexus 4 Build/JDQ39E)'
def initialize (partner_key, secret_key)
$_partner_key = partner_key
$_secret_key = secret_key
end
def get(id)
# build the params
params = { 'partner' => $_partner_key,
'code' => id,
'profile' => 'large',
'filter' => 'movie',
'striptags' => 'synopsis,synopsisshort',
'format' => 'json' }
# do the request
response = _do_request('movie', params)
return response
end
private
def _do_request(method, params)
# build the URL
query_url = $_api_url + '/' + method
# new algo to build the query
http_build_query = Rack::Utils.build_query(params)
sed = DateTime.now.strftime('%Y%m%d')
sig = URI::encode(Base64.encode64(Digest::SHA1.digest($_secret_key + http_build_query + '&sed=' + sed)))
return sig
end
end
Then call
allocine = Allocine.new(ALLOCINE_PARTNER_KEY, ALLOCINE_SECRET_KEY)
puts allocine.get('any ID')
get method return 'e7RwKU9E%2FUcCzpejSfQ4BPw6acA%3D' in PHP and 'cPf6I4ZP0qHQTSVgdKTbSspivzg=%0A' in Ruby...
thanks again
I think this "encoding" issue has turned up due to debugging other parts of a conversion from PHP to Ruby. The target API that will consume a digest of params looks like it will accept a signature variable constructed in Ruby as follows (edit: well this is guess, there may also be relevant differences between Ruby and PHP in URI encoding and base64 defaults):
require 'digest/sha1'
require 'base64'
require 'uri'
sig_data = 'edhefhekjfhejk8edfefefefwjw69partne...'
sig = URI.encode( Base64.encode64( Digest::SHA1.digest( sig_data ) ) )
=> "+ZabHg22Wyf7keVGNWTc4sK1ez4=%0A"
The exact construction of sig_data from the parameters that are being signed is also important. That is generated by the PHP method http_build_query, and I do not know what order or escaping that will apply to input params. If your Ruby version gets them in a different order, or escapes differently to PHP, the signature will be wrong (edit: Actually it is possible we are looking here for a signature on the exact query string sent the API - I don't know). It is possibly an issue of that sort that has led you down the rabbit hole of how the signature is constructed?
Thank you guys for your help.
Problem is solved. With the following code I obtain exactly the same string as with PHP:
http_build_query = Rack::Utils.build_query(params)
sed = DateTime.now.strftime('%Y%m%d')
sig = CGI::escape(Base64.strict_encode64(Digest::SHA1.digest($_secret_key + http_build_query + '&sed=' + sed)))
Now I've another problem for which I opened a new question here.
thanks you very much.

Rails Facebook avatar to data-uri

I'm trying to pull a facebook avatar via auth. Here's what i'm doing:
def image_uri
require 'net/http'
image = URI.parse(params[:image]) # https://graph.facebook.com/565515262/picture
fetch = Net::HTTP.get_response(image)
based = 'data:image/jpg;base64,' << Base64.encode64(fetch)
render :text => based
end
I'm getting the following error (new error — edited):
Connection reset by peer
I've tried googling about, I can't seem to get a solution, any ideas?
I'm basically looking for the exact functioning of PHP's file_get_contents()
Try escaping the URI before parsing:
URI.parse URI.escape(params[:image])
Make sure that params[:image] does contain the uri you want to parse... I would instead pass the userid and interpolate it into the uri.
URI.parse URI.escape("https://graph.facebook.com/#{params[:image]}/picture)"
Does it throw the same error when you use a static string "https://graph.facebook.com/565515262/picture"
What does it say when you do
render :text => params[:image]
If both of the above don't answer your question then please try specifying the use of HTTPS-
uri = URI('https://secure.example.com/some_path?query=string')
Net::HTTP.start(uri.host, uri.port, :use_ssl => uri.scheme == 'https').start do |http|
request = Net::HTTP::Get.new uri.request_uri
response = http.request request # Net::HTTPResponse object
end
Presuming you are on ruby < 1.9.3, you will also have to
require 'net/https'
If you are on ruby 1.9.3 you don't have to do anything.
Edit
If you are on the latest version, you can simply do:
open(params[:image]) # http://graph.facebook.com/#{#user.facebook_id}/picture

Nokogiri parsing for metawords

I know this question has been asked earlier but I am not able to get the parsed result. I am trying to parse metawords using nokogiri, can any one point out my mistake?
keyword = []
meta_data = doc.xpath('//meta[#name="Keywords"]/#content') #parsing for keywords
meta_data.each do |meta|
keyword << meta.value
end
key_str=keyword.join(",")
I tried running this in irb as well but keyword returns a nil.
This is how I used it in irb
require 'rubygems'
require 'nokogiri'
doc = Nokogiri::HTML("www.google.com")
have already tried alternatives from other stackoverflow posts like
Nokogiri html parsing question but of no use, they still return nil. I guess i am doing something wrong somewhere.
www.google.com does not have any meta keywords in the source. View Source on the page to see for yourself. So even if everything else went perfectly, you'd still get no results there.
The result of doc = Nokogiri::HTML("www.google.com") is
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html><body><p>www.google.com</p></body></html>
If you want to fetch the contents of a URL, you want to use something like:
require 'open-uri'
doc = Nokogiri::HTML( open('http://www.google.com' ) )
If you get a valid HTML page, and use the proper casing on keywords to match the source, it works fine. Here's an example from my IRB session, fetching a page from one of the apps on my site that happens to use name="keywords" instead of name="Keywords":
irb(main):001:0> require 'open-uri'
#=> true
irb(main):002:0> require 'nokogiri'
#=> true
irb(main):003:0> url = "http://pentagonalrobin.phrogz.net/choose"
#=> "http://pentagonalrobin.phrogz.net/choose"
irb(main):04:0> doc = Nokogiri::HTML( open(url) ); nil # don't show doc here
#=> nil
irb(main):005:0> doc.xpath('//meta[#name="keywords"]/#content').map(&:value)
#=> ["team schedule free round-robin league"]

HTTP post request via Ruby

I'm very new to ruby and trying some basic stuff. When I send HTTP request to the server using:
curl -v -H "Content-Type: application/json" -X GET -d "{"myrequest":"myTest","reqid":"44","data":{"name":"test"}}" localhost:8099
My server sees JSON data as "{myrequest:myTest,reqid:44,data:{name:test}}"
But when I send the request using the following ruby code:
require 'net/http'
#host = 'localhost'
#port = '8099'
#path = "/posts"
#body = ActiveSupport::JSON.encode({
:bbrequest => "BBTest",
:reqid => "44",
:data =>
{
:name => "test"
}
})
request = Net::HTTP::Post.new(#path, initheader = {'Content-Type' =>'application/json'})
request.body = #body
response = Net::HTTP.new(#host, #port).start {|http| http.request(request) }
puts "Response #{response.code} #{response.message}: #{response.body}"
It sees it as "{\"bbrequest\":\"BBTest\",\"reqid\":\"44\",\"data\":{\"name\":\" test\"}}" and server is unable to parse it. Perhaps there are some extra options I need to set to send request from Ruby to exclude those extra characters?
Can you please help. Thanks in advance.
What you are doing on the shell produces invalid JSON. Your server should not accept it.
$echo "{"myrequest":"myTest","reqid":"44","data":{"name":"test"}}"
{myrequest:myTest,reqid:44,data:{name:test}}
This is JSON with unescaped keys and values, will NEVER work. http://jsonlint.com/
If your server accept this "sort of kind of JSON" but does not accept the second one in your example your server is broken.
My server sees JSON data as "{myrequest:myTest,reqid:44,data:{name:test}}"
Your server sees a string. When you will try to parse it into JSON it will produce an error or garbage.
It sees it as "{\"bbrequest\":\"BBTest\",\"reqid\":\"44\",\"data\":{\"name\":\" test\"}}"
No this is how it's printed via Ruby's Object#inspect. You are printing the return value of inspect somewhere and then trying to judge whether it's valid JSON - it is not, since this string you've pasted in is made to be pasted into the interactive ruby console (irb) or into a ruby script, and it contains builtin escapes. You need to see your JSON string raw, just print the string instead of inspecting it.
I think your server is either broken or not finished yet, your curl example is broken and your ruby script is correct and will work once the server is fixed (or finished). Simply because
irb(main):002:0> JSON.parse("{\"bbrequest\":\"BBTest\",\"reqid\":\"44\",\"data\":{\"name\":\" test\"}}")
# => {"bbrequest"=>"BBTest", "reqid"=>"44", "data"=>{"name"=>" test"}}
Your problem is something other than the existence of escape characters in the string. Those are not put in by the code you show, but by irb or .inspect. If you put in a simple puts #body in your code (or in a Rails context, logger.debug #body), you'll see this. Here's an irb session showing the difference:
ruby-1.9.2-p180 :002 > require 'active_support'
=> true
ruby-1.9.2-p180 :003 > json = ActiveSupport::JSON.encode({
ruby-1.9.2-p180 :004 > :bbrequest => "BBTest",
ruby-1.9.2-p180 :005 > :reqid => "44",
ruby-1.9.2-p180 :006 > :data =>
ruby-1.9.2-p180 :007 > {
ruby-1.9.2-p180 :008 > :name => "test"
ruby-1.9.2-p180 :009?> }
ruby-1.9.2-p180 :010?> })
=> "{\"bbrequest\":\"BBTest\",\"reqid\":\"44\",\"data\":{\"name\":\"test\"}}"
ruby-1.9.2-p180 :013 > puts json
{"bbrequest":"BBTest","reqid":"44","data":{"name":"test"}}
=> nil
In any case, the best way to do json encoding in Rails is not to call ActiveSupport::JSON.encode directly, but rather override as_json in your model or use the serializable_hash feature. This will make your code cleaner as well. See the top answers to this stackoverflow question for details.

invalid URI - How to prevent, URI::InvalidURIError errors?

I got the following back from delayed_job:
[Worker(XXXXXX pid:3720)] Class#XXXXXXX failed with URI::InvalidURIError: bad URI(is not URI?): https://s3.amazonaws.com/cline-local-dev/2/attachments/542/original/mac-os-x[1].jpeg?AWSAccessKeyId=xxxxxxxx&Expires=1295403309&Signature=xxxxxxx%3D - 3 failed attempts
The way this URI comes from in my app is.
In my user_mailer I do:
#comment.attachments.each do |a|
attachments[a.attachment_file_name] = open(a.authenticated_url()) {|f| f.read }
end
Then in my attachments model:
def authenticated_url(style = nil, expires_in = 90.minutes)
AWS::S3::S3Object.url_for(attachment.path(style || attachment.default_style), attachment.bucket_name, :expires_in => expires_in, :use_ssl => attachment.s3_protocol == 'https')
end
That being said, is there some type of URI.encode or parsing I can do to prevent a valid URI (as I checked the URL works in my browser) for erroring and killing delayed_job in rails 3?
Thank you!
Ruby has (at least) two modules for dealing with URIs.
URI is part of the standard library.
Addressable::URI, is a separate gem, and more comprehensive, and claims to conform to the spec.
Parse a URL with either one, modify any parameters using the gem's methods, then convert it using to_s before passing it on, and you should be good to go.
I tried ' open( URI.parse(URI.encode( a.authenticated_url() )) ' but that errord with OpenURI::HTTPError: 403 Forbidden
If you navigated to that page via a browser and it succeeded, then later failed going to it directly via code, it's likely there is a cookie or session state that is missing. You might need to use something like Mechanize, which will maintain that state while allowing you to navigate through a site.
EDIT:
require 'addressable/uri'
url = 'http://www.example.com'
uri = Addressable::URI.parse(url)
uri.query_values = {
:foo => :bar,
:q => '"one two"'
}
uri.to_s # => "http://www.example.com?foo=bar&q=%22one%20two%22"

Resources