How can I parse subpages of another website in Rails? - ruby-on-rails

I'm creating a third-party web application of Last.fm and I'm having an issue with getting info about certain artist from them.
I have a method that parses data about some #{artist} from JSON:
artists_helper.rb
require 'net/http'
require 'json'
module ArtistsHelper
def about(artist)
artist = "?"
url = "http://ws.audioscrobbler.com/2.0/?method=artist.getinfo&artist=#{artist}&api_key=f5cb791cfb2ade77749afcc97b5590c8&format=json"
uri = URI(url)
response = Net::HTTP.get(uri)
JSON.parse(response)
end
end
If I change '?' to the artist name in that method I can successfully parse info about artist from JSON file of that artist. But for when I go the page e.g. http://localhost:3000/artists/Wild+Nothing I need the method 'about(artist)' to get the value 'Wild+Nothing' and parse the data for Wild Nothing from Last.fm's JSON file.
How can I tell the method 'about' that what stands after http://localhost:3000/artists/ is the required value?

In the routes, have a get route name that accepts a variable
get 'artists/:name', to: 'artists#about'
In the artists controller, have an about function:
def about
artist = params[:name]
url = "http://ws.audioscrobbler.com/2.0/?method=artist.getinfo&artist=#{artist}&api_key=f5cb791cfb2ade77749afcc97b5590c8&format=json"
uri = URI(url)
response = Net::HTTP.get(uri)
response = JSON.parse(response)
render json: response
end
and we are good to go to display the json on the view.
If you need the param in the helper, just pass params[:name] to the helper as a parameter.
about(param[:name]) #wherever you are calling this method in the controller or view
def about(artist)
url = "http://ws.audioscrobbler.com/2.0/?method=artist.getinfo&artist=#{artist}&api_key=f5cb791cfb2ade77749afcc97b5590c8&format=json"
uri = URI(url)
response = Net::HTTP.get(uri)
JSON.parse(response)
end

Related

Posting to other website's form and getting response with Rails

I am trying to send some params to this website (http://www.degraeve.com/translator.php) and get the response to my rails application. I want to select 'binary' from the radio buttons whose name is 'd' and put just 'a' on the text field whose name is 'w' to be translated.
I am using this action on my controller:
class RoomsController < ApplicationController
require "uri"
require "net/http"
require 'json'
def test
uri = URI.parse("http://www.degraeve.com/translator.php")
header = {'Content-Type': 'text/json'}
params = { d: 'binary', w: 'a' }
# Create the HTTP objects
http = Net::HTTP.new(uri.host, uri.port)
request = Net::HTTP::Post.new(uri.request_uri, header)
request.body = params.to_json
# Send the request
response = http.request(request)
render json: response.body
end
end
Is there something wrong? It just renders the body of http://www.degraeve.com/translator.php before submitting the form, but I would like to get the body after it has been submitted.
When you look at what happens after you press the "Translate!" button you may notice that there is no form being submitted via POST. Instead, a GET request is sent and a HTML file is returned - see for yourself in your browser's network inspector.
Consequently, you can send a simple GET request with a prepared URL, like this (note the d and w query parameters):
uri = URI.parse("http://www.degraeve.com/cgi-bin/babel.cgi?d=binary&url=http%3A%2F%2Fwww.multivax.com%2Flast_question.html&w=a")
response = Net::HTTP.get_print(uri)
and then parse the response accordingly.

How to remove special characters from params hash?

I have one application with the following code:
quantity = 3
unit_types = ['MarineTrac','MotoTrac','MarineTrac']
airtime_plan = 'Monthly Airtime Plan'
url = "http://localhost:3000/home/create_units_from_paypal?quantity=#{quantity}&unit_types=#{unit_types}&airtime_plan=#{airtime_plan}"
begin
resp = Net::HTTP.get(URI.parse(URI.encode(url.strip)))
resp = JSON.parse(resp)
puts "resp is: #{resp}"
true
rescue => error
puts "Error: #{error}"
return nil
end
It sends data to my other application via the URL params query string. This is what the controller method of that other application looks like:
def create_units_from_paypal
quantity = params[:quantity]
unit_types = params[:unit_types]
airtime_plan = params[:airtime_plan]
quantity.times do |index|
Unit.create! unit_type_id: UnitType.find_by_name(unit_types[index]),
airtime_plan_id: AirtimePlan.find_by_name(airtime_plan),
activation_state: ACTIVATION_STATES[:activated]
end
respond_to do |format|
format.json { render :json => {:status => "success"}}
end
end
I get this error:
<h1>
NoMethodError
in HomeController#create_units_from_paypal
</h1>
<pre>undefined method `times' for "3":String</pre>
<p><code>Rails.root: /Users/johnmerlino/Documents/github/my_app</code></p>
I tried using both raw and html_safe on the params[:quantity] and other params, but still I get the error. Note I had to use URI.encode(url) because URI.parse(url) returned bad uri probably because of the array of unit_types.
Change:
quantity.times do |index|
To:
quantity.to_i.times do |index|
The reason you are having this problem is because you are treating the params values as the types that you originally tried to send, but they are actually always going to be strings. Converting back to the expected 'type' solves your problem.
However, you have some more fundamental problems. Firstly, you are trying to send an array by simply formatting it to a string. However, this is not the format that the receiving application expects to translate back to an array. Secondly, there is duplication in your request - you don't need to specify a quantity. The length of the array itself is the quantity. A better method would be to build your url like this:
url = 'http://localhost:3000/home/create_units_from_paypal?'
url << URI.escape("airtime_plan=#{airtime_plan}") << "&"
url << unit_types.map{|ut| URI.escape "unit_types[]=#{ut}" }.join('&')
On the receiving side, you can do this:
def create_units_from_paypal
unit_types = params[:unit_types]
airtime_plan = params[:airtime_plan]
quantity = unit_types.try(:length) || 0
#...

How to get a full URL given a shortened one passed to Nokogiri?

I want to traverse some HTML documents with Nokogiri.
After getting the XML object, I want to have the last URL used by Nokogiri that fetched a document to be part of my JSON response.
def url = "http://ow.ly/hh8ri"
doc = Nokogiri::HTML(open(url)
...
Nokogiri internally redirects it to http://www.mp.rs.gov.br/imprensa/noticias/id30979.html, but I want to have access to it.
I want to know if the "doc" object has access to some URL as attribute or something.
Does someone know a workaround?
By the way, I want the full URL because I'm traversing the HTML to find <img> tags and some have relative ones like: "/media/image/image.png", and then I adjust some using:
URI.join(url, relative_link_url).to_s
The image URL should be:
http://www.mp.rs.gov.br/media/imprensa/2013/01/30979_260_260__trytr.jpg
Instead of:
http://ow.ly/hh8ri/media/imprensa/2013/01/30979_260_260__trytr.jpg
EDIT: IDEA
class Scraper < Nokogiri::HTML::Document
attr_accessor :url
class << self
def new(url)
html = open(url, ssl_verify_mode: OpenSSL::SSL::VERIFY_NONE)
self.parse(html).tap do |d|
url = URI.parse(url)
response = Net::HTTP.new(url.host, url.port)
head = response.start do |r|
r.head url.path
end
d.url = head['location']
end
end
end
end
Use Mechanize. The URLs will always be converted to absolute:
require 'mechanize'
agent = Mechanize.new
page = agent.get 'http://ow.ly/hh8ri'
page.images.map{|i| i.url.to_s}
#=> ["http://www.mp.rs.gov.br/images/imprensa/barra_area.gif", "http://www.mp.rs.gov.br/media/imprensa/2013/01/30979_260_260__trytr.jpg"]
Because your example is using OpenURI, that's the code to ask, not Nokogiri. Nokogiri has NO idea where the content came from.
OpenURI can tell you easily:
require 'open-uri'
starting_url = 'http://www.example.com'
final_uri = nil
puts "Starting URL: #{ starting_url }"
io = open(starting_url) { |io| final_uri = io.base_uri }
doc = io.read
puts "Final URL: #{ final_uri.to_s }"
Which outputs:
Starting URL: http://www.example.com
Final URL: http://www.iana.org/domains/example
base_uri is documented in the OpenURI::Meta module.
I had the exact same issue recently. What I did was to create a class that inherits from Nokogiri::HTML::Document, and then just override thenew class method to parse the document, then save the url in an instance variable with an accessor:
require 'nokogiri'
require 'open-uri'
class Webpage < Nokogiri::HTML::Document
attr_accessor :url
class << self
def new(url)
html = open(url)
self.parse(html).tap do |d|
d.url = url
end
end
end
end
Then you can just create a new Webpage, and it will have access to all the normal methods you would have with a Nokogiri::HTML::Document:
w = Webpage.new("http://www.google.com")
w.url
#=> "http://www.google.com"
w.at_css('title')
#=> [#<Nokogiri::XML::Element:0x4952f78 name="title" children=[#<Nokogiri::XML::Text:0x4952cb2 "Google">]>]
If you have some relative url that you got from an image tag, you can then make it absolute by passing the return value of the url accessor to URI.join:
relative_link_url = "/media/image/image.png"
=> "/media/image/image.png"
URI.join(w.url, relative_link_url).to_s
=> "http://www.google.com/media/image/image.png"
Hope that helps.
p.s. the title of this question is quite misleading. Something more along the lines of "Accessing URL of Nokogiri HTML document" would be clearer.

Getting data out of a JSON Response in Rails 3

So I am trying to pull tweets off of Twitter at put them into a rails app (Note because this is an assignment I can't use the Twitter Gem) and I am pretty confused.
I can get the tweets I need in the form of a JSON string but I'm not sure where to go from there. I know that the Twitter API call I'm making returns a JSON array with a bunch of Tweet objects but I don't know how to get at the tweet objects. I tried JSON.parse but was still unable to get the required data out (I'm not sure what that returns). Here's the code I have so far, I've made it pretty clear with comments/strings what I'm trying for. I'm super new to Rails, so this may be way off for what I'm trying to do.
def get_tweets
require 'net/http'
uri = URI("http://search.twitter.com/search.json?q=%23bieber&src=typd")
http = Net::HTTP.new(uri.host, uri.port)
request = Net::HTTP::Get.new(uri.request_uri)
response = http.request(request)
case response
when Net::HTTPSuccess then #to get: text -> "text", date: "created_at", tweeted by: "from_user", profile img url: "profile_img_url"
JSON.parse(response.body)
# Here I need to loop through the JSON array and make n tweet objects with the indicated fields
t = Tweet.new(:name => "JSON array item i with field from_user", :text "JSON array item i with field text", :date => "as before" )
t.save
when Net::HTTPRedirection then
location = response['location']
warn "redirected to #{location}"
fetch(location, limit - 1)
else
response.value
end
end
Thanks!
The JSON.parse method returns a ruby hash or array representing the json object.
In your case, the Json is parsed as a hash, with the "results" key (inside that you have your tweets), and some meta data: "max_id", "since_id", "refresh_url", etc. Refer to twitter documentation for a description on the fields returned.
So again with your example it would be:
parsed_response = JSON.parse(response.body)
parsed_response["results"].each do |tweet|
t = Tweet.new(:name => tweet["from_user_name"], :text => tweet["text"], :date => tweet["created_at"])
t.save
end

Problems with MailChimp API in Ruby Error Code: -90

I am using the following code in my MailChimp Controller to submit simple newsletter data. When It is sent I receive the following error as a "Method is not exported by this server -90" I have attached my controller code below. I am using this controller for a simple newsletter signup form. (Name, Email)
class MailchimpController < ApplicationController
require "net/http"
require "uri"
def subscribe
if request.post?
mailchimp = {}
mailchimp['apikey'] = 'f72328d1de9cc76092casdfsd425e467b6641-us2'
mailchimp['id'] = '8037342dd1874'
mailchimp['email_address'] = "email#gmail.com"
mailchimp['merge_vars[FNAME]'] = "FirstName"
mailchimp['output'] = 'json'
uri = URI.parse("http://us2.api.mailchimp.com/1.3/?method=listSubscribe")
response = Net::HTTP.post_form(uri, mailchimp)
mailchimp = ActiveSupport::JSON.decode(response.body)
if mailchimp['error']
render :text => mailchimp['error'] + "code:" + mailchimp['code'].to_s
elsif mailchimp == 'true'
render :text => 'ok'
else
render :text => 'error'
end
end
end
end
I highly recommend the Hominid gem: https://github.com/tatemae-consultancy/hominid
The problem is that Net::HTTP.post_form is not passing the "method" GET parameter. Not being a big ruby user, I'm not certain what the actual proper way to do that with Net::HTTP is, but this works:
require "net/http"
data="apikey=blahblahblah"
response = nil
Net::HTTP.start('us2.api.mailchimp.com', 80) {|http|
response = http.post('/1.3/?method=lists', data)
}
p response.body
That's the lists() method (for simplicity) and you'd have to build up (and urlencode your values!) your the full POST params rather than simply providing the hash.
Did you take a look at the many gems already available for ruby?
http://apidocs.mailchimp.com/downloads/#ruby
The bigger problem, and main reason I'm replying to this, is that your API Key is not obfuscated nearly well enough. Granted I'm used to working with them, but I was able to guess it very quickly. I would suggest immediately going and disabling that key in your account and then editing the post to actually have completely bogus data rather than anything close to the correct key. The list id on the other hand, doesn't matter at all.
You'll be able to use your hash if you convert it to json before passing it to Net::HTTP. The combined code would look something like:
mailchimp = {}
mailchimp['apikey'] = 'APIKEYAPIKEYAPIKEYAPIKEY'
mailchimp['id'] = '8037342dd1874'
mailchimp['email_address'] = "email#gmail.com"
mailchimp['merge_vars[FNAME]'] = "FirstName"
mailchimp['output'] = 'json'
response = nil
Net::HTTP.start('us2.api.mailchimp.com', 80) {|http|
response = http.post('/1.3/?method=listSubscribe', mailchimp.to_json)
}

Resources