Rails: Read the contents from another site - ruby-on-rails

In Rails, how can I make an http request to a page, like "http://google.com" and set the response to a variable?
Basically I'm trying to get the contents of a CSV file off of Amazon S3:
https://s3.amazonaws.com/datasets.graf.ly/24.csv
My Rails server needs to return that content as a response to an AJAX request.
Get S3 bucket
Access the file and read it
Render its contents (so the ajax request receives it)
A few questions have suggested screen scraping, but this sounds like overkill (and probably slow) for simply taking a response and pretty much just passing it along.

API
Firstly, you need to know how you're accessing the data
The problems you've cited are only valid if you just access someone's site through HTTP (with something like CURL). As you instinctively know, this is highly inefficient & will likely get your IP blocked for continuous access
A far better way to access data (from any reputable service) is to use their API. This is as true of S3 as Twitter, Facebook, Dropbox, etc:
AWS-SDK
#GemFile
gem "aws-sdk-core", "~> 2.0.0.rc2"
#config/application.rb
Aws.config = {
access_key_id: '...',
secret_access_key: '...',
region: 'us-west-2'
}
#config/initializers/s3.rb
S3 = Aws::S3.new
S3 = Aws.s3
Then you'll be able to use the API resources to help retrieve objects:
#controller
# yields once per response, even works with non-paged requests
s3.list_objects(bucket:'aws-sdk').each do |resp|
puts resp.contents.map(&:key)
end
CORS
If you were thinking of xhring into a server, you need to ensure you have the correct CORS permissions to do so
Considering you're wanting to use S3, I would look at this documentation to ensure you set the permissions correctly. This does not apply to the API or an HTTP request (only Ajax)

To do as you asked:
the open-uri solution from
How make a HTTP request using Ruby on Rails?
(to read from https in the simplest way possible), and
the set headers solution from in rails, how to return records as a csv file,
and
and a jquery library to decode the csv, eg http://code.google.com/p/jquery-csv/ -
Alternatively decode the csv file in rails and pass a json array of arrays back:
decode the csv as suggested in Rails upload CSV file with header
return the decoded data with the appropriate type
off the top of my head it should be something like:
def get_csv
url = 'http://s3.amazonaws.com/datasets.graf.ly/%d.csv' % params[:id].to_i
data = open(url).read
# set header here
render :text => data
end

Related

Rails-API How to test Active Storage

I added Active Storage into my Rails API application and want to test it, but I don't know how to do it. I was trying send file with JSON data in Postman, but JSON data doesn't send correctly or I am doing something wrong. I did it like that:
Image from postman
Is there any option to send request with file and JSON data without creating any view?
As far as I know, an upload can't be done via an API rest way. You need to use enctype: multipart/form-data as regular POST done via form.
In your screenshot you are already sending as form-post, because you chose to send via form-data. This is why your json isn't properly parsed by rails.
If you want to upload an image and post data in the same request you will need to break your json attributes into form data fields like:
data[name]=asaa
data[last_name]=foo
...
Or, you can send a JSON in your data field and do the manual parsing when fetching in controller like:
def upload
file = params[:file]
data = parsed_params_data!(params)
# do your magic
end
def parsed_params_data!(params)
JSON.parse(params[:data])
end

Verify Shopify webhook

I believe that to have a Shopify webhook integrate with a Rails app, the Rails app needs to disable the default verify_authenticity_token method, and implement its own authentication using the X_SHOPIFY_HMAC_SHA256 header. The Shopify docs say to just use request.body.read. So, I did that:
def create
verify_webhook(request)
# Send back a 200 OK response
head :ok
end
def verify_webhook(request)
header_hmac = request.headers["HTTP_X_SHOPIFY_HMAC_SHA256"]
digest = OpenSSL::Digest.new("sha256")
request.body.rewind
calculated_hmac = Base64.encode64(OpenSSL::HMAC.digest(digest, SHARED_SECRET, request.body.read)).strip
puts "header hmac: #{header_hmac}"
puts "calculated hmac: #{calculated_hmac}"
puts "Verified:#{ActiveSupport::SecurityUtils.secure_compare(calculated_hmac, header_hmac)}"
end
The Shopify webhook is directed to the correct URL and the route gives it to the controller method shown above. But when I send a test notification, the output is not right. The two HMACs are not equal, and so it is not verified. I am fairly sure that the problem is that Shopify is using the entire request as their seed for the authentication hash, not just the POST contents. So, I need the original, untouched HTTP request, unless I am mistaken.
This question seemed like the only promising thing on the Internet after at least an hour of searching. It was exactly what I was asking and it had an accepted answer with 30 upvotes. But his answer... is absurd. It spits out an unintelligible, garbled mess of all kinds of things. Am I missing something glaring?
Furthermore, this article seemed to suggest that what I am looking for is not possible. It seems that Rails is never given the unadulterated request, but it is split into disparate parts by Rack, before it ever gets to Rails. If so, I guess I could maybe attempt to reassemble it, but I would have to even get the order of the headers correct for a hash to work, so I can't imagine that would be possible.
I guess my main question is, am I totally screwed?
The problem was in my SHARED_SECRET. I assumed this was the API secret key, because a few days ago it was called the shared secret in the Shopify admin page. But now I see a tiny paragraph at the bottom of the notifications page that says,
All your webhooks will be signed with ---MY_REAL_SHARED_SECRET--- so
you can verify their integrity.
This is the secret I need to use to verify the webhooks. Why there are two of them, I have no idea.
Have you tried doing it in the order they show in their guides? They have a working sample for ruby.
def create
request.body.rewind
data = request.body.read
header = request.headers["HTTP_X_SHOPIFY_HMAC_SHA256"]
verified = verify_webhook(data, header)
head :ok
end
They say in their guides:
Each Webhook request includes a X-Shopify-Hmac-SHA256 header which is
generated using the app's shared secret, along with the data sent in
the request.
the keywors being "generated using shared secret AND DATA sent in the request" so all of this should be available on your end, both the DATA and the shared secret.

Rails reading API response header

I am getting familiarised with APIs. As a start, I am using the Forecast API.
In the docs, you will find a section entitled "Response Headers". What are they, and how can I use them?
Also, to get a response, it says you need to pass an API key, along with lat and long data. But aren't API keys supposed to be kept secret? Will anyone find out the contents of the request?
This is the code I have:
Forecast model
require 'json'
class Forecast
include HTTParty
debug_output $stdout
default_params :apiKey => 'xxxxxxxxxxxxxxxxxxxxxxxxx'
base_uri "api.forecast.io"
format :json
def self.get_weather(api,lat,long)
#response = get("/forecast/#{apiKey}/#{lat},#{long}")
end
def self.show_weather
JSON.parse(#response.body)
end
end
Forecast controller
def index
#weather = Forecast.get_weather("28.5355", "77.3910")
#response = Forecast.show_weather
end
Forecast view
<%= #response["currently"]["summary"] %>
You're asking a couple of different questions here.
Response headers: They are part of an HTTP response, and contain information about the response. For example, they might tell you the MIME-type of the response - eg. Content-Type: application/json. In this case, Forecast use it to tell you how many API calls you've made (X-Forecast-API-Calls) and how long it took them to respond (X-Response-Time) as well as some caching information.
API keys: Yes, these should be kept secret. The Forecast API works over HTTPS, so (in theory) your API key should be kept secret from people sniffing traffic on your network. The main danger is keeping it in your code and, for example, committing it to GitHub. You should figure out a safer way to store the API key. One example, whilst not perfect, would be to have it as an environment variable.
I hope that helps.

Rails, Rackspace Cloud Files, Referrer ACL

I am using Rackspace Cloud Files as File Storage server for my application. The files that users upload must be authorized from within my application, then from a controller it would redirect to the correct Rackspace Cloud Files CDN URL. I am trying to do authorization using Rackspace Cloud Files' Referrer ACL.
So let me just add a very simple snippet to clarify what I am trying to accomplish.
class FilesController < ApplicationController
def download
redirect_to(some_url_to_a_file_on_cloud_files_url)
end
end
The URL the user would access to get to this download action would be the following:
http://a-subdomain.domain.com/projects/:project_id/files/:file_id/download
So with the CloudFiles gem I have set up an ACL Referrer regular expression that should work.
http\:\/\/.+\.domain\.com\/projects\/\d+\/files\/\d+\/download
When the user clicks on a link in the web UI, it routes them to the above URL and depending on the parameters, it will from the download action redirect the user to the correct Rackspace Cloud Files File URL.
Well, what I get is an error, saying that I am unauthorized (wrong http referrer). I have a hunch that because I am doing a redirect from the download action straight to cloud files, that it doesn't "count" as a HTTP Referrer and, rather than use this URL as a referrer, I think it might be using this URL:
http\:\/\/.+\.domain\.com\/projects\/\d+\/files
Since this is the page you are on when you want to click on the "download" link, that directs the user to the download action in the FilesController.
When I set the HTTP Referrer for Rackspace ACL to just this:
http\:\/\/.+\.domain\.com\/projects\/\d+\/files
And then click on a link, I am authorized to download. However, this isn't safe enough since then anyone could for example just firebug into the html and inject a raw link to the file and gain access.
So I guess my question is, does anyone have any clue how or why, what I am trying to accomplish is not working, and have any suggestions/ideas? As I said I think it might be that when a user clicks the link, that the referrer is being set to the location of which the file is being clicked, not the url where the user is being redirected to the actual file on cloud files.
Is something like this possible?
class FilesController < ApplicationController
def download
# Dynamically set a HTTP Referrer here before
# redirecting the user to the actual file on cloud files
# so the user is authorized to download the file?
redirect_to(some_url_to_a_file_on_cloud_files_url)
end
end
Any help, suggestions are much appreciated!
Thanks!
Generally Micahel's comment is more than enough to explain why S3 tops rackspace for this matter, but if you'd really like to add some special HTTP headers to your Rackspace request - do an HTTP request of your own and fetch the file manually:
class DownloadsController < ApplicationController
def download
send_data HTTParty.get(some_url_to_a_file_on_cloud_files_url, :headers => {"x-special-headers" => "AWESOME" }), :file_name => "myfile.something"
end
end
Yes, you can code this example better but it's the general idea.
Although there is still no 'Referer' check, you can create temp urls (signed urls) with the current version of Rackspace CloudFiles.
The following code is taken from Rackspace documentation site.
require "openssl"
unless ARGV.length == 4
puts "Syntax: <method> <url> <seconds> <key>"
puts ("Example: GET https://storage101.dfw1.clouddrive.com/v1/" +
"MossoCloudFS_12345678-9abc-def0-1234-56789abcdef0/" +
"container/path/to/object.file 60 my_shared_secret_key")
else
method, url, seconds, key = ARGV
method = method.upcase
base_url, object_path = url.split(/\/v1\//)
object_path = '/v1/' + object_path
seconds = seconds.to_i
expires = (Time.now + seconds).to_i
hmac_body = "#{method}\n#{expires}\n#{object_path}"
sig = OpenSSL::HMAC.hexdigest("sha1", key, hmac_body)
puts ("#{base_url}#{object_path}?" +
"temp_url_sig=#{sig}&temp_url_expires=#{expires}")
end

Rails: Obfuscating Image URLs on Amazon S3? (security concern)

To make a long explanation short, suffice it to say that my Rails app allows users to upload images to the app that they will want to keep in the app (meaning, no hotlinking).
So I'm trying to come up with a way to obfuscate the image URLs so that the address of the image depends on whether or not that user is logged in to the site, so if anyone tried hotlinking to the image, they would get a 401 access denied error.
I was thinking that if I could route the request through a controller, I could re-use a lot of the authorization I've already built into my app, but I'm stuck there.
What I'd like is for my images to be accessible through a URL to one of my controllers, like:
http://railsapp.com/images/obfuscated?member_id=1234&pic_id=7890
If the user where to right-click on the image displayed on the website and select "Copy Address", then past it in, it would be the SAME url (as in, wouldn't betray where the image is actually hosted).
The actual image would be living on a URL like this:
http://s3.amazonaws.com/s3username/assets/member_id/pic_id.extension
Is this possible to accomplish? Perhaps using Rails' render method? Or something else? I know it's possible for PHP to return the correct headers to make the browser think it's an image, but I don't know how to do this in Rails...
UPDATE: I want all users of the app to be able to view the images if and ONLY if they are currently logged on to the site. If the user does not have a currently active session on the site, accessing the images directly should yield a generic image, or an error message.
S3 allows you to construct query strings for requests which allow a time-limited download of an otherwise private object. You can generate the URL for the image uniquely for each user, with a short timeout to prevent reuse.
See the documentation, look for the section "Query String Request Authentication Alternative". I'd link directly, but the frame-busting javascript prevents it.
Should the images be available to only that user or do you want to make it available to a group of users (friends)?
In any case if you want to stop hotlinking you should not store the image files under DocumentRoot of your webserver.
If the former, you could store the image on the server as MD5(image_file_name_as_exposed_to_user + logged_in_username_from_cookie). When the user requests image_file_name_as_exposed_to_user, in your rails app, construct the image filename as previously mentioned and then open the file in rails app and write it out (after first setting Content-Type in response header appropriately). This is secure by design.
If the image could be shared with friends, then you should not incorporate username in constructed filename but rest of the advice should work.
This is late in the day to be answering, but another option altogether would be to store the files in MongoDB's GridFS, served through a bit of Rack Middleware that requires auth to be passed. Pretty much as secure as you like, and the URLs don't even need obfuscation.
The other benefit of this is in the availability of the files and the future scalability of the system.
Thanks for your responses, but I'm still skeptical as to whether or not "timing out" the URL from Amazon is a very effective way to go.
I've updated my question above to be a little more clear about what I'm trying to do, and trying to prevent.
After some experimentation, I've come up with a way to do what I want to do in my Rails App, though this solution is not without downsides. Effectively what I've done is to construct my image_tag with a URL that points to a controller, and takes a path parameter. That controller first tests whether or not the user is authorized to see the image, then it fetches the content of the image in a separate request, and stores the content in an instance variable, which is then passed to a repond_to view to return the image, successfully obfuscating the actual image's URL (since that request is made separately).
Cons:
Adds to request time (I feel that the additional time it takes to do this double-request is acceptable considering the privacy this method gives me)
Adds some clutter to views and routes (a small amount, maybe a bit more than I'd like)
If the user is authorized, and tries to access the image directly, the image is downloaded immediately rather than displayed in the browser (anyone know how to fix this? Modify HTTP headers? Only seems to do this with the jpg, though...)
You have to make a separate view for each file format you intend to serve (two for me, jpg and png)
Are there any other cons or considerations I should be aware of with this method? So far what I've listed, I can live with...
(Refactoring welcome.)
application_controller.rb
class ApplicationController < ActionController::Base
def obfuscate_image
respond_to do |format|
if current_user
format.jpg { #obfuscated_image = fetch_url "http://s3.amazonaws.com/#{Settings.bucket}/#{params[:path]}" }
else
format.png { #obfuscated_image = fetch_url "#{root_url}/images/assets/profile/placeholder.png" }
end
end
end
protected
# helps us fetch an image, obfuscated
def fetch_url(url)
r = Net::HTTP.get_response(URI.parse(url))
if r.is_a? Net::HTTPSuccess
r.body
else
nil
end
end
end
views/application/obfuscate_image.png.haml & views/application/obfuscate_image.jpg.haml
= #obfuscated_image
routes.rb
map.obfuscate_image 'obfuscate_image', :controller => 'application', :action => 'obfuscate_image'
config/environment.rb
Mime::Type.register "image/png", :png
Mime::Type.register "image/jpg", :jpg
Calling an obfuscated image
= image_tag "/obfuscate_image?path=#{#user.profile_pic.path}"
The problem you have is that as far as I know you need the images on S3 to be World-readable for them to be accessible. At some point in the process an HTTP GET is going to have to be performed to retrieve the image, which is going to expose the real URL to tools that can sniff HTTP, such as Firebug.
Incidentally, 37signals don't consider this to be a huge problem because if I view an image in my private Backpack account I can see the public S3 URL in the browser address bar. Your mileage may vary...

Resources