HTTP::ConnectionError & Errno::EHOSTUNREACH Errors in Rails App - ruby-on-rails

I'm working on a Rails app and here are two important pieces of the error message I get when I try to seed data in my database:
HTTP::ConnectionError: failed to connect: Operation timed out - SSL_connect
Errno::ETIMEDOUT: Operation timed out - SSL_connect
Here is my code, where I'm pulling data from a file, and creating Politician objects:
politician_data.each do |politician|
photo_from_congress = "https://theunitedstates.io/images/congress/original/" + politician["id"]["bioguide"] + ".jpg"
HTTP.get(photo_from_congress).code == 200 ? image = photo_from_congress : image = "noPoliticianImage.png"
Politician.create(
name: "#{politician["name"]["first"]} #{politician["name"]["last"]}",
image: image
)
end
I put in a pry, and the iteration works for the first loop, so the code is OK. After several seconds, the loop breaks, and I get that error, so I think it has something to do with the number of HTTP.get requests I'm making?
https://github.com/unitedstates/images is a Git repo. Perhaps that repo can't handle that many get requests?
I did some Google'ing and saw it may have something to do with "Request timed out" error? My having to set up a proxy servers? I'm a junior programmer so please be very specific when responding.
*EDIT TO ADD THIS:
I found this blurb on the site where I'm making get requests to cull photos (https://github.com/unitedstates/images), that may help?
Note: Our HTTPS permalinks are provided through CloudFlare's Universal SSL, which also uses "Flexible SSL" to talk to GitHub Pages' unencrypted endpoints. So, you should know that it's not an end-to-end encrypted channel, but is encrypted between your client use and CloudFlare's servers (which at least should dissociate your requests from client IP addresses).

by the way, using "Net::HTTP" instead of the "HTTP" Ruby gem worked. Instead of checking the status code, i just checked to see if the body contained key text:
photo_from_congress = "https://theunitedstates.io/images/congress/original/" + politician["id"]["bioguide"] + ".jpg"
photo_as_URI = URI(photo_from_congress)
Net::HTTP.get_response(photo_as_URI ).body.include?("File not found") ? image = "noPoliticianImage.png" : image = photo_from_congress

Related

Running ActionCable behind Cloudfront

We've setup Cloudfront in front of our application, but unfortunately it strips the Upgrade header required for ActionCable to run.
We'd like to have a different subdomain that points to the same servers, but bypasses Cloudfront (socket.site.com, for instance). We've done this and it's somewhat working, but it seems like a persistent connection can't be made. ActionCable continues to retry to make the connection every 10s and seems unable to hold the connection open:
Any advice related to Cloudfront or different domains for ActionCable is appreciated.
To all who follow, hopefully this helps.
As of the time of me writing this (Oct. 2018), it doesn't appear that you can use ActionCable behind Cloudfront at all. CF will discard the upgrade header which will prevent a secure socket connection from ever being made.
Our setup was CF -> Application Load Balancer (ALB) -> EC2. On the AWS side, we began by making a subdomain (socket.example.com) that pointed directly to the same ALB and bypassed CF entirely. Note that Classic Load Balancers absolutely will not work. You can only use ALBs.
This alone did not fix the issue. On your Rails config, you have to add the following lines to your production.rb:
config.action_cable.url = 'wss://socket.example.com:28080/cable'
config.action_cable.allowed_request_origins = ['https://example.com'] # Not the subdomain
You may also need to update your CSP to include wss://socket.example.com/cable for connect_src.
If at this point you're getting a message about failing to upgrade, you need to ensure that your NGINX config is correct. This answer may help.
You will also need to reflect this change in your cable.js. This following snippet works for me with local development as well as production, but you may need to alter it. I wrote it with pre-ES6 in mind because this file never hit Babel in our configuration.
(function() {
this.App || (this.App = {})
var wsUrl
if(location.host.indexOf('localhost') != -1) {
wsUrl = '/cable'
} else {
var host = location.host
var protocol = location.protocol
wsUrl = protocol + '//socket.' + host + '/cable'
}
App.cable = ActionCable.createConsumer(wsUrl)
}).call(this)
That may be all you need, depending on your authentication scheme. However, I was using cookies shared between the main application and ActionCable and this caused a difficult bug. The connection would appear to be made correctly, but it would actually fail and ActionCable would retry every 10s. The final step was to ensure the auth cookies being set would work across the socket subdomain. I updated my cookie as such:
cookies.signed[:cookie_name] = {
value: payload,
domain: ['.socket.example.com', '.example.com']
# Some people have to specify tld_length, but I was fine without it
}

In ruby/rails, can you differentiate between no network response vs long-running response?

We have a Rails app with an integration with box.com. It happens fairly frequently that a request for a box action to our app results in a Passenger process being tied up for right around 15 minutes, and then we get the following exception:
Errno::ETIMEDOUT: Connection timed out - SSL_connect
Often it's on something that should be fairly quick, such as listing the contents of a small folder, or deleting a single document.
I'm under the impression that these requests never actually got to an open channel, that either at the tcp or ssl levels we got no initial response, or the full handshake/session-setup never completed.
I'd like to get either such condition to timeout quickly, say 15 seconds, but allow for a large file that is successfully transferring to continue.
Is there any way to get TCP or SSL to raise a timeout much sooner when the connection at either of those levels fails to complete setup, but not raise an exception if the session is successfully established and it's just taking a long time to actually transfer the data?
Here is what our current code looks like - we are not tied to doing it this way (and I didn't write this code):
def box_delete(uri)
http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = true
http.verify_mode = OpenSSL::SSL::VERIFY_NONE
request = Net::HTTP::Delete.new(uri.request_uri)
http.request(request)
end

Mechanize error "too many bad responses"

Doing scraping I've found that some urls failed. After check the url looked ok in the browser and see in wireshark the remote server was answering with a 200 I've finally found that the url:
http://www.segundamano.es/electronica-barcelona-particulares/galaxy-note-3-mas.htm
was failing with
Net::HTTP::Persistent::Error: too many bad responses after 0 requests on 42319240, last used 1414078471.6468294 seconds ago
More weird is that if you remove a character from the last part, it works. If you add the character in another place, it fails again.
Update 1
The "code"
agent = Mechanize.new
page = agent.get("http://www.segundamano.es/electronica-barcelona-particulares/galaxy-note-3.htm")
Net::HTTP::Persistent::Error: too many bad responses after 0 requests on 41150840, last used 1414079640.353221 seconds ago
This is a network error which normally occurs if you make too many requests to a certain source from the same IP and thus the page takes too long to load. You could try adding a custom timeout to your connection agent, keep the connection alive and ignore bad chunking (potentially bad):
agent = Mechanize.new
agent.keep_alive = true
agent.ignore_bad_chunking = true
agent.open_timeout = 25
agent.read_timeout = 25
page = agent.get("http://www.segundamano.es/electronica-barcelona-particulares/galaxy-note-3.htm")
But that is not giving you a guarantee that the connection will be successfull, it just increases the chances.
It's hard to say why you get the error on one url and not on another. When you remove the 3 you request a different page; one that might be easier for the server to process? My point being: There is nothing wrong with your Mechanize setup but with the response you are getting back.
Agree with Severin, the problem was in the other side. As I can't do anything in the server, I was trying different libs to fetch the data. It was weird that some of them worked and others don't. Trying different setups for mechanize, at the end I've found a good one:
agent = Mechanize.new { |agent|
agent.gzip_enabled = false
}

Connect to a password protected FTP through PROXY in Ruby

I'm trying to upload to my server (on Heroku) a file stored in a password protected FTP.
The problem is that this FTP also dont contain my production IP address on his whitelist (and i cant add it..) so i should use a proxy to connect my rails app this FTP.
I tried this code :
proxy_uri = URI(ENV['QUOTAGUARDSTATIC_URL'] || 'http://login:password#myproxy.com:9293')
Net::HTTP::Proxy(proxy_uri.host, proxy_uri.port,"login","password").start('ftp://login:password#ftp.website.com') do |http|
http.get('/path/to/myfile.gz').body
end
But my http.get returns me lookup ftp: no such host.
I also got this code for FTP download, but i dont know how to make it works with a proxy :
ftp = Net::FTP.new('ftp.myftp.com', 'login', 'password')
ftp.chdir('path/to')
ftp.getbinaryfile('myfile.gz', 'public/myfile.gz', 1024)
ftp.close
Thanks in advance.
I realise that you asked this question over 6 months ago, but I recently had a similar issue and found that this (unanswered) question is the top Google result, so I thought I would share my findings.
mudasobwa's comment below your original post has a link to the net/ftp documentation which explains how to use a SOCKS proxy...
Although you don't mention a specific requirement for a HTTP proxy in your original post, it seems obvious to me that is what you were trying to use. As I'm sure you're aware, this makes the SOCKS documentation totally irrelevant.
The following code has been tested on ruby-1.8.7-p357 using an HTTP proxy that does not require authentication:
file = File.open('myfile.gz', 'w')
http = Net::HTTP.start('myproxy.com', '9293')
resp, data = http.get('ftp://login:password#ftp.website.com')
file.write(data) if resp.code == "200"
file.close unless file.nil?
Source
This should give you a good starting point to figure the rest out for yourself.
To get you going, I would guess that you could use user:pass#myproxy.com for basic auth, or perhaps sending a Proxy-Authorization header in your GET request.

405 Method not allowed on Net::HTTP request [ruby on rails]

I'm trying to verify if there is a remote url with following code:
endpoint_uri = URI.parse(#endpoint.url)
endpoint_http = Net::HTTP.new(endpoint_uri.host, endpoint_uri.port)
endpoint_request = Net::HTTP::Head.new(endpoint_uri.request_uri)
endpoint_response = endpoint_http.request(endpoint_request)
I'm still getting 405 Method not allowed. When I use Get instead Head in Net::HTTP::Head.new I'm getting 200 Success but also with whole remote document in response what results in bigger response time (0.3s => 0.9s).
Any ideas why this is happening? Thx
There's a chance that the #endpoint url you're trying to interact with doesn't support HEAD requests (which would be really weird, but still may be the case). Your code works fine for me with a handful of urls (google.com, stackoverflow.com, etc.)
Have you tried a curl request to see what it returns?
curl -I http://www.the_website_you_want_to_test.com

Resources