In Rails I want to read excel file form Live Path like http://www.carsa.jp/admin/data.xlsx - ruby-on-rails

I wants to read a excel file existing on Live URL of another website.
When I hit that URL in browser file is downloading. While in my rails app it is giving below error
No such file or directory # rb_sysopen - http://www.carsa.jp/admin/data.xlsx (Errno::ENOENT)
My Rails app code is as below
data = Roo::Excelx.new('http://www.carsa.jp/admin/data.xlsx')
header = data.row(1)
puts header
Note: If I download file and place it within my application it is working fine but the requirement is to read it from the third-party website in a scheduled job as per the above script.
data = Roo::Excelx.new('lib/data.xlsx')
header = data.row(1)
puts header

Try using Roo::Spreadsheet.open instead of Roo::Excelx.new. According to the Roo Readme:
Roo::Spreadsheet.open can accept both paths and File instances.
This should do the trick:
Roo::Spreadsheet.open('http://www.carsa.jp/admin/data.xlsx')

Related

Downloading all files from URL with extension filter

I want to download all files from FTP or HTTP using extension filter
For example, I have one URL that contains many MKV files and I want to set the extension filtering to download all MKV files from URL or download all jpg
I can use open-Uri for this but this method only download one file and save it
require 'open-uri'
download = open('https://img.webmd.com/dtmcms/live/webmd/consumer_assets/site_images/article_thumbnails/video/caring_for_your_kitten_video/650x350_caring_for_your_kitten_video.jpg')
IO.copy_stream(download, '650x350_caring_for_your_kitten_video.png')
That's a limitation on the server site. If your server doesn't allow directory listing (which most of the time HTTP servers dont) there is not a lot you can do.
So to answer your question: open-uri will not allow you to list files.
As for FTP. you are able to list all files in a directory like this. Not sure if you're able to pass in wildcards. If not you will have to use the select method to filter on the filenames you want.
I found the solution to get all links and put it to a text file
Then using system command to get a text file to download by the download manager, for example, I need to get mkv links from URL pages
require 'mechanize'
mechanize = Mechanize.new
page = mechanize.get("URL")
page.search('a').each do |link|
uri = mechanize.resolve(' ' + link).to_s
if uri.include?(".mkv")
File.open("url.txt", "a") do |file|
file.puts uri
end
end
end
puts "The File has been created!"
#exec 'idman /d URL'
#puts "Done!"

Attach a pdf in asset pipeline using ActionMailer Rails 4

I'm trying to attach a file to an email. The file is in assets/downloads/product.pdf
In the mailer, I have:
attachments["product.pdf"] = File.read(ActionController::Base.helpers.asset_path("product.pdf"))
I've tried:
attachments["product.pdf"] = File.read(ActionController::Base.helpers.asset_url("product.pdf"))
...and even:
attachments["product.pdf"] = File.read(ActionController::Base.helpers.compute_asset_host("product.pdf") + ActionController::Base.helpers.compute_asset_path("product.pdf"))
I always get the same error:
EmailJob crashed!
Errno::ENOENT: No such file or directory - //localhost:3000/assets/product.pdf
...or a variation on the theme. But even when I try using asset_url in the view or just put the url in the browser it works:
http://localhost:3000/assets/product.pdf
I've also tried using straight up:
File.read("app/assets/downloads/product.pdf")
File.read("downloads/product.pdf")
...which works in dev environment but not on staging server (heroku). Error is still:
Errno::ENOENT: No such file or directory - downloads/product-market-fit-storyboard.pdf
Also tried:
File.read("/downloads/product.pdf")
File.read("http://lvh.me:3000/assets/product.pdf")
...don't work at all.
Ideas?
you should use syntex like this.it is work for me may be will work for you also.
File.open(Dir.glob("#{Rails.root}/app/assets/downloads/product.pdf"), "r")
When using mailer, you shouldn't use assets pipline. Asset pipeline would be useful if you wanted to have link to a file inside your email. When rendering an email, action mailer has access to files in app directory.
Please read about attachments in action mailer guide. As you can see, you just need to pass path to a file, not url:
attachments['filename.jpg'] = File.read('/path/to/filename.jpg')

retrieve carrierwave file uploaded to Amazon S3

Using Rails 3.2, carrier wave, and recently switched to store on Amazon S3. My setup and uploads are all working fine.
1. I have image_uploader.rb to upload and store images. Displaying them all works fine
2. I have file_uploader.rb to upload and store files. I've even taken it a step further to upload ZIP files and extract a version so that both the ZIP file and TXT files are stored in the correct place on S3.
My problem is I run a method on the TXT file. In the past, I used storage :file
With that I was able to:
Dir.chdir("public/uploads/")
import_file = Dir['*.TXT'].first
f = File.new(import_file)
Now, that I'm using storage :fog I can't get seem to retrieve/File.new/Open the file.
I see the file with the usual commands:
#upload1.team_file # stored file
#upload1.team_file.url # url
#upload1.team_file_url(:data_file).to_s # version created
I've been pouring through all kinds of very limited leads on retrieving and/or opening the file, but everything I try seems to return errors, such as:
Errno::ENOENT: No such file or directory - https://teamfiles.s3.amazonaws.com/data_files…
Thoughts on the difference here of retrieving and USING a file from AmazonS3? Thanks!
Pulling from multiple threads, APIs, etc. I'm answering my own question with what I've found. I welcome any corrections or improvements:
To retrieve carrierwave files uploaded to AmazonS3, you have to understand that open(#upload.file_url) or File.open(#upload.file_url) does NOT open the file, it only opens the PATH to the file. (ref: Ruby OpenURI )
I use: open_uri_url = open(#upload.file_url)
You then have to find the specific file in that path that you want. For me, I then find a ZIP file that was uploaded to AmazonS3 and Extract the specific file within the ZIP file that I want with a unique *.ABC extension:
zip_content_file = Zip::File.open(open_uri_url).map{|content| content if content.to_s.split('.').last == "ABC"}.compact.first
Now, from here, where to extract to?? I create a unique directory in the Rails tmp directory to extract the file to, use it and then delete the directory:
tmp_directory = "tmp/extracts/#{#upload.parent_id}/"
FileUtils.mkdir_p(tmp_directory) unless File.directory?(tmp_directory)
extract = zip_content_file.extract(tmp_directory + content_file.to_s)
Now with found from the AmazonS3 stored ZIP file and extracted, I can open, read, etc:
f = File.new(tmp_directory + extract.to_s)
I hope this helps with Carrierwave, AmazonS3, ZIP files and using them once uploaded.

How to call input file which is qlready in the package

In my Hadoop Map Reduce application I have one input file.I want that when I execute the jar of my application, then the input file will automatically be called.To do this I code one class to specify the input,output and file itself but from where I am calling the file, there I want to specify the file path. To do that I have used this code:
QueriesTest.class.getResourceAsStream("/src/main/resources/test")
but it is not working (cannot read the input file from the generated jar)
so I have used this one
URL url = this.getClass().getResource("/src/main/resources/test") here I am getting the problem of URL. So please help me out. I am using Hadoop 0.21.
I'm not sure what you want to tell us with your resource loading, but the usual way to add an input file is this:
Configuration conf = new Configuration();
Job job = new Job(conf);
Path in = new Path("YOUR_PATH_IN_HDFS");
FileInputFormat.addInputPath(job, in);
job.setInputFormatClass(TextInputFormat.class); // could be a sequencefile also
// set the other stuff
job.waitForCompletion(true);
Make sure your file resides in HDFS then.

ruby reading files from S3 with open-URI

I'm having some problems reading a file from S3. I want to be able to load the ID3 tags remotely, but using open-URI doesn't work, it gives me the following error:
ruby-1.8.7-p302 > c=TagLib2::File.new(open(URI.parse("http://recordtemple.com.s3.amazonaws.com/music/745/original/The%20Stranger.mp3?1292096514")))
TypeError: can't convert Tempfile into String
from (irb):8:in `initialize'
from (irb):8:in `new'
from (irb):8
However, if i download the same file and put it on my desktop (ie no need for open-URI), it works just fine.
c=TagLib2::File.new("/Users/momofwombie/Desktop/blah.mp3")
is there something else I should be doing to read a remote file?
UPDATE: I just found this link, which may explain a little bit, but surely there must be some way to do this...
Read header data from files on remote server
Might want to check out AWS::S3, a Ruby Library for Amazon's Simple Storage Service
Do an AWS::S3:S3Object.find for the file and then an use about to retrieve the metadata
This solution assumes you have the AWS credentials and permission to access the S3 bucket that contains the files in question.
TagLib2::File.new doesn't take a file handle, which is what you are passing to it when you use open without a read.
Add on read and you'll get the contents of the URL, but TagLib2::File doesn't know what to do with that either, so you are forced to read the contents of the URL, and save it.
I also noticed you are unnecessarily complicating your use of OpenURI. You don't have to parse the URL using URI before passing it to open. Just pass the URL string.
require 'open-uri'
fname = File.basename($0) << '.' << $$.to_s
File.open(fname, 'wb') do |fo|
fo.print open("http://recordtemple.com.s3.amazonaws.com/music/745/original/The%20Stranger.mp3?1292096514").read
end
c = TagLib2::File.new(fname)
# do more processing...
File.delete(fname)
I don't have TagLib2 installed but I ran the rest of the code and the mp3 file downloaded to my disk and is playable. The File.delete would clean up afterwards, which should put you in the state you want to be in.
This solution isn't going to work much longer. Paperclip > 3.0.0 has removed to_file. I'm using S3 & Heroku. What I ended up doing was copying the file to a temporary location and parsing it from there. Here is my code:
dest = Tempfile.new(upload.spreadsheet_file_name)
dest.binmode
upload.spreadsheet.copy_to_local_file(:default_style, dest.path)
file_loc = dest.path
...
CSV.foreach(file_loc, :headers => true, :skip_blanks => true) do |row|}
This seems to work instead of open-URI:
Mp3Info.open(mp3.to_file.path) do |mp3info|
puts mp3info.tag.artist
end
Paperclip has a to_file method that downloads the file from S3.

Resources