Trying to save xml file from pdftables api call - ruby-on-rails

I'm trying to make a request to the PDFTables API, and save what is returned (an xml doc) in a new file. I have this code:
result = RestClient.post "https://pdftables.com/api?key=nn123450hsn", :myfile => File.new("./lib/assets/PeterValleyHexacoResults.pdf", "rb")
File.open('./lib/assets/test.xml', "w") do |f|
f.puts result
end`
When I view the newly saved file, it looks like a bunch of random symbols and characters in the editor. I'm not entirely sure what I'm doing wrong. Any help is appreciated.

You are getting the result in XLSX format. You need to specify XML in your request:
result = RestClient.post "https://pdftables.com/api?key=nn123450hsn&format=xml", :myfile => File.new("./lib/assets/PeterValleyHexacoResults.pdf", "rb")

Related

Read and write JSON data from form to file

I am trying to work with JSON data in Rails.
We need to save some countries in our JSON file which we support. We have created a form which a user can create a new country/state/pincode pair and this form will append the pair in the JSON file. After that, we need to read that JSON file and print which countries are supported.
We know how to read data from the JSON file, but we are having some issues while writing the data in the proper format.
This is the code for reading the data:
#data=JSON.parse( IO.read("public/dealer.json") )
How can I write data to a file from the form in JSON format?
Given a ruby object, you can generate a file with text in json format like so:
require 'json'
data = { "foo" => "bar" }
File.open("output.json", "w+") do |f|
f.write(JSON.generate(data))
end
require 'json'
data = [{ "foo" => "bar" } , { "foo1" => "bar1" }]
File.open("output.json", "w+") do |f|
f.write(JSON.generate(data))
end
Try this...!

how to read uploaded files

I'm giving user opportunity to upload their files. However, application is not saving those files into the database, it only needs to get informations out of it.
So from a form which looks like this:
= simple_form_for #channel, method: :post do |f|
= f.input :name
= f.input :configuration_file, as: :file
= f.submit
come params[:channel][:configuration_file]
#<ActionDispatch::Http::UploadedFile:0xc2af27c #original_filename="485.csv", #content_type="text/csv", #headers="Content-Disposition: form-data; name=\"channel[configuration_file]\"; filename=\"485.csv\"\r\nContent-Type: text/csv\r\n", #tempfile=#<File:/tmp/RackMultipart20140822-6972-19sqxq2>>
How exactly can i read from this thing? I tried simply
File.open(params[:channel][:configuration_file])
but it returns error
!! #<TypeError: can't convert ActionDispatch::Http::UploadedFile into String>
PS
Additional solutions for xml and csv would be much appreciated!
According to the Rails docs:
http://api.rubyonrails.org/classes/ActionDispatch/Http/UploadedFile.html
an uploaded file supports the following instance methods, among others:
open()
path()
read(length=nil, buffer=nil)
you could try:
my_data = params[:channel][:configuration_file].read
to get a string of the file contents?
or even:
my_data = File.read params[:channel][:configuration_file].path
Also, if the file can be long, you may want to open the file and read line by line. A few solutions here:
How to read lines of a file in Ruby
If you want to read a CSV file, you could try:
require 'csv'
CSV.foreach(params[:channel][:configuration_file].path, :headers => true) do |row|
row_hash = row.to_hash
# Do something with the CSV data
end
Assuming you have headers in your CSV of course.
For XML I recommend the excellent Nokogiri gem:
http://nokogiri.org/
At least partly because it uses an efficient C library for navigating the XML. (This can be a problem if you're using JRuby). Its use is probably out of scope of this answer and adequately explained in the Nokogiri docs.
From the documentation
The actual file is accessible via the tempfile accessor, though some
of its interface is available directly for convenience.
You can change your code to:
file_content = params[:channel][:configuration_file].read
or if you want to use the File API:
file_content = File.read params[:channel][:configuration_file].path

How do I create a Paperclip attachment (from raw bytes? (maybe)) without having the file stored on disk?

This works (but seems tedious):
File.new("temp.pdf", "w").close
File.open("temp.pdf", "w+") do |f|
f.write(response.body)
pdf = PDF.new({
:document => f,
})
pdf.save
end
# delete the temp file
File.delete("temp.pdf")
But, I would rather not have to do the create, write, upload delete stuff every time I want to create a PDF on the S3 Bucket
This is what I would like to do:
pdf = PDF.new({
:document => response.body,
})
pdf.save
But, seeing as response.body is just a bunch of bytes (I think, I'm not sure about the format of response.body or how to look it up), paperclip doesn't know to convert that to a file.
Note: response.body is from DocRaptor: http://docraptor.com/ which converts html to PDF.

CSV.open and send_data in rails....?

I may end up figuring this out later, but i thought I'd try.
Can someone help combine send_data and CSV.open
According to the docs, you can CSV.open filename, mode(whatever that is) and basically a file will save to your current path. However, if you want to send that file to a user through his broswer, as most of us who give the option of CSV files to download, do,do... then can we combine the CSV.open with send_data?
thoughts?
examples welcome if you do something like this too.
I don't think you want to combine those two things.
CSV.open will save the data to a file, which you would need to read back in in order to send it through send_data.
But you can do something like:
csv = []
csv << ["titles", "for", "csv"]
csv << ["data", "for", "csv"]
send_data(csv.collect{|s| s.join(",")}.join("\n"),
:type => 'text/csv; charset=utf-8; header=present',
:filename => "mytitle.csv")
Which should prompt the user to download the csv file.

Parsing a document in a table

How do I parse a document in a table and send it across as a JSON file to another db.
Detailed Desc:
I have crawled and taken data into a table from websites using anemone. I need to now parse it and transfer it as a JSON file to another server. I think, I will have to first convert the document in the table into nokogiri document which can be parsed and converted to json file. Any idea how can I convert the doc into nokogiri document or if anyone has any other idea to parse it and send it as a json file ?
Nokogiri is your best bet for the HTML parsing, but as for converting it to JSON you're on your own from what I can tell.
Once you have it parsed via Nokogiri it shouldn't be terribly hard to extract the elements you need and generate JSON that represents them. What you're doing isn't a very common task, so you'll have to bridge the gap between Nokogiri and whichever gem you're using to generate the JSON.
Okay I found the answer long time back, I basically made use of REST to send message form one application to another, i sent it across as a hash. And the obvious one, I used nokogiri for parsing the table.
def post_me
#page_hash = page_to_hash
res = Net::HTTP.post_form(URI.parse('http://127.0.0.1:3007/element_data/save.json'),#page_hash)
end
For sending the hash from one application to another using net/http.
def page_to_hash
require 'rubygems'
require 'nokogiri'
require 'open-uri'
require 'domainatrix'
#page = self.page.sub(/^<!DOCTYPE html(.*)$/, '<!DOCTYPE html>')
hash={}
doc = Nokogiri::HTML(self.page)
doc.search('*').each do |n|
puts n.name
end
Using Nokogiri for parsing the page table in my model. page table had the whole body of a webpage.
file_type = []
file_type_data=doc.xpath('//a/#href[contains(. , ".pdf") or contains(. , ".doc")
or contains(. , ".xls") or contains(. , ".cvs") or contains(. , ".txt")]')
file_type_data.each do |href|
if href[1] == "/"
href = "http://" + website_url + href
end
file_type << href
end
file_type_str = file_type.join(",")
hash ={:head => head,:title => title, :body => self.body,
:image => images_str, :file_type => file_type_str, :paragraph => para_str, :description => descr_str,:keyword => key_str,
:page_url=> self.url, :website_id=>self.parent_request_id, :website_url => website_url,
:depth => self.depth, :int_links => #int_links_arr, :ext_links => #ext_links_arr
}
A simple parsing example and how i formed my hash.

Resources