Import CSV from url address and export as XML -- Rails - ruby-on-rails

Two questions:
How can I import a file from a web address, without a form?
Example: Organisation.import(:from => 'http://wufoo.com/report.csv')
How can I use xml builder without pulling from the db?
More Info
My company uses wufoo for web forms. The data from wufoo is exported as csv files. To get the data into my company's cms, it needs to be formatted as xml. I don't need to store any of the data, aside from the url to the csv file. I thought this might work well as a simple rails app.

Use open-uri (http://www.ruby-doc.org/stdlib/libdoc/open-uri/rdoc/) to fetch the file, and ruby's csv library to parse it. Or, use csv-mapper which is nice and simple (http://csv-mapper.rubyforge.org/).

Here is a way:
require 'rio'
require 'fastercsv'
url = 'http://remote-url.com/file.csv'
people = FasterCSV.parse(rio(url).read)
xml = ''
1.upto(people.size-1) do |row_idx|
xml << " <record>\n"
people[0].each_with_index do |column, col_idx|
xml << " <#{column.parameterize}>#{people[row_idx][col_idx]}</#{column.parameterize}>\n"
end
xml << " </record>\n"
end

Related

how to convert pdf file into xlsx file in ruby on rails

I have uploaded 1 PDF then convert it to xlsx file. I have tried different ways but not getting actual output.pdf2xls only displays single line format not whole file data. I want whole PDF file data to display on xlsx file.
i have one method convert PDF to xlsx but not display proper format.
def do_excel_to_pdf
#user=User.create!(pdf: params[:pdf])
#path_in = #user.pdf.path
temp1 = #user.pdf.path
#path_out = #user.pdf.path.slice(0..#user.pdf.path.rindex(/\//))
query = "libreoffice --headless --invisible --convert-to pdf " + #path_in + " --outdir " + #path_out
system(query)
file = #path_out+#user.pdf.original_filename.slice(0..#user.pdf.original_filename.rindex('.')-1)+".pdf"
send_file file, :type=>"application/msexcel", :x_sendfile=>true
end
if any one use please help me, any gem any script.
I would start with reading from the PDF, inserting the data in the XLSX is easy, if you have problems with that ask another question and specify which gem you use and what you tried for that part.
You use libreoffice to read the PDF but according to the FAQ your PDF needs to be hybrid, perhaps that is the problem.
As an alternative you could try to use some conversion tool for ebooks like the one in Calibre but I'm afraid you will lose too much formatting to recover the data you need.
All depends on how the data in your PDF is structured, if regular text without much formatting and positioning it can be as easy as using the gem pdf-reader
I used it in the past and my data had a lot of formatting - you would be surprised to know how complicated the PDF structure is - so I had to specify for each field at which location exactly which data had to be read, not for the faint of heart.
Here a simple example.
require 'pdf/reader' # gem install pdf-reader
reader = PDF::Reader.new("my.pdf")
reader.pages.each do |page|
# puts page.text
page.page_object.each do |e|
p e.first.contents
end
end
not able to find options to convert from PDF to xsls but API Options available for converting PDF to Image and PDF to powerpoint(Link Given Below)
Not sure u can change the requirement to show results in other formats!!
http://www.convertapi.com/

Parse a string like a CSV file with seek, rewind, position

My application accepts an uploaded file from the user and parses it, making use of seek and rewind methods quite heavily to parse blocks from the file (lines can begin with 'start' or 'end' to enclose a section of data, etc).
A new requirement allows the user to upload encrypted files. I've implemented decryption of the content of the file and return the content string to the existing method. I can parse the string as a CSV but lose the file controls.
Storing an unencrypted version of the file is not an option for business reasons.
I'm using FasterCSV but not averse to using something else if I can keep the seek/rewind behaviour.
Current code:
FasterCSV.open(path, 'rb') do |csv| # Can I open a string as if it were a file?
unless csv.eof? # Catch empty files
# Read, store position, seek, rewind all used during parsing
position = csv.pos
row = csv.readline
csv.seek(pos)
After some digging and experimentation I've found that it was possible to retain the IO methods by using the StringIO class like so:
csv = StringIO.new(decrypted_content)
unless csv.nil?
unless csv.eof? # Catch empty files
position = csv.pos
row = csv.readline.chomp.split(',')
csv.seek(pos)
Only change is needing to manually split the line to be able to use it like a csv row, not much extra work.
You don't need the CSV gem anymore but if you prefer the seek/rewind behaviour you can roll your own for strings. Something like this might work for your scenario:
array_of_lines=unecrypted_file_string.split('\n')
array_of_lines.each_with_index do |line,index|
position=index
row=line
seek=line[10]
end

Ruby on rails string parsing

I have a string that is a bunch of XML tags.
Basically there is the contents to one tag I want and ignore everything else:
The input would look like:
<Some><XML><stuff>
<title type='text'>key</title>
<Some><other><XML><stuff>
The output would look like:
key
I'm not sure if XML is appropriate since there doesn't seem very much structure to this particular XML.
Can regex do this in RoR or is it more of just a pattern matching thing (true or false) in ruby on rails?
Thanks so much!
Cheers,
Zigu
No. If your source could not be strictly valid XML, I strongly suggest you to use Nokogiri.
Handle the source as an HTML document and extract the info you need in this way:
doc = Nokogiri::HTML("Your string with <key>some value</key>"))
doc.search('key').each do |value|
puts value.content # do whatever you want
end
Here's why you don't parse xml with regexen: RegEx match open tags except XHTML self-contained tags

How can i include image into CSV

In my Rails application Admin can export the user data into csv format. Every user in my application has their profile photo.... my client wants to include the user photo into the CSV file .. I have not how to do this . can any one please help me....
i am using fastercsv gem and here is some my controller code for reference
In my Controller :
require 'fastercsv'
def getcsv
entries = User.find(:all)
csv_string = FasterCSV.generate do |csv|
csv << ["first_name","last_name","username","address","photo" ]
entries.each do |e|
csv << [e.first_name,e.last_name,e.username,e.address,e.photo.url]
end
end
send_data csv_string,:type=>'text/csv;charset=iso-8859-1; header=present', :filename=>"entries.csv",:disposition => 'attachment'
end
Saving the actual photo in a CSV file is technically possible, but very ill-advised. CSV is simply not intended for that sort of job. You obviously cannot simply embed the binary image data into the ASCII text-based CSV file. It would be possible to use Base-64 encoding to convert the 8-bit binary data into 7-bit text, and then store this in one of the fields in your CSV file. (This will also expand the storage required by about 20%).
But then what software could decode this? You would have to write some other software to convert the images back on the other end, which would defeat the purpose. Furthermore, each line of the CSV file would be massive, and probably cause problems on importing.
You would be much better off exporting the profile photos as a set of PNGs, and save the filename in the CSV file against each record.
CSV is plain text. There's no way to include graphic data unless both the generator and the reader agree on a format, such as base64-encoded PNG.
You could try adding
links to image files
single line base64 encoded string
CSV (comma separated values) is not a format suitable for binary data, the general rule of thumb though for saving binary data to a text file is to convert it to a format that does suit text files, something like Base64, or a textual representation of the raw data in Hex.

In Ruby on Rails, how can I convert html to word?

how can I convert html to word
thanks.
I have created a Ruby html to word gem that should help you do just that. You can check it out at https://github.com/nickfrandsen/htmltoword - You simply pass it a html string and it will create a corresponding word docx file.
def show
respond_to do |format|
format.docx do
file = Htmltoword::Document.create params[:docx_html_source], "file_name.docx"
send_file file.path, :disposition => "attachment"
end
end
end
Hope you find it helpful.
I am not aware of any solution which does this, i.e. convert HTML to Word format. If you literally mean that, you will have to parse the HTML document first using something like Nokogiri. If you mean you want to output data persisted in your model objects, there is obviously no need to parse HTML! As far as outputting to Word, I'm afraid it looks as if you will have to directly interface with a running instance of Microsoft Word via OLE!
A quick google search for win32ole ruby word will get you started:
http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/241606
Good luck!
I agree with CodeJoust that it is better to generate a PDF. However, if you really need to generate a Word document then you can do the following:
If your server is a Windows machine, you can install Office in it and use ruby's OLE binding to generate the Word document into the public folder and then deliver the file in the response.
To use ruby's OLE binding, see the "Programming Ruby" ebook that comes with the one-click ruby installer for Windows. You may have to use custom logic to convert from HTML to Word unless you can find a function in the OLE api of Word to do that.
http://prawn.majesticseacreature.com/
You could allow the user to download a PDF or a .html file, but there aren't any helpful ruby libraries to do that. You're better off generating a 'printable and downloadable' version, without much styling, and/or a pdf version using a library like prawn.
You could always generate a simple .rtf file, I think word'll be pretty happy reading that...

Resources