How to write a Tempfile as binary - ruby-on-rails

When trying to write a string / unzipped file to a Tempfile by doing:
temp_file = Tempfile.new([name, extension])
temp_file.write(unzipped_io.read)
Which throws the following error when I do this with an image:
Encoding::UndefinedConversionError - "\xFF" from ASCII-8BIT to UTF-8
When researching it I found out that this is caused because Ruby tries to write files with an encoding by default (UTF-8). But the file should be written as binary, so it ignores any file specific behavior.
Writing regular File you would be able to do this as following:
File.open('/tmp/test.jpg', 'rb') do |file|
file.write(unzipped_io.read)
end
How to do this in Tempfile

Tempfile.new passes options to File.open which accepts the options from IO.new, in particular:
:binmode
If the value is truth value, same as “b” in argument mode.
So to open a tempfile in binary mode, you'd use:
temp_file = Tempfile.new([name, extension], binmode: true)
temp_file.binmode? #=> true
temp_file.external_encoding #=> #<Encoding:ASCII-8BIT>
In addition, you might want to use Tempfile.create which takes a block and automatically closes and removes the file afterwards:
Tempfile.create([name, extension], binmode: true) do |temp_file|
temp_file.write(unzipped_io.read)
# ...
end

I have encountered the solution in an old Ruby forum post, so I thought I would share it here, making it easier for people to find:
https://www.ruby-forum.com/t/ruby-binary-temp-file/116791
Apparently Tempfile has an undocumented method binmode, which changes the writing mode to binary and thus ignoring any encoding issues:
temp_file = Tempfile.new([name, extension])
temp_file.binmode
temp_file.write(unzipped_io.read)
Thanks unknown person who mentioned it on ruby-forums.com in 2007!

Another alternative is IO.binwrite(path, file_content)

Related

Ruby: Is there a way to specify your encoding in File.write?

TL;DR
How would I specify the mode of encoding on File.write, or how would one save image binary to a file in a similar fashion?
More Details
I'm trying to download an image from a Trello card and then upload that image to S3 so it has an accessible URL. I have been able to download the image from Trello as binary (I believe it is some form of binary), but I have been having issues saving this as a .jpeg using File.write. Every time I attempt that, it gives me this error in my Rails console:
Encoding::UndefinedConversionError: "\xFF" from ASCII-8BIT to UTF-8
from /app/app/services/customer_order_status_notifier/card.rb:181:in `write'
And here is the code that triggers that:
def trello_pics
#trello_pics ||=
card.attachments.last(config_pics_number)&.map(&:url).map do |url|
binary = Faraday.get(url, nil, {'Authorization' => "OAuth oauth_consumer_key=\"#{ENV['TRELLO_PUBLIC_KEY']}\", oauth_token=\"#{ENV['TRELLO_TOKEN']}\""}).body
File.write(FILE_LOCATION, binary) # doesn't work
run_me
end
end
So I figure this must be an issue with the way that File.write converts the input into a file. Is there a way to specify encoding?
AFIK you can't do it at the time of performing the write, but you can do it at the time of creating the File object; here an example of UTF8 encoding:
File.open(FILE_LOCATION, "w:UTF-8") do
|f|
f.write(....)
end
Another possibility would be to use the external_encoding option:
File.open(FILE_LOCATION, "w", external_encoding: Encoding::UTF_8)
Of course this assumes that the data which is written, is a String. If you have (packed) binary data, you would use "wb" for openeing the file, and syswrite instead of write to write the data to the file.
UPDATE As engineersmnky points out in a comment, the arguments for the encoding can also be passed as parameter to the write method itself, for instance
IO::write(FILE_LOCATION, data_to_write, external_encoding: Encoding::UTF_8)

Confusion about creating and writing to File in Ruby on rails

I am trying to create a new file and then write some content to it just to create a basic backup of a template.
When I log out the values of filename and file_content they are correct, but when I send the data all I get is a file named after the method (download_include) and a fixnum inside the file, the last one made was 15.
# POST /download_include/:id
def download_include
#include = Include.find(params[:id])
version_to_download = #include.latest_version_record
filename = "#{version_to_download.name}"
file_content = "#{version_to_download.liquid_code.to_s}"
file = File.open(filename, "w") { |f| f.write (file_content) }
send_data file
end
I also tried send_file but that produces the error
no implicit conversion of Fixnum into String
I also tried to just write dummy values like below, and it still produced a file named after the method with a fixnum inside it.
file = File.open("DOES THIS CHANGE THE FILENAME?", "w") { |f| f.write ("FILE CONTENT?") }
I feel I am missing something obvious but I cannot figure it out after looking at many examples here and in blogs.
If you don't end along the filename as an option for send_data, it defaults to the method name.
Secondly, the download wants to read the data from a buffer. My guess is your syntax is just sending a file handle.
Try this...
send_data(file.read, filename: filename)
Or skip the intermediate file and try...
send_data(version_to_download.liquid_code.to_s, filename: filename)

Can't parse files uploaded in UTF-16 on Heroku

I'm writing a Rails app that allows the user to upload TSV (Tab-separated values) files to be parsed on the server. Those files are encoded in UTF-16. All goes fine in local, but when I try to open the file with such encoding on Heroku, I'm getting a warning that says warning: Unsupported encoding utf-16 ignored. When later I try to read such file, it obviously fails stating invalid byte sequence in UTF-8. See an excerpt of the code below:
File.open(params[:batch_import][:file].path, 'r:utf-16') do |f|
#recipients = Recipient.from_tsv(f.read)
end
Is there any walkaround that I can do?
UTF-16 files must be opened with the binary mode. Try this:
File.open(params[:batch_import][:file].path, 'rb:utf-16') do |f|
#recipients = Recipient.from_tsv(f.read)
end

MalformedCSVError with rails CSV (FasterCSV)

I'm having serious issues trying to parse some CSV in rails right now.
Basically my app gets a user to upload a CSV file. The app then converts the file to ensure it is in UTF-8 format, then attempts to parse it and process it. Whenever the app attempts to parse it however, I get the MalformedCSVError stating "Illegal quoting on line 1"
Now what I don't get, is if I copy the original file into a new document and save it, then I can parse it on a rails console without a problem.
If I attempt to parse the original file, it complains about an invalid character for UTF-8 encoding (the file isn't in UTF-8 hence the app converts it)
If I attempt to parse the file which the app has converted to UTF-8 and changed the line endings to LF, it fails to parse.
If I do a file diff between the version the app has produced, and the copy/paste version that I have made (which works) there are 0 differences so I really can't figure out why one is parsable, and one is not.
Any suggestions? My app is processing the file as follows :
def create
#survey = Survey.new(params[:survey])
# Now we need to try and convert this to UTF-8 if it isn't already
encoded = File.read(#survey.survey_data.current_path)
encoding = CharlockHolmes::EncodingDetector.detect(encoded)
# We've got a guess at the encoding,
# so we can try and convert it but it
# may still fail so we need to handle
# that
begin
re_encoded = CharlockHolmes::Converter.convert(encoded, encoding[:encoding], 'UTF-8')
re_encoded = re_encoded.gsub(/\r\n?/, "\n")
# Now replace the uploaded file
File.open(#survey.survey_data.current_path, 'w') { |f|
f.write(re_encoded)
}
rescue ArgumentError
puts "UH OH!!!!!"
end
puts "#{#survey.survey_data.current_path}"
#parsed = CSV.read(#survey.survey_data.current_path)
end
The file uploading gem is CarrierWave if that makes any difference.
Please can someone help me as this is driving me insane!
Edit
The error says it's on line 1. Line 1 (assuming it doesn't index from 0) is
"Survey","RD","GarrysMDs","NigelsMDs","PaulsMDs","StephensMDs","BrinleyJ","CarolineP","DaveL","GrantR","GregS","Kent","NeilC","NicolaP","AndyC","DarrenS","DeanB","KarenF","PaulR","RichardF","SteveG","BrianG","GordonA","NickD","NickR","NickT","RayL","SimonH","EdmondH","JasonF","MikeS","SamanthaN","TimB","TravisF","AlanS","Q1","Q2","Q3","Q4","Q5","Q6","Q7","Q8PM","Q8N","Q9","Q10","Q11","Q12","Q13","Q14","Q15","Q16PM","Q16N","Q17PM","Q17N","Q18PM","Q18N","Q19","Q20","Q21","Q22","comment","Q23.1","Q23.2","Q23.3","TQ23.1","TQ23.2","VPM","VN","VQ1","VQ2","VQ3","VQ4","VQ5","VQ6","VQ7","VQ8N","VQ8PM","VQ9","VQ10","VQ11","VQ12","VQ13","VQ14","VQ15","VQ16","VQ16N","VQ16PM","VQ17","VQ17N","VQ17PM","VQ18","VQ18N","VQ18PM","VQ19","VQ20","VQ21","VQ22","VQ23.1","VQ23.2","VQ23.3","VRD","XQ16","XQ17","XQ18"
Well that was irritating!
Turns out the file had a BOM which was causing the CSV parser to break. Loading the file with
CSV.open("path/to/file.csv", "rb:bom|encoding")
allowed it to parse it perfectly! So annoyed how long it took to track down but it's now working and with no need to convert to UTF-8 now either!

Spreadsheet - encoding problem with reading cyrillic characters

I'm working on a rails app for a small shop. It needs to load an .xls file, parse it and maybe load to the database.
I use Spreadsheet gem to work with the file.
The problem is that the file contains russian characters which are displayed as "└ÛÛ.ExT H-1727F (ÓÝÓÙ¯Ò GP T304)"
The reference says, I need to specify the encoding, but I don't know which one is used in this file. I tried "win-1251" but it gave me an error about being unable to find a "utf-8 to win-1251 converter"
I've setting encoding to "WINDOWS-1251" but it gave me this error:
U+00BE to WINDOWS-1251 in conversion from CP850 to UTF-8 to WINDOWS-1251
So then I've tried CP850, which didn't throw an error, but the characters were still not readable.
There's not much code really.
# -*- encoding : utf-8 -*-
...
def show
require 'spreadsheet'
Spreadsheet.client_encoding = 'UTF-8'
book = Spreadsheet.open 'c:\rails\renergy23\public\price-16-04-11.xls'
#sheet = book.worksheet 0
end
For simpicity I don't load it to the database right now. Instead I output it in my view:
- 30.times do |i|
= #sheet.row i+10
%br
http://dl.dropbox.com/u/4976861/price-16-04-11.xls
I kinda solved this after 1.5 months by first saving the document in .xlsx and then saving it in .xls (97-2003). I couldn't use the .xlsx because of some weird OLE signature incorrect error.

Resources