Download file from PostgreSQL bytea escape - ruby-on-rails

I have some issue to allow users download file, which stored in PostgreSQL bytea escaped field (
1.9.3p385 :023 > data = PG::Connection.unescape_bytea(m[:data])
=> "JVBERi0xLjMKJcTl8uXrp/Og0MTGCjQgMCBvYmoKPDwgL0xlbmd0aCA1IDAg\r\nUiAvRmlsdGVyIC9GbGF0ZURlY29kZSA+Pgpzd..."
1.9.3p385 :023 >
1.9.3p385 :023 > data.bytesize
=> 3878164
But when I used "send_data" or "send_file" with tempfile, I getting file in invalid format (this is pdf file). It much bigger, than original and not opening by pdf readers.
This data in field is mime part of email. If I build raw email from all this parts (using boundary as separator), this email will contain valid pdf attachment.
How should I convert this data to bytes to allow user download this file separately?

See the following:
The syntax is something like PGConn.unescape_bytea($field);
Depending on your version of pg, you may need to upgrade that gem


How to convert Base64 string to pdf file using prawn gem

I want to generate pdf file from DB record. Encode it to Base64 string and store it to DB. Which works fine. Now I want reverse action, How can I decode Base64 string and generate pdf file again?
here is what I tried so far.
def data_pdf_base64
# Create Prawn Object
my_pdf =
# write text to pdf
my_pdf.text("Hello Gagan, How are you?")
# Save at tmp folder as pdf file
# Read pdf file and encode to Base64
encoded_string = Base64.encode64("#{Rails.root}/tmp/pdf/gagan.pdf"){|i|})
# Delete generated pdf file from tmp folder
File.delete("#{Rails.root}/tmp/pdf/gagan.pdf") if File.exist?("#{Rails.root}/tmp/pdf/gagan.pdf")
# Now converting Base64 to pdf again
pdf =
# I have used ttf font because it was giving me below error
# Your document includes text that's not compatible with the Windows-1252 character set. If you need full UTF-8 support, use TTF fonts instead of PDF's built-in fonts.
pdf.font Rails.root.join("app/assets/fonts/fontawesome-webfont.ttf")
pdf.text Base64.decode64 encoded_string
rescue => e
return render :text => "Error: #{e}"
Now I am getting below error:
Encoding ASCII-8BIT can not be transparently converted to UTF-8.
Please ensure the encoding of the string you are attempting to use is
set correctly
I have tried How to convert base64 string to PNG using Prawn without saving on server in Rails but it gives me error:
"\xFF" from ASCII-8BIT to UTF-8
Can anyone point me what I am missing?
The answer is to decode the Base64 encoded string and either send it directly or save it directly to disk (naming it as a PDF file, but without using prawn).
The decoded string is a binary representation of the PDF file data, so there's no need to use Prawn or to re-calculate the content of the PDF data.
raw_pdf_str = Base64.decode64 encoded_string
render :text, raw_pdf_str # <= this isn't the correct rendering pattern, but it's good enough as an example.
To clarify some of the information given in the comments:
It's possible to send the string as an attachment without saving it to disk, either using render text: raw_pdf_str or the #send_data method (these are 4.x API versions, I don't remember the 5.x API style).
It's possible to encode the string (from the Prawn object) without saving the rendered PDF data to a file (save it to a String object instead). i.e.:
encoded_string = Base64.encode64(my_pdf.render)
The String data could be used directly as an email attachment, similarly to the pattern provided here only using the String directly instead of reading any data from a file. i.e.:
# inside a method in the Mailer class
attachments['my_pdf.pdf'] = { :mime_type => 'application/pdf',
:content => raw_pdf_str }

How do I create a checksum of carrierwave upload to verify the download?

How do I create a checksum (MD5, sha512, whatever) of a file when I upload it, so that when I download (using cache_stored_file!), I can verify that it is indeed the original file that was uploaded?
The Ruby Digest module can help with this.
One way solution would be to read the file on upload and assign it a unique digest with a before_create callback. I would add it as a column on the file table in your database.
Here's some output from IRB to show how it would work:
2.2.2 :001 > require 'digest'
=> true
2.2.2 :002 > f = 'test.rb'
=> "Original content\n"
2.2.2 :003 > Digest::SHA256.hexdigest(f)
=> "646722e7ee99e28d618142b9d3a1bfcbe2196d8332ae632cc867ae5d1c8c57b5"
# (... file modified ...)
2.2.2 :004 > f = 'test.rb'
=> "Original content with more content\n"
2.2.2 :005 > Digest::SHA256.hexdigest(f)
=> "c29f2f77c0777a78dbdf119bf0a58b470c098635dfc8279542e4c49d6f20e62c"
You can use this digest in your download method to check the integrity of the file. If you read the file again, produce a digest, and it matches the original digest, you can be confident the file hasn't been altered since it was uploaded.
Ruby Digest Module
md5 = Digest::MD5.file('path_to_file').hexdigest
This would read file in blocks and avoid reading the whole file in RAM which is done in
For SHA checksum
Digest::SHA2.hexdigest("/path/to/my/file.txt") );
=> "fa5880ac744f3c05c649a864739530ac387c8c8b0231a7d008c27f0f6a2753c7"
More details for SHA checksum generation SHA Checksum

How to convert ascii-8bit file data to readable string

I'm trying to parse the contents of a CSV file (saved on Windows Excel then uploaded to dropbox) from my dropbox via the Dropbox Core api.
I create a rake task (part of a Rails app) with the following code and it creates a magnum-opus.csv file on my local hard drive that has the original text in the Excel file. The encoding of contents is ASCII-8BIT by calling contents.encoding
contents, metadata = client.get_file_and_metadata('/magnum-opus.csv')
open('magnum-opus.csv', 'w') {|f| f.puts contents }
Instead of creating a local file, I'd like to convert the binary data in "contents" to readable text on the fly and parse through it. I don't want to save it anywhere and then have to open it.
How do I go about doing that?
If do
p contents
I end up getting some type of unreadable data format ... \x00e\x00d\x00u
1) How do I convert this into a string I can parse through with Ruby?
2) The other thing I'm wondering - if do:
puts contents
The original text in the CSV file that is human readable is outputed to STDOUT. What is puts doing?
I tried:
calling CSV.parse on contents.encode( "UTF-8", "binary", :invalid => :replace, :undef => :replace, :replace => '') but end up getting an error such as
CSV::MalformedCSVError: Unquoted fields do not allow \r or \n

Soft signs in rails console

I want to create multiple categories via console and I want to be able add soft signs. At this moment I can't do that.
It's very important to project that I can save category names with soft signs.
Can somebody tip me where to search? I searched such tag - soft signs rails.
There wasn't any usefull resource.
Soft signs in my native language is like this.
Ā,Š,Ē,Ž with that symbol called soft sign abowe the character.
At this moment when I try to save new category record it shows me this kind off error
thodError: undefined methodcache_ancestry!' for #
But I am sure that I didn't change anything in models or controllers :(
What version of Ruby is this? What you're seeing there are either US-ASCII strings with UTF-8 data in them (Ruby 1.9) or byte arrays (Ruby 1.8).
If you're using Ruby 1.8, you may need to use Iconv to convert your encoding from US-ASCII to UTF-8. If you're using Ruby 1.9, then make sure you're creating UTF-8 strings and it should work just fine.
Note that those escape sequences are correct - that is the literal byte array of those characters, assuming the proper encoding is applied, so you may not need to actually change anything. If the bytes are right, everything's fine - you're just seeing ruby interpret the string as ASCII rather than UTF-8 or whatnot.
In Ruby 1.8, when you #inspect a string, you get the escaped version, but putsing it will show you the actual string:
1.8.7 :021 > s = "Komunālās mašīnas"
=> "Komun\304\201l\304\201s ma\305\241\304\253nas"
1.8.7 :022 > puts s
Komunālās mašīnas
In 1.9, you get the correct display all around, so long as your encoding is right:
1.9.3p327 :001 > s = "Komunālās mašīnas"
=> "Komunālās mašīnas"
1.9.3p327 :004 > s.force_encoding "US-ASCII"
=> "Komun\xC4\x81l\xC4\x81s ma\xC5\xA1\xC4\xABnas"
1.9.3p327 :005 > puts s
Komunālās mašīnas
Check this out Edgars:
#encoding: UTF-8
fallback = {
'Š'=>'S', 'š'=>'s', 'Ð'=>'Dj','Ž'=>'Z', 'ž'=>'z', 'À'=>'A', 'Á'=>'A', 'Â'=>'A', 'Ã'=>'A', 'Ä'=>'A',
'Å'=>'A', 'Æ'=>'A', 'Ç'=>'C', 'È'=>'E', 'É'=>'E', 'Ê'=>'E', 'Ë'=>'E', 'Ì'=>'I', 'Í'=>'I', 'Î'=>'I',
'Ï'=>'I', 'Ñ'=>'N', 'Ò'=>'O', 'Ó'=>'O', 'Ô'=>'O', 'Õ'=>'O', 'Ö'=>'O', 'Ø'=>'O', 'Ù'=>'U', 'Ú'=>'U',
'Û'=>'U', 'Ü'=>'U', 'Ý'=>'Y', 'Þ'=>'B', 'ß'=>'Ss','à'=>'a', 'á'=>'a', 'â'=>'a', 'ã'=>'a', 'ä'=>'a',
'å'=>'a', 'æ'=>'a', 'ç'=>'c', 'è'=>'e', 'é'=>'e', 'ê'=>'e', 'ë'=>'e', 'ì'=>'i', 'í'=>'i', 'î'=>'i',
'ï'=>'i', 'ð'=>'o', 'ñ'=>'n', 'ò'=>'o', 'ó'=>'o', 'ô'=>'o', 'õ'=>'o', 'ö'=>'o', 'ø'=>'o', 'ù'=>'u',
'ú'=>'u', 'û'=>'u', 'ý'=>'y', 'ý'=>'y', 'þ'=>'b', 'ÿ'=>'y', 'ƒ'=>'f'
p t.encode('us-ascii', :fallback => fallback)
See Ruby 1.9.x replace sets of characters with specific cleaned up characters in a string
To get all the characters for your language you will need to add them as desired to the fallback hash. When I run "Komunālās mašīnas" as the variable 't' I get this:
t = "Komunālās mašīnas"
t.encode('us-ascii', :fallback => fallback)
Encoding::UndefinedConversionError: U+0101 from UTF-8 to US-ASCII
You can tell from this where the problem lies by googling U+0101 which shows
So now you know which letter is not working and you can add it to the fallback hash like so:
fallback = { OTHER DEFINITIONS , 'ā'=>'a'}
Here's a place to start:

Providing Content-MD5 header through paperclip to S3

I'm using Paperclip to upload files directly to s3 for my rails web app.
I'm currently trying to exploit the md5 check integrated in amazon s3 to verify that the upload was carried on successfully. Paperclip offers a s3_headers hash that you can populate with whatever fields you need. Content-Type is automatically filled. Content-MD5 needs to be Base64 encoded so I provide it this way:
:s3_headers => {:content_md5 => Base64.strict_encode64(md5sum)},
I use strict_encode64 because encode64 adds an unnecessary trailing \n.
With this setup I always receive an InvalidDigest error from aws-sdk, even though paperclip correctly shows the calculated header. I also tried to use plain, unencoded md5sum, with similar results.
If md5sum is a string of hex digits, like the std output from the Linux application md5sum, try this:
:s3_headers => {:content_md5 => [[md5sum].pack("H*")].pack("m0") }
For example, from the rails console:
> md5sum = "7d592a3129ab6a867cf6e2eb60f9ef83"
> [[md5sum].pack("H*")].pack("m0")
=> "fVkqMSmraoZ89uLrYPnvgw=="
Take the md5 of your source, convert each (character couple) from string to hex (2 bytes become 1 byte); then base64 encode and you will be fine.
