Carrierwave: Use spaces instead of underscore - ruby-on-rails

We have an uploader for PDF's. When a file name has spaces in it, they are automatically being converted to use underscores:
some file test -> some_file_test
I'd like to keep the spaces. Can someone tell me how?
I tried:
def filename
original_filename
end

You can override sanitize regexp by adding whitespace:
CarrierWave::SanitizedFile.sanitize_regexp = /[^[:word:]\.\-\+\ ]/
As you see this regexp used in sanitize method that replace forbidden symbols to underscore.
From CarrierWave documentation:
Filenames and unicode chars
Another security issue you should care for is the file names (see Ruby On Rails Security Guide). By default, CarrierWave provides only English letters, arabic numerals and some symbols as white-listed characters in the file name. If you want to support local scripts (Cyrillic letters, letters with diacritics and so on), you have to override sanitize_regexp method. It should return regular expression which would match all non-allowed symbols.
CarrierWave::SanitizedFile.sanitize_regexp = /[^[:word:]\.\-\+]/
Also make sure that allowing non-latin characters won't cause a compatibility issue with a third-party plugins or client-side software.

Try:
original_filename.gsub("_", " ")
UPDATE (possible workaround):
Replace underscores with a character or a string (e.g. "xyxyxyxyxyxyxyxyz") you are not expecting in filenames before passing them to carrierwave i.e.
filename.gsub("_", "your_special_character/s")
Replace underscores with spaces and special character with underscores later:
original_filename.gsub("_", " ")
original_filename.gsub("your_special_character/s", "_")

Related

String from filename showing %20 where spaces are in the name, unwanted

I'm bringing in images with names that contain spaces, English names and Chinese names, and using those filenames to create strings that are presented beneath their images.
This worked fine until I started putting Chinese into the image file names.
Now, unfortunately all spaces are coming in as
%20
and an example of Chinese looks like this:
%E7%99%BE%E9%A6%99
I have Chinese Simplified installed, and setup in internalisation settings of the project in Xcode... but somehow I'm missing something. Which isn't the first time, and won't be the last.
What do I need to do to make Unicode work?
The path is urlencoded if you urldecode the path spaces should turn out fine
For Swift 3:
substring.removingPercentEncoding
For Swift 2.3:
substring.stringByRemovingPercentEncoding

Rails: Is it a bad idea to put double-byte character inside a model?

I've learnt that you may define a Ruby source file as UTF-8 to be able to key inside it double-byte characters (e.g.: ¤) instead of their HTML code (e.g.: & curren;):
# encoding: UTF-8
class Price < ActiveRecord:Base
def currency_symbol
'¤'
end
end
Without the encoding statement, I would need to write '& curren;'.html_safe as the core of the method.
I don't like the later because it assume I'm writing HTML (I have Excel output in my app on top of HTML).
My question is: Is there any problems or performance hits I must be aware while doing this?
Note: Ruby 2.0 brings UTF-8 as the default encoding; does it mean all Ruby files will automatically support all those characters?
Character chart: http://dev.w3.org/html5/html-author/charref
This is exactly the kind of thing that should go in the locales (config/locales). These are YAML files that define words and characters that will be used in the various parts of your application, including currency symbols. It also has the benefit of allowing you to easily introduce translations for other languages.
Take a look at the ruby on rails guide for i18n for more.

Incompatible encoding regexp match (ASCII-8BIT regexp with UTF-8 string) on Heroku

I have a Rails application where I use regex-based rules to categorize transactions. In my seeds.rb, I create some categories and rules, then import transactions from a CSV file (also utf8-encoded) and allow them to be categorized. This process works fine on my development machine, but when I run it on Heroku, I get:
incompatible encoding regexp match (ASCII-8BIT regexp with UTF-8 string)
I am running the Cedar Stack, Rails 2.3.15. I have put
# encoding: utf-8
at the top of all my source files and I've set the encoding to utf-8 in my app config, so I'm not sure what else could be causing this problem. I'm wondering if has something to do with the Heroku configuration.
The issue could be caused by invisible characters that are ignored by your local operating system, ensuring proper encoding takes place whereas on Heroku, the characters mess up the magic number declaration at the top of the file and you end up with both ASCII-8BIT and UTF-8.
Since the file that is having issues contains the regex, it's probably your model class instead of seeds.rb.
There are many ways to view invisible characters in your file. In vi, just set the option :set list

Given a Paperclip file, how to clean up the filename to be url friendly?

remove non-utf8 characters, downcase, removes spaces. Is there a builtin way in rails to make filenames friendly and safe before paperclip saves?

ruby on rails x charset

i'm having problem to deal with charset in ruby on rails app, specificially in my templates. Code that comes from my database, works fine, but codes like ç ~ that are located in my views are not working. I added the following codes to my code
I added a function like that, but that still not working i have ç ~ codes in my application.rhtml that are not working.
before_filter :configure_charsets
# Configuring charset to UTF-8 def configure_charsets
headers["Content-Type"] = "text/html; charset=UTF-8"
end
I added as well meta http-equiv html to utf-8 and a .htaccess parameter AddDefaultCharset UTF-8
That's still not working, any other tip?
Put this piece of code in your config (environment.rb)
Rails::Initializer.run do |config|
config.action_controller.default_charset = "iso-8859-1"
end
This will do it.
Also, remove the default charset line if any in layouts/application.html
Is the text editor you're using to put the special characters into the file (either source or views) treating those characters as UTF-8? For example, if you're using TextMate, you can deliberately save a file as UTF-8. If for some reason you used a different encoding earlier (a default, perhaps), those UTF-8 characters might be getting transcoded at the code editing stage, so even if the rendering process is using UTF-8 throughout, it'll still not work.
Further, if you're using something from a shell, like vi, or whatever, is your terminal set up to accept UTF-8 as default? If you had it set to ISO-8859-1 or whatever, you'd get the same issue.
Is your application.rhtml file written in the correct character set? Make sure it's UTF-8, and not ISO-8859-1.
So if the contents of your file are UTF-8, and the output is being interpreted as UTF-8, something in between is changing the data. Can give give us the the hex interpretation of the input bytes (anything non-ASCII will be at least two bytes in UTF-8) for one of your special characters, and the hex interpretation of the output byte or bytes? Perhaps we can figure out what the change is, and work back from there.

Resources