How to translate a website in all languages? - ruby-on-rails

I'm trying to translate my website in all languages supported by Google Translate.
I'm using Ruby on Rails 6, and want to do it as a translation backend, but this is not specific to Ruby or Ruby on Rails.
When I had to support 6 languages I would correct the mistakes myself but I can't
I tried different things but my latest strategy has been storing everything in the database:
class ActiveRecordBackend
include I18n::Backend::Base
include I18n::Backend::Transliterator
SEPARATE_INTERPOLATIONS = /(?<interpolation>%{[^}]+})|(?<text>[^%]+)/
NETWORK_ERRORS = [SocketError, Errno::EHOSTUNREACH].freeze
LOCALES_PATH = Rails.root.join("lib/data/locales.yml")
LOCALES = YAML.safe_load(LOCALES_PATH.read).map(&:to_struct).sort_by(&:name)
LOCALE_NAMES = LOCALES.map(&:locale).map(&:to_sym)
def available_locales
LOCALE_NAMES
end
def reload!
#translations = nil
self
end
def initialized?
!#translations.nil?
end
def init_translations
#translations = Translation.to_hash
end
def translations(do_init: false)
init_translations if do_init || !initialized?
#translations ||= {}
end
private
def lookup(locale, key, _scope = [], _options = {})
Translation.find_by(locale: locale, key: key)&.value ||
store_translation(locale: locale, key: key)
end
def store_translation(locale:, key:)
default = Translation.find_by(locale: I18n.default_locale, key: key)
return unless default
translated_value =
easy_translate(default.value, from: I18n.default_locale, to: locale)
return unless translated_value
Translation.find_or_create_by(
locale: locale,
key: key,
value: translated_value
)
translated_value
end
def easy_translate(original, from:, to:)
original
.scan(SEPARATE_INTERPOLATIONS)
.map do |interpolation, text|
next interpolation if interpolation
spaces_before = text.scan(/\A */).first
spaces_after = text.scan(/ *\z/).first
translated_text =
EasyTranslate.translate(text, from: from, to: to).strip
"#{spaces_before}#{translated_text}#{spaces_after}"
end
.join
rescue *NETWORK_ERRORS, EasyTranslate::EasyTranslateException
nil
end
end
But I get things like
"<b>7976membri attivi tra cui1in linea<br>562attività con5945partecipazioni"
for italian
instead of:
"<b>7976</b> membres actifs dont <b>1</b> en ligne <br><b>562</b> activités avec <b>5945</b> participations"
for french
And I also don't handle returning a group of translations like t(".js").
How would you do it?

How would you do it?
I wouldn't do it.
If your website only natively supports a few languages (e.g. English) and a user wants to view it in an unsupported language (e.g. Italian), then let the user apply Google Translation themselves.
There's a very popular plugin to do this. But, like you found already, it won't always give perfect results: Sometimes it can mess up your page layout, in addition to just giving sub-optimal translations due to mis-interpreted context.
If you discover a magic way to accurately apply website translations in the backend to all possible languages and contexts, without breaking the UI, then congratulations -- you'll soon be incredibly wealthy.

Agree with #GiacomoCatenazzi, it looks very unprofessional to have obvious spelling mistakes. If you have to translate the page, I recommend you use I18n and do it manually.
If you feel like you have to use GT, I would do something like this:
Create the I18n files for each language
Only manually populate the english one
Create a class which reads in the english version of the I18n file to a hash.
Loop through all the files you want to populate, if the key does not exist you should use the GT api to translate and populate the files where the key does not exist.
Create a cron job and run the class everyday.
You can improve the amount of requests in step 4 as much as you want, with some dedication it should be possible to limit the requests to the amount of languages you support.

I found a solution:
def easy_translate(original, from:, to:)
interpolations_in_original = original.scan(INTERPOLATION)
spaces_before = original.scan(/\A */).first
spaces_after = original.scan(/ *\z/).first
translated_text = EasyTranslate.translate(original, from: from, to: to).strip
translated_text = translated_text.gsub("% {", "%{")
bad_interpolations = translated_text.scan(INTERPOLATION)
interpolations_in_original.size.times do |index|
translated_text.gsub!(bad_interpolations[index], interpolations_in_original[index])
end
"#{spaces_before}#{translated_text}#{spaces_after}"
rescue *NETWORK_ERRORS, EasyTranslate::EasyTranslateException
nil
end
Can be improved obviously like actually replacing each interpolation by the corresponding correct one and keeping capitalization

Related

rails - How to refactor this method

I got a little bit of a challenge but I don't know where to start. Long story short, I'm trying to make a method that will automatically translate a record from its model via Deepl or Google Translate.
I've got something working but I want to refactor it so it gets more versatile:
def translate
texts = [self.title_fr, self.desc_fr, self.descRequirements_fr, self.descTarget_fr, self.descMeet_fr, self.descAdditional_fr]
translations = DeepL.translate texts, 'FR', 'EN'
self.update(title_en: translations[0], descRequirements_en: translations[2], descTarget_en: translations[3], descMeet_en: translations[4], descAdditional_en: translations[5])
end
Hopefully this is self explanatory.
I would love to have a method/concern working like such :
def deeplTranslate(record, attributes)
// Code to figure out
end
and use it like such : deeplTranslate(post, ['title', 'desc', 'attribute3']). And that will translate the attributes and save the translated attributes to the database in en language.
Thanks in advance to anyone that can point me to a valid direction to go towards.
Okay, I actually managed to create an auto translate method for active record :
def deeplTranslate(record, attributes, originLang, newLang)
keys = attributes.map{|a| record.instance_eval(a + "_#{originLang}")}
translations = DeepL.translate keys, originLang, newLang
new_attributes = Hash.new
attributes.each_with_index do |a, i|
new_attributes[a + "_#{newLang}"] = translations[i].text
end
record.update(new_attributes)
end
Maybe it can get cleaner... But it's working : )

I18n translation with i18n-active_record: same form for same key

I am working on an app in Rails 4 using i18n-active_record 0.1.0 to keep my translations in the database rather than in a .yml-file. It works fine.
One thing that I am struggling with, however, is that each translation record is one record per locale, i.e.
#1. { locale: "en", key: "hello", value: "hello")
#2. { locale: "se", key: "hello", value: "hej")
which makes updating them a tedious effort. I would like instead to have it as one, i.e.:
{ key: "hello", value_en: "hello", value_se: "hej" }
or similar in order to update all instances of one key in one form. I can't seem to find anything about that, which puzzles me.
Is there any way to easily do this? Any type of hacks would be ok as well.
You could make an ActiveRecord object for the translation table, and then create read and write functions on that model.
Read function would pull all associated records then combine them into a single hash.
Write function would take your single hash input and split them into multiple records for writing/updating.
I ended up creating my own Translation functionality using Globalize. It does not explicitly rely on I18n so it is a parallell system but it works, although not pretty and it is not a replacement to I18n but it has the important functionality of being able to easily add a locale and handle all translations in one form.
Translation model with key:string
In Translation model:
translates :value
globalize_accessors :locales => I18n.available_locales, :attributes => [:value]
In ApplicationHelper:
def t2(key_str)
key_stringified = key_str.to_s.gsub(":", "")
t = Transl8er.find_by_key(key_stringified)
if t.blank?
# Translation missing
if t.is_a? String
return_string = "Translation missing for #{key_str}"
else
return_string = key_str
end
else
begin
return_string = t.value.strip
rescue
return_string = t.value
end
end
return_string
end

Formatting dates coming from params

In Grails, if I define a locale, and put a date on specific format on i18n file, like (dd/mm/AAAA), if call one request like:
http://myapp/myaction?object.date=10/12/2013
When I get print: params.date, it comes to me a date object.
How can I do the same on rails?
Normally the Rails handles this for you. For instance, the form helper datetime_select works in conjunction with some activerecord magic
to ensure ensure time/date types survive the round-trip. There are various alternatives to the standard date-pickers.
If this doesn't work for you e.g. rails isn't generating the forms, there are (at least) a couple of options.
One option, slightly evi, is to monkey-patch HashWithIndifferentAccess (used by request params) to do type conversions based on the key name. It could look something like:
module AddTypedKeys
def [](key)
key?(key) ? super : find_candidate(key.to_s)
end
private
# look for key with a type extension
def find_candidate(key)
keys.each do |k|
name, type = k.split('.', 2)
return typify_param(self[k], type) if name == key
end
nil
end
def typify_param(value, type)
case type
when 'date'
value.to_date rescue nil
else
value
end
end
end
HashWithIndifferentAccess.send(:include, AddTypedKeys)
This will extend params[] in the way you describe. To use it within rais, you can drop it into an initialiser, eg confg/initializers/typed_params.rb
To see it working, you can test with
params = HashWithIndifferentAccess.new({'a' => 'hello', 'b.date' => '10/1/2013', 'c.date' => 'bob'})
puts params['b.date'] # returns string
puts params['b'] # returns timestamp
puts params['a'] # returns string
puts params['c'] # nil (invalid date parsed)
However... I'm not sure it's worth the effort, and it will likely not work with Rails 4 / StrongParameters.
A better solution would be using virtual attributes in your models. See this SO post for a really good example using chronic.

How to disable ActiveRecord logging for a certain column?

I'm running into a problem which, in my opinion, must be a problem for most rails users but I could not find any solution for it yet.
When, for instance, performing a file upload of a potentially large, binary file and storing it in the database, you most certainly don't want rails or ActiveRecord to log this specific field in development mode (log file, stdout). In case of a fairly big file, this causes the query execution to break and almost kills my terminal.
Is there any reliable and non-hacky method of disabling logging for particular fields? Remember, I'm not talking about disabling logging for request parameters - this has been solved quite nicely.
Thanks for any information on that!
If this helps anyone, here is a Rails 4.1 compatible version of the snippet above that also includes redaction of non-binary bind params (e.g. a text or json column), and increases the logging to 100 char before redaction. Thanks for everyone's help here!
class ActiveRecord::ConnectionAdapters::AbstractAdapter
protected
def log_with_binary_truncate(sql, name="SQL", binds=[], statement_name = nil, &block)
binds = binds.map do |col, data|
if data.is_a?(String) && data.size > 100
data = "#{data[0,10]} [REDACTED #{data.size - 20} bytes] #{data[-10,10]}"
end
[col, data]
end
sql = sql.gsub(/(?<='\\x[0-9a-f]{100})[0-9a-f]{100,}?(?=[0-9a-f]{100}')/) do |match|
"[REDACTED #{match.size} chars]"
end
log_without_binary_truncate(sql, name, binds, statement_name, &block)
end
alias_method_chain :log, :binary_truncate
end
Create a file in config/initializers whitch modifies ActiveRecord::ConnectionAdapters::AbstractAdapter like so:
class ActiveRecord::ConnectionAdapters::AbstractAdapter
protected
def log_with_trunkate(sql, name="SQL", binds=[], &block)
b = binds.map {|k,v|
v = v.truncate(20) if v.is_a? String and v.size > 20
[k,v]
}
log_without_trunkate(sql, name, b, &block)
end
alias_method_chain :log, :trunkate
end
This will trunkate all fields that are longer than 20 chars in the output log.
NOTE: Works with rails 3, but apparently not 4 (which was not released when this question was answered)
In your application.rb file:
config.filter_parameters << :parameter_name
This will remove that attribute from displaying in your logs, replacing it with [FILTERED]
The common use case for filtering parameters is of course passwords, but I see no reason it shouldn't work with your binary file field.
Here's an implementation of the approach suggested by #Patrik that works for both inserts and updates against PostgreSQL. The regex may need to be tweaked depending upon the formatting of the SQL for other databases.
class ActiveRecord::ConnectionAdapters::AbstractAdapter
protected
def log_with_binary_truncate(sql, name="SQL", binds=[], &block)
binds = binds.map do |col, data|
if col.type == :binary && data.is_a?(String) && data.size > 27
data = "#{data[0,10]}[REDACTED #{data.size - 20} bytes]#{data[-10,10]}"
end
[col, data]
end
sql = sql.gsub(/(?<='\\x[0-9a-f]{20})[0-9a-f]{20,}?(?=[0-9a-f]{20}')/) do |match|
"[REDACTED #{match.size} chars]"
end
log_without_binary_truncate(sql, name, binds, &block)
end
alias_method_chain :log, :binary_truncate
end
I'm not deliriously happy with it, but it's good enough for now. It preserves the first and last 10 bytes of the binary string and indicates how many bytes/chars were removed out of the middle. It doesn't redact unless the redacted text is longer than the replacing text (i.e. if there aren't at least 20 chars to remove, then "[REDACTED xx chars]" would be longer than the replaced text, so there's no point). I did not do performance testing to determine whether using greedy or lazy repetition for the redacted chunk was faster. My instinct was to go lazy, so I did, but it's possible that greedy would be faster especially if there is only one binary field in the SQL.
In rails 5 you could put it in initializer:
module SqlLogFilter
FILTERS = Set.new(%w(geo_data value timeline))
def render_bind(attribute)
return [attribute.name, '<filtered>'] if FILTERS.include?(attribute.name)
super
end
end
ActiveRecord::LogSubscriber.prepend SqlLogFilter
For filter attributes geo_data, value and timeline for instance.
Here is a Rails 5 version. Out of the box Rails 5 truncates binary data, but not long text columns.
module LogTruncater
def render_bind(attribute)
num_chars = Integer(ENV['ACTIVERECORD_SQL_LOG_MAX_VALUE']) rescue 120
half_num_chars = num_chars / 2
value = if attribute.type.binary? && attribute.value
if attribute.value.is_a?(Hash)
"<#{attribute.value_for_database.to_s.bytesize} bytes of binary data>"
else
"<#{attribute.value.bytesize} bytes of binary data>"
end
else
attribute.value_for_database
end
if value.is_a?(String) && value.size > num_chars
value = "#{value[0,half_num_chars]} [REDACTED #{value.size - num_chars} chars] #{value[-half_num_chars,half_num_chars]}"
end
[attribute.name, value]
end
end
class ActiveRecord::LogSubscriber
prepend LogTruncater
end
I didn't find much on this either, though one thing you could do is
ActiveRecord::Base.logger = nil
to disable logging entirely, though you would probably not want to do that. A better solution might be to set the ActiveRecord logger to some custom subclass that doesn't log messages over a certain size, or does something smarter to parse out specific sections of a message that are too large.
This doesn't seem ideal, but it does seem like a workable solution, though I haven't looked at specific implementation details. I would be really interested to hear any better solutions.
I encountered the same problem, but I couldn't figure out a clean solution to the problem. I ended up writing a custom formatter for the Rails logger that filters out the blob.
The code above needs to be placed in config/initializers, and replace file_data with the column you want to remove and file_name with the column that appears after in the regular expression.
version for Rails 5.2+
module LogTruncater
def render_bind(attr, value)
num_chars = Integer(ENV['ACTIVERECORD_SQL_LOG_MAX_VALUE']) rescue 120
half_num_chars = num_chars / 2
if attr.is_a?(Array)
attr = attr.first
elsif attr.type.binary? && attr.value
value = "<#{attr.value_for_database.to_s.bytesize} bytes of binary data>"
end
if value.is_a?(String) && value.size > num_chars
value = "#{value[0,half_num_chars]} [REDACTED #{value.size - num_chars} chars] #{value[-half_num_chars,half_num_chars]}"
end
[attr && attr.name, value]
end
end
class ActiveRecord::LogSubscriber
prepend LogTruncater
end
This is what works for me for Rails 6:
# initializers/scrub_logs.rb
module ActiveSupport
module TaggedLogging
module Formatter # :nodoc:
# Hide PlaygroundTemplate#yaml column from SQL queries because it's huge.
def scrub_yaml_source(input)
input.gsub(/\["yaml", ".*, \["/, '["yaml", "REDACTED"], ["')
end
alias orig_call call
def call(severity, timestamp, progname, msg)
orig_call(severity, timestamp, progname, scrub_yaml_source(msg))
end
end
end
end
Replace yaml with the name of your column.

Localizing Ruby alphabet

I'm working on I18N for a web application (Rails), and part of the app needs to display a select containing the alphabet for a selected locale. My question is, is there a way to get Ruby to handle this or do I need to go thru the Rails-provided I18N API?
This is the array I'm using for generating the select options:
'A'.upto('Z').to_a.concat(0.upto(9).to_a)
I need to translate that to Russian, Chinese & Arabic.
You need to create an HTML select, with all the letters of a particular alphabet?
That would theoretically work for Russian and Arabic, but Chinese doesn't have an 'alphabet'.
The writing system contains thousands of characters.
I think you need to implement this yourself. Afaik Rails i18n plugins don't provide this information.
A nice solution would be to creating you own Range.
Example from the docs:
class Xs # represent a string of 'x's
include Comparable
attr :length
def initialize(n)
#length = n
end
def succ
Xs.new(#length + 1)
end
def <=>(other)
#length <=> other.length
end
def to_s
sprintf "%2d #{inspect}", #length
end
def inspect
'x' * #length
end
end
r = Xs.new(3)..Xs.new(6) #=> xxx..xxxxxx
r.to_a #=> [xxx, xxxx, xxxxx, xxxxxx]
r.member?(Xs.new(5)) #=> true

Resources