Creating lorem ipsum translations for Rails I18n - ruby-on-rails

I have a big Rails app that I've I18ned. It hasn't yet been translated to any language, but I extracted 99% of the strings into yaml. They come from all over the place - views, models, JS.
There's that remaining 1%, and tracking it down is proving tricky.
I would like to take my existing yaml files and "translate" them into a new language, Lorem Ipsum. I.e, to run the files through some processor that would produce valid i18n locale YAML, with gibberish content.
I would then import these, switch to the test locale (I could call these es.yml, or whatever) and cruise through my app looking for broken formatting and English-language strings.
The only slight problem is... how do I produce this lorem ipsum file? I can't just lorem all of the quoted strings, because there are I18n tokens of several format there.
Any ideas/advice would be appreciated.

Here, I'm overriding I18n#translate to return lorem ipsum of the expected text length. I haven't tested this, but in theory, it would look something like this:
module I18n
alias_method :translate_without_lorem_ipsum, :translate
def translate_with_lorem_ipsum(*args)
actual_text = translate_without_lorem_ipsum(*args)
LoremIpsum.text_with_length(actual_text.length) # imaginary method returning text of given length
end
alias_method :translate, :translate_with_lorem_ipsum
end
Since you're using rails, you could replace the alias_method usage with alias_method_chain.

Related

Truncate Head in paragraph/label in iOS

While truncating a string with multiple lines in a label, the lineBreakMode with byTruncatingHead only works on the last line instead of the initial line. Is there any way to get around this?
So if the text is
"It is a long established fact that a reader will be distracted by the readable content of a page when looking at its layout. The point of using Lorem Ipsum is that it has a more-or-less normal distribution of letters, as opposed to using 'Content here, content here', making it look like readable English. Many desktop publishing packages and web page editors now use Lorem Ipsum as their default model text, and a search for 'lorem ipsum' will uncover many web sites still in their infancy. Various versions have evolved over the years, sometimes by accident, sometimes on purpose (injected humour and the like)."
and the numberOfLines is 2 then the expected outcome is
... versions have evolved over the years, sometimes by accident, sometimes on purpose (injected humour and the like).
Instead of
It is a long established fact that a reader will be distracted by the readable content by accident,
and in the second line
... sometimes on purpose (injected humour and the like).

Inverse translation with rails-i18n

I've been happily using the built-in rails i18n support for translating strings to different languages, which works great. Recently though I've had a need for something that goes a bit beyond the default behaviour of this gem.
I'll call this "inverse translation" for lack of a better word. Basically the idea is that I have some string in some language, and I want to be able to call a method with another locale, and get back the string translated to that locale if a mapping exists in the locale strings.
For example, assume I have in config/locales/en.yml
en:
hello: Hello World!
and in config/locales/ja.yml:
ja:
hello: Konnichi wa!
then when I call this method l2l_translate ("locale to locale translate") while in the English locale, with the string and the locale as arguments, I get back the Japanese translation:
I18n.locale = :en
l2l_translate("Hello World!", :ja) #=> "Konnichi wa!"
Also, and this is more tricky, I want to be able to inverse match interpolated strings. So say I have:
config/locales/en.yml
en:
minutes: "%d minutes"
config/locales/ja.yml
ja:
minutes: "%d分"
Then I should be able to translate from English to Japanese like so:
l2l_translate("5 minutes", :ja) #=> "5分"
So basically the string should be matched with a regex to the English translation string, and the "5" pulled out and sent as an argument "%d" to the Japanese translation.
Obviously there are potential problems here, if: 1) there is no match, or 2) there are multiple matches. Those could be handled by raising an exception, for example, or by returning nil in the former case and an array of translations in the latter. In any case those are minor points.
My basic question is: does anything like this exist? And if not, does anyone have any suggestions on how to go about developing it (say as a gem)?
The application I'm specifically thinking of is an API wrapper for a service in Japanese. I want to be able to specify patterns in Japanese which can be matched and translated into other languages. The default i18n support won't do this, and I don't know of any other gems that will.
Any advice or suggestions would be much appreciated! For reference see also this discussion in 2010 on the topic of inverse translation with i18n-rails.
We use gettext, which is a standard unix i18n solution. For Rails, you can use gettext_i18n_rails. One caveat is that FastGettext, which gettext_i18n_rails is backed by, doesn't seem to have complete gettext support, and some advanced features such as pluralization didn't work as expected.

Cut text block in two parts, based on finding a tag wihin the text

This should be simple, once I figure out how, so I'm interested in the various ways to solve this while I search for an answer.
I have a what is essentially a db text field of html. I need to cut this in two parts based on finding a tag within the text.
In simple(but imagine this is a gigantic block of html):
lorem ipsum lorem ipsum lorem ipsum {CUT} lorem ipsum lorem ipsum lorem ipsum
I need to be able to deconstruct this into two parts, based on a {CUT} tag in the text. I've not done this yet, but imagine its been done.
What is the most efficient way to do this with Ruby? It is in a rails app, so if there is something rails(ish) to simplify this I'm unaware know of, that would be great. There is probably a thread here on stackoverflow, but I haven't found one yet.
The block of HTML is so gigantic that a string.split("{CUT}") won't cut it?
split("{CUT}", 2) will split it in, at most, two pieces, based on the first occurence of {CUT}.

Changing the textwidth of the notes in Beamer (LaTeX)

I've been using the beamer class to create presentations in LaTeX and I love it. Recently I started using the \note command to add notes to my handout so that I have a printed version with some pointers to remind myself of things I want to say in the lecture.
I have a problem with the longer lines in the notes environment as they seems to spill of the right end of the page without formatting correctly. I don't know if this is so for a reason, but in any case, I would like to find out how to change it. Clearly, I do not want to change the width of the text everywhere, only in the note environment.
Here is a minimal example:
\documentclass[beamer]{beamer}
\title{An example of itemize in notes not working in beamer}
\usetheme{Boadilla}
\setbeameroption{show notes}
\begin{document}
\begin{frame}
$$ e^{i\pi}+1=0$$
\end{frame}
\note[itemize]{
\item At vero eos et accusamus et iusto odio dignissimos ducimus qui blandiis pra
}
\end{document}
Without the [itemize] option it works fine, but if you put a \begin{itemize}...\end{itemize} environment manually the result is the same.
Any ideas?
Thanks
I finally found a good answer, by re-posting on TeX.SE. It turns out that there's a small bug in Beamer that is responsible for this behavior. A workaround is given in the TeX.SE site. Hopefully, the workaround or a real fix will be included in the next Beamer release, as is currently planned.
Cheers.
I had the same problem, so I created a command in the preamble which defined a new style for my note page, and I also changed the template of the notes a bit. This is what I have (just before the \begin{document}:
\usepackage{setspace}
\usetemplatenote{\setlength{\leftmargin}{1cm} \beamertemplatefootempty \insertnote}
\newcommand{\notepage}[1]{\note{\setlength{\parskip}{0.7em}
\setlength{\parindent}{0.4em}
\scriptsize #1 }}
So instead of using \note in the document, I call \notepage, and the note will be formatted the way I defined before. Try this formatting and if you don't like you can change the values of the margins, indentation and skip between paragraphs to suit your needs.
By the way, I don't understand why you are using
\documentclass[beamer]{beamer}
\setbeameroption{show notes}
The way I do it is to have the three options and comment/uncomment according to what I need:
%\documentclass[notes]{beamer}
%\documentclass[notes=hide]{beamer}
\documentclass[notes=only]{beamer}
Try changing the theme before going to more drastic measures.
I noticed that changing the theme from Boadilla to something else, or deleting the reference to a theme altogether, solved the problem. FWIW, the two themes I used to test this were Warsaw and Berlin.
I found the above to be true for the following versions of Beamer: 3.07-2 and 3.10-2.

How to write an ANTLR parser for JSP/ASP/PHP like languages?

I am new to parser generators and I am wondering how the ANTLR grammar for an embedded language like JSP/ASP/PHP might look like, but unfortunately the ANTLR site doesn't provide any such grammar files.
More precisely I don't know exactly how to define an AnyText token which matches everything (including keywords which aren't having any meaning outside the code blocks) and still be able to recognize them correctly inside the blocks.
For example the following snipped should be tokenized as something like: AnyText, BlockBegin, Keyword, BlockEnd, AnyText.
lorem ipsum KEYWORD dolor sit <% KEYWORD %> amet
Maybe there is also another parser generator which is suited better for my needs. I have only tried ANTLR up to now, because of its huge popularity here at stackoverflow :)
Many thanks in advance!
I can't speak for ANTLR, as I use a different lexer/parser (the DMS Software Reengineering Toolkit, for which I have developed precisely such JSP and PHP lexer/parsers. (ASP isn't different as you have observed in your question).
But the basic idea is that the lexer needs lexical modes to recognize when you are picking up "anytext" and when you are processing "real" programming language text.
So you need a starting lexical mode, say HTML, whose job is to absorb the HTML
text, and when it encounters an transition-into PHP, switches modes.
You also need a PHP mode which picks up all the PHP tokens,
and switches back to HTML mode when the transition-out characters are encountered.
Here's a sketch:
%%HTML -- mode
#token HTMLText "~[]* \< \% "
<< (GotoPHPMode) >>
%%PHP -- mode
#token KEYWORD "KEYWORD"
...
#token '%>' "\%\>"
<< (GotoHTMLMode) >>
Your lexer generator is likely to have some kind of mode-switching capability
that you'll have to use instead of this. And you'll likely find that
lexing the HTML stuff is more complicated than it looks (you have to worry
about <SCRIPT tags and lots of other crazy HTML stuff, but those are
details I presume you can handle.
I've come across this project http://code.google.com/p/phpparser/
which also contains an ANTLR grammar file for parsing PHP: http://code.google.com/p/phpparser/source/browse/grammar/Php.g
Hope this helps.

Resources