Single YAML file VS multiple YAML files in different folders and subfolders - ruby-on-rails

I am working on automating the translation workflow and improving the Localization process as a whole of a Rails website. I am using SimpleBackend so only YAML files are used for storing translations.
The current locales directory consists of folders, then sub-folders (in some cases) and those sub-folders containing yml files. I am considering to integrate the project with some third-party tool like Transifex for translation management so may be using a single YAML file for each language may be good for management of workflow.
If someone can highlight the pros and cons of both structures then it would be really helpful to decide whether I should switch from nested file structure to single file pattern or not. Also, the project is an Open-Source project with active contributors and so thinking for a long-term solution.
Thanks!

I think whatever tools you are using to make the process flow smoothly factors a lot in this decision. You should explore how exactly Transifex wants things to be structured in output, and try to keep your current input structure, and give that a shot before making a decision.
However, in my opinion, for a large app with a lot of translatable text, my preference would be to allow for multiple yaml files in your default locale, and one or two consolidated yaml files for each foreign translation. If there isn't a lot of translatable text in your app, maybe a single file is fine for you, but given it's already split up, there's a good chance that's the better choice. On a team with many contributors you can end up with a very high churn file (maybe with a lot of merge conflicts) that everyone changes all the time.
Splitting into separate files lets you logically separate out text to match a domain in your app, like a separate yaml file for mailers (or even each mailer), and one for each domain (or controller). Either way, it puts you in control of your organization strategy.
However, there isn't a lot of value, IMO in separating your foreign translations to mirror that structure. The systems I have experience with (not Transifex) generate your foreign translation files for you, so you just need to sync with the web interface and commit the results.

Related

Performance of flat directory structures in iOS

On the iOS filesystem, is there a way to optimize file access performance by using a tiered directory structure vs. a flat directory structure?
Specifically, my app has Objects that each contain a number of images and data files. A user could create thousands of these Objects and I need to optimize access to one image for ~100 arbitrary Objects at a time.
In this situation, how should I organize files on the filesystem? Would a tiered directory structure be faster than a flat one? And if so, how should I structure the tiered system (i.e. how many tiers, and how many subdirectories / files per tier)?
THANKS!
Well first of all you might as well try it with a flat structure to see if it is slow or not. Perhaps apple has put in code to optimize how files are found and you don't even need to worry about this. You can probably build out the whole app and just test how quickly it loads and see if that meets your requirements.
If you need to speed it up I would suggest trying to make some sort of structure based on the name of the file. You could have a folder which has all of the items beginning with the letter 'a' or 'b' and so on and so forth. This would split it into 26 folders which should significantly decrease the amount of items in each. Depending on how you name the files you might want a different scheme so that each of the folders had a similar amount of items in it
If you are using Core Data, you could always just enable the Allows External Storage option in the attribute of your model and let the system decide where it should go.
That would be a decent first step to see if the performance is ok.

in a Rails apps, how to organize Cucumber .feature files?

when i got to this project there were cucumber tests in "features/enhanced", which ran with javascript and a few in "features/plain" which did not require js. with the later development of per-scenario #javascript, this doesn't make sense. and as the number of features files we have grows and grows, it'd be awesome if this stayed tidy.
so, in best practice land:
1) how long should .feature files be? i try to keep each narrow and specific with 1 or 2 "Scenarios".
2) what folder/file structure should one keep them in?
2a) how might one group similar features?
1) Once you've done them for a few months you'll soon find what works best for you. My advice is you should make them small ish. We have often split our earlier features down into smaller chunks, but have never ended up combining them. It's handy for making use of backgrounds etc...
2) We had a big problem with this and spent ages doing it one way then another. In the end we gunned to group them by the services that our company provides. e.g. payments, customer registration, stock management
Inconveniently, features don't always conform to a hierarchical tree view of the world, so make liberal use of tagging and your primary grouping of features is less important.
Have you tried yard? There's an example here We've just built it into our CI, it lets you pull together sets of scenarios based on tags, you can do unions, intersections etc... well worth it :)
I would keep the JavaScript and non-JavaScript versions of a scenario together, since they should be very similar.
Anything more than 8 scenarios in a feature file is probably too much.
A useful approach is to have a folder to represent the high-levels features (sometimes call epics or themes), and separate feature files within those folders for the different aspects of the behaviour.
For example, you may have a feature "Employee Directory" which would have separate feature files contains scenarios for a photograph, office location, job title, etc.
Depending on the size and complexity of your app, you could group those folders into other folders.
(Note that none of the above is specific to Rails apps).

Rails CMS: static files or database records?

I'm trying to figure out the cut-off with respect to when a "text entry" should be stored in the database vs. as a static file. Are there any rules of thumb here? The text entries will be at the most several paragraphs and have links to images and tables (and hyperlinks to other text entries). Some criteria for the text entry:
I'm thinking of using DITA as the content format
The text should be searchable
If the text is revised, a new version will be created
thanks in advance, Chuck
The "rails way" would be using a database.
The solution will be more scalable, therefore faster and probably easier to develop with (using migration and so on). Using the file system, you will have to build lots of functions on your own, that are already implemented for database usage.
You could create a Model (e.g.) Document and easily use existing versioning systems, like paper_trail. When using an indexed search, you can just have an has_many relation enabling you to realise the depencies between the models (destroy a model means to destroy the search index).
Rather than a cut-off, you could look at what databases provide and ask yourself if those features would be useful. Take Isolation (the I in ACID): if you have any worries that multiple people could be trying to edit an entry at the same time, a database would handle that well while you'd have to handle the locks yourself working with files. Or Atomicity: you might want to update two things at once (e.g. an index page and an entry page) and know they will either both succeed or both fail.
Databases do a number of things beyond ACID, such as taking advantage of multiple datatypes, making querying easier, and allowing for scaling. It's a question worth asking since most databases end up having data stored in a bunch of files on disk. Would you end up writing a mini-database if you used files yourself?
Besides, if you're using rails you mind as well take advantage of its ActiveRecord functionality, and make it possible to use the many plugins that expect a database.
I'd use a database for even a small, single user rails app.

What is the best way to store multiple language versions of a website?

My web site (on Linux servers) needs to support multiple languages.
What is the best practice to have/store multiple languages versions of the same site?
Some I can think of:
store in DB
different view file for each language
gettex
hard coded words in PHP files (like in phpBB)
With web sites, you really have several categories of content to consider for localization:
The article-type content elements that you would in many cases create, edit and publish in a CMS.
The smaller content blocks that are common to every page (or a sub-group of pages), such as tagline, blurb, text around a contact form, but also imported content such as a news ticker or ads and affiliate links. Some of these may only appear for one language (for example, if you don't offer some services in some regions, or don't have, say, language-appropriate imported content for a particular language: it can be better to remove an element rather than offering English to people who may not speak it).
The purely functional elements, like "Click here to comment", "More...", high-level navigation, etc., which are sometimes part of your template. Some of these may be inside images.
For 1. the main decision is using a CMS or not. If yes, you absolutely need to choose one that supports multiple languages. I'm not up-to-date with recent developments in PHP CMS's, but several of the Django CMS apps (Django-CMS-2, FeinCMS) support multi-language content. Don't forget that date stamps, for example, need to be localized, too (or you can get around this by choosing ISO dates, though that may not always be possible). If you don't use a CMS, and everything is in your HTML files, then gettext is the way to go, and keep the .mo files (and your offline .po files) in folders by language.
For 2. if you have a CMS with good multi-lingual support, get as much as possible inside the CMS. The reason is that these bits do change, and you want to edit your template as little as possible. If you write code yourself, think of ways of exporting all in-CMS strings per language, to hand them to translators. Otherwise, again, gettext. The main issue is that these elements may require hard-coding language-selection code (if $language = X display content1 ...)
For 3., if it's in your template, use gettext. For images, the per-language folders will come in handy, and for heaven's sake make choose images the generation of which can be automated, or you (or your graphic artist) will go mad with editing 100s of custom images with strings in languages you don't understand.
For both 2. and 3., abstracting from the language selection may help selecting the appropriate blocks or content directory (where localized images or .mo files are kept).
What you definitely want to avoid is keeping a pile of HTML files with extensive text content in them that would be a nightmare to maintain.
EDIT: Everything about gettext, .po and .mo files is in the GNU gettext manual (more than you ever wanted to know) or a slightly dated but friendlier tutorial. For PHP, there's are the PHP gettext functions, and also the Zend Locale documentation
I recommend using Zend_Translate's Gettext adapter which parses mo files. Very efficient + caching. Your calls would be like
echo $translation->_("Hello World");
Which would find the locale specific key for that specified string.
Check out i18n support for php: http://php-flp.sourceforge.net/getting_started_english.htm

Localization asset management

We have several large products we'd like to integrate with a consistent localization strategy.
We're already doing the right things from a code point of view - ie. strings in resource files.
I'm looking for something that will organize localized strings in a database, and generate the appropriate resource files (ie. .RESX files for .NET, .js files, etc.) during the build process. Ideally, it would also be able to read in the files as well (detecting strings that have been added/removed).
The database would allow us to reuse translations in different products, switch to different technologies, and track what translations are missing in each release.
Has anyone found a good product that handles these requirements? What have others done to manage localized assets?
Found some good links in the answers for this question: Do you know of a good program for editing/translating resource (.rc) files?
There's a number of products which we're now evaluating:
http://www.lingobit.com/
http://www.sisulizer.com/
http://www.multilizer.com/
WinTrans - http://www.schaudin.com/
None of have quite the database-based approach we were initially looking for, but they seem to have the core functionality. Lingobit is an early favorite, but we haven't trialed in too much detail yet. Does anyone have a recommendation between those products (or similar)?
Check out GlobalSight or Alchemy's Catalyst
Catalyst is a standalone translation memory and localization engine that can be used in your build process (and is used by many large software companies). GlobalSight is a relatively new and open source translation database and workflow tool that looks very promising.

Resources