Find differences in a .po file - localization

I have a .po file where most translated strings are identical to original ones. However, few are different. How do I quickly find the ones that differ from original?

use podiff
I used it, an it workd for me. Its in C, so you have to compile it. make is your friend

I would suggest using one of the many web-based localization management platforms. To name a few:
Amanuens (disclaimer: my company builds this product)
Web Translate It
Transifex
GetLocalization
This kind of platforms allows you to keep your resource files in sync, edit them in a web-based editor (useful for no-technical translators) and most importantly to highlight/see only the changed/untranslated strings.

Related

How to identify what projects have been affected by a code change

I have a large application to manage consisting of of three or four executables and as many as fifty .dlls. Many of the source code files are shared across many of the projects.
The problem is a familiar one to many of us - if I change some source code I want to be able to identify which of the binaries will change and, therefore, what it is appropriate to retest.
A simple approach would be simply to compare file sizes. That is an 80% acceptable solution, but there is at least a theoretical possibility of missing something. Secondly, it gives me very little indication as to WHAT has changed; It would be ideal to get some form of report on this so I can then filter out irrelevant (e.g. dates/versions copyrights etc..)
On the plus side :
all my .dcus are in a row - I mean they are all built into a single folder
the build is controlled by a script (.bat)(easy, for example, to emit .obj files if that helps)
svn makes it easy to collect together any (two) revisions for comparison
On the minus side
There is no policy to include all used units in all projects; some units get included because they are on a search path.
Just knowing that a changed unit is used/compiled by a project is not sufficient proof that the binary is affected.
Before I begin writing some code to solve the problem I would like to ask the panel what suggestions they might have as to how to approach this.
The rules of StackOverflow forbid me to ask for recommended software, but if anyone has any positive experiences of continuous integration tools that would help - great
I am open to any suggestion or observation that is relevant in this context.
It seems to me that your question boils down to knowing which units are contained in your various executables. Since you are using search paths, it will be hard for you to work this out ahead of time. The most robust way to find out is to consult the .map file that the compiler emits. This contains a list of all units contained in your executable.
Once you know which units are contained in each executable, you need to know whether or not anything has changed in those units. That information is contained in your revision control system. Put this all together and you have the information that you need.
Of course, just because the source code for a unit has changed, you might argue that re-testing is not needed. Perhaps the only change made was the version, or the date in a copyright label or some such. But it is asking too much to be able to ask a computer to make such a judgement. At some point you need a human to step up and take responsibility.
What is odd about this though is that you are asking the question at all. It seems to me to be enormously risky to attempt partial testing. I cannot understand why you don't simply retest the entire product.
After using it for > 10 years for commercial in-house and freelancer work in large projects, I can recommend to try Apache Ant. It is a build tool which supports dependencies, and has many very helpful features.
Apache Ant also integrates nicely with CI tools such as Hudson/Jenkins, Bamboo etc.
Another suggestion - based on experience with Maven - is to design the general software architecture as modular as possible. If modules (single or multiple source or DCU files in one directory) use a version number in the directory name as a version number, it is possible to control exactly how application are composed from these modules.
If you want to program such a tool yourself the approach would be something like this:
First you need to detect wheter there were any changes made to seperate source files. As you already figured out comparing the file size is bad idea as the file size can stay the same despite lots of changes made to it (as long as there is same amount of text in pas file its size won't change). So instead you could check the last modification time for specific file or create some hash value like MD5 hash for comparison (can be quite slow).
Then you need to generate yourself a dependancy tree which will tell you which files are used for which project/subproject.
Finally based on changes detected in seperate files you check the dependancy tree to see which projects needs to be recompiled.
The problem of such approach is that you would probably have to update the dependancy tree manually each time when new unit is added to the project or an existing one is removed from the project.
But the best way would be to go and use some version controll software istead of reinventing the wheel. I myself like the way how GIT works and I belive that with proper implementation of GIT into the project mannager itself could be quite powerfull do to GIT support of branching/subbranching (each project is its own branch, each version of your software can be its own subbranch).
Now latest version of Delphi does have GIT integration done though SVN but this unfortunately limits some of best GIT functionality. So if you maybe decide to go and integrate GIT support directly into Delphi I'm first in line to use it.

The best approach to modular programming in Delphi

this is a continuation of the discussion I started here. I would like to find the best way to modularize Delphi source code as I'm not experienced on this field. I will be gratefull for all your suggestions.
Let me post what I have already written there.
The software developed by the company I work for consists of more than 100 modules (most of them being something like drivers for different devices). Most of them share the same code - in most cases classes. The problem is that those classes are not always put into separate, standalone PAS units. I mean that the shared code is often put into units containing code specific to a module. This means that when you fix a bug in a shared class, it is not enough to copy the PAS unit it is defined in into all software modules and recompile them. Unfortunately, you have to copy and paste the fixed pieces of code into each module, one by one, into a proper unit and class. This takes a lot of time and this is what I would like to eliminate in the nearest future by choosing a correct approach - please help me.
I thought that using BPLs distributed with EXEs would be a good solution, but it has some downsides, as some mentioned during the previous discussion. The worst problem is that if each EXE needs several BPLs, our technical support people will have to know which EXE needs which BPLs and then provide end users with proper files. As long as we don't have a software updater, this will be a great deal for both our technicians and end users. They will certainly get lost and angry :-/.
Also compatibility issues may occur - if one BPL is shared by many EXEs, a modification of that BPL can bee good for one EXE and bad for some other ones.
What should I do then to make bug fixes quicker in so many projects? I think of one of the following approaches. If you have better ideas, please let me know.
Put shared code into separate and standalone PAS units, so when there is a bug fix in one of them, it is enough to copy it to all projects (overwrite the old files) and recompile all of them. This means that each unit is copied as many times as many projects it is used by.
This solution seems to be OK as far as a rarely modified code is concerned. But we also have pas units with general use functions and procedures, which often undergo modifications. It would be impossible to do the same procedure (of copying and recompiling so many projects) every time someone adds a new function to this file.
Create BPLs for all the shared code, but link them into EXEs, so that EXEs are standalone.
For me it seems the best solution now, but there are some cons. If I make a bug fix in a BPL, each programmer will have to update the BPL on their computer. What if they forget to do that? However, I think it is a minor problem. If we take care of informing each other about changes, everything should be fine. What do you think?
And the last idea, suggested by CodeInChaos (I don't know if I understood it properly). Sharing PAS files between projects. It probably means that we would have to store shared code in a separate folder and make all projects search for that code there, right? And whenever it is necessary to modify a project, it would have to be downloaded from SVN together with the shared files folder, I guess. Each change in the shared code would have to cause recompilation of each project that uses that code.
Please help me choose a good solution. I just don't want the company to lose much more time and money than necessary on bugfixes, just because of a stupid approach to software development. So far nobody has cared about it and you can imagine how many problems it causes.
Thank you very much.
You say:
Create BPLs for all the shared code, but link them into EXEs, so
that EXEs are standalone.
You can't link BPLs into an executable. You are simply linking in the separate units that are also in the BPL. That way you don't actually use or even need the BPL at all.
BPLs are meant to be used as shared code, i.e. you put the code that is shared into one or several BPLs and use that from each of the .exes, .dlls or other .bpls. Bugfixes (if they don't change the public interface of the BPL) merely require the redistribution of that one fixed BPL.
As I said, decide on the public interface of a DLL and then don't change it. You can add routines, types and classes, but you should not modify the public interfaces of any existing classes, types, interfaces, constants, global variables, etc. that are already in use. That way, a fixed version of the BPL can easily be distributed.
But note that BPLs are highly compiler version dependent. If you use a new version of the compiler, you will have to recompile the BPL too. That is why it makes sense to give BPLs suffixes like 100, 110, etc., depending on the compiler version. An executable compiled with compiler version 15.0 will then be told to use the BPL with suffix 150, and an executable compiled with version 14.0 will use the BPL with suffix 140. That way, different versions of the BPLs can peacefully co-exist. The suffix can be set in the project options.
How do you manage different versions? Make a directory with a structure like I have for my ComponentInstaller BPL (this is the expert you can see in the Delphi/C++Builder/RAD Studio XE IDE under menu Components -> Install Component):
Projects
ComponentInstaller
Common
D2007
D2009
D2010
DXE
The Common directory contains the .pas files and resources (bitmaps, etc.) shared by each version, and each of the Dxxxx directories contains the .dpk, .dproj, etc. for that particular version of the BPL. Each of the packages uses the files in the Common directory. This can of course be done for several BPLs at once.
A versioning system might make this a lot easier, BTW. Just be sure to give each version of the BPL a different suffix.
If you actually want standalone executables, you don't use BPLs and simply link in the separate units. The option "compile with BPLs" governs this.
From my point of view trying to manage artifacts like Delphi units, libraries and executable files, you search at wrong place. I suggest you to turn around and start with refactoring of code, based on Design patterns implementation.
E.g. all common functions can be placed into one Singleton class, instances of common classes can be constructed with Abstract Factory, classes can interact through native Delphi implementation of interfaces instead of direct usage and so on. Even you can choose to implement Facade for all common parts of projects.
Of course, concrete choice of patterns and details of implementation depends on project specific and only you can decide what applicable in your case.
I suppose, that after looking to project in this vein you can find more natural ways of code organization and solution for your problems.
Some other things:
Of course, you must follow #CodeInChaos suggestion and share one copy of source file between all projects instead of copying it to each project manually. It may be useful if you adopt some standard for building environment, which will be mandatory for all developers (same folder structure, location of libraries, environment settings).
Try to analyze building and deployment process: for me it's looking abnormal when solution not built with latest version of code and not tested before deployment. (it's for your "If I make a bug fix in a BPL, each programmer ..." phrase).
Variant with standalone executable files looks better because significantly simplifies organization of testing environment and project deployment. Just choose adequate conventions for versioning.

Tools and methods for localization of ASP.NET application needed

We have a multilingual site that is currently using 2 languages, but with several others coming soon. The site is localized primarily by resx files, but with some localized data in a database.
We need to find some tools to manage localization of the site - something that picks up on changes in resx files so translators will only need to translate new or updated texts.
Any ideas or recommendations? We're also interested in any articles about the logistics of localization if anyone has some.
I'm researching this area as we speak and I came across your post. I'm not sure if this is any help but in the past we used RCWinTrans for our localization. This was for mutiple C++/MFC products although it does support .Net . We would have a RCWinTrans project for each language/product we intended to support although you could have multiple languages in a single project. It kept track of state (i.e. not-translated, translated, changed, etc) and would allow us to export the strings to an excel spreadsheet which we could then send onto a translator. They would updated the spreadsheet and we would reimport the data.
Hope this helps, apologies if I'm on the wrong track and I'm teaching grandma to suck egss. I will be creating another thread today with a similar requirement to this btw, but with a few more snagettes - might be worth a look to see if I get any answers. Cheers, Roger
An idea might be to have all the localization data the database (or a localization database), this way one could build the localization tools/interface as part of the application and poyentially use it for many applications?
There is a free tool called Resource Translation Helper ( http://www.winking.be/resource-translation-helper )
Install it
Point it to your project directory
Export to excel (.xls)
Give the excel to your translaters and let them update the columns.
Import the updated xls file again
The tool creates also a file to map the resx translation to the excel rows. Don't remove that or you will not be able to import again.
Works very good, but backup your data before adjusting things ;-)
You cant try to use this Visual Studio extension https://visualstudiogallery.msdn.microsoft.com/2676967b-0516-4f5f-b312-6873e2f9d219.
It allows to export/import your project resources to/from excel and to add new cultures.
Old question, but I would like to add http://www.zeta-resource-editor.com/index.html
Free tool for .NET resource files.
This directly edits the resource files, no need to ex-/import.

Is there some sort of open source repository of translated strings stored in .mo, or similar formats?

Just curious, I reckon I'll have to hire a translator to do my .mo files manually but would be great if there was some sort of resource for this.
Sorry if this question doesn't belong or isn't a "real" question, but it is related to web application localization.
You should try one of next resources:
Rosetta - Ubuntu translation platform
OpenOffice.org Localization Project
Google Translation - The quality is better than you may think and is already using existing open source translations.

Is it expected that all the units of a Project Group in Delphi 7 to be in one folder?

Maybe this applied to other Delphi's (I've only used 7). We've got our code broken up so that nearly every DLL in our fairly massive app is in a different folder.
99% of the open source stuff I've downloaded to plug into Delphi have had all their source munged into one folder.
It seems like this was an assumption that the developers of Delphi made about the coding practices of it's users that may be non-obvious.
I don't think so. In fact, In more recent versions they've added features to the project manager to make it easier to deal with the fact that code is spread around different directories (such as the flatten directories option), so I think it is accepted that this is how many people organize their code.
I suspect it's more to do with projects growing organically over time, and whether anyone takes the time to tidy up.
I for one definitely do not put all the sources into one directory but rather keep them in groups that have something in common. e.g. I use subversion externals quite extensively
(see http://www.dummzeuch.de/delphi/subversion/english.html , the section about externals).
I prefer different modules to be hosted on different folders, then have a common folder for units that is shared among different modules, makes management easy. e.g
myClientServerApp:(parent)
Client folder :(child)
server filder (child)
lib - (child)
Back in DELPHI 7 I also had all files in one folder. It has easy for small projects, but very hard for med to big one.
So I began to create a folder structure for all DELPHI projects small or big.
Over the year I am trying to improve, this folder structure, and every new project I make a small improvement so that it is simpler, logical, and more organized.
This day I am trying to make some parts of it sharable to several project. Its work in progress.
It would seem that having all the units in one folder would save you headaches in doubly named units. On the other hand, it might be handier to keep your projects in different folders when checking in and out of your version control. On the other hand it really doesn't promote code reuse to have them separated out like that.

Resources