Two column, parallel text, epub, Is it possible? - epub

I am trying to make a dual-language ebook. I'd like it to be typeset in two columns with one language on the left and the other on the right.
Is this possible with the epub specs?

It is quite possible, with two different ways:
Columns (used by wikisource epub export):
You can try with the printable version of this page (choose epub of course) : http://fr.wikisource.org/wiki/Trait%C3%A9_de_la_r%C3%A9forme_de_l%E2%80%99entendement_%28bilingue%29
On my Kobo, the regular epub engine screws the cells (the end of cells, if out of the display, is just ignored...) so you need to rename your epub file into .kepub.epub for the adobe engine (I guess) to be used and a correct display to be used.
Interlinear:
Using a simple css trick (display:inline-table) you can have different lines / words one under the other.
I have made a chinese version with chinese, pinyin and translation quite easily.
Hope this helps.

I'd use a simple table.
If you were thinking about CSS columns property : it is unreliable on iBooks and not even implemented on ADE based devices.

Related

Max size for PO file strings

I know that PO / MO files are meant to be used for small strings like button names, labels, etc. Not long text like an About page, etc.
But lately I am encountering a lot of situations that are in the middle. For example, a two sentence call to action. Or a short paragraph.
Is there best practice or "rule of thumb" for when a string is too long to put in a PO file?
update
For "long" text I use partials and include the correct language version. My question is WHEN is it optimal to use one vs the other. I've heard that PO files are "inefficient" for "long" pieces of text. But what does that mean and when is it too "long"? Or is this not a concern?
Use one entry for a self-contained chunk of text; e.g. a sentence as you say.
Two sentences that belong together and don't make sense without each other should be one entry. Why? Because otherwise the translator wouldn't have the context necessary to translate it well. Same goes for a short paragraph, e.g. explaining a setting: if it's inseparable in the code, it should be one entry.
If you encounter a situation where you have lots of long texts regularly (e.g. entire pages or paragraphs of pages), that's usually a sign that you are using an ill-fitting tool. Some people do it, using Gettext for entire articles, but you're better off having separate documents in such cases. But that doesn't seem to be the case here.

Retrieve text from pdf in iOS, using zachron iphonepdf?

I'm using the same zachrone iphonepdf but I did not get the text. My text view shows nothing. What's the problem?
Here is my code:
NSString *text=convertPDF(#"Course.pdf");
texview.text=text;
But I did not get anything in text view?
The text extractor zachron / pdfiphone (I assume you meant that one) is extremely naive and makes very many assumptions.
It ignores the PDF file structure and, therefore completely ignores whether the data it inspects is still used in the current revision.
It ignores encryption and therefore will fail completely for many documents with usage restrictions.
It completely ignores font encodings and implicitely assumes an ASCII'ish one --- this is fairly often true in small PDFs with English text only and not embedded fonts; otherwise the result can be anything.
... many many more assumptions ...
Unless one only has to deal with very simple documents and the extracted text is not really necessary for the functionality of one's code, I would propose using different code for text extraction.

Combining keys and full text when working with gettext and .po files

I am looking at gettext and .po files for creating a multilingual application. My understanding is that in the .po file msgid is the source and msgstr is the translation. Accordingly I see 2 ways of defining msgid:
Using full text (e.g. "My name is %s.\n") with the following advantages:
when calling gettext you can clearly see what is about to be
translated
it's easier to translate .po files because they
contain the actual content to be translated
Using a key (e.g. my-name %s) with the following advantages:
when the source text is long (e.g. paragraph about company), gettext calls are more concise which makes your views cleaner
easier to maintain several .po files and views, because the key is less likely to change (e.g. key of company-description far less likely to change than the actual company description)
Hence my question:
Is there a way of working with gettext and .po files that allows combining the advantages of both methods, that is:
-usage of a keys for gettext calls
-ability for the translator to see the full text that needs to be translated?
gettext was designed to translate English text to other languages, and this is the way you should use it. Do not use it with keys. If you want keys, use some other technique such as an associative array.
I have managed two large open-source projects (50 languages, 5000 translations), one using the key approach and one using the gettext approach - and I would never use the key approach again.
The cons include propagating changes in English text to the other langauges. If you change
msg_no_food = "We had no food left, so we had to eat the cats"
to
msg_no_food = "We had no food left, so we had to eat the cat's"
The new text has a completely different meaning - so how do you ensure that other translations are invalidated and updated?
You mentioned having long text that makes your scripts hard to read. The solution to this might be to put these in a separate script. For example, put this in the main code
print help_message('help_no_food')
and have a script that just provides help messages:
switch ($help_msg) {
...
case 'help_no_food': return gettext("We had no food left, so we had to eat the cat's");
...
}
Another problem for gettext is when you have a full page to translate. Perhaps a brochure page on a website that contains lots of embedded images. If you allow lots of space for languages with long text (e.g. German), you will have lots of whitespace on languages with short text (e.g. Chinese). As a result, you might have different images/layout for each language.
Since these tend to be few in number, it is often easier to implement these outside gettext completely. e.g.
brochure-view.en.php
brochure-view.de.php
brochure-view.zh.php
I just answered a similar (much older) question here.
Short version:
The PO file format is very simple, so it is possible to generate PO/MO files from another workflow that allows the flexibility you're asking for. (your devs want identifiers, your translators want words)
You could roll this solution yourself, or use a cloud-based app like Loco to manage your translations and export a Gettext file with identifiers when your devs need them.

Ruby on Rails: In there a way to convert word to html?

maybe a way to batch convert also?
You could use Google Docs API to upload and convert .doc's.
http://code.google.com/apis/documents/overview.html
Some samples and code: http://code.google.com/apis/documents/code.html
Ruby example and demo:
http://code.google.com/p/gdata-samples/source/browse/#svn/trunk/doclist/DocListManager
http://doclistmanager.googlecodesamples.com/
The short answer is no, but the long answer is sorta.
MS Word itself will save a file out as html - but it's a total friggin' mess. To an extent this is simply because the customer base that is converting word files to html directly are not concerned about it being sloppy, so Word hasn't worked hard on making a clean output. On the other hand, it's intrinsically difficult, because word is oriented to create fixed size, non-dynamic documents, like a paper-base book. So it's easy to convert to other static formats (say a PDF), but how do you convert to HTML? Do you just make the text flow across? Do you set a width that will hopefully make the layout stay the same? What if there is fonts or layout elements in the word doc that are not available in the HTML renderer?
The easiest thing to do is to do it project by project - you can create a DTD to convert an RTF file, for instance - but this involves you making programmer level decisions about how these will be converted.

Including full LaTeX documents within others

I'm currently finishing off my dissertation, and would like to be able to include some documents within my LaTeX document.
The files I'd like to include are weekly reports done in LaTeX to my supervisor. Obviously all documents are page numbered seperately.
I would like them to be included in the final document.
I could concatenate all the final PDFs using GhostScript or some other tool, but I would like to have consistent numbering throughout the document.
I have tried including the LaTeX from each document in the main document, but the preamble etc causes problems and the small title I have in each report takes a whole page...
In summary, I'm looking for a way of including a number of 1 or 2 page self-complete LaTeX files in a large report, keeping their original layouts, but changing the page numbering.
For a possible solution of \input-ing the original LaTeX files while skipping their preamble, the newclude package might help.
Otherwise, you can use pdfpages for inserting pre-existing PDFs into your dissertation. I seem to recall that it has a feature of "suppressing" the original page numbers by covering them up with white boxes.
The suggestion from #Will Robertson works great. I'd just like to add an example for all lazy people:
\usepackage{pdfpages}
...
% Insert _all_ pages from some_pdf.pdf:
\includepdf[pages=-]{some_pdf} % the .pdf extension may be omitted
From the documentation of the package:
To include a specific range of pages, you could do pages={4-9}. If start is omitted, it defaults to the first page, if end is omitted, it defaults to the last page.
To include it in landscape mode, do landscape=true
Maintaining the original formatting per document will be difficult if they're using different formats. For example, concatenating different document classes will be near impossible.
I would suggest you go with the GhostScript solution with a slight twist. Latex allows you to set the starting page number using \setcounter{page}{13} for example. If you can find an application that can count the pages of a PDF document (pdfinfo in the pdfjam Ubuntu package is one example), then you can do the following:
Compile the next document to PDF
Concatenate the latest PDF with the current full PDF
Find the page count of the full PDF
Use sed to pluck in a \setcounter{page}{N} command into the next latex file
Go back to the beginning
If you need to do any other processing, again use sed. You should (assuming you fix the infinite loop in the above algorithm ;-) ) end up with a final PDF document with all original PDFs concatenated and continuous line numbers.
Have a look a the combine package, which seems to be exactly what you're searching for.
Since it merges documents at the source level, I guess the page numbers will be correct.

Resources