Resources to simplify W3C standard implementation - parsing

At my company we’re looking to implement support for W3C standards, such as SVG 1.1 (2nd edition), in our app. I’ve been doing research on how people approach this problem.
There are various resources available at w3.org, but the standard only appears to be available in HTML files. I would like to be able to parse a single document that gives me all of the concepts in the standard, which I can then generate objects from in a programming language of my choice.
Apart from simply parsing the HTML itself, it seems possible to parse the document type definition file, but this doesn’t include type information for attributes such as “color,” whose constraints are described in EBNF in the html document.
What is the best approach to implementing a W3C standard, such as SVG 1.1 (2nd edition)?
Does it simply come down to manually tracking the different parts of the standard, as defined in the HTML?

Related

Microsoft word clipboard HTML documentation

I could not find any documentation describing conventions in text/html data in the clipboard resulting from copying part of a word document!
Specifically I want to know what classes like MsoNormal, TableGrid313, MsoTableGrid, MsoHeading9, MsoListParagraph are there! Or does styling information of texts always lay in style attribute of a span element containing the text?
The Word round-tip HTML is undocumented as it's not an official Word file format.
It was created to enable round-tripping Word documents for viewing (and some editing) in a browser, many years ago. Even then, it was not documented as its use was for internal Microsoft software. Being HTML, anyone could read and produce it, but MS made an conscious decision to not document it (and not need to put the resources into maintaining that documentation).

Is there a Way to localize an Application on Various Platforms

We are developing an Application which runs on various plattforms (Windows, Windows RT, MacOSX, iOS, Android).
The Problem is how to manage the different localizations on the different Platforms in an Easy Way. The Language Files on the different platforms have various formats (some are xml based, others are simple key-value pairs and others are totally crazy formats like on MacOS)
I'm sure, we aren't the first company with this problem, but I wasn't able to find an easy to use solution o achive the possibility to have one "datasource" where the strings are collected in different languages (the best would be an User Interface for the translators) and then can export it to the different formats for the different platforms.
Does anybody has a solution for this problem?
Greetings
Alexander
I recommend using GNU Gettext toolchain for management and at runtime use either
some alternate implementation for runtime reading like Boost.Locale,
own implementation (the .mo format is pretty trivial) or
use Translate Toolkit to convert the message catalogs to some other format of your liking.
You can't use the libintl component of GNU Gettext, because it is licensed under LGPL and terms of both Apple AppStore and Windows Live Store are incompatible with that license. But it is really trivial to reimplement the bit you need at runtime.
The Translate Toolkit actually reimplements all or most of GNU Gettext and supports many additional localization formats, but the Gettext .po format has most free tools for it (e.g. poedit for local editing and Weblate for online editing) so I recommend sticking with it anyway. And read the GNU Gettext manual, it describes the intended process and rationale behind it well.
I have quite good experience with the toolchain. The Translate Toolkit is easy to script when you need some special processing like extracting translatable strings from your custom resource files and Weblate is easy to use for your translators, especially when you rely on business partners and testers in various countries for most translations like we do.
Translate Toolkit also supports extracting translatable strings from HTML, so the same process can be used for translating your web site.
I did a project for iPhone and Android which had many translations and I think I have exactly the solution you're looking for.
The way I solved it was to put all translation texts in an Excel spreadsheet and use a VBA macro to generate the .string and .xml translation files from there. You can download my example Excel sheet plus VBA macro here:
http://members.home.nl/bas.de.reuver/files/multilanguage.zip
Just recently I've also added preliminary Visual Studio .resx output, although that's untested.
edit:
btw also my javascript xcode/eclipse converter might be of use..
you can store your translations on https://l10n.ws and get it via they API
Disclaimer: I am the CTO and Co-Founder at Tethras, but will try to answer this in a way that is not just "Use our service".
As loldop points out above, you really need to normalize your content across all platforms if you want to have a one-stop solution for managing your localized content. This can be a lot of work, and would require much coding and scripting and calling of various tools from the different SDKs to arrive at a common format that would service the localization needs of all the various file formats you need to support. The length and complexity of my previous sentence is inversely proportional to the amount of work you would need to do to arrive at a favorable solution for all of this.
At Tethras, we have built a platform that alleviates the need for multi-platform software publishers to have to do this. We support all of the native formats from the platforms you list above, and can leverage translations from one file format to another. For example, translate the content in Localizable.strings from your iOS app into a number of languages, then upload your equivalent strings.xml file from Android or foo.resx from Windows RT to the system, and it will leverage translations for you automatically. Any untranslated strings will be flagged and you can order updates for these strings.
In effect, Tethras is a CMS for localized content across many different native files formats.

Tradeoff between LaTex, MathML, and XHTMLMathML in an iOS app?

I plan to use Xcode to make an app for the iPhone that displays math equations that high school and college students often use. I do all my math with Mathematica, and it allows me to save such equations in three relevant formats. (1) LaTex (.tex); (2) MathML (.mml); (3) XHTMLMathML.
The Mathemactica documentation says the third format is XHTML with embedded MathML. I found some of the examples at this browser test don't look so good on my iPhone. So I will propbably not rely completly on MathML.
I am a total beginner with Xcode and the three file formats that I mention above, but I have some experience with OOP in C++. Assuming Mathematica can do a good job writing the required LaTeX, MathML, XHHTMLMathML needed for whatever equation, what are the tradeoffs between the three file formats? Can I mix the formats in the same app?
I would suggest to use HTML. The "right" way to include mathematical content is MathML -- which is part of HTML5 (but see below for using TeX).
iOS's UIwebview is webkit based and therefore has the same partial MathML support (though on iOS5 it's significantly worse due to a font bug) so I would also suggest to use MathJax (disclaimer: I'm part of MathJax).
MathJax is an open source javascript library which understands TeX and Asciimath input, converts either one to MathML and renders MathML as HTML-CSS or SVG (in any modern browser).
MathJax has no problem mixing these input formats. Additionally, it has better MathML support than webkit (and you can always configure MathJax to use the native MathML support if you want -- say when you know your content should render fine in webkits native support).
To get you started, you can take a look at this open source app to see how MathJax can be integrated in an iOS app.

Is WebGL part of the Html5 specification

So I know (think?) WebGL depends on the element of html5, but is it part of the html5 spec itself?
I used to think they were two different things, much like CSS3 and html5. But then I saw it as one of the criteria tested in http://html5test.com/.
Let's start by identifying what HTML 5 is. The W3C has a spec with a history section that details how the different HTML version numbers came about.
The WHATWG, for its part, considers HTML to be a "Living Standard", free of version numbers, but still includes HTML5 in its description of that standard.
WebGL itself is not part of either of the above specifications directly, although you'll find a reference to it if you search the WHATWG document above. So officially, no, not part of HTML5. WebGL does, as you mention, depend on the <canvas> element from HTML5.
In practice, however, I've seen a lot of people use "HTML5" as a buzzword or umbrella term to refer to the latest web technologies, including WebGL. In particular, you can't always describe an app as being a "WebGL app" because it almost always relies on newer aspects of HTML 5, CSS, JavaScript, etc., to make it work. I've often heard these referred to as "HTML5 apps" even if that's not strictly the definition. It's more modern than saying "Web 2.0" I suppose.
WebGL is not part of HTML 5 specification.But this belongs to khronos group.More info can be had from this link:
www.khronos.org/webgl

What's a solid, full-featured open rich text representation usable on the Web?

I'm looking for an internal representation format for text, which would support basic formatting (font face, size, weight, indentation, basic tables, also supporting the following features:
Bidirectional input (Hebrew, Arabic, etc.)
Multi-language input (i.e. UTF-8) in same text field
Anchored footnotes (i.e. a superscript number that's a link to that numbered footnote)
I guess TEI or DocBook are rich enough, but here's the snag -- I want these text buffers to be Web-editable, so I need either an edit control that eats TEI or DocBook, or reliable and two-way conversion between one of them and whatever the edit control can eat.
UPDATE: The edit control I'm thinking of is something like TinyMCE, but AFAICT, TinyMCE lacks footnotes, and I'm not sure about its scalability (how about editing 1 or 2 megabytes of text?)
Any pointers much appreciated!
FCKeditor has a great API, supports several programming languages (considering it is javascript this isn't hard to achieve), can be loaded through HTML or instantiated in code; but most of all, allows easy access to the underlying form field, so having a jQuery or prototype ajax buffer shouldn't be terribly difficult to achieve.
The load time is very quick compared to previous versions. I'd give it a whirl.
In my experience a two-way conversion between HTML and XML formats like TEI or DocBook is very hard to make 100% reliable.
You could use Xopus (demo) to have your users directly edit TEI or DocBook XML. Xopus is a commercial browser based XML editor designed specifically for non-technical users. It supports bidi and UTF-8. The WYSIWYG view is rendered using XSLT, so that gives you sufficient control to render footnotes the way you describe.
As TEI and DocBook don't have means to store styling information, those formats will not allow your users to change font face, size and weight. But I think that is a good thing: users should insert headers and emphasis, designers should pick font face and size.
Xopus has a powerful table editor and indentation is handled by nesting sections or lists and XSLT reacting to that.
Unfortunately Xopus 3 will only scale to about 200KB of XML, but we're working on that.
I can't really decide on one of them. IMHO they are all not very good and complete. They all have their advantages and clear disadvantages. If TinyMCE is your favorite then afaik, it also does tables.
This list will probably come in handy: WysiwygEditorComparision.
I've also used FCKEditor and it performed well and was easy to integrate into my project. It's worth checking out.
Small correction to laurens' answer above: As of now (May 2012), Xopus supports UTF8, but not BiDi editing. Right-to-left text is displayed fine if it came from another source, cannot be edited correctly.
Source: I was recently asked to evaluate this, so have been testing it.

Resources