What is the basic file extension list for Apache Tika standard library? - apache-tika

I am trying to determine a basic list of file extensions supported for tika-parsers-standard-package. They have a formats supported page https://tika.apache.org/2.6.0/formats.html, but is there a way to get an easy list of extensions for each supported parser?

Related

Which asciidoc editors support custom extensions?

I am building a documentation solution where I need to use some custom extensions. For a writer it is often important to see some preview of his work, therefore I am looking for an editor that would support preview generation with these custom extensions. Asciidoc is a very versatile, but it is sometimes difficult to navigate in the ecosystem, because different engines are available, which support different features in different languages.
Which editors support preview with custom extensions?
Asciidoc ecosystem has a lot of engines: Ruby-based, JS-based and Java-based. Different editors use different engines and support different subset of features.
AsciidocFX, asciidoc-vscode, Adobe Brackets and Atom make use of asciidocjs engine.
AsciidocFX can be hacked to include js-based plugins, according to its maintainers.
Asciidoc-vscode recently dropped functionality to use external processor (any of those mentioned), so now there is no official mean to use other extensions.
Atom and Brackets do not mention this kind of functionality.
Eclipse and IntelliJ use java-based processor.
Eclipse can use external processor to handle extensions. It is not possible with embedded engine.
IntelliJIdea can use compiled Java extensions or Ruby extensions if they do not require any other libraries. To enable an extension you should only place it into .asciidoctor/lib in the root of the project. In case your custom ruby-based extension has any dependencies, they should be placed under gempath, which is ~/.gems/jruby/2.5.0 for me. That can be done with gem install --install-dir <gempath> <gem> command.
Conclusions: Asciidoctor has a very versatile and easy to extend engine. Broad ecosystem exists. The downsides are:
A choice paradox
Only a few editors can support your homemade customizations, and it is not easy to figure out which do.

Annotating C++ code for generating swagger json/yaml

I know that there is a way to generate the client-side code from the swagger yaml with swagger-codegen, but is there a way to generate the swagger yaml with the C++ annotations similar to what can be done in Java.
There seem to be annotation libraries available for other languages e.g python C# (https://swagger.io/blog/api-development/swagger-annotation-libraries/) but I am not able to find any support for C++.
I don`t think so. As far as I know the Swagger Core Annotations are part of the Swagger Core Project, and in the documentation you can find:
Swagger Core is a Java implementation of the OpenAPI Specification.
Current version supports JAX-RS2.
Also the Prerequisites says:
You need the following installed and available in your $PATH:
Java 8
Apache maven 3.0.4 or greater
Jackson 2.4.5 or greater
Update:
I'm not quite sure if oatpp-swagger can fit your requirements.

Meteor js localization

I got a meteor app and I want to support multiple languages.
How to support multy language app with meteor js?
Is there a recommended pattern, couldn't find stuff in documentation
Meteor suggest you don't write your own packages at the moment and their appears not to be a bundled l10n package as part of the project.
Two projects worth looking at for server side patterns in Node are i18n-node and node-polyglot.
Both these projects use simple JSON structures loaded from locale directories and both create new translation keys when you first use them. If you need to create a JSON language pack from an existing source, you could use my PO->JSON converter.
UPDATE:
I just found this Meteor wrapper for i18next

How to convert ODT to DOC/RTF without openoffice.org

Is there any way to convert odt documents to doc or rtf on linux without openoffice or any library that relies on having openoffice installed ?
OpenOffice.org and its derivatives (LibreOffice, Symphony, etc) currently have one of the best converters between ODF and the Microsoft formats (besides the ODF support built into MS Office).
If those converters are not an option for you, you can choose between some alternatives: Foremost you might want to check out the KOffice project which also offers command line tools for file conversion:
KOffice - File Filters
Then there is another open source project with a free BSD license available on SourceForge:
OpenXML/ODF Translator
This project offers not only add-ins for Microsoft Office, but also a stand-alone command line version which also runs on Linux.
Then there would also be a different approach: You can automate Google Docs using command line tools:
googlecl: Command line tools for the Google Data APIs
Google Docs file conversion have internally been based on the OpenOffice.org file filters, but as far as I know they have been replaced by Aspose, a library for document formats.
Aspose is available in several versions, and as you have a Linux dependency you might want to check out their Java version.
Aspose.Words for Java
The library has its price, but you won't find another library that is not a full office suite with that quality.
If you don't want to use OpenOffice, Google Docs is your best bet. Cross-platform, web-based, and free, it takes about 2 minutes. You would upload the file, and check convert, then redownload as a doc or pdf (depends on what you want).
http://docs.google.com/
You could try this freeware (Docx2Rtf) and run it under WINE.
Checkout unoconv. It relies on OpenOffice.org its core, but it doesn't rely on any GUI packages. I assume this is what you want?
Use http://zamzar.com/ It has great support for all those formats. And is not reliant on any other installed program.
And of course, being a web page, it will work on any OS.

What is the standard format for localised resource files on different development platforms?

When developing in .Net, the framwork provides resx files as the standard way of storing localised resources (e.g. tranlsations of UI text).
I would like to know if there is a standard format for this in other development platforms (e.g. Java, RoR, etc.) and what that format is.
Thank you!
Please limit each answer to one development technology (e.g Java/C++/PHP etc.)
Java uses Properties, which are key-value pairs.
They can be serialized to the following two formats:
.properties
foo=bar
.XML
<entry key="foo">bar</entry>
Like Java, Adobe Flex also uses ResourceBundles that are serialized to .properties files
See http://www.freebsd.org/doc/en/books/developers-handbook/posix-nls.html
There is a standard, called POSIX, that applies to just about every other non-Windows operating system.
See http://www.php.net/manual/en/book.intl.php for the PHP-specific implementation of internationalization.
Large translation vendors accept the TMX file format for interchange of translation strings. Because they only have to deal with a standard xml file rather than strings embedded in controls, the amount of work these vendors have to do is reduced and so are their fees.
The standard way to do this on Linux is to use the gettext library, which stores its translations in .po files.
Cocoa applications (Mac/iPhone) are distributed as bundles (essentially: folders but with a known file-ish type). Inside a bundle, you can provide copies of strings files or other localized resources in a locale-specific subfolder. The Xcode provides IDE support for this, and the Cocoa frameworks provide many methods to conveniently fetch these resources.
See http://developer.apple.com/mac/library/documentation/MacOSX/Conceptual/BPInternational/Articles/InternatAndLocaliz.html for details.

Resources