Splitting an Avro file? - avro

The Avro-Tools package provides an easy way to concatenate multiple avro files together, however there doesn't seem to be an easy way to split files.
Does anyone know of a simple command-line tool that allows one to split an Avro file?

Related

Are there any tools that can give a class/struct diagram of a LUA project

Are there any tool that can give me a UML or table diagram on a set of related LUA files? If it can handle XML at the same time (the project has mixed LUA and XML that work together) that's a bonus.
No, the tool that you are looking for does not exist.

How to read and write id3v1 and id3v2 tags in Elixir

I would like to scan music files and read/write metadata using Elixir (this whole project is about learning Elixir - so please don't tell me to use Python!). As I understand it, I have two choices: call a system utility or (as no libraries exist in Erlang or Elixir that I am aware of) write an Elixir library. For m4a files, I make a system call to MP4Box and it writes an xml file to disk. I then read in the file, parse it, and load the data into a database.
def parse(file_name) do
System.cmd("MP4Box", ["-diso",file_name])
Ainur.XmlParser.parse(xml_file_name(file_name))
|> get_tags
end
Very slow, especially for thousands of files. And I want it to run at start up everytime to check for changed/new files.
Now I am trying to do the same for mp3's with id3 tags. I tried libid3-tools on Ubuntu and it only found the id3v1 tags. eyeD3 only found id3v2 tags. My mp3's have both so I need to make sure there are the same (I suppose I could delete the id3v1 tags, but I have been led to believe that id3v1 tags are needed on legacy equipment).
Are there any Erlang or Elixir libraries for music metadata? If not, are system calls to ubuntu utilities my best choice (any recommendations on which ones)?
Or do I need to write a library to obtain reasonable performance? If so, is there an existing library in a functional language that I could try to port?
Or is it possible to call a library written in another language directly from Elixir (without the system call)?
You can always use erlang NIFs (http://erlang.org/doc/tutorial/nif.html) to wrap an external library
In this project we have a module written in Elixir which extracts ID3 tags from mp3:
https://github.com/anisiomarxjr/shoutcast_server/blob/master/lib/mp3_file.ex
To use:
id3 = Mp3File.extract_id3("./test/fixtures/nederland.mp3")
I've implemented ID3v2 tag reading (not writing) in Elixir. It's on GitHub and Hex.
Support is very basic; I implemented the bare minimum to support my use case. There's lots of bugs, but all the building blocks are there to fork/improve/contribute.
You could also try directly reading the binary of the file to find the tag in question.
Check the File.stream/3 docs to get started.

Creating DDS(DXT5) from two files (RGB+Alpha). Command line tool need

I need some command-line tool to create dds (dxt5 format) from two .png files -- one with rgb channels and one with alpha. It's because I have a waste amount of images to process -- I can't do it manually. It's no problem for me to create script for generating batch file to process all images one by one, but I need tool to create dds from two png-s.
Anyone known such command-line tool ?
Thanks.
P.S. nvDXT.exe is very good but it can't combine rgb and alpha from different files.
If you have Photoshop, you could always use Batch Script (see Batch Scripting Tutorial for an example) to merge the channels (with NVidea plug in installed, you could probably even do the DDS conversion too). Just a thought.

Are there any Java libraries than can read hierarchical xmp data from files?

I tried looking into Apache Tika, but it seems to flatten XMP keywords to a single level.
Are there any Java libraries than can read hierarchical xmp data from files? Even just image files would suffice, but the more file types the better.

What is the standard format for localised resource files on different development platforms?

When developing in .Net, the framwork provides resx files as the standard way of storing localised resources (e.g. tranlsations of UI text).
I would like to know if there is a standard format for this in other development platforms (e.g. Java, RoR, etc.) and what that format is.
Thank you!
Please limit each answer to one development technology (e.g Java/C++/PHP etc.)
Java uses Properties, which are key-value pairs.
They can be serialized to the following two formats:
.properties
foo=bar
.XML
<entry key="foo">bar</entry>
Like Java, Adobe Flex also uses ResourceBundles that are serialized to .properties files
See http://www.freebsd.org/doc/en/books/developers-handbook/posix-nls.html
There is a standard, called POSIX, that applies to just about every other non-Windows operating system.
See http://www.php.net/manual/en/book.intl.php for the PHP-specific implementation of internationalization.
Large translation vendors accept the TMX file format for interchange of translation strings. Because they only have to deal with a standard xml file rather than strings embedded in controls, the amount of work these vendors have to do is reduced and so are their fees.
The standard way to do this on Linux is to use the gettext library, which stores its translations in .po files.
Cocoa applications (Mac/iPhone) are distributed as bundles (essentially: folders but with a known file-ish type). Inside a bundle, you can provide copies of strings files or other localized resources in a locale-specific subfolder. The Xcode provides IDE support for this, and the Cocoa frameworks provide many methods to conveniently fetch these resources.
See http://developer.apple.com/mac/library/documentation/MacOSX/Conceptual/BPInternational/Articles/InternatAndLocaliz.html for details.

Resources