I am new on HDF5. I am trying to convert a hyperspectral image raw file to a HDF5 file, but I do not find the proper way. Does anyone know how to convert a raw file in a HDF5 file?
Thanks in advance.
The only way is to learn HDF5 APIs. If you use C++, you can this detailed document http://www.hdfgroup.org/HDF5/doc/H5.intro.html should be very useful.
Moreover, you can use the high-level APIs (http://www.hdfgroup.org/HDF5/doc/HL/). It should be more convenient since all you need is to create a new HDF5 dataset.
Youn can also consult the HDF help desk help#hdfgroup.org. They are always willing to help every user very patiently.
If you want to convert a binary file to a HDF5 file, you need:
to know the specification of your raw binary file (how data are stored)
to know how to store data in your HDF5 file
If I had to do such a conversion, I would use a high level language to do the reading and writing. I would use Python (2 or 3), struct to do the reading, and the library h5py for exporting hdf5 file. It would cause much less trouble than using compiled language like C, C++ and Fortran.
Related
I want to develop some simulation software. It produces long arrays of data. Is it good idea to store this data in mkv file with custom codec ? The goal of it is to get fast random access to data and avoid headache with handling big arrays(bigger then 32bit address space)
And if so, is there are any simple mkv c++ library ?
Also, mkv is a specific application of EBML, a sort of binary xml language, optimized for media. If you decided the features are right for you, EBML would be what you would use, which would allow you to customize for your specific application.
mkv is the file extension for the Matroska format, which would help you with your search.
Here is the Matroska source code page, which includes links to EBML and Matroska c libraries.
http://www.matroska.org/team/source-code.html
I'm pretty sure the things you get from mkv are nowhere near as sophisticated for scientific (simulation) data as HDF5. It was designed for exactly the use case you describe.
I am looking to speed up the reading of a data file which has been converted from binary (it is my understanding that "binary" can mean a lot of different things - I do not know what type of binary file I have, just that it's a binary file) to plaintext. I looked into reading files quickly awhile ago, and was informed that reading/parsing a binary file is faster than text. So, I would like to parse/read the binary file (that was converted to plaintext) in an effort to speed up the program.
I'm using Matlab for this project (I have a Matlab "program" that needs the data in the file). I guess I need some information on the different "types" of binary, but I really want information on how to read/parse said binary file (I know what I'm looking for in plaintext, so I imagine I'll need to convert that to binary, search the file, then pull the result out into plaintext). The file is a logfile, if that helps in any way.
Thanks.
There are several issues in what you are asking -- however, you need to know the format of the file you are reading. If you can say "At position xx, I can expect to find data yy", that's what you need to know. In you question/comments you talk about searching for strings. You can also do it (much like a text file) "when I find xxxx in the file, give me the following data up to nth character, or up to the next yyyy".
You want to look at the documentation for fread. In the documentation there are snippets of code that will get you started, but as I (and others) said you need to know the format of your binary files. You can use a hex editor to ascertain some information if you are desperate, but what should be quicker is the documentation for the program that outputs these files.
Regarding different "binary files", well, there is least significant byte first or LSB last. You really don't need to know about that for this work. There are also other platform-dependent issues which I am almost certain you don't need to know about (unless you are moving the binary files from Mac to PC to unix machines). If you read to almost the bottom of the fread documentation, there is a section entitled "Reading Files Created on Other Systems" which talks about the issues and how to deal with them.
Another comment that I have to make, you say that "reading/parsing a binary file is faster than text". This is not true (or even if it is, odds are you won't notice the performance gain). In terms of development time, however, reading/parsing a textfile will save you huge amounts of time.
The simple way to store data in a binary file is to use the 'save' command.
If you load from a saved variable it should be significantly faster than if you load from a text file.
How can I write a script or program to manipulate Adobe Photoshop files? I'd like to be able to do something like read a Adobe PSD file, rename the layers, and save it back to a PSD format.
The files look to be saved with a combination of XML and serialized data. I looked at the file's code and see that it has <x:xmpmeta near the start, did some google searching to find the wikipedia article about xmp - Extensible Metadata Platform, but I'm unclear if that is the format for the entire file or just for the metadata portion.
I saw that there is a PSD parser class for PHP available, and not a bad article about how to use it, although it seems like it is just for reading / converting and not for writing / saving.
But I'd like to know:
What format are these files stored in?
Where are the guidelines for interfacing with that format?
Are there some classes / tools available for manipulating that file format? Any language would be fine for a start.
I'm happy to do more research on my own but I'm hoping for some guidance to know what I should be looking for.
I'm not familiar with it myself, but there is an official SDK for Photoshop available that should let you do all that and more with .psd files.
There are not so many options. The general advice would be to look into buying Adobe InDesign Server. In some cases it can be cost prohibitive and you might be interested in 3-party SDKs. Unfortunately there are a few options in the market. One of them is Graphics Mill image processing SDK (http://www.graphicsmill.com/photoshop-psd).
Disclaimer: I work for Aurigma which runs Graphics Mill project.
I need to create a .xls file from the Array data programmatically in iPhone. How can this be done?
Maybe you're in trouble, maybe not. The "old" xls format is a binary one and I am not aware of any free libraries which are able to read or write to that format. If this one is required, you're propably out of luck.
If however a more recent format will do you're back in business, because you can use xml (objc wrappers for lib2xml are readily available). Wikipedia features a short overwiev of the format which you might want to check out: Excel file formats on Wikipedia
I am trying to deserialize an old file format that was serialized in Delphi, it uses binary seralization. I know nothing about the structure of the file except some very high level records that are in it.
What steps would you take to solve this problem? Any tools etc?
A good hexeditor, and use the gray matter to identify structures.
If you get a hint what kind of file it is, you can search for more specialized tools.
Running the unix/Linux "file" command can be good too (*) See Barry's comment below for how it works. It can be a quick check for common filetypes like DBF,ZIP etc hidden by using a different extension.
(*) there are 3rd party builds for windows, but they might lag in versions. If you can do it on a recent *nix distro, it is advised to do so.
The serialization process simply loops over all published properties and streams their value to a text file. If you do not know the exact classes that were streamed to the file you will have a very hard time deserializing the file. (if not impossible)
A good hex editor is first. If the file is read without buffering (eg read directly from a TFileStream) you could gain some information when using ProcMon from SysInternals; You can see exactly what data is read in what chunks and thus determine more quickly where the boundaries are between the structures you already identified.