Does speed of tar.gz file listing depend on tar size? - tar

I am using the tf function to list the contents of a tar.gz file. It is pretty large ~1 GB. There are around 1000 files organized in a year/month/day file structure.
The listing operation takes quit a bit of time. Seems like a listing should be fast. Can anyone enlighten me on the internals?
Thanks -

Take a look at wikipedia, for example, to verify that each file inside the tar is preceed by a header. To verify all files inside the tar, is necessary to read the whole tar.
There's no "index" in the beggining of the tar to indicate it's contents.

Tar has simple file structure. If you want list them, you must parse all file.
If you want find one file, you can stop process. But must be sure archive has only one file version. This is typical on packed archives because adding on that is unsupported.
for example you can do like this:
tar tvzf somefile.gz|grep for find something|\
while read file; do foundfile="$file"; last; done
at this loop will break and do not read everything, but only from start to file position.
If you must do something more with list, save it to any temporary file. you can gzip this file for place saving if it is needed:
tar tvzf somefile.gz|gzip >temporary_filelist.gz

Related

Erlang : exception error: no match of right hand side value {error,enoent} while reading a text file

I am currenly working on an erlang project and stuck in reading the file. I want to read a text file which is in the /src folder where all the erlang and a text file are in the same structure. Then too, I am not being able to read the file despite of specifying file paths. Any help would be appreciated.
start() ->
{ok,DataList} = file:consult("Calls.txt"),
io:format("** Calls to be made **"),
io:fwrite("~w~n",[DataList]).
The data file stores contents like : {john, [jill,joe,bob]}.
Try add folder name to the path or try set full patch to the file:
1> {ok,DataList} = file:consult("src/Calls.txt").
Notes: the error {error,enoent} mean that the file does not exist or you don't have a rights to read/write current file, for this case need set 777 rights or similar.
If you need to use src/call.txt, then this simply means that your IDE (or you) has created a src folder in which the calls.txt file has been placed. At the same time, the IDE is using a path that only includes the top level folder (i.e., the root folder for the IDE project). So src/call.txt must be used in that case. This isn’t a problem with Erlang, or even the IDE. It’s just the way your project is set up.
You can do either of two things. Move the calls.txt file up one level in the IDE file manager, so that it can be referenced as calls.txt, not src/call.txt. You can also just change the path to “calls.txt” before you run it from the command line.
enoent means "Error: No Entry/Entity". It means the file couldn't be found. When I try your code, it works correctly and outputs
[{john,[jill,joe,bob]}]

Is there a way to determine the coverage of a .PO file?

I've got a python program under active development, which uses gettext for translation.
I've got a .POT file with translations, but it is slightly out of date. I've got a script to generate an up-to-date .PO file. Is there a way to check how much of the new .PO file is covered by the .POT file?
I've got a .POT file with translations, but it is slightly out of date. I've got a script to generate an up-to-date .PO file
I think you mean the other way around. POT files are generated from your source code with PO files containing the translations.
Is there a way to check how much of the new .PO file is covered by the .POT file?
The Gettext command line msgmerge program can be used for syncing your out-of-date PO files with your latest source strings. To create a new PO file from an updated POT you would issue this command:
msgmerge old.po new.pot > updated.po
The new file will contain all the existing translations that are still valid and add any new source strings. Open it in your favourite PO editor and you should see how many strings now remain untranslated.
Update
As pointed out in the comments, you can see how many strings remain untranslated with the "statistics" option of the msgfmt program (normally used for compiling to .mo) e.g.
msgfmt --statistics updated.po
Or without bothering with the interim file:
msgmerge old.po new.pot | msgfmt --statistics -
This would produce a synopsis like:
123 translated messages, 77 untranslated messages.

LibTiff.net - Save Directory

I have massive tiff file that contains 8 directories (resolutions). It's also a tiled.
I can cycle thru the directories and get the resolution of each. I want to save the 4th directory to a new tif file. I think it's possible but can't get my hands on it.
Basically want to do this:
using (LibTiff.Classic.Tiff image = LibTiff.Classic.Tiff.Open(file, "r"))
{
if (image.NumberOfDirectories() > 4) {
image.SetDirectory(4);
image.WriteDirectory("C:\\Temp\Test.tif");
}
}
It would be so nice if that was possible but I know I have to create an output image and copy the rows of data into it. Not sure how yet. Any help would be much appreciated.
There are no built-in methods in LibTiff.Net library that can be used to copy one directory into a new file.
The task is quite complex and the best place to start is to look at TiffCP utility's source code.
The utility no only can copy images but it can also extract directories.

How to compress multiple folders into one archive?

I have some compression components (like KAZip, JVCL, zLib) and exactly know how to use them to compress files, but i want to compress multiple folders into one single archive and keep folders structure after extract, how can i do it?
in all those components i just can give a list of files to compress, i can not give struct of folders to extract, there is no way (or i couldn't find) to tell every file must be extracted where:
i have a file named myText.txt in folder FOLDER_A and have a file with same name myText.txt in folder FOLDER_B:
|
|__________ FOLDER_A
| |________ myText.txt
|
|__________ FOLDER_B
| |________ myText.txt
|
i can give a list of files to compress: myList(myText.txt, myText.txt) but i cant give the structure for uncompress files, what is best way to found which file belongs to which folder?
The zip format just does not have folders. Well, it kinda does, but they are kind of empty placeholders, only inserted if you need metadata storage like user access rights. But other than those rather rare advanced things - there is no need for folders at all. What is really done - and what you can observe opening zip file in the notepad and scrolling to the end - is that each file has its path in it, starting with "archive root". In your exanple the zip file should have two entries (two files):
FOLDER_A/myText.txt
FOLDER_B/myText.txt
Note, that the separators used are true slashes, common to UNIX world, not back-slashes used in DOS/Windows world. Some libraries would fix back-slashes it for you, some would not - just do your tests.
Now, let's assume that that tree is contained in D:\TEMP\Project - just for example.
D:\TEMP\Project\FOLDER_A\myText.txt
D:\TEMP\Project\FOLDER_B\myText.txt
There are two more questions (other than path separators): are there more folders within D:\TEMP\Project\ that should be ignored, rather than zipped (like maybe D:\TEMP\Project\FOLDER_C\*.* ? and does your zip-library have direct API to pack the folders wit hall its internal subfolder and files or should you do it file by file ?
Those three questions you should ask yourself and check while choosing the library. The code drafts would be somewhat different.
Now let's start drafting for the libraries themselves:
The default variant is just using Delphi itself.
Enumerate the files in the folder: http://docwiki.embarcadero.com/CodeExamples/XE3/en/DirectoriesAndFilesEnumeraion_(Delphi)
If that enumeration results in absolute paths then strip the common D:\TEMP\Project from the beginning: something like If AnsiStartsText('D:\TEMP\Project\', filename) then Delete(filename, 1, Length('D:\TEMP\Project\'));. You should get paths relative to chosen containing place. Especially if you do not compress the whole path and live some FOLDER_C out of archive.
Maybe you should also call StringReplace to change '\' into '/' on filenames
then you can zip them using http://docwiki.embarcadero.com/Libraries/XE2/en/System.Zip.TZipFile.Add - take care to specify correct relative ArchiveFileName like aforementioned FOLDER_A/myText.txt
You can use ZipMaster library. It is very VCL-bound and may cause troubles using threads or DLLs. But for simple applications it just works. http://www.delphizip.org/
Last version page have links to "setup" package which had both sources, help and demos. Among demos there is an full-featured archive browser, capable of storing folders. So, you just can read the code directly from it. http://www.delphizip.org/191/v191.html
You talked about JVCL, that means you already have Jedi CodeLib installed. And JCL comes with a proper class and function, that judging by name can directly do what you want it too: function TJclSevenzipCompressArchive.AddDirectory(const PackedName: WideString; const DirName: string = ''; RecurseIntoDir: Boolean = False; AddFilesInDir: Boolean = False): Integer;
Actually all those libraries are rather similar on basic level, when i made XLSX export i just made a uniform zipping API, that is used with no difference what an actual zipping engine is installed. But it works with in-memory TStream rather than on-disk files, so would not help you directly. But i just learned than apart of few quirks (like instant vs postponed zipping) on ground level all those libs works the same.

How do I create a zip file of a given compressed size in rails

I have a pile of records that need to be converted to XML then zipped up into a file, so I can send it on to a server that is expecting said records.
The problem I have, is that the server can only accept files that are smaller than a given amount.. Lets say for argument sake 10 Megs
require 'zip/zip'
Zip::ZipOutputStream.open("tmp/myfile_#{Process.pid}.zip") do |zos|
i_xml.each_with_index do |xml, index|
zos.put_next_entry("#{index}.xml")
zos << xml
end
end
The code above creates the zip file perfectly.. but I don't see how I can get the compressed size.
I can give some lea-way for the zip header and stuff.. So once I can tell how big my output is, I could tinker. It's just getting that size seems not in the cards for this class.
Note: I've tried installing zipRuby because it's has a compressed size method, but that just leads me down another rabbit hole.. Native extensions and such.
Can't see anything in the Zip library to do this, sorry.
Consider, if you can:
pushing further with getting zipRuby to compile
breaking the finished zip file into fixed-size chunks with simple File.read statements and putting the chunks back together at the server.
limiting the size of the zip file by limiting the number of files added, e.g. add files until the file size limit is exceeded, then remove the last added file and add it to a new zip file

Resources