GZip stream compression with Delphi (optionally with tar) - delphi

I am searching and searching since hours to create a valid .tar.gz file using streams in Delphi 10.
I was able to solve the tarball part using LibTar, which works well.
After some searching I also found examples to decompress gzip data using just System.ZLib. The secret lies in the WindowBits parameter:
// 31 bit wide window = gzip only mode
DecompStream:= TZDecompressionStream.Create(SourceStream, 15 + 16);
TarStream:= TTarArchive.Create(DecompStream);
TarStream.Reset;
while TarStream.FindNext(DirRec) do {...} TarStream.ReadFile(TargetStream);
Great! But is it really possible that System.ZLib is able to decompress gzip (I guess by just ignoring the gzip header by that +16?), but is not able to create such header by itself? Whatever I try, I only get a file that cannot be opened by 7zip or WinRar, because the header is missing.
Maybe it just can't work, because the gzip header contains a checksum, so it's not possible to write the header without knowing the following data. How to solve this? Edit: this is wrong, see comments: crc32 is in the trailer.
It seems, many others also have this problem - I found and tried multiple solutions to add this header, but nothing really worked and everything requires adding long units (not nice but acceptable) or even DLLs (not acceptable for me).

The secret lies in the WindowBits parameter - sounds familiar? :)
Believe it or not, compressing to gzip just works the same way! I couldn't find this anywhere using Google, or in the Embarcadero documentation/help. But have a look at this comment in the System.ZLib source of Delphi Tokyo:
Add 16 to windowBits to write a simple gzip header and
trailer around the compressed data instead of a zlib wrapper. The
gzip header will have no file name, no extra data, no comment, no
modification time (set to zero), no header crc, and the operating
system will be set to 255 (unknown).
It works:
TargetStream:= TFileStream.Create(TargetFilename, fmCreate);
CompressStream:= TZCompressionStream.Create(TargetStream, zcDefault, 15 + 16);
TarStream:= TTarWriter.Create(CompressStream);
TarStream.AddStream(SourceStream1, SourceFilename1, Now);
TarStream.AddString(SourceString2, SourceFilename2, Now);

Related

Django silently discarding uploaded files with long paths

I am having an issue where Django Rest Framework appears to be silently discarding uploaded files with long paths.
Here is my view class and post method:
class UploadMediaViewSet(viewsets.ViewSet):
parser_classes = [parser.MultiPartParser]
# POST /api/upload/media/
def create(self, request):
LOG.info(f"************** request.FILES = {request.FILES}")
The form data that is sent is as follows:
------WebKitFormBoundaryBEDAIwXzG6Ik2xVY
Content-Disposition: form-data; name="transactionId"
804d4146-0947-4d96-90b5-8ffbbc0b2135
------WebKitFormBoundaryBEDAIwXzG6Ik2xVY
Content-Disposition: form-data; name="oOJGp433ODZvBOZTCXNz1oO7ogG0j3BRRBo98jpx1iIlvMPeNoc8nBKvpoTjx9PsOl5ulGGWniur3TdbDSd9TpgsnWhhqurcQO3TnssSQNHWti7xm7nZGW6tFRtrjrvwoJm9Bds5AsMcNKxT7oBkzA35fA1fgo5jkiUAfHHiduMdGIYf3NJGk8LP54JAORfYEK05mdHdQ4zfpMKfDUNJLnc5tk3H/AndroidLandscape.mp4"; filename="oOJGp433ODZvBOZTCXNz1oO7ogG0j3BRRBo98jpx1iIlvMPeNoc8nBKvpoTjx9PsOl5ulGGWniur3TdbDSd9TpgsnWhhqurcQO3TnssSQNHWti7xm7nZGW6tFRtrjrvwoJm9Bds5AsMcNKxT7oBkzA35fA1fgo5jkiUAfHHiduMdGIYf3NJGk8LP54JAORfYEK05mdHdQ4zfpMKfDUNJLnc5tk3H/AndroidLandscape.mp4"
Content-Type: video/mp4
------WebKitFormBoundaryBEDAIwXzG6Ik2xVY
Content-Disposition: form-data; name="oOJGp433ODZvBOZTCXNz1oO7ogG0j3BRRBo98jpx1iIlvMPeNoc8nBKvpoTjx9PsOl5ulGGWniur3TdbDSd9TpgsnWhhqurcQO3TnssSQNHWti7xm7nZGW6tFRtrjrvwoJm9Bds5AsMcNKxT7oBkzA35fA1fgo5jkiUAfHHiduMdGIYf3NJGk8LP54JAORfYEK05mdHdQ4zfpMKfDUNJLnc5tk3H/Yym32tTMGQAfAMVGFTUJA1z9zQB3YremlDV1Hluotwj21UZWP9Aop6QTPvUMVIZVS8Hk6gADadVu4TihPloTy5N7JX99SgPqf3JZILRSMtEMCXLeT4gw34aq5e0HfxetOlKHTx6m2uS1SLFHi8OvcujtWEIAlTfXQW5pvsFGMJYOwNwWjncOoZETXaTs1LspDUHchPEHypp4CHEM5Y3e5HhsKBkA9cFJs6oA26XQW7y/AndroidPortrait.mp4"; filename="oOJGp433ODZvBOZTCXNz1oO7ogG0j3BRRBo98jpx1iIlvMPeNoc8nBKvpoTjx9PsOl5ulGGWniur3TdbDSd9TpgsnWhhqurcQO3TnssSQNHWti7xm7nZGW6tFRtrjrvwoJm9Bds5AsMcNKxT7oBkzA35fA1fgo5jkiUAfHHiduMdGIYf3NJGk8LP54JAORfYEK05mdHdQ4zfpMKfDUNJLnc5tk3H/Yym32tTMGQAfAMVGFTUJA1z9zQB3YremlDV1Hluotwj21UZWP9Aop6QTPvUMVIZVS8Hk6gADadVu4TihPloTy5N7JX99SgPqf3JZILRSMtEMCXLeT4gw34aq5e0HfxetOlKHTx6m2uS1SLFHi8OvcujtWEIAlTfXQW5pvsFGMJYOwNwWjncOoZETXaTs1LspDUHchPEHypp4CHEM5Y3e5HhsKBkA9cFJs6oA26XQW7y/AndroidPortrait.mp4"
Content-Type: video/mp4
------WebKitFormBoundaryBEDAIwXzG6Ik2xVY--
When my create() method receives the request, I find that request.FILES contains only the first file (AndroidLandscape.mp4). The second file (AndroidPortrait.mp4) seems to be silently discarded.
I suspect that this is being done by parser.MultiPartParser, but I'm not sure.
Is it being discarded because the path is too long?
(Update: I did some testing, and 470 characters seems to be the magic path length limit. If the path is 471 characters or longer, the file is NOT included in request.FILES)
If upload paths cannot be that long, I can accept that, but I need to detect that this has happened so that I can return an appropriate error response to the client, instead of silently discarding files. If so, how can I detect that in my method?
I finally found out why this is happening. The Django Rest Framework's multi-part form header parser has the maximum header length hard coded at 1024 bytes. With a long file path, the size of the header is too long, and the MultiPartParser stops reading from the stream after 1024 bytes. This results in the header being invalid, and the file is discarded. Unfortunately, this "overflow" is silently swallowed, and thus the file is silently discarded, leaving no way for the developer's code to even know that it happened, or that the file was even attempted to be uploaded.
I was able to implement a working solution by subclassing/overriding the affected classes, copy/pasting the library code for the affected methods, and finally changing the hard coded 1024 byte limit to a higher number.
It's not a great solution, because patching library code is brittle, and the solution could cause conflicts in future versions of DRF, but that's the only solution that I see at this point.
If anyone wants to implement this solution, the code with the hard coded limit is this:
in .../site-packages/django/http/multipartparser.py:
class Parser:
def __init__(self, stream, boundary):
self._stream = stream
self._separator = b'--' + boundary
def __iter__(self):
boundarystream = InterBoundaryIter(self._stream, self._separator)
for sub_stream in boundarystream:
# Iterate over each part
yield parse_boundary_stream(sub_stream, 1024)
The 1024 in the last line must be increased to the desired max header length.

Which compression types support chunking in dask?

When processing a large single file, it can be broken up as so:
import dask.bag as db
my_file = db.read_text('filename', blocksize=int(1e7))
This works great, but the files I'm working with have a high level of redundancy and so we keep them compressed. Passing in compressed gzip files gives an error that seeking in gzip isn't supported and so it can't be read in blocks.
The documentation here http://dask.pydata.org/en/latest/bytes.html#compression suggests that some formats support random access.
The relevant internal code I think is here:
https://github.com/dask/dask/blob/master/dask/bytes/compression.py#L47
It looks like lzma might support it, but it's been commented out.
Adding lzma into the seekable_files dict like in the commented out code:
from dask.bytes.compression import seekable_files
import lzmaffi
seekable_files['xz'] = lzmaffi.LZMAFile
data = db.read_text('myfile.jsonl.lzma', blocksize=int(1e7), compression='xz')
Throws the following error:
Traceback (most recent call last):
File "example.py", line 8, in <module>
data = bag.read_text('myfile.jsonl.lzma', blocksize=int(1e7), compression='xz')
File "condadir/lib/python3.5/site-packages/dask/bag/text.py", line 80, in read_text
**(storage_options or {}))
File "condadir/lib/python3.5/site-packages/dask/bytes/core.py", line 162, in read_bytes
size = fs.logical_size(path, compression)
File "condadir/lib/python3.5/site-packages/dask/bytes/core.py", line 500, in logical_size
g.seek(0, 2)
io.UnsupportedOperation: seek
I assume that the functions at the bottom of the file (get_xz_blocks) for example can be used for this, but don't seem to be in use anywhere in the dask project.
Are there compression libraries that do support this seeking and chunking? If so, how can they be added?
Yes, you are right that the xz format can be useful to you. The confusion is, that the file may be block-formatted, but the standard implementation lzmaffi.LZMAFile (or lzma) does not make use of this blocking. Note that block-formatting is only optional for zx files, e.g., by using --block-size=size with xz-utils.
The function compression.get_xz_blocks will give you the set of blocks in a file by reading the header only, rather than the whole file, and you could use this in combination with delayed, essentially repeating some of the logic in read_text. We have not put in the time to make this seamless; the same pattern could be used to write blocked xz files too.

Using MSXML2.ServerXMLHTTP to access data from a web page returns truncated data in Lua

I am trying to download a source code file from a web site which works fine for small files, but a couple of larger ones get truncated.
The example below should be returning a file 146,135 bytes in size, but returns one of 141,194 bytes with a status of 200.
I have tried winhttp.winhttprequest.5.1 as well, but both seem to truncate at the same point.
I have also found quite a few people with similar problems, but have not been able to find a solution.
require('luacom')
http = luacom.CreateObject('MSXML2.ServerXMLHTTP')
http:Open("GET","http://www.family-historian.co.uk/wp-content/plugins/forced-download2/download.php?path=/wp-content/uploads/formidable/tatewise/&file=Map-Life-Facts3.fh_lua&id=190",true)
http:Send()
http:WaitForResponse(30)
print('Status: '..http.Status)
print('----------------------------------------------------------------')
headers = http:GetAllResponseHeaders()
data = http.Responsetext
print('Data Size = '..#data)
print('----------------------------------------------------------------')
print(headers)
I finally worked out what was going on so will post it here for others.
To avoid the truncation I needed to use ResponseBody and not ResponseText, what appears to be happening is the file is being sent in binary format, the ResponseText data is the same number of bytes as the ResponseBody one, but is in UTF-8 format, this means the number if special characters in the file (which are double byte in UTF-8 are dropped from the end of the ResponseText. I am not sure at what level the "mistake" in the length is made, but the way to avoid it is to use ResponseBody.

any functions to create zip file of directory/file on vista with delphi 2009

I am looking for a simple method of zipping and compressing with delphi. I have already looked at the components at torry delphi:http://www.torry.net/pages.php?s=99. They all seem as though they would accomplish what I want however a few disadvantages to using them is that none of them run in delphi 2009 and are very complex which makes it difficult for me to port them to delphi 2009. And besides, the documentation on them is scarce, well at least to me. I need basic zipping functionality without the overhead of using a bunch of DLLs. My quest lead me to FSCTL_SET_COMPRESSION which I thought would have settled the issue but unfortunately this too did not work. CREATEFILE looked promising, until I tried it yielded the same result as FSCTL_SET... I know that there are some limited native zipping capability on windows. For instance if one right clicks a file or folder and selects -> sendTo ->zipped folder, a zipped archive is smartly created. I think if I was able to access that capability from delphi it will be a solution. On a side issue, does linux have its own native zipping functions that can be used similar to this?
TurboPower's excellent Abbrevia can be downloaded for D2009 here, D2010 support is underway and already available in svn according to their forum.
Abbrevia used to be a commercial (for $$$) product, which means that the documentation is quite complete.
I use Zipforge. Why are there problems porting these to D2009? Is it because of the 64bit??
Here is some sample code
procedure ZipIt;
var
Archiver: TZipForge;
FileName: String;
begin
try
Archiver:= TZipForge.create(self);
with Archiver do begin
FileName := 'c:\temp\myzip.zip';
// Create a new archive file
OpenArchive(fmCreate);
// Set path to folder with some text files to BaseDir
BaseDir := 'c:\temp\';
// Add all files and directories from 'C:\SOURCE_FOLDER' to the archive
AddFiles('myfiletozip.txt');
// Close the archive
CloseArchive;
end;
finally
Archiver.Free;
end;
end;
If you can "do" COM from Delphi, then you can take advantage of the built-in zip capability of the Windows shell. It gives you good basic capability.
In VBScript it looks like this:
Sub CreateZip(pathToZipFile, dirToZip)
WScript.Echo "Creating zip (" & pathToZipFile & ") from folder (" & dirToZip & ")"
Dim fso
Set fso= Wscript.CreateObject("Scripting.FileSystemObject")
If fso.FileExists(pathToZipFile) Then
WScript.Echo "That zip file already exists - deleting it."
fso.DeleteFile pathToZipFile
End If
If Not fso.FolderExists(dirToZip) Then
WScript.Echo "The directory to zip does not exist."
Exit Sub
End If
NewZip pathToZipFile
dim sa
set sa = CreateObject("Shell.Application")
Dim zip
Set zip = sa.NameSpace(pathToZipFile)
WScript.Echo "opening dir (" & dirToZip & ")"
Dim d
Set d = sa.NameSpace(dirToZip)
For Each s In d.items
WScript.Echo s
Next
' http://msdn.microsoft.com/en-us/library/bb787866(VS.85).aspx
' ===============================================================
' 4 = do not display a progress box
' 16 = Respond with "Yes to All" for any dialog box that is displayed.
' 128 = Perform the operation on files only if a wildcard file name (*.*) is specified.
' 256 = Display a progress dialog box but do not show the file names.
' 2048 = Version 4.71. Do not copy the security attributes of the file.
' 4096 = Only operate in the local directory. Don't operate recursively into subdirectories.
WScript.Echo "copying files..."
zip.CopyHere d.items, 4
' wait until finished
sLoop = 0
Do Until d.Items.Count <= zip.Items.Count
Wscript.Sleep(1000)
Loop
End Sub
COM also allws you to use DotNetZip, which is a free download, that does password-encrypted zips, zip64, Self-extracting archives, unicode, spanned zips, and other things.
Personally I use VCL Zip which runs with D2009 and D2010 perfectly fine. it does cost $120 at the time of this post but is very simple, flexible and most of all FAST.
Have a look at VCLZIP and download the trail if your interested
code wise:
VCLZip1.ZipName := ‘myfiles.zip’;
VCLZip1.FilesList.add(‘c:\mydirectory\*.*’);
VCLZip1.Zip;
is all you need for a basic zip, you can of course set compression levels, directory structures, zip streams, unzip streams and much more.
Hope this is of some assistance.
RE
Take a look at this OpenSource SynZip unit. It's even faster for decompression than the default unit shipped with Delphi, and it will generate a smaller exe (crc tables are created at startup).
No external dll is needed. Works from Delphi 6 up to XE. No problem with Unicode version of Delphi. All in a single unit.
I just made some changes to handle Unicode file names inside Zip content, not only Win-Ansi charset but any Unicode chars. Feedback is welcome.

Opening File paths with spaces in Delphi 5

(Using Delphi 5)
I am attempting to open a log file using the following code:
// The result of this is:
// C:\Program Files\MyProgram\whatever\..\Blah\logs\mylog.log
fileName := ExtractFilePath(Application.ExeName) + '..\Blah\logs\mylog.log';
// The file exists check passes
if (FileExists(fileName)) then
begin
logs := TStringList.Create();
// An exception is thrown here: 'unable to open file'
logs.LoadFromFile(fileName);
end;
If I relocate the log file to C:\mylog.log the code works perfectly. I'm thinking that the spaces in the file path are messing things up. Does anyone know if this is normal behavior for Delphi 5? If it is, is there a function to escape the space or transform the path into a windows 8.3 path?
I'm pretty sure that Delphi 5 handles spaces in filenames ok but it has been a very long time since I have used that specific version. Is the file currently open by another process? It also could be a permissions issue. Can you instead of loading it into a tStringList, try opening it with a tFileStream with the filemode set to "fmOpenRead or fmShareDenyNone".
fStm := tFileStream.Create( filename, fmOpenRead or fmShareDenyNone );
then load your tStringlist from the stream:
Logs.LoadFromStream ( fStm );
Are you sure its not the "..\" thats causing the problem rather than the spaces. Have you tried to see if it works at
c:\My\Path\nospaces\
If so and you are always using the ..\ path, maybe write a simple function to remove the last folder from your application path and create a full correct pathname.
It's odd that Delphi 5 would throw errors about this. I know of an issue with FileExists failing on files with an invalid last-modified-date (since it internally uses FileAge), but it's the opposite here. Instead of using "..\" I would consider risking the current path, and loading from a relative path: LoadFromFile('..\Something\Something.log'); especially for smaller applications, or by calling ExtractFilePath twice: ExtractFilePath(ExtractFilePath(Application.ExeName))
I'm pretty sure Delphi has always handled spaces so I doubt that is the issue.
You don't show the full path. Any chance it is really long? For example I could believe an issue with paths longer than 255 characters.
It's also a bad idea to put log files under Program Files. Often normal users are not given permission to write to anything under Program Files.
Delphi 5 can open files with spaces - that is certainly not the problem. To prove it, try copying it to c:\my log.log- it should open fine.
Is there any more information in the error message you receive? The most likely thing is that someone else (perhaps your own program) is still writing to the log.
The spaces are not a problem. While the '..' could be a problem in Delphi 5, mosts probably the file is locked by the process that writes to it. If you have control of it, make sure it opens the file with fmShareDenyWrite and not fmShareExclusive or fmShareCompat (which is the default).
Also, you can use:
fileName := ExpandFileName(ExtractFilePath(Application.ExeName) + '..\Blah\logs\mylog.log');
to obtain the absolute path from a relative path.
Also, as others have said, it is not good idea to write anything in Program Files. Regular users (that are not Administrators or Power Users) do not have rights to write there (although in Vista is will be virtualized, is is still not a good idea). Use the appropriate Application Data folder for the user (or all users). This folder can be obtained using:
SHGetFolderPath(0,folder,0,SHGFP_TYPE_CURRENT,#path[0])
where folder is either CSIDL_COMMON_APPDATA or CSIDL_LOCAL_APPDATA. See this delphi.about.com article for an example.
Simple :
// if log file = "C:\Program files\mylog.log"
// you'll get :
// »»»»» fileName = 'C:\Program files..\Blah\logs\mylog.log'
// if log file = "C:\mylog.log"
// you'll get :
// »»»»» fileName = 'C:..\Blah\logs\mylog.log'
Try this code instead, I'm pretty sure it will fit your needs :
fileName := IncludeTrailingPathDelimiter(ExtractFilePath(Application.ExeName))
+ '..\Blah\logs\mylog.log';
Regards,
Olivier
Delphi 5 has never had a problem opening files with spaces and I am still using it since it is uber stable and works great for older XP apps. You need to check your code closely.

Resources