I have this file, it contains:
"AAAAAAA"
I want to add "11111" to the file above. I tried two different calls, BOTH with seekToFileOffset:0:
fileHandleForWritingAtPath:
"11111AA"
Some items at the front parts of the file are truncated (gone)
I also tried:
fileHandleForUpdatingAtPath:
It also ended with:
"11111AA"
You have two choices, depending on your skill level
rewrite the file with a new name, delete the original file, and rename the newly written file to the original name.
rewrite the file in place. For example, using a 1K buffer, you do this by starting at the end, read the last 1K, and rewrite at the same location+an offset corresponding to the number of bytes to insert. Repeat for all previous data. When you get to the front of the file, you will have moved all the data by the desired offset and can then write the new data.
Related
Let's assume there is a binary file format, and it contains a header and blocks of data which location and size is deriveable from the info in tue header.
Let's assume that the data can span multiple blocks and has a certain structure within it. I.e. it may be an uncompressed video stream.
Let's assumme we want to add some data into the middle of file. We can copy the tail into a separate temp file, then overwrite the block formerly helonging to the tail, then write the tail back after it. Feels like inefficient!
If the file format was designed a certain way, i.e. if the header determined the order of blocks, then it'd be possible to just append the block into the end and then rewrite just the header. We assume the file format is not designed this way.
We also assume tuat the data must be in a single file.
If we assumme that all the blocks within file are aligned with blocks of FS, the data about the blocks in the header is redundant and it may be possible to modify the FS metadata to allocate a fs block and write the new data into it and then modify the chain of fs blocks to have the newly-created block be in the right place.
Does any OS provide such an API?
I've developed a Delphi service that writes a log to a file. Each entry is written to a new line. Once this logfile reaches a specific size limit, I'd like to trim the first X lines from the beginning of the file to keep its size below the specified limit. I've found some code here on SO which demonstrates how to delete chunks of data from the BOF, but how do I go about deleting full randomly sized lines and not given chunks?
I have a BigQuery table where each row represent a text file (gs://...) and a line number.
file, line, meta
file1.txt, 10, meta1
file2.txt, 12, meta2
file1.txt, 198, meta3
Each file is about 1.5Gb and there are about 1k files in the my bucket. My goal is extract lines specified in the BQ table.
I decided to implement the following plan:
Map table => KV<file,line>
Reduce KV<file,line> => KV<file, [lines]>
Map KV<file, [lines]> => [KV<file, rowData>]
where rowData means actual data from file on the some line from lines.
If I read docs and SO carefully, TextIO.Read isn't supposed to be used in such conditions. As a workaround I can use GcsIoChannelFactory to read files from GCS. Is it correct? Is it a preferable approach for the described task?
Yes, your approach is correct. There is currently no better approach to reading lines with line numbers from text files, except for doing it yourself using GcsIoChannelFactory (or writing a custom FileBasedSource, but this is more complex, and wouldn't work in your case because the filenames are not known in advance).
This and other similar scenarios will get much better with Splittable DoFn - work on that is in progress, but it is a large amount of work, so no timeline yet.
what is the difference between .pag file and .ind file ?
I know the page file contains actual data means data-blocks and cells and index file holds the pointer of data block i.e. available in page file.
but is there any other difference ?regarding size?
As per my opinion size of page file is always larger than index file. Is it write?
If the size of Index file is larger than page file then what happened?If size of index file is larger than page file then is write?
If I have deleted the page file then it's affect to index file?
or
If I have deleted some data-block from page file then how is affect to index file?
You are correct about the page file including the actual data of the cube (although there is no data without the index, so in effect they are both the data).
Very typically the page files are bigger than the index. It's simply based on the number of dimensions and whether they are sparse or dense, the number of stored members in the dimensions, the density of the data blocks, the compression scheme used in the data blocks, and the number of index entries in the database.
It's not a requirement that one be larger than the other, it will simply depend on how you use the cube. I would advise you to not really worry about it unless you run into specific performance problems. At that point it is then useful, if for the purposes of optimizing retrieval, calc, or data load time, whether you should make a change to the configuration of the cube.
If you delete the page file it doesn't affect the index file necessarily, but you would lose all of the data in the cube. You would also lose the data if you just deleted all the index files. While the page files have data in them, as I mentioned, it is truly the combination of the page and index files that make up the data in the cube.
Under the right circumstances you can delete data from the database (such as doing a CLEARDATA operation) and you can reduce the size of the page files and/or the index. For example, deleting data such that you are clearing out some combination of sparse members may reduce the size of the index a bit as well as any data blocks associated with those index entries (that is, those particular combinations of sparse dimensions). It may be necessary to restructure and compact the cube in order for the size of the files to decrease. In fact, in some cases you can remove data and the size of the store files could grow.
I need to append MIDI files: leave header (same for all files) and other meta information, just copy music/score part.
I already have MIDI files in appropriate bytearrays, as I guessed I need to use writeBytes, but unfortunately couldn't find which bytes I need to take and copy.
Something like this:
var newFileBytes:ByteArray=new ByteArray();
newFileBytes.writeBytes(firstMIDIBytes);
newFileBytes.writeBytes(secondMIDIBytes,8);
Works only partially, file is playable; first part fully and second - only some notes (then player hangs out)
To say truth byteArrays aren't my strong side, as the MIDI file structure.
Can you suggest how to solve this?
Thanks in advance.
As per my comment, you probably mean to append these files, not merge them. Assuming that to be the case, you can't simply slap the data from the second file to the end of the first. As the MIDI protocol is bandwidth-optimized, it makes a number of assumptions regarding the streaming of events. These behaviors mean that you must take special care when appending MIDI data.
MIDI files can (and usually) use running status, which means that an even may omit the status byte, in which case the event should use the status byte of the previous event. This may not be the cause of your problems, but are you absolutely sure that you are only parsing raw MIDI data, and not the file headers and such? If this were the case, all sorts of weird data would be erroneously interpreted as valid MIDI events.
Events in MIDI files use relative offsets to the previous event in the sequence. The way that this is calculated is a bit complicated, but it involves a few properties (such as tempo, number of pulses/sec, etc) which are defined in the MIDI file header. If you stripped these events, and the properties are different for the second file, then the timing of these events will be wrong.
Basically, the only safe way to append the two MIDI files is to play them through a sequencer and re-write them to a new stream. Appending the byte arrays will probably be the cause of many mysterious bugs.
The structure of a MIDI file doesn't allow you to just "append" more data to it, for the following reasons:
Each track ends with an End of Track event, rendering all notes after that event meaningless.
Each track header chunk defines the size of the data that follows. Even if you append new data, any reader will only read [size] bytes before it starts looking for a new track.
A MIDI file defines how many tracks are present in the file, so even if you appended the byte array of a single MIDI track, unless you also update the track count of the header data, any reader would simply ignore the track you added.
If you add data to a MIDI file, you need to make sure the structural integrity of the file format is maintained. Simply appending data does not accomplish this.