Getting the current index in the input string (flex lexer) - flex-lexer

I am using flex lexer. Is there a way to (1) get the current index in the input string (2) jump back to that index in a future time point?
Thanks.

It's fairly easy to maintain the current input position. When any rule is matched, yyleng contains the length of the match, so it is sufficient to add yyleng to the cumulative length processed. Assuming you are using flex, it is not necessary to insert the code directly into every rule action, which would be tedious. Instead, you can use the YY_USER_ACTION macro:
#define YY_USER_ACTION input_pos += yyleng;
(This assumes that you have defined input_pos somewhere, and arranged for it to be initialized to 0 when the lexical scan commences.)
This will lead to incorrect results if you use REJECT, yymore(), yyless() or input(); in all of these cases, you will have to adjust the value of input_pos. For every call to yymore(), you need to subtract yyleng from input_pos; this will also work for REJECT. For a call to yyless(), you can subtract yyleng before the call and add it back after the call. For each call to input(), you need to add one to input_pos.
Within a rule, you can then use input_pos as the position at the end of the match, or input_pos - yyleng as the position at the beginning of the match.
Returning to a saved position is trickier.
(F)lex does not maintain the entire input in memory, so in principle you would need to use fseek() to rewind yyin to the correct place. However, in the common case where yyin has not been opened in binary mode, you cannot reliably use fseek() to return to a computed input offset. So at a minimum, you would have to ensure that yyin was opened (or reopened) in binary mode.
Moreover, it is not in general possible to guarantee that whatever stream yyin is attached to can be rewound at all (it might be console input, a pipe, or some other non-seekable device). So to be fully general, you might have to use a temporary file to store data read from the stream. This will create additional complications when you attempt to reread previous input, because you will have to switch to the temporary file for reading until it is finished, at which point you would have to return to the main file. Creative use of yywrap will simplify this procedure.
Note that after you rewind the input stream -- whether or not you switch to reading from a temporary file -- you must call yyrestart() to reset the scanner's input buffer. (This is also a flex-only feature; Posix lex does not specify the mechanism by which you inform the scanner that its buffer needs to be reset, so if you are not using flex you will have to consult the relevant documentation for your scanner generator.)

Related

How do I get the TimeFrame for an open order in MT mq4?

I'm scanning through the order list using the standard OrderSelect() function. Since there is a great function to get the current _Symbol for an order, I expected to find the equivalent for finding the timeframe (_Period). However, there is no such function.
Here's my code snippet.
...
for (int i=orderCount()-1; i>=0; i--) {
if (OrderSelect(i, SELECT_BY_POS, MODE_TRADES)) {
if (OrderMagicNumber()==magic && OrderSymbol()==_Symbol ) j++;
// Get the timeframe here
}
}
...
Q: How can I get the open order's timeframe given it's ticket number?
In other words, how can I roll my own OrderPeriod() or something like it?
There is no such function. Two approaches might be helpful here.
First and most reasonable is to have a unique magic number for each timeframe. This usually helps to avoid some unexpected behavior and errors. You can update the input magic number so that the timeframe is automatically added to it, if your input magic is 123 and timeframe is M5, the new magic number will be 1235 or something similar, and you will use this new magic when sending orders and checking whether a particular order is from your timeframe. Or both input magic and timeframe-dependent, if you need that.
Second approach is to create a comment for each order, and that comment should include data of the timeframe, e.g. "myRobot_5", and you parse the OrderComment() in order to get timeframe value. I doubt it makes sense as you'll have to do useless parsing of string many times per tick. Another problem here is that the comment can be usually changed by the broker, e.g. if stop loss or take profit is executed (and you need to analyze history), and if an order was partially closed.
One more way is to have instances of some structure of a class inherited from CObject and have CArrayObj or array of such instances. You will be able to add as much data as needed into such structures, and even change the timeframe when needed (e.g., you opened a deal at M5, you trail it at M5, it performs fine so you close part and virtually change the timeframe of such deale to M15 and trail it at M15 chart). That is probably the most convenient for complex systems, even though it requires to do some coding (do not forget to write down the list of existing deals into a file or deserialize somehow in OnDeinit() and then serialize back in OnInit() functions).

Get some information from HEVC reference software

I am new to HEVC and I am understanding the reference software now (looking at intra prediction right now).
I need to get information as below after encoding.
the CU structure for a given CTU
for each CU during calculations, it's information (eg. QP value, selected mode for Luma, selected mode for chroma, whether the CU is in final CU structure of the CTU-split decision, etc.)
I know CTU decision are made when m_pcCuEncoder->compressCtu( pCtu ) is called in TEncSlice.cpp. But where exactly I can get these specific information? Can someone help me with this?
p.s. I am learning C++ too (I have a Java background).
EDIT: This post is a solution for the encoder side. However, the decoder side solution is far less complex.
Getting CTU information (partitioning etc.) is a bit tricky at encoder if you are new to the code. But I try to help you with it.
Everything that I am going to tell you is based on the JEM code and not HM, but I am pretty sure that you can apply them to HM too.
As you might have noticed, there are two completely separate phases for compression/encoding of each CTU:
The RDO phase: first there is the Rate-Distortion Optimization loop to "make the decisions". In this phase, literally all possible combinations of the parameters are tested (e.g. differetn partitionings, intra modes, filters etc.). At the end of this phase the RDO determines the best combination and passes them to the second phase.
The encoding phase: Here the encoder does the actual final encoding step. This includes writing all the bins into the bitstream, based on the parameters determined during the RDO phase.
In the CTU level, these two phases are performed by the m_pcCuEncoder->compressCtu( pCtu ) and the m_pcCuEncoder->encodeCtu( pCtu ) functions, respectively, both in the compressSlice() function of the TEncSlice.cpp file.
Given the above information, you must look for what you are looking for, in the second phase and not the first phase (you may already know these things, but I suspected that you might be looking at the first phase).
So, now this is my suggestion for getting your information. It's not the best way to do it, but is easier to explain here.
You first go to this point in your HM code:
compressGOP() -> encodeSlice() -> encodeCtu() -> xEncodeCU()
Then you find the line where the prediction mode (intra/inter) is encoded:
m_pcEntropyCoder->encodePredMode()
At this point, you have access to the pcCU object which contains all the final decisions, including the information you look for, that are made during the first phase. At this point of the code, you are dealing with a single CU and not the entire CTU. But if you want your information for the entire CTU, you may go back to
compressGOP() -> encodeSlice() -> encodeCtu()
and find the line where the xEncodeCU() function is called for the first time. There, you will have access to the pCtu object.
Reminder: each TComDataCU object (pcCU if you are in the CU level, or pCtu if you are in the CTU level) of size WxH is split to NumPartition=(W/4)x(H/4) partitions of size 4x4. Each partition is accessible by an index (uiAbsPartIdx) which indicates its Z-scan order. For example, the uiAbsPartIdx for the partition at <x=8,y=0> is 4.
Now, you do the following steps:
Get the number of partitions (NumPartition) within your pCtu by calling pCtu->getTotalNumPart().
Loop over all NumPartition partitions and call the functions pCtu->getWidth(idx), pCtu->getHeight(idx), pCtu->getCUPelX(idx) and pCtu->getCUPelY(), where idx is your loop iterator. These functions return the following information for each CU coincided with the 4x4 partition at idx: width, height, positionX, positionY. [both positions are relative to the pixel <0,0> of the frame]
The above information is enough for deriving the CTU partitioning of the current pCtu! So the last step is to write a piece of code to do that.
This was an example of how to extract CTU partitioning information during the second phase (i.e. encoding phase). However, you may call some proper functions to get the other information in your second question. For example, to get selected luma intra mode, you may call pCtu->getIntraDir(CHANNEL_TYPE_LUMA, idx), instead of getWidth()/getHeight() functions. Or pCtu->getQP(CHANNEL_TYPE_LUMA, idx) to get the QP value.
You can always find a list of functions that provide useful information at the pCtu level, in the TComDataCU class (TComDataCU.cpp).
I hope this helps you. If not, let me know!
Good luck,

Going back to old position in lex

During my lex processing, I need to go back in the lex input file, to process the same input several times with different local settings.
However, just doing fseek(yyin, old_pos, SEEK_SET); does not work, since the input data are buffered by lex. How can I (portably) deal with this?
I tried to add a YY_FLUSH_BUFFER after the fseek(), but it didn't help, since the old file position was incorrect (it was set to the point after filling the buffer, not to the point where I evaluate the token).
The combination of YY_FLUSH_BUFFER() and fseek(yyin, position, SEEK_SET) (in either order, but I would do the YY_FLUSH_BUFFER() first) will certainly cause the next token to be scanned starting at position. The problem is figuring out the correct value of position.
It is relatively simple to track the character offset (but see the disclaimer below if you require a portable scanner which could run on non-Posix platforms such as Windows):
%{
long scan_position = 0;
%}
%%
[[:space:]]* scan_position += yyleng;
"some pattern" { scan_position += yyleng; ... }
Since it's a bit tedious to insert scan_position += yyleng; into every rule, you can use flex's helpful YY_USER_ACTION macro hook: this macro is expanded at the beginning of every action (even empty actions). So you could write the above more simply:
%{
long scan_position = 0;
#define YY_USER_ACTION scan_position += yyleng;
%}
%%
[[:space:]]*
"some pattern" { ... }
One caveat: This will not work if you use any of the flex actions which adjust token length or otherwise alter the normal scanning procedure. That includes at least yyless, yymore, REJECT, unput and input. If you use any of the first three, you need to reset scan_position -= yyleng; (that needs to go just before the invocation of yyless, yymore or REJECT. For input and unput, you need to increment / decrement scan_position to account for the character read outside of the scanning process.
Disclaimer:
Tracking positions like that assumes that there is a one-to-one correspondence between bytes read from an input stream and raw bytes in the underlying file system. For Posix systems, this is guaranteed to be the case: fread(3) and read(2) will read the same bytes and the b open mode flag has no effect.
In general, though, there is no reliable way of tracking file position. You could open the stream in binary mode and deal with the system's idiosyncratic line endings yourself (this will work on Windows but there is no portable way of establishing what the line ending sequence is, so it is not portable either). But on other non-Posix systems, it is possible that a binary read produces a completely different result (for example, the underlying file might use fixed-length records so that each line is padded (with some system-specific padding character) to make it the correct length.
That's why the C standard prohibits the use of computed offset values:
For a text stream, either offset shall be zero, or offset shall be a value returned by an earlier successful call to the ftell function on a stream associated with the same file and whence shall be SEEK_SET. (ยง7.21.9.2 "The fseek function", paragraph 4).
There is no way to turn buffering off in flex -- or any version of lex that I know of -- because correctly handling fallback depends on being able to buffer. (Fallback happens when the scan has proceeded beyond the end of a token, because the token matches the prefix of a longer token which happens not to be present.)
I think the only portable solution would be to copy the input stream token by token into your own buffer (or temporary file) and then use yy_push_buffer_state and yy_scan_buffer (if you're using a buffer) to insert that buffer into the input stream. That solution would look a lot like the tracking code above, except that YY_USER_ACTION would append the tokens read to your own string buffer or temporary file. (You would want to make that conditional on a flag so that it only happens in the segment of the file you want to rescan.) If you have nested repeats, you could track the position in your own buffer/file in order to be able to return to it.

Tracking address when writing to flash

My system needs to store data in an EEPROM flash. Strings of bytes will be written to the EEPROM one at a time, not continuously at once. The length of strings may vary. I want the strings to be saved in order without wasting any space by continuing from the last write address. For example, if the first string of bytes was written at address 0x00~0x08, then I want the second string of bytes to be written starting at address 0x09.
How can it be achieved? I found that some EEPROM's write command does not require the address to be specified and just continues from lastly written point. But EEPROM I am using does not support that. (I am using Spansion's S25FL1-K). I thought about allocating part of memory to track the address and storing the address every time I write, but that might wear out flash faster. What is widely used method to handle such case?
Thanks.
EDIT:
What I am asking is how to track/save the address in a non-volatile way so that when next write happens, I know what address to start.
I never worked with this particular flash, but I've implemented something similar. Unfortunately, without knowing your constrains / priorities (memory or CPU efficient, how often write happens etc.) it is impossible to give a definite answer. Here are some techniques that you may want to consider. I don't know if they are widely used though.
Option 1: Write X bytes containing string length before the string. Then on initialization you could parse your flash: read the length n, jump n bytes forward; read the next byte. If it's empty (all ones for your flash according to the datasheet) then you got your first empty bit. Otherwise you've just read the length of the next string, so do the same over again.
This method allows you to quickly search for the last used sector, since the first byte of the used sector is guaranteed to have a value. The flip side here is overhead of extra n bytes (depending on the max string length) each time you write a string, and having to parse it to get the value (although this can only be done once on boot).
Option 2: Instead of prepending the size, append the unique "end-of-string" sequence, and then parse on boot for the last sequence before ones that represent empty flash.
Disadvantage here is longer parse, but you possibly could get away with just 1 byte-long overhead for each string.
Option 3 would be just what you already thought of: allocating a separate sector that would contain the value you need. To reduce flash wear you could also write these values back-to-back and search for the last one each time you boot. Also, you might consider the expected lifetime of the device that you program versus 100,000 erases that your flash can sustain (again according to the datasheet) - is wearing even a problem? That of course depends on how often data will be saved.
Hope that helps.

Append MIDI files using bytearray in Actionscript 3

I need to append MIDI files: leave header (same for all files) and other meta information, just copy music/score part.
I already have MIDI files in appropriate bytearrays, as I guessed I need to use writeBytes, but unfortunately couldn't find which bytes I need to take and copy.
Something like this:
var newFileBytes:ByteArray=new ByteArray();
newFileBytes.writeBytes(firstMIDIBytes);
newFileBytes.writeBytes(secondMIDIBytes,8);
Works only partially, file is playable; first part fully and second - only some notes (then player hangs out)
To say truth byteArrays aren't my strong side, as the MIDI file structure.
Can you suggest how to solve this?
Thanks in advance.
As per my comment, you probably mean to append these files, not merge them. Assuming that to be the case, you can't simply slap the data from the second file to the end of the first. As the MIDI protocol is bandwidth-optimized, it makes a number of assumptions regarding the streaming of events. These behaviors mean that you must take special care when appending MIDI data.
MIDI files can (and usually) use running status, which means that an even may omit the status byte, in which case the event should use the status byte of the previous event. This may not be the cause of your problems, but are you absolutely sure that you are only parsing raw MIDI data, and not the file headers and such? If this were the case, all sorts of weird data would be erroneously interpreted as valid MIDI events.
Events in MIDI files use relative offsets to the previous event in the sequence. The way that this is calculated is a bit complicated, but it involves a few properties (such as tempo, number of pulses/sec, etc) which are defined in the MIDI file header. If you stripped these events, and the properties are different for the second file, then the timing of these events will be wrong.
Basically, the only safe way to append the two MIDI files is to play them through a sequencer and re-write them to a new stream. Appending the byte arrays will probably be the cause of many mysterious bugs.
The structure of a MIDI file doesn't allow you to just "append" more data to it, for the following reasons:
Each track ends with an End of Track event, rendering all notes after that event meaningless.
Each track header chunk defines the size of the data that follows. Even if you append new data, any reader will only read [size] bytes before it starts looking for a new track.
A MIDI file defines how many tracks are present in the file, so even if you appended the byte array of a single MIDI track, unless you also update the track count of the header data, any reader would simply ignore the track you added.
If you add data to a MIDI file, you need to make sure the structural integrity of the file format is maintained. Simply appending data does not accomplish this.

Resources