End of File detection in ESPER - esper

I am using ESPER to read events from a CSV file.
How can I make the query output something when reading a CSV file is finished.
For example i want to output every 30 min or at the end of the file
SELECT id FROM stream output every 30 min or [ EOF reached ]
Thanks in advance
Regards

The "adapter.start()" finishes when the CSV file is done and the code can send a EOF event into the engine. You could declare a context that ends on that EOF event and there is a "output every 30 minutes and when terminated" option.

Related

Is there any way to convert .log file generated from BUS master (CAN tool) to .asc file (Vector support) with out bus master?

I am trying to make a script which convert BUS master .log files into .asc files without bus master. The files are just column adjustment but I am stuck with different time format used in the files' format.
Example: 11:29:26:2229 (.log) = 41366.222900 (.asc)
Please help me out to understand the .asc file time format.
In the header of the .asc file, the start time and date is given in the very first line. E.g.:
date Mon Okt 24 10:01:03 2022
The timestamps of the individual CAN messages (and other events) are just offsets to the start time in seconds. I.e. given the header above, the timestamp
123.456
Would be 123 seconds and 456 milliseconds after 10:01:03, which is 10:03:06.456
Details about the .asc format can be found in the Doc folder in CANoe's installation directory.
.asc time format = hour * 3600 + min * 60 + sec.

Wireshark MATE: Calculate response time

I'm just trying to use MATE to calculate response time between each SMPP submit_sm and submit_sm_resp, this is the mate script I'm using:
Pdu smpp_pdu Proto smpp Transport mate {
Extract cmd From smpp.command_id;
Extract seq From smpp.sequence_number;
};
Gop smpp_session On smpp_pdu Match (seq) {
Start (cmd=4);
Stop (cmd=2147483652);
};
Done;
So basically, it exacts command id and sequence numbers, then in Gop uses command id for start/stop
4 = 0x00000004 = SUBMIT_SM
2147483652 = 0x80000004 = SUBMIT_SM_RESP
This should do the trick. But, now what?
I added a column with Delta Time Displayed, and this should show the response time for each submit_sm_resp, but this is not using MATE, just calculate the time between each previous packet:
How can I use MATE script?
If I use the following filter in a specific column:
mate.smpp_pdu.RelativeTime
I only got the seconds, for each packet, from starting trace:
As far I understood, MATE should setup time between START and STOP, but which is the filter I should use?
This doesn't shown anything:
mate.smpp_session.Time
Please advise,
Thank you,
Lucas
Find solution! I posted here just in case someone need it:
Because cmd is a string, the correct Gop will be:
Start (cmd="0x00000004");
Stop (cmd="0x80000004");
This will did the trick.

Read data from csv file with foreach function

I have been reading data from csv, if there is a large csv file, for avoid this time-out(rack 12 sec timeout) i have read only 25 rows from csv after 25 rows it return and again make a request so this will continue until read all the rows.
def read_csv(offset)
r_count = 1
CSV.foreach(file.tempfile, options) do |row|
if r_count > offset.to_i
#process
end
r_count += 1
end
But here it is creating a new issue, let say first read 25 rows then when the next request comes offset is 25 that time it will read upto first 25 rows then it will start read from 26 and do process, so how can i skip this rows which already read?, i tried this if next to skip iteration but that fails, or is there any other efficient way to do this?
Code
def read_csv(fileName)
lines = (`wc -l #{fileName}`).to_i + 1
lines_processed = 0
open(fileName) do |csv|
csv.each_line do |line|
#process
lines_processed += 1
end
end
end
Pure Ruby - SLOWER
def read_csv(fileName)
lines = open("sample.csv").count
lines_processed = 0
open(fileName) do |csv|
csv.each_line do |line|
#process
lines_processed += 1
end
end
end
Benchmarks
I ran a new benchmark comparing your original method provided and my own. I also included the test file information.
"File Information"
Lines: 1172319
Size: 126M
"django's original method"
Time: 18.58 secs
Memory: 0.45 MB
"OneNeptune's method"
Time: 0.58 secs
Memory: 2.18 MB
"Pure Ruby method"
Time: 0.96
Memory: 2.06 MB
Explanation
NOTE: I added a pure ruby method, since using wc is sort of cheating, and not portable. In most cases it's important to use pure language solutions.
You can use this method to process a very large CSV file.
~2MB memory I feel is pretty optimal considering the file size, it's a bit of an increase of memory usage, but the time savings seems to be a fair trade, and this will prevent timeouts.
I did modify the method to take a fileName, but this was just because I was testing many different CSV files to make sure they all worked correctly. You can remove this if you'd like, but it'll likely be helpful.
I also removed the concept of an offset, since you stated you originally included it to try to optimize the parsing yourself, but this is no longer necessary.
Also, I keep track of how many lines are in the file, and how many were processed since you needed to use that information. Note, that lines only works on unix based systems, and it's a trick to avoid loading the entire file into memory, it counts the new lines, and I add 1 to account for the last line. If you're not going to count headers as line though, you could remove the +1 and change lines to "rows" to be more accurate.
Another logistical problem you may run into is the need to figure how to handle if the CSV file has headers.
You could use lazy reading to speed this up, the whole of the file wouldn't be read, just from the beginning of the file until the chunk you use.
See http://engineering.continuity.net/csv-tricks/ and https://reinteractive.com/posts/154-improving-csv-processing-code-with-laziness for examples.
You could also use SmarterCSV to work in chunks like this.
SmarterCSV.process(file_path, {:chunk_size => 1000}) do |chunk|
chunk.each do |row|
# Do your processing
end
do_something_else
end
enter code here
The way I did this was by streaming the result to the user, if you see what is happening it doesn't bother that much you have to wait. The timeout you mention won't happen here.
I'm not a Rails user so I give an example from Sinatra, this can be done with Rails also. See eg http://api.rubyonrails.org/classes/ActionController/Streaming.html
require 'sinatra'
get '/' do
line = 0
stream :keep_open do |out|
1.upto(100) do |line| # this would be your CSV file opened
out << "processing line #{line}<br>"
# process line
sleep 1 # for simulating the delay
end
end
end
A still better but somewhat complicated solution would be to use websockets, the browser would receive the results from the server once the processing is finished. You will need some javascript in the client also to handle this. See https://github.com/websocket-rails/websocket-rails

Shift in the columns of spool file

I am using a shell script to extract the data from 'extr' table. The extr table is a very big table having 410 columns. The table has 61047 rows of data. The size of one record is around 5KB.
I the script is as follows:
#!/usr/bin/ksh
sqlplus -s \/ << rbb
set pages 0
set head on
set feed off
set num 20
set linesize 32767
set colsep |
set trimspool on
spool extr.csv
select * from extr;
/
spool off
rbb
#-------- END ---------
One fine day the extr.csv file was having 2 records with incorrect number of columns (i.e. one record with more number of columns and other with less). Upon investigation I came to know that the two duplicate records were repeated in the file. The primary key of the records should ideally be unique in file but in this case 2 records were repeated. Also, the shift in the columns was abrupt.
Small example of the output file:
5001|A1A|AAB|190.00|105|A
5002|A2A|ABB|180.00|200|F
5003|A3A|AAB|153.33|205|R
5004|A4A|ABB|261.50|269|F
5005|A5A|AAB|243.00|258|G
5006|A6A|ABB|147.89|154|H
5003|A7A|AAB|249.67|AAB|153.33|205|R
5004|A8A|269|F
5009|A9A|AAB|368.00|358|S
5010|AAA|ABB|245.71|215|F
Here the primary key records for 5003 and 5004 have reappeared in place of 5007 and 5008. Also the duplicate reciords have shifted the records of 5007 and 5008 by appending/cutting down their columns.
Need your help in analysing why this happened? Why the 2 rows were extracted multiple times? Why the other 2 rows were missing from the file? and Why the records were shifted?
Note: This script is working fine since last two years and has never failed except for one time (mentioned above). It ran successfully during next run. Recently we have added one more program which accesses the extr table with cursor (select only).
I reproduced a similar behaviour.
;-> cat input
5001|A1A|AAB|190.00|105|A
5002|A2A|ABB|180.00|200|F
5003|A3A|AAB|153.33|205|R
5004|A4A|ABB|261.50|269|F
5005|A5A|AAB|243.00|258|G
5006|A6A|ABB|147.89|154|H
5009|A9A|AAB|368.00|358|S
5010|AAA|ABB|245.71|215|F
See the input file as your database.
Now I write a script that accesses "the database" and show some random freezes.
;-> cat writeout.sh
# Start this script twice
while IFS=\| read a b c d e f; do
# I think you need \c for skipping \n, but I do it different one time
echo "$a|$b|$c|$d|" | tr -d "\n"
(( sleeptime = RANDOM % 5 ))
sleep ${sleeptime}
echo "$e|$f"
done < input >> output
EDIT: Removed cat input | in script above, replaced by < input
Start this script twice in the background
;-> ./writeout.sh &
;-> ./writeout.sh &
Wait until both jobs are finished and see the result
;-> cat output
5001|A1A|AAB|190.00|105|A
5002|A2A|ABB|180.00|200|F
5003|A3A|AAB|153.33|5001|A1A|AAB|190.00|105|A
5002|A2A|ABB|180.00|205|R
5004|A4A|ABB|261.50|269|F
5005|A5A|AAB|243.00|200|F
5003|A3A|AAB|153.33|258|G
5006|A6A|ABB|147.89|154|H
5009|A9A|AAB|368.00|358|S
5010|AAA|ABB|245.71|205|R
5004|A4A|ABB|261.50|269|F
5005|A5A|AAB|243.00|258|G
5006|A6A|ABB|147.89|215|F
154|H
5009|A9A|AAB|368.00|358|S
5010|AAA|ABB|245.71|215|F
When I edit the last line of writeout.sh into done > output I do not see the problem, but that might be due to buffering and the small amount of data.
I still don't know exactly what happened in your case, but it really seems like 2 progs writing simultaneously to the same script.
A job in TWS could have been restarted manually, 2 scripts in your masterscript might write to the same file or something else.
Preventing this in the future can be done using some locking / checks (when the output file exists, quit and return errorcode to TWS).

How to gracefully kill an unresponsive tcl script?

Lets say I have a tcl script which should normally execute in less than a minute - How could I make sure that the script NEVER takes more than 'x' minutes/seconds to execute, and if it does then the script should just be stopped.
For example, if the script has taken more than 100 seconds, then I should be able to automatically switch control to a clean up function which would gracefully end the script so that I have all the data from the script run so far but I also ensure that it doesn't take too long or get stuck infinitely.
I'm not sure if this can be done in tcl - any help or pointers would be welcome.
You could use interp limit when you use a child interpreter.
Note that this will throw an uncachable error, if you want to do some cleanup you to remove the limit in a parent interp.
set interp [interp create]
# initialize the interp
interp eval $interp {
source somestuff.tcl
}
# Add the limit. From now you have 60 seconds or an error will be thrown
interp limit $interp time -seconds [clock seconds] -milliseconds 60000
set errorcode [catch {interp eval {DoExpensiveStuff}} res opts]
# remove the limit so you can cleanup the mess if needed.
interp limit $interp time -seconds {}
if {$errorcode} {
# Do some cleanup here
}
# delete the interp, or reuse it?
interp delete $interp
# And what shall be done with the error? Throw it.
return -options $opt $res
Resource limits are the best bet with Tcl, but they are not bullet-proof. Tcl can not (and will not) abort C procedures, and there are some ways to let the Tcl core do some hard working.
There must be a loop that you're worried might take more than 100 seconds, yes? Save clock seconds (current time) before you enter the loop, and check the time again at the end of each iteration to see if more than 100 seconds have elapsed.
If for some reason that's not possible, you can try devising something using after—that is, kick off a timer to a callback that sets (or unsets) some global variable that your executing code is aware of—so that on detection, it can attempt to exit.

Resources