Reading consoleText URL from within a running build only returns first 10000 lines - jenkins

I have a groovy code that reads from the current consoleText and do some jobs. When I run the code from the IDE, it works perfectly but when I run it as a part of a step in Jenkins, it only reads 10000 lines of the total which is approximately 2.8 million lines. The code to read from the console is:
url.withReader { bufferedReader ->
while ((line = bufferedReader.readLine()) != null) {
//do something
}
}
The url is
${BUILD_URL}/consoleText

The .../consoleText URL will not "grow" automatically -- it just provides a "snapshot" of console data that's available at query time.
So, if you GET that URL for a build while that build is still running, then you will only see part of the console log. The amount that you see will depend on the time when you issue the GET -- and possibly it will also depend on the status of some buffers.
If this used to work better in the past, then you probably moved the point in time when you tried to read the console.

Related

Way to get some sort of schedule in TCL without blocking on-going code

I need some sort of schedule thing to schedule a task to happen at x:y (12:00 for example) in Tcl.
The scenario is a router using Openwrt with Tcl 8.6.10 with limited RAM and storage where I have some sort of IRC client "bot" (using socket to connect). The "bot" was just a barebone that I modify to suit my needs. Most of the things work fine, except that I don't have way to schedule easily things. I wanted something like how eggdrop has "bind time" where the bind thing is "bind time flag "cron-style string" caller".
The "bot" scheme is like:
Main Tcl script:
<info+code to connect to IRC>
<while loop>
<some code in case of IRC disconnection>
<list of files with tcl code aka sub-scripts>
<usage of source based from a list of the filenames>
<code for error handling>
<end of while loop>
The list of files is source filelist.tcl, where filelist.tcl is a set var {filename1.tcl filename2.tcl...}. The filenamex.tcl has some basic code to respond to IRC server or IRC input from channels and reply to channels.
I can make some sort of schedule if I base a execution like if {[clock format [clock seconds] -format "%H:%M"]=="12:00"} {code to execute} and hopefully wait for a server ping/pong but that can lead to repeated code inside of the if body.
I been looking around and found a package called cron but I don't know how to use it correctly because there are not many examples and I don't know to use vwait properly and I don't want vwait to hang the bot waiting for a value to change. I also read about tcl threads for maybe parallel execution.
So I need some code inside of a sub-script that looks like (a package cron style):
#beginning of file
#add a task specifying hour and minute
task-at "12:00" proccaller
proc procname {optional} {
<some code to be executed at specific hour+time>
}
#end of file
I also don't know how to use after command to use it.
How can I accomplish I want?
Thanks for the replies and yes, it would help if I study event loops and coroutine, which probably comes next.
Some time has passed since I posted the question and kinda sorted the thing by creating a sub-script in a folder named scripts with the following structure:
#beginning of the script
if {![file exists executed]} {set executed "no"}
#the following clock instruction returns for example: Tuesday 22:14
switch -glob -- [clock format [clock seconds] -format "%A %H:%M"] {
"*12:00" - "*12:01" {
#Basic example of sending a message to the irc channel when it's midday
if {$executed=="no"} {
puts $fd "PRIVMSG #CODE :It's midday right now."
flush $fd
set executed "yes"
}
}
#...more time comparisions and code
default {set executed "no"}
}
#end of script
And the script is almost the top of the list of scripts to be loaded so if I wish to send some command down stream at giving time, the command can be executed.
There is double timings because the "bot" reacts, at least at minimum, to the irc server's ping which happens each 90 seconds and it may skip some minutes.
This is not an answer but an unproper workaround.

Redis - monitoring maximum memory before inserts fail?

While this Q/A does not address the actual issue of: How to detect with client (eg redis-py) that redis is running out of memory constraint not by machine but by the maxmem configuration? Before inserts fail which command to use in the programm to detect about to be full?
My first guess is: info and check if used_memory_peak < maxmem setting. Is this correct?
(Besides, for out of machine memory, since defrag, use which setting, none of the returned INFO fields help here)
Well should i just try an insert and see if fail (but that would be after the fact then.)
Trail and error, good enough tested by running
while true; do redis-cli lpush mm longstringhere; done; results on maxmem - used_memory < 0.1MB with insert failures:
(error) OOM command not allowed when used memory > 'maxmemory'.
So i have set i poll it via redis-py client and once the diff goes <1mb threshold throw up, sry raise Error of course. Make sure the user_memory memory addon of your longest command is < threshold too of course otherwise you run into it on insert.
I try to figure how to calc the ~percentage of used mem so i get notification way earlier eg 90% of maxmem, therefore this solution is fine.
Info dump:
# Memory
used_memory:3126272
used_memory_human:2.98M
used_memory_rss:5292032
used_memory_rss_human:5.05M
used_memory_peak:4914296
used_memory_peak_human:4.69M
used_memory_peak_perc:63.62%
used_memory_overhead:696654...
Furthermore maxmem is not a hardcap, when running it further by eg adding members to existing set.
used_memory:3162584
used_memory_human:3.02M
code to get percent 0-100
rmem_info = pipe.info(section='memory')
{'redis_mem_percent': math.ceil(rmem_info['used_memory'] / rmem_info['maxmemory'] *100)}

Defining "global" behavior in Gulp (measuring task duration)

I'm working on moving us from ant to gulp, and as part of the effort I want to write timing stats to Graphite. We're doing this in ant as well (no idea how, beside the point anyway). My question is, I'd prefer to not have to add some or other plugin manually to every task we have (we have over 60), but rather have some sort of global behavior, where for every task, before the task is run a timer is start, and when it signals completion we push some data to Graphite (over statsd).
Can someone point me in the right direction where to hook into gulp for this? I couldn't find anything particularly useful in the docs / recipes...
We're running gulp#4.
Instead of adding timing code to your numerous tasks, you could make use of the NPM gulp-duration package.
A snippet of an example of it's use is shown below:
function rebundle() {
var uglifyTimer = duration('uglify time')
var bundleTimer = duration('bundle time')
return bundler.bundle()
.pipe(source('bundle.js'))
.pipe(bundleTimer)
// start just before uglify recieves its first file
.once('data', uglifyTimer.start)
.pipe(uglify())
.pipe(uglifyTimer)
.pipe(gulp.dest('example/'))
}
gulp-duration's duration function:
Creates a new pass-through duration stream. When this stream is
closed, it will log the amount of time since its creation to your
terminal.
will then allow you to log the duration of the task.
Whilst this is not a global behaviour solution, at least you can specify the timing code in your gulp file, as opposed to having to modify all 60+ of your tasks.

Neo4j: Java API IndexHits<Node>.size() is 0

I'm trying to use the Java API for Neo4j but I seem to be stuck at IndexHits. If I query the DB with Cypher using
START n=node:types(type="Process") RETURN n;
I get all 2087 nodes of type "Process".
In my application I have the following lines
Index<Node> nodeIndex = db.index().forNodes("types");
IndexHits<Node> hits = nodeIndex.get("type", "Process");
System.out.println("Node index size: " + hits.size());
which leads my console to spit out a value of 0. Here, db is of course an instance of GraphDatabaseService.
I expected an object that included all 2087 nodes. What am I doing wrong?
The .size() question is just the prelude to my iterator
for(Node process : hits) { ... }
but that does not much when hits.size() == 0. According to http://api.neo4j.org/1.9.2/org/neo4j/graphdb/index/IndexHits.html this should be possible, provided there is something in hits.
Thanks in advance for your help.
I figured it out. Man, I feel so embarrassed...
It so happens that I had set up the DB_PATH to my default data folder, whereas the default storage folder is the default data folder plus graph.db. When I tried to run the code from that corrected DB_PATH I got an error saying that a lock file was in place because the Neo4j server was running. After shutting it down it worked perfectly.
So, if you happen to see the following error, just stop the server and run the code again:
Caused by: org.neo4j.kernel.StoreLockException: Could not create lock file
at org.neo4j.kernel.StoreLocker.checkLock(StoreLocker.java:74)
at org.neo4j.kernel.StoreLockerLifecycleAdapter.start(StoreLockerLifecycleAdapter.java:40)
at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:491)
I found on several forums that you cannot run the Neo4j server and use the Java API to query it at the same time.

Why is my Ruby script utilizing 90% of my CPU?

I wrote a admin script that tails a heroku log and every n seconds, it summarizes averages and notifies me if i cross a certain threshold (yes I know and love new relic -- but I want to do custom stuff).
Here is the entire script.
I have never been a master of IO and threads, I wonder if I am making a silly mistake. I have a couple of daemon threads that have while(true){} which could be the culprit. For example:
# read new lines
f = File.open(file, "r")
f.seek(0, IO::SEEK_END)
while true do
select([f])
line = f.gets
parse_heroku_line(line)
end
I use one daemon to watch for new lines of a log, and the other to periodically summarize.
Does someone see a way to make it less processor-intensive?
This probably runs hot because you never really block while reading from the temporary file. IO::select is a thin layer over POSIX select(2). It looks like you're trying to block until the file is ready for reading, but select(2) considers EOF to be ready ("a file descriptor is also ready on end-of-file"), so you always return right away from select then call gets which returns nil at EOF.
You can get a truer EOF reading and nice blocking behavior by avoiding the thread which writes to the temp file and instead using IO::popen to fork the %x[heroku logs --ps router --tail --app pipewave-cedar] log tailer, connected to a ruby IO object on which you can loop over gets, exiting when gets returns nil (indicating the log tailer finished). gets on the pipe from the tailer will block when there's nothing to read and your script will only run as hot as it takes to do your line parsing and reporting.
EDIT: I'm not set up to actually try your code, but you should be able to replace the log tailer thread and your temp file read loop with this code to get the behavior described above:
IO.popen( %w{ heroku logs --ps router --tail --app my-heroku-app } ) do |logf|
while line = logf.gets
parse_heroku_line(line) if line =~ /^/
end
end
I also notice your reporting thread does not do anything to synchronize access to #total_lines, #total_errors, etc. So, you have some minor race conditions where you can get inconsistent values from the instance vars that parse_heroku_line method updates.
select is about whether a read would block. f is just a plain old file, so you when get to the end reads don't block, they just return nil instantly. As a result select returns instantly rather than waiting for something to be appending to the file as I assume you're expecting. Because of this you're sitting in a tight busy loop, so high cpu is to be expected.
If you are at eof (you could either check f.eof? or whether gets returns nil), then you could either start sleeping (perhaps with some sort of back off) or use something like listen to be notified of filesystem changes

Resources