Query Execution Time in Jena Fuseki - jena

Is there a way to get the query execution time for SPARQL queries running in Jena Fuseki?

Check your fuseki print out, you will see:
13:03:59 INFO [1] Query = XXX
13:04:00 INFO [1] exec/select
13:04:00 INFO [1] 200 OK (467 ms)
The last line (467 ms) is the execution time

The log file has start and store entries with timestamps, and the last entry of a request is the time taken.

Related

PROFILE and EXPLAIN not showing anything on cypher-shell

After seeing this question, I've been reading this blog post about the need to avoid the need to avoid Eager when loading a very large CSV into Neo4J.
In my case, I have a ~27 million line CSV, totaling ~8.5 GB in size. It seems pretty important that I break up my query into several queries to avoid Eager transactions.
EXPLAIN and PROFILE both offer ways to "test" a query. In Mark Needham's blog post linked above, he mentions:
You'll notice that when we profile each query we're stripping off the
periodic commit section and adding a 'WITH row LIMIT 0'. This allows
us to generate enough of the query plan to identify the 'Eager'
operator without actually importing any data.
However, when I try to test my query on the cypher shell with PROFILE prepended... nothing happens. I don't get any output or report back.
$ ./bin/cypher-shell
Connected to Neo4j 3.3.5 at bolt://localhost:7687 as user neo4j.
Type :help for a list of available commands or :exit to exit the shell.
Note that Cypher queries must end with a semicolon.
neo4j> :begin
neo4j# PROFILE LOAD CSV WITH HEADERS FROM "file:///myfile.tsv" AS line FIELDTERMINATOR '\t'
WITH line LIMIT 0
MERGE ...
I also EXPLAIN and saw the same behavior -- no report or output.
If I paste the same PROFILE ... command into the Neo4J web interface, I do see the graphical plan show up, and even a warning tab telling me about EAGER. That is better than nothing, I suppose, but it's hard to read through this graphical display. I'd really like to use the cypher-shell for this, but it bizarrely is not showing anything.
I have also tried piping the EXPLAIN or PROFILE query to cypher-shell, but that just gives me some meta-data, not the actual plan.
$ cat query.cypher | ./bin/cypher-shell --format plain
Plan: "EXPLAIN"
Statement: "READ_WRITE"
Version: "CYPHER 3.3"
Planner: "COST"
Runtime: "INTERPRETED"
Time: 155
PROFILE:
$ cat query.cypher | ./bin/cypher-shell --format plain
Plan: "PROFILE"
Statement: "READ_WRITE"
Version: "CYPHER 3.3"
Planner: "COST"
Runtime: "INTERPRETED"
Time: 285
DbHits: 0
Rows: 1
count(*)
0
Any ideas what is going on?
That :begin opens a transaction, the query itself won't execute until you end with :commit.
In this case, you can leave off :begin completely, just end the query with a semicolon. Also, since you're only after the query plan here, use EXPLAIN so it doesn't actually execute the query.

batch query is not allowed to request data from "derivatives"."autogen"

Good afternoon,
I have created the following tickscript with a standard tickstack setup.
Which includes: InfluxDB(latest version) and kapacitor(latest version):
dbrp "derivatives"."default"
var data = batch
|query('select sum(value) from "derivatives"."default".derivative_test where time > now() - 10m')
.every(1m)
.period(2m)
var slope = data
|derivative('value')
.as('slope')
.unit(2m)
slope
|eval(lambda: ("slope" - "value") / "value")
.as('percentage')
|alert()
.crit(lambda: "percentage" <= -50)
.id('derivative_test_crit')
.message('{{ .Level }}: DERIVATIVE FOUND!')
.topic('derivative')
// DEBUGGING
|influxDBOut()
.database('derivatives')
.measurement('derivative_logs')
.tag('sum', 'sum')
.tag('slope', 'slope')
.tag('percentage', 'percentage')
But every time i want to define it i get the following message:
batch query is not allowed to request data from "derivatives"."autogen"
I never had this problem before with stream's but every batch tick script i write returns the same message.
My kapacitor user has full admin privs and i am able to get the data via a curl request, does anyone have any idea what could possibly be the problem here?
My thanks in advance.
Change this
dbrp "derivatives"."default"
var data = batch
|query('select sum(value) from "derivatives"."default".derivative_test where time > now() - 10m')
to this:
dbrp "derivatives"."autogen"
var data = batch
|query('select sum(value) from "derivatives"."autogen".derivative_test where time > now() - 10m')
It might not be obvious, but the retention policy is most likely incorrect.
If you run SHOW RETENTION POLICIES on the derivatives database you will see the RP's. I suspect you have an RP of autogen, which is the default RP. However "default" doesn't normally exist as an RP unless you create it, it just signifies that it is the default RP, if that makes sense?
RP Documentation might help clear it up Database Documentation.
default autogen RP

error: ORA-02289 - Sequence doesn't exist in Agile PLM 9.3.5

Not sure if this is the right place to ask this question.
I am facing issues while performing any action in Agile PLM 9.3.5. I have upgraded PLM from 9.3.3 to 9.3.5. Checked in Sequence table also, all the sequences are available.Still, getting the above error while creating any Object or updating any user profile.
Thanks!
You can try this to resolve the issue if it's still not resolved:
After you upgraded to Agile 9.3.5, You need to run 'reorder_query.bat' shell script in the [AUT_HOME]/AUT/bin directory. This tool clears out temporary records and gaps to compact the query table to reuse sequence IDs. This information is in the Agile Database Upgrade Guide.
If that doesn't work, please refer to Doc ID 1606365.1 in MOS KB.
Else, if you don't have access, I am copy pasting the excerpt below about the plan of action.
Stop application server and bounce the database server to make sure all inflight transactions are committed. While the database is down, take a cold backup. Leave the application server down during this process to prevent users from connecting.
Download the attached script called GAP_HUNTER_GC_v1.0.sql to a machine that has Oracle database client installed and can connect to your Agile schema through SQL*Plus, and run it. For example, the output on the screen will look similar to this:
SQL> #GAP_HUNTER_GC_v1.0.sql
You are logging on DB User - AGILE
Your agile database data version is 9.3.095.0
Your agile database schema version is 9.3.095
Please enter the gap threshold, default 5000:
Please enter the number of top largest gaps, default 10:
>>>>>>>> Start to collect gap ....
>>>>>>>> Prepare for scanning tables....
>>>>>>>> Start to collect tables and Generate the mapping tables ....
>>>>>>>> Step 1: Collect Reused ids....Begin time:20131208 11:39:17
table is not existing:Regulation_addorreplace_action
table is not existing:Regulation_addorreplace_task
table is not existing:INSTANCES
table is not existing:REFERENCE_OBJECT
>>>>>>>> Step 2: Generate gap .... Begin time:20131208 11:39:17
>>>>>>>> Step 3: Finish the Gap Hunter Process ....
>>>>>>>> Report: There are 0 id(s) have been collected in the GAP
Sequence Indexer Number, Gap Size, Starting Number, Ending Number
67018473, 131226320, 1352956646, 1484182965
50955717, 94058060, 1031324895, 1125382954
89993219, 87600000, 1812982965, 1900582964
78036370, 87424300, 1573458652, 1660882951
29531387, 77700000, 601882965, 679582964
86572585, 68412680, 1744470274, 1812882953
59910085, 67800000, 1210682962, 1278482961
25834330, 59801320, 527781692, 587583011
83797585, 55500000, 1688882958, 1744382957
12104050, 47011460, 252171585, 299183044
>>>>>>>> End .........
The output from step 2 is placed into log files on the file system. The log files are located in the same directory where SQL*Plus was launched from. Look for the following files:
gap_hunter_version.log
gap_hunter.log
gap_hunter_report.log
Open the gap_hunter_report.log file and review the first set of numbers in the list. For example:
Sequence Indexer Number, Gap Size, Starting Number, Ending Number
67018473, 131226320, 1352956646, 1484182965
This indicates the largest set of number available with a gap size of 131226320, starting with 1352956646 and ending with 1484182965.
Drop and recreate the AGILEOBJECTIDSEQUENCE sequence using the numbers in step 4:
drop sequence AGILEOBJECTIDSEQUENCE;
create sequence AGILEOBJECTIDSEQUENCE minvalue 1 maxvalue [Ending Number] increment by 20 cache 20 noorder nocycle start with [Starting Number];
For example:
SQL> drop sequence AGILEOBJECTIDSEQUENCE;
Sequence dropped.
SQL> create sequence AGILEOBJECTIDSEQUENCE minvalue 1 maxvalue 1484182965 increment by 20 cache 20 noorder nocycle start with 1352956646;
Sequence created.

Rails: export millions of row to csv

So I found a lot of articles where people are having issues exporting big data into a CSV with rails. I'm able to do this, but it takes about 40 seconds per 20k rows.
Has anyone overcame this issue? I searched everywhere for the past couple hours and couldn't find something which worked for me.
Thanks!
Suppose you want to load 1k rows into CSV. You can write a rake task which accepts limit and offset to pull data from table. Then write a ruby script something like below
batch_size = 100
offset = 0
for i in 0..9
nohup rake my_task:to_load_csv(batch_size, offset, index) > rake.out 2>&1 &
offset += batch_size
end
** Refer this link to know more about how to run rake in background
rake task will be something like
namespace :my_task
task :load_csv, [:limit, :offset, :index] :environments do
# write code here load data from table using limit and offset
# write the data returned in above query to FILE_NAME_#{index}.csv
end
end
Once you see all rake task are finished combine all files by index. If you want to automate process of combining files, you need to write some code for process monitoring. You have to grep for all active rake tasks and store their PID in array. Then every 15 seconds or something try to get the status of process using PID from array. If process is no longer running pop the PID from array. Continue this until array is blank i.e all rakes are finished and then merge files by their index.
Hopefully this helps you. Thanks!

AWS SimpleDB where clause 'and' operator behaving unexpectedly

The following simpledb query returns 51 results:
select * from logger where time > '2011-07-29 17:45:10.540284+00:00'
This query returns 20534 results:
select * from logger where time < '2011-07-29 17:50:08.615626'
These two queries both return 0 results!!?:
select * from logger where time between '2011-07-29 17:45:10.540284+00:00' and '2011-07-29 17:50:08.615626'
select * from logger where time > '2011-07-29 17:45:10.540284+00:00' and time < '2011-07-29 17:50:08.615626'
What am I missing here?
But are any of your 51 results returned from the first query actually within the time span you are searching? If they are all later than 17:50:08.615626 then your queries are performing as expected.
I am also suspicious of the fact that you are being inconsistent in how you are representing the time. You should really be using ISO 8601 timestamps if you want consistent lexicographic matching of times with SDB.
The other option is that the queries are taking longer than the query timeout to run, are you checking for errors?
Finally - perhaps SDB is having a bad day and the query is just a bit slow - in those circumstances you can find you get 0 results but DO get a next token - and the actual results follow in the next batch.
Does any of that help?

Resources