Perform an INSERT in Jena Fuseki with SPARQL gem (Ruby) - ruby-on-rails

So I'm developing an API in Rails and using Jena Fuseki to store triples, and right now I'm trying to perform an INSERT in a named graph. The query is correct, since I ran it on Jena and worked perfectly. However, no matter what I do when using the Rails CLI, I keep getting the same error message:
SPARQL::Client::MalformedQuery: Error 400: SPARQL Update: No 'update=' parameter
I've created a method that takes the parameters of the object I'm trying to insert, and specified the graph where I want them.
def self.insert_taxon(uri, label, comment, subclass_of)
endpoint = SPARQL::Client.new("http://app.talkiu.com:8030/talkiutest/update")
query =
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix gpc:<http://www.ebusiness-unibw.org/ontologies/pcs2owl/gpc/>
prefix tk: <www.web-experto.com.ar/ontology#>
INSERT DATA {
GRAPH <http://app.talkiu.com:8030/talkiutest/data/talkiu> {
<#{uri}> a owl:Class.
<#{uri}> rdfs:label '#{label}'#es .
<#{uri}> rdfs:comment '#{comment}' .
<#{uri}> rdfs:subClassOf <#{subclass_of}> .
}
}"
resultset = endpoint.query(query)
end
As you can see, I'm using the UPDATE endpoint. Any ideas? Thanks in advance

Well... Instead of endpoint.query, I tried
resultset = endpoint.update(query)
and worked. Method returned
<SPARQL::Client:0x2b0158a050e4(http://app.talkiu.com:8030/talkiutest/update)>
and the data is showing up in my database and graph. Hope this helps anyone with the same problem.

Related

Microsoft.Jet.OLEDB.4.0 , JOIN two MDB Files

I am having trouble figuring out the correct Syntax for the File Path
SELECT c1.Produkt,c2.Name
from Reservation c1
left join [Articles.mdb].[Articles] c2 on c1.Produkt=c2.Produkt
group by c1.Produkt,c2.Name
It now searches for the Articles.mdb inside the Application Folder . I would like to specify the path , for example c:\database\articles.mdb . But unfortunately I cannot figure it out how to do it.
I tried [c:\database\articles.mdb] and ['c:\database\articles.mdb'] , either get Incorrect Parameter, or Incorrect Filename .
Please help.
UPDATE :
after removing Parameter Check from TADOQuery . And entering the text like this :
left join [c:\Articles.mdb].[Articles] c2 on c1.Produkt=c2.Produkt
It works.

How to fix custom function class not registered in apache jena fuseki error?

I need a custom filter function in apache-jena-fuseki. I tried adding custom function class name to config.ttl. I added function class files to class path. But it's always throwing error that function is not registered.
Can anyone please share a detailed approach I can try or some documentation? Desperately need it.
Added following line To Configuration File
[] ja:loadClass "org.apache.jena.sparql.function.library.function" .
Class file is in folder /home/user/custom_functions/
Class file package name = org.apache.jena.sparql.function.library.
Java command to launch fuseki server is
java -cp /home/user/custom_functions/function.class:/home/user/apache-jena-4.5.0/lib-src/*:/home/user/apache-jena-4.5.0/lib/* -jar fuseki-server.jar
Function takes one argument.
When I run query, it gives me error log that function has not registered FunctionFactory.
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX java: <http://www.w3.org/2007/uwa/context/java.owl#>
PREFIX f: <java:org.apache.jena.sparql.function.library.>
SELECT ?s ?o {
?s rdfs:label ?o .
FILTER (f:function(?o) ) .
}

Strange URL containing 'A=0 or '0=A in web server logs

During the last weekend some of my sites logged errors implying wrong usage of our URLs:
...news.php?lang=EN&id=23'A=0
or
...news.php?lang=EN&id=23'0=A
instead of
...news.php?lang=EN&id=23
I found only one page originally which mentioned this (https://forums.adobe.com/thread/1973913) where they speculated that the additional query string comes from GoogleBot or an encoding error.
I recently changed my sites to use PDO instead of mysql_*. Maybe this change caused the errors? Any hints would be useful.
Additionally, all of the requests come from the same user-agent shown below.
Mozilla/5.0 (Windows; U; Windows NT 5.1; pt-PT; rv:1.9.1.2) Gecko/20090729 Firefox/3.5.2 (.NET CLR 3.5.30729)
This lead me to find the following threads:
pt-BR
and
Strange parameter in URL - what are they trying?
It is a bot testing for SQL injection vulnerabilities by closing a query with apostrophe, then setting a variable. There are also similar injects that deal with shell commands and/or file path traversals. Whether it's a "good bot" or a bad bot is unknown, but if the inject works, you have bigger issues to deal with. There's a 99% chance your site is not generating these style links and there is nothing you can do to stop them from crafting those urls unless you block the request(s) with a simple regex string or a more complex WAF such as ModSecurity.
Blocking based on user agent is not an effective angle. You need to look for the request heuristics and block based on that instead. Some examples of things to look for in the url/request/POST/referrer, as both utf-8 and hex characters:
double apostrophes
double periods, especially followed by a slash in various encodings
words like "script", "etc" or "passwd"
paths like dev/null used with piping/echoing shell output
%00 null byte style characters used for init a new command
http in the url more than once (unless your site uses it)
anything regarding cgi (unless your site uses it)
random "enterprise" paths for things like coldfusion, tomcat, etc
If you aren't using a WAF, here is a regex concat that should capture many of those within a url. We use it in PHP apps, so you may/will need to tweak some escapes/looks depending on where you are using this. Note that this has .cgi, wordpress, and wp-admin along with a bunch of other stuff in the regex, remove them if you need to.
$invalid = "(\(\))"; // lets not look for quotes. [good]bots use them constantly. looking for () since technically parenthesis arent valid
$period = "(\\002e|%2e|%252e|%c0%2e|\.)";
$slash = "(\\2215|%2f|%252f|%5c|%255c|%c0%2f|%c0%af|\/|\\\)"; // http://security.stackexchange.com/questions/48879/why-does-directory-traversal-attack-c0af-work
$routes = "(etc|dev|irj)" . $slash . "(passwds?|group|null|portal)|allow_url_include|auto_prepend_file|route_*=http";
$filetypes = $period . "+(sql|db|sqlite|log|ini|cgi|bak|rc|apk|pkg|deb|rpm|exe|msi|bak|old|cache|lock|autoload|gitignore|ht(access|passwds?)|cpanel_config|history|zip|bz2|tar|(t)?gz)";
$cgis = "cgi(-|_){0,1}(bin(-sdb)?|mod|sys)?";
$phps = "(changelog|version|license|command|xmlrpc|admin-ajax|wsdl|tmp|shell|stats|echo|(my)?sql|sample|modx|load-config|cron|wp-(up|tmp|sitemaps|sitemap(s)?|signup|settings|" . $period . "?config(uration|-sample|bak)?))" . $period . "php";
$doors = "(" . $cgis . $slash . "(common" . $period . "(cgi|php))|manager" . $slash . "html|stssys" . $period . "htm|((mysql|phpmy|db|my)admin|pma|sqlitemanager|sqlite|websql)" . $slash . "|(jmx|web)-console|bitrix|invoker|muieblackcat|w00tw00t|websql|xampp|cfide|wordpress|wp-admin|hnap1|tmunblock|soapcaller|zabbix|elfinder)";
$sqls = "((un)?hex\(|name_const\(|char\(|a=0)";
$nulls = "(%00|%2500)";
$truth = "(.{1,4})=\1"; // catch OR always-true (1=1) clauses via sql inject - not used atm, its too broad and may capture search=chowder (ch=ch) for example
$regex = "/$invalid|$period{1,2}$slash|$routes|$filetypes|$phps|$doors|$sqls|$nulls/i";
Using it, at least with PHP, is pretty straight forward with preg_match_all(). Here is an example of how you can use it: https://gist.github.com/dhaupin/605b35ca64ca0d061f05c4cf423521ab
WARNING: Be careful if you set this to autoban (ie, fail2ban filter). MS/Bing DumbBots (and others) often muck up urls by entering things like strange triple dots from following truncated urls, or trying to hit a tel: link as a URi. I don't know why. Here is what i mean: A link with text www.example.com/link-too-long...truncated.html may point to a correct url, but Bing may try to access it "as it looks" instead of following the href, resulting in a WAF hit due to double dots.
since this is a very old version of FireFox, I blocked it in my htaccess file -
RewriteCond %{HTTP_USER_AGENT} Firefox/3\.5\.2 [NC]
RewriteRule .* err404.php [R,L]

Configuring Jena Fuseki + inference and TDB?

I am new to Jenna TDB and Fuseki. I would like to load Lehigh University Benchmark (LUBM) data generated with their data generator (ver.1.7) in to Fuseki. This is about 400 .owl files. used the following Configuration file, that comes with Fuseki for inferencing:
<#service1> rdf:type fuseki:Service ;
fuseki:name "inf" ; # http://host/inf
fuseki:serviceQuery "sparql" ; # SPARQL query service
#fuseki:serviceUpdate "update" ;
fuseki:serviceReadWriteGraphStore "data" ;
# A separate read-only graph store endpoint:
fuseki:serviceReadGraphStore "get" ;
fuseki:dataset <#dataset> ;
.
<#dataset> rdf:type ja:RDFDataset ;
ja:defaultGraph <#model_inf> ;
.
<#model_inf> a ja:InfModel ;
ja:baseModel <#tdbGraph> ;
ja:reasoner [
ja:reasonerURL <http://jena.hpl.hp.com/2003/OWLFBRuleReasoner>
] .
<#tdbDataset> rdf:type tdb:DatasetTDB ;
tdb:location "myDB" ;
tdb:unionDefaultGraph true ;
.
<#tdbGraph> rdf:type tdb:GraphTDB ;
tdb:dataset <#tdbDataset> .
Fuseki starts without any issues. However when I execute the following command:
./s-put http://localhost:3030/inf/data default ~/Owl/univ-bench.owl
I get the an error:405 HTTP method PUT is not supported by this URL http://localhost:3030/inf/data?default
I have couple of questions:
1.The update in the config file is clearly not disabled, so why do I get this message.
2.In order to load all the 400 .owl file as one graph apparently I have to disable the update and enable tdb:unionDefaultGraph true(This is mentioned in the config file that came with Fuseki) if that is the case how on earth am I suppose to load the data to Fuseki.
Please let me know what am I missing here and how I can do this correctly.
Thanks in advance for the help.
Edit: I found out that you will need to add the following:
fuseki:serviceReadWriteGraphStore "data" ;
# A separate read-only graph store endpoint:
fuseki:serviceReadGraphStore "get" ;
in order to be able to use s-put to load data, however every time I add a new file it overwrites the data from the previous file and therefore the inferencing doesn't work. What did I do wrong here? How do I load the data correctly that all the files are loaded to the same graph and inferencing work?
Edit
So digging more in to this problem I found out that there are two ways to load the data.
you can add the following where you define the model in the config file:
ja:content [ja:externalContent <file://// Path_to_owl_file >] ;
So for me I added it under <#model_inf> a ja:InfModel ; However, if you have 400 files that will be really tedious.
You can separately loaded the data using tdbloader2 and point the config file to the directory that the tdbload builds as a database. Which is also described here
$ tdbloader2 --loc tdb PATH_TO_DIR_or_OWL_Files
The issue currently is that when I run a simple query for instance the following query I get a Out of memory error.
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX ub: <http://cs.uga.edu#>
SELECT *
WHERE
{
?X rdf:type ub:GraduateStudent .
?X ub:takesCourse <http://www.Department0.University0.edu/GraduateCourse0>
}
I increased the memory for Fuseki-Server (int the server script) to up to 5GB and still get a out of memory error for this simple query. Any idea why that might be happening?
s-put does a PUT - which is defined to be a "replace contents".
Use s-post to add to a graph.
LUBM is sufficiently simple in structure that (1) it is not very realistic and (2) inference can be applied to each university alone and the data loaded so at query time, it has all been expanded.

Extract pieces from URL

I need to extract pieces from a URL and I am trying to learn preg_match_all().
The output in a PHP variable:
$content = 'http://www.domain.com/folder1/firstname_lastname.jpg';
Here is my attempt:
preg_match_all('/http://(.*?).jpg/s', $content, $out, PREG_SET_ORDER);
echo $out[0][0] . "\n";
Matching the URL is not easy.
I need to pick out from:
hxxp://www.domain.com/folder1/firstname_lastname.jpg
the following: "www.domain.com" and "folder1" and "firstname_lastname"
Could I get one preg_match_all() for each example?
Thanks in advance.
I like to learn by example and trial and error.
He he... :)
You can use these:
http://uk3.php.net/parse_url
http://uk1.php.net/manual/en/function.parse-str.php
To get constituent parts of a URL and query string.

Resources