I define an assembler file with name dataset2.ttl. The content of this file is:
#prefix tdb: <http://jena.hpl.hp.com/2008/tdb#> .
#prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
#prefix ja: <http://jena.hpl.hp.com/2005/11/Assembler#> .
[] ja:loadClass "com.hp.hpl.jena.tdb.TDB" .
tdb:DatasetTDB rdfs:subClassOf ja:RDFDataset .
tdb:GraphTDB rdfs:subClassOf ja:Model .
<#dataset> rdf:type tdb:DatasetTDB ;
tdb:location "DB" ;
tdb:unionDefaultGraph true ;
.
<#data1> rdf:type tdb:GraphTDB ;
tdb:dataset <#dataset> ;
tdb:graphName <http://example.org/data1> ;
ja:content [ja:externalContent <file:///C:/Users/data/data1.ttl>;];
.
The related jena code to create a datase is:
public class TDB {
public static void main(String[] args) {
Dataset ds = null;
try {
ds = TDBFactory.assembleDataset("Dataset2.ttl");
if(ds == null) {
System.out.println("initial tdb failed");
} else {
System.out.println("Default Model:");
Model model = ds.getDefaultModel();
ds.begin(ReadWrite.WRITE);
model.write(System.out, "TURTLE");
}
} finally {
if(ds != null) {
ds.close();
}
}
}
The content in data1.ttl is:
#prefix : <http://example.org/> .
#prefix foaf: <http://xmlns.com/foaf/0.1/> .
:alice
a foaf:Person ;
foaf:name "Alice" ;
foaf:mbox <mailto:alice#example.org> ;
foaf:knows :bob ;
foaf:knows :charlie ;
foaf:knows :snoopy ;
.
:bob
foaf:name "Bob" ;
foaf:knows :charlie ;
.
:charlie
foaf:name "Charlie" ;
foaf:knows :alice ;
.
A dataset has been created using this code. However, the content in the file of "data1.ttl" has not been read into the model. What is the problem of my code?
You also have
<#dataset> rdf:type tdb:DatasetTDB ;
tdb:location "DB" ;
tdb:unionDefaultGraph true ;
.
and
ds = TDBFactory.assembleDataset("Dataset2.ttl");
so you are asking Jena to assemble a dataset. That dataset will be <#dataset> (find by the type). It is not connected the graph you define so that's ignored; you can remove that part. Assembling the dataset is the way to do this.
You have tdb:unionDefaultGraph true so the default graph for query is the combination of all named graphs in the dataset.
Pick one out with model.getNamedModel.
If you use SPARQL, use the GRAPH keyword.
I would try validating your ttl file online to make sure they dataset2.ttl and data.ttl are both valid. I noticed you seem to add an extra semi-colon at the end when it's not needed (it should end with just a period).
try changing your line to this:
ja:content [ja:externalContent <file:///C:/Users/data/data1.ttl>] .
<#data1> rdf:type tdb:GraphTDB ;
tdb:dataset <#dataset> ;
tdb:graphName <http://example.org/data1> ;
ja:content [ja:externalContent <file:///C:/Users/data/data1.ttl>;];
.
Note the tdb:GraphTDB which means attach to a graph in the database. It does not load data with ja:content.
As a persistent store, it is expected that the data is already loaded, e.g.by tdbloader, not every time the assembler is used.
Related
I want to change every entry in csv file to 'BlahBlah'
For that I have antlr grammar as
grammar CSV;
file : hdr row* row1;
hdr : row;
row : field (',' value1=field)* '\r'? '\n'; // '\r' is optional at the end of a row of CSV file ..
row1 : field (',' field)* '\r'? '\n'?;
field
: TEXT
{
$setText("BlahBlah");
}
| STRING
|
;
TEXT : ~[,\n\r"]+ ;
STRING : '"' ('""' | ~'"')* '"' ;
But when I run this on antlr4
error(63): CSV.g4:13:3: unknown attribute reference setText in $setText
make: *** [run] Error 1
why is setText not supported in antlr4 and is there any other alternative to replace text?
Couple of problems here:
First, have to identify the receiver of the setText method. Probably want
field : TEXT { $TEXT.setText("BlahBlah"); }
| STRING
;
Second is that setText is not defined in the Token class.
Typically, create your own token class extending CommonToken and corresponding token factory class. Set the TokenLableType (in the options block) to your token class name. The setText method in CommonToken will then be visible.
tl;dr:
Given the following grammar (derived from original CSV.g4 sample and grammar attempt of OP (cf. question)):
grammar CSVBlindText;
#header {
import java.util.*;
}
/** Derived from rule "file : hdr row+ ;" */
file
locals [int i=0]
: hdr ( rows+=row[$hdr.text.split(",")] {$i++;} )+
{
System.out.println($i+" rows");
for (RowContext r : $rows) {
System.out.println("row token interval: "+r.getSourceInterval());
}
}
;
hdr : row[null] {System.out.println("header: '"+$text.trim()+"'");} ;
/** Derived from rule "row : field (',' field)* '\r'? '\n' ;" */
row[String[] columns] returns [Map<String,String> values]
locals [int col=0]
#init {
$values = new HashMap<String,String>();
}
#after {
if ($values!=null && $values.size()>0) {
System.out.println("values = "+$values);
}
}
// rule row cont'd...
: field
{
if ($columns!=null) {
$values.put($columns[$col++].trim(), $field.text.trim());
}
}
( ',' field
{
if ($columns!=null) {
$values.put($columns[$col++].trim(), $field.text.trim());
}
}
)* '\r'? '\n'
;
field
: TEXT
| STRING
|
;
TEXT : ~[',\n\r"]+ {setText( "BlahBlah" );} ;
STRING : '"' ('""'|~'"')* '"' ; // quote-quote is an escaped quote
One has:
$> antlr4 -no-listener CSVBlindText.g4
$> grep setText CSVBlindText*java
CSVBlindTextLexer.java: setText( "BlahBlah" );
Compiling it works flawlessly:
$> javac CSVBlindText*.java
Testdata (the users.csv file just renamed):
$> cat blinded_by_grammar.csv
User, Name, Dept
parrt, Terence, 101
tombu, Tom, 020
bke, Kevin, 008
Yields in test:
$> grun CSVBlindText file blinded_by_grammar.csv
header: 'BlahBlah,BlahBlah,BlahBlah'
values = {BlahBlah=BlahBlah}
values = {BlahBlah=BlahBlah}
values = {BlahBlah=BlahBlah}
3 rows
row token interval: 6..11
row token interval: 12..17
row token interval: 18..23
So it looks as if the setText() should be injected before the semicolon of a production and not between alternatives (wild guessing here ;-)
Previous iterations below:
Just guessing, as I 1) have no working antlr4 available currently and 2) did not write ANTLR4 grammars for quite some time now - maybe without the Dollar ($) ?
grammar CSV;
file : hdr row* row1;
hdr : row;
row : field (',' value1=field)* '\r'? '\n'; // '\r' is optional at the end of a row of CSV file ..
row1 : field (',' field)* '\r'? '\n'?;
field
: TEXT
{
setText("BlahBlah");
}
| STRING
|
;
TEXT : ~[,\n\r"]+ ;
STRING : '"' ('""' | ~'"')* '"' ;
Update: Now that an antlr 4.5.2 (at least via brew) instead of a 4.5.3 is available, I digged into that and answering some comment below from OP: the setText() will be generated in lexer java module if the grammar is well defined. Unfortunately debugging antlr4 grammars for a dilettant like me is ... but nevertheless very nice language construction kit IMO.
Sample session:
$> antlr4 -no-listener CSV.g4
$> grep setText CSVLexer.java
setText( String.valueOf(getText().charAt(1)) );
The grammar used:
(hacked up from example code retrieved via:
curl -O http://media.pragprog.com/titles/tpantlr2/code/tpantlr2-code.tgz )
grammar CSV;
#header {
import java.util.*;
}
/** Derived from rule "file : hdr row+ ;" */
file
locals [int i=0]
: hdr ( rows+=row[$hdr.text.split(",")] {$i++;} )+
{
System.out.println($i+" rows");
for (RowContext r : $rows) {
System.out.println("row token interval: "+r.getSourceInterval());
}
}
;
hdr : row[null] {System.out.println("header: '"+$text.trim()+"'");} ;
/** Derived from rule "row : field (',' field)* '\r'? '\n' ;" */
row[String[] columns] returns [Map<String,String> values]
locals [int col=0]
#init {
$values = new HashMap<String,String>();
}
#after {
if ($values!=null && $values.size()>0) {
System.out.println("values = "+$values);
}
}
// rule row cont'd...
: field
{
if ($columns!=null) {
$values.put($columns[$col++].trim(), $field.text.trim());
}
}
( ',' field
{
if ($columns!=null) {
$values.put($columns[$col++].trim(), $field.text.trim());
}
}
)* '\r'? '\n'
;
field
: TEXT
| STRING
| CHAR
|
;
TEXT : ~[',\n\r"]+ ;
STRING : '"' ('""'|~'"')* '"' ; // quote-quote is an escaped quote
/** Convert 3-char 'x' input sequence to string x */
CHAR: '\'' . '\'' {setText( String.valueOf(getText().charAt(1)) );} ;
Compiling works:
$> javac CSV*.java
Now test with a matching weird csv file:
a,b
"y",'4'
As:
$> grun CSV file foo.csv
line 1:0 no viable alternative at input 'a'
line 1:2 no viable alternative at input 'b'
header: 'a,b'
values = {a="y", b=4}
1 rows
row token interval: 4..7
So in conclusion, I suggest to rework the logic of the grammar (I presume inserting "BlahBlahBlah" was not essential but a mere debugging hack).
And citing http://www.antlr.org/support.html :
ANTLR Discussions
Please do not start discussions at stackoverflow. They have asked us to
steer discussions (i.e., non-questions/answers) away from Stackoverflow; we
have a discussion forum at Google specifically for that:
https://groups.google.com/forum/#!forum/antlr-discussion
We can discuss ANTLR project features, direction, and generally argue about
whatever we want at the google discussion forum.
I hope this helps.
I'm starting in ANTLR4, what I would want is to recognize this format while doing some action according to the Token read.
what I'm trying to produce:
IDENTIFIER:Test1 ([a-zA-Z09]{10})
{insert 'Test1' in personId column}
CODE: F0101F
FULL_NAME: FIRST_NAME ( [A-Z]+)LAST_NAME ( [A-Z]+ )
{insert FIRST_NAME.value in firstName column and insert LAST_NAME.value in
lastName column}
ADRESS: DIGIT+ STREET_NAME ([A-Z]+)
{insert STREET_NAME.value in streetName column }
OTHER_INFORMATION: ([A-Z]+)
{insert OTHER_INFORMATION.value in other column}
What I did:
prod
:
read_information+
;
read_information
:
{getCurrentToken().getType()== ID }?
idElement
|
{getCurrentToken().getType()== CODE }?
codeElement
|
{getCurrentToken().getType()== FULLNAME}?
fullNameElement
|
{getCurrentToken().getType()== STREET}?
streetElement
|
{getCurrentToken().getType()== OTHER}?
otherElement
;
codeElement
:
CODE
{getCurrentToken().getText().matches("[A-F0-9]{6}")}?
codeInformation
|
{/*throw someException*/}
;
codeInformation
:
HEXCODE
;
HEXCODE
:
[a-fA-F0-9]+
;
CODE
:
'CODE:'
;
otherElement
:
OTHER otherInformation
;
otherInformation
:
STR
;
OTHER
:
'OTHER:'
;
streetElement
:
STREET streetInformation
;
STREET
:
'STREET:'
;
streetInformation
:
STR
;
STR
:
[a-zA-Z0-9]+
;
WORD
:
[a-zA-Z]+
;
fullNameElement
:
FULLNAME firstNameInformation lastNameInformation
;
FULLNAME
:
'FULL_NAME:'
;
firstNameInformation
:
WORD
;
lastNameInformation
:
WORD
;
idElement
:
ID idInformation
;
ID
:
'ID:'
;
idInformation
:
{getCurrentToken().getText().length()<=10}?
STR
;
I'm not sure If this is the right approach since I have problems reading WORD token.
Since all the tokens are basically of the same format, I'm trying to find a way to keep track of the precedent token or context to resolve the ambiguity, and check the format at the same time ( example if it's more than 10 char throw exception )
A thing you could do to find out which rules the generated parser would enter (i.e. which context is visited), you could use ANTLR to create visitors. There is a great explanation of it here (See Bart Kiers response).
Generally, if there are two rules, which are the same, you could just merge them into one, and then label the usage of them. For example, for these rules:
firstNameInformation
:
WORD
;
lastNameInformation
:
WORD
;
there is no reason to actually have them. Instead, you could write the grammar for the full name this way:
fullNameElement
:
FULLNAME firstname=WORD lastname=WORD
;
In that case, you only use the WORD token, but you label them so you can distinct between them when doing a tree walk.
I intend to add object properties to classes using Jena API.
I can't find a proper way how to do this. I would like to achieve something similar to what can be done in Protege:
ExampleObjectProperty is my own ObjectProperty.
I tried adding this property using ontClass.addProperty, also adding a new statement to ontModel, but the result wasn't the same.
As far as I know, in Protege, blank node is generated (saying that :blank_node has some onProperty ExampleObjectProperty and ExampleClass has someValuesOf :blank_node... I'm not sure about this though).
loopasam's comment is correct; you're not trying to "add a property to a class" or anything like that. What you're trying to do is add a subclass axiom. In the Manchester OWL syntax, it would look more or less like:
ExampleResource subClassOf (ExampleObjectProperty some ExampleClass)
Jena's OntModel API makes it pretty easy to create that kind of axiom though. Here's how you can do it:
import com.hp.hpl.jena.ontology.OntClass;
import com.hp.hpl.jena.ontology.OntModel;
import com.hp.hpl.jena.ontology.OntModelSpec;
import com.hp.hpl.jena.ontology.OntProperty;
import com.hp.hpl.jena.rdf.model.ModelFactory;
public class SubclassOfRestriction {
public static void main(String[] args) {
final String NS = "https://stackoverflow.com/q/20476826/";
final OntModel model = ModelFactory.createOntologyModel( OntModelSpec.OWL_DL_MEM );
// Create the two classes and the property that we'll use.
final OntClass ec = model.createClass( NS+"ExampleClass" );
final OntClass er = model.createClass( NS+"ExampleResource" );
final OntProperty eop = model.createOntProperty( NS+"ExampleObjectProperty" );
// addSuperClass and createSomeValuesFromRestriction should be pretty straight-
// forward, especially if you look at the argument names in the Javadoc. The
// null just indicates that the restriction class will be anonymous; it doesn't
// have an URI of its own.
er.addSuperClass( model.createSomeValuesFromRestriction( null, eop, ec ));
// Write the model.
model.write( System.out, "RDF/XML-ABBREV" );
model.write( System.out, "TTL" );
}
}
The output is:
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:owl="http://www.w3.org/2002/07/owl#"
xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#">
<owl:Class rdf:about="https://stackoverflow.com/q/20476826/ExampleResource">
<rdfs:subClassOf>
<owl:Restriction>
<owl:someValuesFrom>
<owl:Class rdf:about="https://stackoverflow.com/q/20476826/ExampleClass"/>
</owl:someValuesFrom>
<owl:onProperty>
<rdf:Property rdf:about="https://stackoverflow.com/q/20476826/ExampleObjectProperty"/>
</owl:onProperty>
</owl:Restriction>
</rdfs:subClassOf>
</owl:Class>
</rdf:RDF>
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
#prefix owl: <http://www.w3.org/2002/07/owl#> .
#prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
#prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
<https://stackoverflow.com/q/20476826/ExampleObjectProperty>
a rdf:Property .
<https://stackoverflow.com/q/20476826/ExampleClass>
a owl:Class .
<https://stackoverflow.com/q/20476826/ExampleResource>
a owl:Class ;
rdfs:subClassOf [ a owl:Restriction ;
owl:onProperty <https://stackoverflow.com/q/20476826/ExampleObjectProperty> ;
owl:someValuesFrom <https://stackoverflow.com/q/20476826/ExampleClass>
] .
In Protégé that appears as follows:
] .
I am using jQuery Autocomplete 1.8. Every time it returns every string that contains the input as a substring. How can I make it return only the strings that contains the input as a prefix?
I'm doing it in the sql. I'm using php to generate the json
<?php
set_include_path(get_include_path() . ':' . '/home/lms/library/php');
set_include_path(get_include_path() . ':' . '/home/lms/systems/ORM');
require_once("Configuration.php");
require_once("DALI_Class.php");
//$unitID = $_POST['unitID'];
$unitID = $_GET["term"];
$return_array=array();
$row_array=array();
$lmsAdminSysDB = DALI::connect(LMS_MIDDLEWARE_DATABASE);
$selectUnit = "SELECT " .
"UnitID, " .
"Title " .
"FROM UnitTBL " .
"WHERE UnitID LIKE '".$unitID."%' " .
"ORDER BY UnitID " .
"";
$resultUnit = $lmsAdminSysDB->Execute($selectUnit);
while($row = $resultUnit->FetchRow()) {
$row_array['label'] = $row['UnitID']." - ".$row['Title'];
$row_array['value'] = $row['UnitID'];
$row_array['title'] = $row['Title'];
array_push($return_array,$row_array);
}
unset($resultUnit);
//header('Content-type: application/json');
//echo json_encode($result);
DALI::disconnect($lmsAdminSysDB);
echo json_encode($return_array);
?>
I've converted the 'easy' parts (fragment, #header and #member
declerations etc.), but since I'm new to Antlr I have a really hard
time converting the Tree statements etc.
I use the following migration guide.
The grammar file can be found here....
Below you can find some examples where I run into problems:
For instance, I have problems with:
n3Directive0!:
d:AT_PREFIX ns:nsprefix u:uriref
{directive(#d, #ns, #u);}
;
or
propertyList![AST subj]
: NAME_OP! anonnode[subj] propertyList[subj]
| propValue[subj] (SEMI propertyList[subj])?
| // void : allows for [ :a :b ] and empty list "; .".
;
propValue [AST subj]
: v1:verb objectList[subj, #v1]
// Reverse the subject and object
| v2:verbReverse subjectList[subj, #v2]
;
subjectList![AST oldSub, AST prop]
: obj:item { emitQuad(#obj, prop, oldSub) ; }
(COMMA subjectList[oldSub, prop])? ;
objectList! [AST subj, AST prop]
: obj:item { emitQuad(subj,prop,#obj) ; }
(COMMA objectList[subj, prop])?
| // Allows for empty list ", ."
;
n3Directive0!:
d=AT_PREFIX ns=nsprefix u=uriref
{directive($d, $ns, $u);}
;
You have to use '=' for assignments.
Tokens can then be used as '$tokenname.getText()', ...
Rule results can then be used in your code as 'rulename.result'
If you have rules having declared result names, you have to use these names iso.
'result'.