Initialising Apache Jena TDB Dataset

Initialising Apache Jena TDB Dataset - jena

I am using Apache Jena from within Tomcat 8 under Windows 10 (Eclipse IDE) and am not able to initialise the TDB Dataset. The initialisation code is put in the static initialiser inside a try-catch block but no exception is being thrown and the finally clause is getting invoked. I tried with relative directory names, absolute path names, as well as empty path (in memory dataset). The
dataset remains null and so triples cannot be written. What do I need to change in the code
to initialise the dataset?
Here is the code:
package knowledgegraph;
import org.apache.jena.tdb.TDBFactory;
import org.apache.jena.rdf.model.*;
import org.apache.jena.shared.JenaException;
import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileOutputStream;
import java.io.FileReader;
import java.io.IOException;
import java.io.OutputStreamWriter;
import org.apache.jena.query.Dataset;
import org.apache.jena.query.ReadWrite;
public class JenaProcessor {
static Dataset dataset = null;
static String ns = "http://www.lke.com/lke.owl#";
static {
try {
// dataset = TDBFactory.createDataset("lke");
// dataset = TDBFactory.createDataset("C:\\Users\\Diptendu\\Desktop\\lke");
dataset = TDBFactory.createDataset();
System.out.println("TDB initialised");
}
// catch(Exception ex) {
catch(JenaException ex) {
ex.printStackTrace();
}
finally {
System.out.println("Finally clause");
}
}
static public void writeTriple(String corpus_file_id, String subject, String predicate, String object) {
dataset.begin(ReadWrite.WRITE) ;
Model model = null;
try {
model = dataset.getNamedModel(corpus_file_id);
// model.enterCriticalSection(Lock.WRITE);
// write triples to model
Resource subjectResource = model.createResource(ns.concat(subject));
Property property = model.createProperty(ns.concat(predicate));
Resource objectResource = model.createResource(ns.concat(object));
// model.add(subjectResource, property, objectResource);
Statement statement = model.createStatement(subjectResource, property, objectResource);
model.add(statement);
dataset.commit();
// TDB.sync(model);
} finally {
// model.leaveCriticalSection();
model.close();
dataset.end();
}
}
}

sked and responded to on users#jena:
https://lists.apache.org/thread.html/r9f788bf21ceb3991329ab0ba3c649d94f2983f92aa3c0a76af788e52%40%3Cusers.jena.apache.org%3E
It's because you are trying to close the model after you've committed
the transaction so the error message is quite correct in that you are
no longer in a transaction at that point
Put the dataset.commit() after the model.close() line and it will work
Rob

Related

Xtext global auto-generation

In Xtext, how does one auto-generate a single file containing information from multiple model files.
Consider the following simple Xtext grammar.
grammar org.example.people.People with org.eclipse.xtext.common.Terminals
generate people "http://www.example.org/people/People"
People:
people+=Person*;
Person:
'person' name=ID ';';
In the launched workspace I create a project with two files, friends.people
// friends
person Alice;
person Bob;
and enemies.people
// enemies
person Malice;
person Rob;
How do I auto-generate a single file listing everyone when the global index changes?
Alice
Bob
Malice
Rob

For ease of future reference, here is the solution obtained by combining the various references given by Christian Dietrich. Note that the solution is Eclipse dependent.
Anyone who finds themselves with this requirement should perhaps try to find a better way of modelling the problem. For example a singleton model element All that generates the required list by finding everyone in the model using the standard API. This is independent of Eclipse, and requires non of the following complexity.
In the generator package of the grammar project, create an Java interface IPeopleGenerator extending IGenerator2.
package org.example.people.generator;
import org.eclipse.emf.ecore.resource.ResourceSet;
import org.eclipse.xtext.generator.IFileSystemAccess2;
import org.eclipse.xtext.generator.IGenerator2;
import org.eclipse.xtext.generator.IGeneratorContext;
public interface IPeopleGenerator extends IGenerator2{
public void doGenerate(ResourceSet input, IFileSystemAccess2 fsa, IGeneratorContext context);
}
and edit the existing generator PeopleGenerator as follows.
/*
* generated by Xtext 2.14.0
*/
package org.example.people.generator
import org.eclipse.emf.ecore.resource.Resource
import org.eclipse.emf.ecore.resource.ResourceSet
import org.eclipse.xtext.generator.IFileSystemAccess2
import org.eclipse.xtext.generator.IGeneratorContext
import org.example.people.people.Person
/**
* Generates code from your model files on save.
*
* See https://www.eclipse.org/Xtext/documentation/303_runtime_concepts.html#code-generation
*/
class PeopleGenerator implements IPeopleGenerator {
override doGenerate(ResourceSet rs, IFileSystemAccess2 fsa, IGeneratorContext context) {
val people = rs.resources.map(r|r.allContents.toIterable.filter(Person)).flatten
fsa.generateFile("all.txt", people.compile)
}
override afterGenerate(Resource input, IFileSystemAccess2 fsa, IGeneratorContext context) {
}
override beforeGenerate(Resource input, IFileSystemAccess2 fsa, IGeneratorContext context) {
}
override doGenerate(Resource input, IFileSystemAccess2 fsa, IGeneratorContext context) {
}
def compile (Iterable<Person> entities) '''
«FOR e : entities»
«e.name»
«ENDFOR»
'''
}
and add the method
def Class<? extends IPeopleGenerator> bindIPeopleGenerator () {
return PeopleGenerator
}
to the existing runtime module PeopleRuntimeModule in the grammar project.
Work needs to be done in the UI project org.example.people.ui. Consequently this solution is Eclipse dependent.
Create a Java class org.example.people.ui.PeopleBuilderParticipant as follows (the complexity being the need to ensure that the global generated file is only created once).
package org.example.people.ui;
import java.util.List;
import org.eclipse.core.runtime.CoreException;
import org.eclipse.core.runtime.IProgressMonitor;
import org.eclipse.core.runtime.NullProgressMonitor;
import org.eclipse.emf.ecore.resource.Resource;
import org.eclipse.xtext.builder.BuilderParticipant;
import org.eclipse.xtext.builder.EclipseResourceFileSystemAccess2;
import org.eclipse.xtext.builder.MonitorBasedCancelIndicator;
import org.eclipse.xtext.generator.GeneratorContext;
import org.eclipse.xtext.resource.IContainer;
import org.eclipse.xtext.resource.IResourceDescription;
import org.eclipse.xtext.resource.IResourceDescription.Delta;
import org.eclipse.xtext.resource.IResourceDescriptions;
import org.eclipse.xtext.resource.impl.ResourceDescriptionsProvider;
import org.example.people.generator.IPeopleGenerator;
import com.google.inject.Inject;
public class PeopleBuilderParticipant extends BuilderParticipant {
#Inject
private ResourceDescriptionsProvider resourceDescriptionsProvider;
#Inject
private IContainer.Manager containerManager;
#Inject(optional = true)
private IPeopleGenerator generator;
protected ThreadLocal<Boolean> buildSemaphor = new ThreadLocal<Boolean>();
#Override
public void build(IBuildContext context, IProgressMonitor monitor) throws CoreException {
buildSemaphor.set(false);
super.build(context, monitor);
}
#Override
protected void handleChangedContents(Delta delta, IBuildContext context,
EclipseResourceFileSystemAccess2 fileSystemAccess) throws CoreException {
super.handleChangedContents(delta, context, fileSystemAccess);
if (!buildSemaphor.get() && generator != null) {
invokeGenerator(delta, context, fileSystemAccess);
}
}
private void invokeGenerator(Delta delta, IBuildContext context, EclipseResourceFileSystemAccess2 access) {
buildSemaphor.set(true);
Resource resource = context.getResourceSet().getResource(delta.getUri(), true);
if (shouldGenerate(resource, context)) {
IResourceDescriptions index = resourceDescriptionsProvider.createResourceDescriptions();
IResourceDescription resDesc = index.getResourceDescription(resource.getURI());
List<IContainer> visibleContainers = containerManager.getVisibleContainers(resDesc, index);
for (IContainer c : visibleContainers) {
for (IResourceDescription rd : c.getResourceDescriptions()) {
context.getResourceSet().getResource(rd.getURI(), true);
}
}
MonitorBasedCancelIndicator cancelIndicator = new MonitorBasedCancelIndicator(
new NullProgressMonitor()); //maybe use reflection to read from fsa
GeneratorContext generatorContext = new GeneratorContext();
generatorContext.setCancelIndicator(cancelIndicator);
generator.doGenerate(context.getResourceSet(), access, generatorContext);
}
}
}
and bind this build participant by adding
override Class<? extends IXtextBuilderParticipant> bindIXtextBuilderParticipant() {
return PeopleBuilderParticipant;
}
to the existing UI module org.example.people.ui.PeopleUiModule.

I added the validation code to the answer of fundagain to eliminate invalid resources. However, this will not work when last modified resource is invalid because doGenerate is not invoked when invalid. When any valid resource is saved invalid resources will be discarded from all.txt .
override doGenerate(ResourceSet rs, IFileSystemAccess2 fsa, IGeneratorContext context) {
var valid_rs = new ArrayList<Resource>
for(r : rs.resources)
if (( r as XtextResource)
.getResourceServiceProvider()
.getResourceValidator()
.validate(r,CheckMode.ALL, null)
.map(issue | issue.severity)
.filter[it === Severity.ERROR]
.size == 0)
valid_rs.add(r)
val types = valid_rs.map(r|r.allContents.toIterable.filter(Person)).flatten
fsa.generateFile("all.txt", people.compile)
}

Generate one file for a list of parsed files using source_gen in dart

I have a list of models that I need to create a mini reflective system.
I analyzed the Serializable package and understood how to create one generated file per file, however, I couldn't find how can I create one file for a bulk of files.
So, how to dynamically generate one file, using source_gen, for a list of files?
Example:
Files
user.dart
category.dart
Generated:
info.dart (containg information from user.dart and category.dart)

Found out how to do it with the help of people in Gitter.
You must have one file, even if empty, to call the generator. In my example, it is lib/batch.dart.
source_gen: ^0.5.8
Here is the working code:
The tool/build.dart
import 'package:build_runner/build_runner.dart';
import 'package:raoni_global/phase.dart';
main() async {
PhaseGroup pg = new PhaseGroup()
..addPhase(batchModelablePhase(const ['lib/batch.dart']));
await build(pg,
deleteFilesByDefault: true);
}
The phase:
batchModelablePhase([Iterable<String> globs =
const ['bin/**.dart', 'web/**.dart', 'lib/**.dart']]) {
return new Phase()
..addAction(
new GeneratorBuilder(const
[const BatchGenerator()], isStandalone: true
),
new InputSet(new PackageGraph.forThisPackage().root.name, globs));
}
The generator:
import 'dart:async';
import 'package:analyzer/dart/element/element.dart';
import 'package:build/build.dart';
import 'package:source_gen/source_gen.dart';
import 'package:glob/glob.dart';
import 'package:build_runner/build_runner.dart';
class BatchGenerator extends Generator {
final String path;
const BatchGenerator({this.path: 'lib/models/*.dart'});
#override
Future<String> generate(Element element, BuildStep buildStep) async {
// this makes sure we parse one time only
if (element is! LibraryElement)
return null;
String libraryName = 'raoni_global', filePath = 'lib/src/model.dart';
String className = 'Modelable';
// find the files at the path designed
var l = buildStep.findAssets(new Glob(path));
// get the type of annotation that we will use to search classes
var resolver = await buildStep.resolver;
var assetWithAnnotationClass = new AssetId(libraryName, filePath);
var annotationLibrary = resolver.getLibrary(assetWithAnnotationClass);
var exposed = annotationLibrary.getType(className).type;
// the caller library' name
String libName = new PackageGraph.forThisPackage().root.name;
await Future.forEach(l.toList(), (AssetId aid) async {
LibraryElement lib;
try {
lib = resolver.getLibrary(aid);
} catch (e) {}
if (lib != null && Utils.isNotEmpty(lib.name)) {
// all objects within the file
lib.units.forEach((CompilationUnitElement unit) {
// only the types, not methods
unit.types.forEach((ClassElement el) {
// only the ones annotated
if (el.metadata.any((ElementAnnotation ea) =>
ea.computeConstantValue().type == exposed)) {
// use it
}
});
});
}
});
return '''
$libName
''';
}
}

It seems what you want is what this issue is about How to generate one output from many inputs (aggregate builder)?

[Günter]'s answer helped me somewhat.
Buried in that thread is another thread which links to a good example of an aggregating builder:
1https://github.com/matanlurey/build/blob/147083da9b6a6c70c46eb910a3e046239a2a0a6e/docs/writing_an_aggregate_builder.md
The gist is this:
import 'package:build/build.dart';
import 'package:glob/glob.dart';
class AggregatingBuilder implements Builder {
/// Glob of all input files
static final inputFiles = new Glob('lib/**');
#override
Map<String, List<String>> get buildExtensions {
/// '$lib$' is a synthetic input that is used to
/// force the builder to build only once.
return const {'\$lib$': const ['all_files.txt']};
}
#override
Future<void> build(BuildStep buildStep) async {
/// Do some operation on the files
final files = <String>[];
await for (final input in buildStep.findAssets(inputFiles)) {
files.add(input.path);
}
String fileContent = files.join('\n');
/// Write to the file
final outputFile = AssetId(buildStep.inputId.package,'lib/all_files.txt');
return buildStep.writeAsString(outputFile, fileContent);
}
}

How can Univocity Parsers read a .csv file when the headers are not on the first line?

How can Univocity Parsers read a .csv file when the headers are not on the first line?
There are errors if the first line in the .csv file is not the headers.
The code and stack trace are below.
Any help would be greatly appreciated.
import com.univocity.parsers.csv.CsvParserSettings;
import com.univocity.parsers.common.processor.*;
import com.univocity.parsers.csv.*;
import java.io.InputStreamReader;
import java.io.Reader;
import java.io.UnsupportedEncodingException;
import java.lang.IllegalStateException;
import java.lang.String;
import java.util.List;
public class UnivocityParsers {
public Reader getReader(String relativePath) {
try {
return new InputStreamReader(this.getClass().getResourceAsStream(relativePath), "Windows-1252");
} catch (UnsupportedEncodingException e) {
throw new IllegalStateException("Unable to read input", e);
}
}
public void columnSelection() {
RowListProcessor rowProcessor = new RowListProcessor();
CsvParserSettings parserSettings = new CsvParserSettings();
parserSettings.setRowProcessor(rowProcessor);
parserSettings.setHeaderExtractionEnabled(true);
parserSettings.setLineSeparatorDetectionEnabled(true);
parserSettings.setSkipEmptyLines(true);
// Here we select only the columns "Price", "Year" and "Make".
// The parser just skips the other fields
parserSettings.selectFields("AUTHOR", "ISBN");
CsvParser parser = new CsvParser(parserSettings);
parser.parse(getReader("list2.csv"));
List<String[]> rows = rowProcessor.getRows();
String[] strings = rows.get(0);
System.out.print(strings[0]);
}
public static void main(String arg[]) {
UnivocityParsers univocityParsers = new UnivocityParsers();
univocityParsers.columnSelection();
}
}
Stack trace:
Exception in thread "main" com.univocity.parsers.common.TextParsingException: Error processing input: java.lang.IllegalStateException - Unknown field names: [author, isbn]. Available fields are: [list of books by author - created today]
Here is the file being parsed:
List of books by Author - Created today
"REVIEW_DATE","AUTHOR","ISBN","DISCOUNTED_PRICE"
"1985/01/21","Douglas Adams",0345391802,5.95
"1990/01/12","Douglas Hofstadter",0465026567,9.95
"1998/07/15","Timothy ""The Parser"" Campbell",0968411304,18.99
"1999/12/03","Richard Friedman",0060630353,5.95
"2001/09/19","Karen Armstrong",0345384563,9.95
"2002/06/23","David Jones",0198504691,9.95
"2002/06/23","Julian Jaynes",0618057072,12.50
"2003/09/30","Scott Adams",0740721909,4.95
"2004/10/04","Benjamin Radcliff",0804818088,4.95
"2004/10/04","Randel Helms",0879755725,4.50

As of today, on 2.0.0-SNAPSHOT you can do this:
settings.setNumberOfRowsToSkip(1);
On version 1.5.6 you can do this to skip the first line and correctly grab the headers:
RowListProcessor rowProcessor = new RowListProcessor(){
#Override
public void processStarted(ParsingContext context) {
super.processStarted(context);
context.skipLines(1);
}
};
An alternative is to comment the first line if your input file (if you have control over how the file is generated) by adding a # at the beginning of the line you want to discard:
#List of books by Author - Created today

Importing Triples with Tinkerpop/bluebrints into OrientDB

Im trying to import RDF-Triples into OrientDB with help of tinkerpop/blueprints.
I found the basic usage here.
Im now that far:
import info.aduna.iteration.CloseableIteration;
import org.openrdf.model.Statement;
import org.openrdf.model.ValueFactory;
import org.openrdf.sail.Sail;
import org.openrdf.sail.SailConnection;
import org.openrdf.sail.SailException;
import com.hp.hpl.jena.graph.Node;
import com.hp.hpl.jena.graph.Triple;
import com.tinkerpop.blueprints.impls.orient.OrientGraph;
import com.tinkerpop.blueprints.oupls.sail.GraphSail;
import de.hof.iisys.relationExtraction.jena.parser.impl.ParserStreamIterator;
import de.hof.iisys.relationExtraction.neo4j.importer.Importer;
public class ImporterJenaTriples extends Importer {
private OrientGraph graph = null;
private Sail sail = null;
private SailConnection sailConnection = null;
private ValueFactory valueFactory = null;
private Thread parserThread = null;
public ImporterJenaTriples(ParserStreamIterator parser, String databasePath) throws SailException {
this.parser = parser;
this.databasePath = databasePath;
this.initialize();
}
private void initialize() throws SailException {
this.graph = new OrientGraph(this.databasePath);
this.sail = new GraphSail<OrientGraph>(graph);
sail.initialize();
this.sailConnection = sail.getConnection();
this.valueFactory = sail.getValueFactory();
}
public void startImport() {
this.parserThread = new Thread(this.parser);
this.parserThread.start();
try {
Triple next = (Triple) this.parser.getIterator().next();
Node subject = next.getSubject();
Node predicate = next.getPredicate();
Node object = next.getObject();
} catch (SailException e) {
e.printStackTrace();
}
try {
CloseableIteration<? extends Statement, SailException> results = this.sailConnection.getStatements(null, null, null, false);
while(results.hasNext()) {
System.out.println(results.next());
}
} catch (SailException e) {
e.printStackTrace();
}
}
public void stopImport() throws InterruptedException {
this.parser.terminate();
this.parserThread.join();
}
}
What i need to do now is to differ the types of subject, predicate and object
but the problem is i dont know which types they are and how i have to use
the valuefactory to create the type and to add the Statement to my SailConnection.
Unfortunately i cant find an example how to use it.
Maybe someone has done it before and knows how to continue.

I guess you need to convert from Jena object types to Sesame ones and use the
The unsupported project https://github.com/afs/JenaSesame may have some code for that.
But mixing Jena and Sesame seems to make things more complicated - have you consider using the Sesame parser and getting Sesame objects that can go into the SailConnection?

Run JesterRecommenderEvaluationRunner, but get no results of evaluation

I downloaded the Jester example code in Mahout, and tries to run it on jester dataset to see the evaluation results. the running is done successfully, but the console only has the results:
log4j:WARN No appenders could be found for logger (org.apache.mahout.cf.taste.impl.model.file.FileDataModel).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
I expect to see the evaluation score range from 0 to 10. any one can help me found out how to get the score?
I am using mahout-core-0.6.jar and the following is the code:
JesterDataModel.java:
package Jester;
import java.io.File;
import java.io.IOException;
import java.util.Collection;
import java.util.regex.Pattern;
import com.google.common.collect.Lists;
import org.apache.mahout.cf.taste.example.grouplens.GroupLensDataModel;
import org.apache.mahout.cf.taste.impl.common.FastByIDMap;
import org.apache.mahout.cf.taste.impl.model.GenericDataModel;
import org.apache.mahout.cf.taste.impl.model.GenericPreference;
import org.apache.mahout.cf.taste.impl.model.file.FileDataModel;
import org.apache.mahout.cf.taste.model.DataModel;
import org.apache.mahout.cf.taste.model.Preference;
import org.apache.mahout.common.iterator.FileLineIterator;
//import org.apache.mahout.cf.taste.impl.common.FileLineIterable;
public final class JesterDataModel extends FileDataModel {
private static final Pattern COMMA_PATTERN = Pattern.compile(",");
private long userBeingRead;
public JesterDataModel() throws IOException {
this(GroupLensDataModel.readResourceToTempFile("\\jester-data-1.csv"));
}
public JesterDataModel(File ratingsFile) throws IOException {
super(ratingsFile);
}
#Override
public void reload() {
userBeingRead = 0;
super.reload();
}
#Override
protected DataModel buildModel() throws IOException {
FastByIDMap<Collection<Preference>> data = new FastByIDMap<Collection<Preference>> ();
FileLineIterator iterator = new FileLineIterator(getDataFile(), false);
FastByIDMap<FastByIDMap<Long>> timestamps = new FastByIDMap<FastByIDMap<Long>>();
processFile(iterator, data, timestamps, false);
return new GenericDataModel(GenericDataModel.toDataMap(data, true));
}
#Override
protected void processLine(String line,
FastByIDMap<?> rawData,
FastByIDMap<FastByIDMap<Long>> timestamps,
boolean fromPriorData) {
FastByIDMap<Collection<Preference>> data = (FastByIDMap<Collection<Preference>>) rawData;
String[] jokePrefs = COMMA_PATTERN.split(line);
int count = Integer.parseInt(jokePrefs[0]);
Collection<Preference> prefs = Lists.newArrayListWithCapacity(count);
for (int itemID = 1; itemID < jokePrefs.length; itemID++) { // yes skip first one, just a count
String jokePref = jokePrefs[itemID];
if (!"99".equals(jokePref)) {
float jokePrefValue = Float.parseFloat(jokePref);
prefs.add(new GenericPreference(userBeingRead, itemID, jokePrefValue));
}
}
data.put(userBeingRead, prefs);
userBeingRead++;
}
}
JesterRecommenderEvaluatorRunner.java
package Jester;
import org.apache.mahout.cf.taste.common.TasteException;
import org.apache.mahout.cf.taste.eval.RecommenderEvaluator;
import org.apache.mahout.cf.taste.impl.eval.AverageAbsoluteDifferenceRecommenderEvaluator;
import org.apache.mahout.cf.taste.model.DataModel;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import java.io.IOException;
public final class JesterRecommenderEvaluatorRunner {
private static final Logger log = LoggerFactory.getLogger(JesterRecommenderEvaluatorRunner.class);
private JesterRecommenderEvaluatorRunner() {
// do nothing
}
public static void main(String... args) throws IOException, TasteException {
RecommenderEvaluator evaluator = new AverageAbsoluteDifferenceRecommenderEvaluator();
DataModel model = new JesterDataModel();
double evaluation = evaluator.evaluate(new JesterRecommenderBuilder(),
null,
model,
0.9,
1.0);
log.info(String.valueOf(evaluation));
}
}

Mahout 0.7 is old, and 0.6 is very old. Use at least 0.7, or better, later from SVN.
I think the problem is exactly what you identified: you don't have any slf4j bindings in your classpath. If you use the ".job" files in Mahout you will have all dependencies packages. Then you will actually see output.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Initialising Apache Jena TDB Dataset - jena

Related

Xtext global auto-generation

Generate one file for a list of parsed files using source_gen in dart

How can Univocity Parsers read a .csv file when the headers are not on the first line?

Importing Triples with Tinkerpop/bluebrints into OrientDB

Run JesterRecommenderEvaluationRunner, but get no results of evaluation

Categories

Resources