Identifying source of parser errors in Apache Fuseki - jena

I am getting the following error in trying to load a large RDF/XML document into Fuseki:
> Code: 4/UNWISE_CHARACTER in PATH: The character matches no grammar rules of URIs/IRIs. These characters are permitted in RDF URI References, XML system identifiers, and XML Schema anyURIs.
How do I find out what line contains the offending error?
I have tried turning up the output in Log4j.properties and I also tried validating the RDF/XML file using the Jena commandline rdfxml tool (as well as utf8 & riot) --- it validates with no errors reported. But I'm new to this toolset.

(version?)
Check the ""-strings in your RDF/XML data for undesiravle URIs - especially spaces in URIs.
Best to validate before loading : try riot YourFile and send stderr and stdout to a file. The errors should be approximately in the position of the parser output (N-triples) at the time.

Related

Problem SamplingRateCalculatorList (00000283DDC3C0E0) : All classes are empty ! OTB + QGis

I use OTB (Orfeo Tool Box) in QGis for classification. When I use the ImageTrainClassifier tool in a batch process, I have a problem for some images. Instead of returning a model in a xml/txt file format, it returns several files with those extensions : .xml_rates_1, .xml_samples_1.dbf, .xml_samples_1.prj, .xml_samples_1.shp, .xml_samples_1.shx, .xml_stats_1 (I have the same files with txt instead of xml if I use txt file format as output).
During the execution of the algorithms, I have only one warning message :
(WARNING): file ..\Modules\Learning\Sampling\src\otbSamplingRateCalculatorList.cxx, line 99, SamplingRateCalculatorList (00000283DDC3C0E0): All classes are empty !
And after that :
(FATAL) TrainImagesClassifier: No samples found in the inputs!
The problem is that after that, I want to use ImageClassifier, that takes the model of ImageTrainClassifier in input, that I don’t have.
Thanks for your help

Using Jena to convert an owl file to N-Triples from terminal returns an empty file

I have generated an owl file using this generator http://swat.cse.lehigh.edu/projects/lubm/
I want to transform the file in N-triples and have done it before using
$ riot -out N-TRIPLE ~/lubm20/*.owl > lubm20.nt
for some reason now I get an empty file (lubm20.nt)
and when I use
$ rdfcat -out N-TRIPLE ~/lubm20/*.owl > lubm20.nt
I get this error
Exception in thread "main" org.apache.jena.riot.RiotException: <file:///root/lubm20/classes\University0_0.owl> Code: 4/UNWISE_CHARACTER in PATH: The character matches no grammar rules of URIs/IRIs. These characters are permitted in RDF URI References, XML system identifiers, and XML Schema anyURIs.
at org.apache.jena.riot.s5ystem.IRIResolver.exceptions(IRIResolver.java:371)
at org.apache.jena.riot.system.IRIResolver.resolve(IRIResolver.java:328)
at org.apache.jena.riot.system.IRIResolver$IRIResolverSync.resolve(IRIResolver.java:489)
at org.apache.jena.riot.system.IRIResolver.resolveIRI(IRIResolver.java:254)
at org.apache.jena.riot.system.IRIResolver.resolveString(IRIResolver.java:233)
at org.apache.jena.riot.SysRIOT.chooseBaseIRI(SysRIOT.java:109)
at org.apache.jena.riot.adapters.AdapterFileManager.readModelWorker(AdapterFileManager.java:286)
at org.apache.jena.util.FileManager.readModel(FileManager.java:341)
at jena.rdfcat.readInput(rdfcat.java:328)
at jena.rdfcat$ReadAction.run(rdfcat.java:473)
at jena.rdfcat.go(rdfcat.java:231)
at jena.rdfcat.main(rdfcat.java:206)
The generator would generate a well known semantic web benchmark dataset so how can it have
UNWISE_CHARACTER s?
edit:
for the question asked
I used this line to generate the *.owl files
java edu.lehigh.swat.bench.uba.Generator -onto http://swat.cse.lehigh.edu/onto/univ-bench.owl univ 20
then moved the *.owl files to lubm20 folder
I used rdf2rdf instead of jena
java -jar rdf2rdf-1.0.1-2.3.1.jar /lubmData/lubm100/*.owl lubm100.nt
worked like a charm
enter link description here

YML Parsing error - symfony documentation

In order to allow users to upload documents on my website, I am trying to add form validation on a symfony2 application. According to this doc : http://symfony.com/doc/current/reference/constraints/File.html , I should create a validation.yml file with this syntax :
# src/Acme/BlogBundle/Resources/config/validation.yml
Acme\BlogBundle\Entity\Author
properties:
bioFile:
- File:
maxSize: 1024k
mimeTypes: [application/pdf, application/x-pdf]
mimeTypesMessage: Please upload a valid PDF
I have tried to type/edit this file in a lot of ways, yet I always get a parsing error when the file is executed :
Unable to parse in "\/***\/***\/dev\/***\/src\/***\/***Bundle\/Resources\/config\/validation.yml" at line 1 (near "***\***\Entity\Author").
I tried to test this code with this online YML parsing tool : http://yaml-online-parser.appspot.com/, and it says the colon on line 3 just after "properties" is wrong :
Output
ERROR:
mapping values are not allowed here
in "<unicode string>", line 3, column 13:
properties:
^
What am I missing here? Why is the YML syntax used in symfony documentation not accepted by this online parser? Note that I am aware of the tab indentation vs. space indentation for .yml files.

Offending Command error while Printing EPS

I am printing an EPS File generated with following credentials.
%-12345X#PJL JOB
#PJL ENTER LANGUAGE = POSTSCRIPT
%!PS-Adobe-3.0
%%Title: InvoiceDetail_combine
%%Creator: PScript5.dll Version 5.2.2
%%CreationDate: 10/7/2011 4:46:59
%%For: Administrator
%%BoundingBox: (atend)
%%Pages: (atend)
%%Orientation: Portrait
%%PageOrder: Special
%%DocumentNeededResources: (atend)
%%DocumentSuppliedResources: (atend)
%%DocumentData: Clean7Bit
%%TargetDevice: (HP Color LaserJet 4500) (2014.200) 0
%%LanguageLevel: 2
%%EndComments
While doing Selection Printing on Ricoh Afficio 2090 or any other drivers/printers get the following error printed on the sheets
ERROR: undefined
OFFENDING COMMAND: F4S47
Stack:
.
Kindly Review and suggest a turn around for the same as i am already stuck in this hell. I have tried to convert/extract in PS but all in vain. I am using gsview to Print and view these files.
This is the problem:
%%PageOrder: Special
A ps document with "Special" page order can NOT be re-ordered. You cannot do a selection or range with this file because it is broken for this use. You must reprocess the file using Distiller or ghostscript (ps2ps or ps2pdf) in order to print selected or re-ordered pages from the document.
You can avoid this by generating your postscript files with a real Postscript™ driver (one not created by Microsoft).
The GSView Documentation has more about this.
Previously:
This line ...
%%TargetDevice: (HP Color LaserJet 4500) (2014.200) 0
... tells us that the file was generated with HP printers as a target. So this really is not an EPS file. Because it's not Encapsulatable. To generate output on a printer the file has to execute the showpage operator, which is a no-no for EPS files.
So uncheck the EPS box (it's a big fat lie, anyway), and select (install) a Generic Postscript driver. If you need to send it to multiple makes of printer, the file needs to make as few assumptions about the printer as possible.
The first thing is that this is not a valid EPS file, as it has PJL attached at the front. Many PostScript printers will strip this off, but by no means all.
This probably is not the source of the problem.
There is no way to 'review' the problem as you have not supplied the complete PostScript program. Without that there is no way to tell what is actually wrong, the error message tells you that the interpreter encountered 'F4547' while trying to parse a token, and that this has not been defined as a routine.
Most likely the file is corrupt, either damaged in some way, or possibly it is a biinary file and has been transmitted by some process which does has done some kind of conversion (CR/LF is common). The offending command looks like its ASCIIHex encoded, so that may be a red herring.
If you want additional help, you are going to have to make the whole program available somewhere.

Character Encoding issue in Rails v3/Ruby 1.9.2

I get this error sometimes "invalid byte sequence in UTF-8" when I read contents from a file. Note - this only happens when there are some special characters in the string. I have tried opening the file without "r:UTF-8", but still get the same error.
open(file, "r:UTF-8").each_line { |line| puts line.strip(",") } # line.strip generates the error
Contents of the file:
# encoding: UTF-8
290919,"SE","26","Sk‰l","",59.4500,17.9500,, # this errors out
290956,"CZ","45","HornÌ Bradlo","",49.8000,15.7500,, # this errors out
290958,"NO","02","Svaland","",58.4000,8.0500,, # this works
This is the CSV file I got from outside and I am trying to import it into my DB, it did not come with "# encoding: UTF-8" at the top, but I added this since I read somewhere it will fix this problem, but it did not. :(
Environment:
Rails v3.0.3
ruby 1.9.2p0 (2010-08-18 revision 29036) [x86_64-darwin10.5.0]
Ruby has a notion of an external encoding and internal encoding for each file. This allows you to work with a file in UTF-8 in your source, even when the file is stored in a more esoteric format. If your default external encoding is UTF-8 (which it is if you're on Mac OS X), all of your file I/O is going to be in UTF-8 as well. You can check this using File.open('file').external_encoding. What you're doing when you opening your file and passing "r:UTF-8" is forcing the same external encoding that Ruby is using by default.
Chances are, your source document isn't in UTF-8 and those non-ascii characters aren't mapping cleanly to UTF-8 (if they were, you would either get the correct characters and no error, and if they mapped by incorrectly, you would get incorrect characters and no error). What you should do is try to determine the encoding of the source document, then have Ruby transcode the document on read, like so:
File.open(file, "r:windows-1251:utf-8").each_line { |line| puts line.strip(",") }
If you need help determining the encoding of the source, give this Python library a whirl. It's based on the automatic charset detection fallback that was in Seamonkey/Mozilla (and is possibly still in Firefox).
If you want to change your file encoding, you can use gem 'charlock holmes'
https://github.com/brianmario/charlock_holmes
$require 'charlock_holmes/string'
content = File.read('test2.txt')
if !content.is_utf8?
detection = CharlockHolmes::EncodingDetector.detect(content)
utf8_encoded_content = CharlockHolmes::Converter.convert content, detection[:encoding], 'UTF-8'
end
Then you can save your new content in a temp file and overwrite your original file.
Hope this help.

Resources