Difference between rdf:seeAlso and rdfs:seeAlso - hyperlink

What is the difference between rdf:seeAlso and rdfs:seeAlso?
When I can use rdf:seeAlso and when I can use rdfs:seeAlso?
Can you do any examples?

First, note that rdf and rdfs are prefixes commonly used to reference the RDF syntax and RDF schema vocabularies respectively. The rdf is typically used for http://www.w3.org/1999/02/22-rdf-syntax-ns#, so that rdf:seeAlso would expand to http://www.w3.org/1999/02/22-rdf-syntax-ns#seeAlso. However, if you follow the vocabulary reference, you won't find a term defined for seeAlso. The RDF syntax is used for basic types such as rdf:type, rdf:XMLLiteral, and rdf:langString. the RDF Schema vocabulary is typically bound to the rdfs prefix, and is at http://www.w3.org/2000/01/rdf-schema#. It is mostly used to define terms useful in performing simple reasoning over RDF graphs, such as rdfs:subClassOf, rdfs:domain, and rdfs:range.
In reality, the terms defined between the two vocabularies end up being in arbitrary locations, and on retrospect, there should have probably been just a single vocabulary definition and a more easily understood location (such as http://www.w3.org/ns/rdf#), but too late for that now.
Why use rdfs:seeAlso is unclear. The description says "Further information about the subject resource.", but there's rules defined for how to use it. In Linked Data, it can be used to do just what it says, and a hypothetical linked data client might dereference IRI values of rdfs:seeAlso to find out more information that might be useful.
You can find out more in the RDF Concepts document and other publications of the RDF Working Group.

What is the difference between rdfs:seeAlso and rdfs:isDefinedBy?
These are defined pretty clearly in the specification:
5.4.1 rdfs:seeAlso
rdfs:seeAlso is an instance of rdf:Property that is used to indicate a
resource that might provide additional information about the subject
resource.
A triple of the form:
S rdfs:seeAlso O
states that the resource O may provide additional
information about S. It may be possible to retrieve representations of
O from the Web, but this is not required. When such representations
may be retrieved, no constraints are placed on the format of those
representations.
5.4.2 rdfs:isDefinedBy
rdfs:isDefinedBy is an instance of rdf:Property that is used to
indicate a resource defining the subject resource. This property may
be used to indicate an RDF vocabulary in which a resource is
described.
A triple of the form:
S rdfs:isDefinedBy O
states that the resource O defines S. It may be
possible to retrieve representations of O from the Web, but this is
not required. When such representations may be retrieved, no
constraints are placed on the format of those representations.
rdfs:isDefinedBy is a subproperty of rdfs:seeAlso.
When I can use rdfs:seeAlso and when I can use rdfs:isDefinedBy?
Can you do any examples for me?
You can use these whenever they're appropriate. Just include the appropriate triples in your data. I don't think there's really a whole lot of need for examples in this case; if something's a related resource, add a seeAlso link. If something has a definition by another resource, add a isDefinedBy link. Note that last bit, "rdfs:isDefinedBy is a subproperty of rdfs:seeAlso". That means that whenever you assert that "x rdfs:isDefinedBy y", you're implicitly asserting that "x rdfs:seeAlso y".

rdfs:seeAlso is an instance of rdf:Property that is used to indicate a resource that might provide additional information about the subject resource.
rdf is the original resource description framework whereas rdfs provides the schema information

Related

Ontology where the same word has different meaning in different contexts?

Are there any example ontologies where the same word has different meaning in different contexts?
For example, when building an ontology for a large company, it is not uncommon for different departments and systems to have a different definition and understanding of common words like "customer", "account", etc.
Is there a generally accepted way to model this in Protege that preserves the original words in their context, while also introducing a layer of disambiguating words for enterprise use?
This is a problem we encounter often in the biological community. I.e., the concept Eye is very dependent on the context, i.e. human eye vs fish vs spider eye etc. You can see a search for eye on the Ontology Lookup Service (OLS) and the results it return for eye from different ontologies. Disclosure: I am responsible for this tool.
Provide an IRI for your concept. This IRI should be similar to a surrogate key for your concept. I.e., instead of giving your Account concept an IRI like http://MyBusiness/someBusinessContex/Account you give it an IRI like http://MyBusiness/someBusinessContex/Context0000001. For the Eye concept the IRI for a human eye is http://purl.obolibrary.org/obo/NCIT_C12401 and for an insect it is http://purl.obolibrary.org/obo/SIBO_0000086.
I explain in this StackOverflow question the reason for using "surrogate keys".
Assign a context specific label and definition to your concept. You can use rdfs:label for label and rdfs:comment or skos:definition for definition.
You may find that you need alternatives for you concept. I.e. may be you refer to customers also as members. In this case you can use skos:altlabel to provide alternative names for your concept and skos:preflabel to define a preferred label.
So how does this work? For user interfaces you make use of rdfs:label/skos:preflabel and rdfs:comment/skos:definition for display purposes. From a data integration perspective you use the IRI.

Is there a well-defined difference between "normalizing" and "canonicalizing" data?

I understand canonicalization and normalization to mean removing any non-meaningful or ambiguous parts of of a data's presentation, turning effectively identical data into actually identical data.
For example, if you want to get the hash of some input data and it's important that anyone else hashing the canonically same data gets the same hash, you don't want one file indenting with tabs and the other using spaces (and no other difference) to cause two very different hashes.
In the case of JSON:
object properties would be placed in a standard order (perhaps alphabetically)
unnecessary white spaces would be stripped
indenting either standardized or stripped
the data may even be re-modeled in an entirely new syntax, to enforce the above
Is my definition correct, and the terms are interchangeable? Or is there a well-defined and specific difference between canonicalization and normalization of input data?
"Canonicalize" & "normalize" (from "canonical (form)" & "normal form") are two related general mathematical terms that also have particular uses in particular contexts per some exact meaning given there. It is reasonable to label a particular process by one of those terms when the general meaning applies.
Your characterizations of those specific uses are fuzzy. The formal meanings for general & particular cases are more useful.
Sometimes given a bunch of things we partition them (all) into (disjoint) groups, aka equivalence classes, of ones that we consider to be in some particular sense similar or the same, aka equivalent. The members of a group/class are the same/equivalent according to some particular equivalence relation.
We pick a particular member as the representative thing from each group/class & call it the canonical form for that group & its members. Two things are equivalent exactly when they are in the same equivalence class. Two things are equivalent exactly when their canonical forms are equal.
A normal form might be a canonical form or just one of several distinguished members.
To canonicalize/normalize is to find or use a canonical/normal form of a thing.
Canonical form.
The distinction between "canonical" and "normal" forms varies by subfield. In most fields, a canonical form specifies a unique representation for every object, while a normal form simply specifies its form, without the requirement of uniqueness.
Applying the definition to your example: Have you a bunch of values that you are partitioning & are you picking some member(s) per each class instead of the other members of that class? Well you have JSON values and short of re-modeling them you are partitioning them per what same-class member they map to under a function. So you can reasonably call the result JSON values canonical forms of the inputs. If you characterize re-modeling as applicable to all inputs then you can also reasonably call the post-re-modeling form of those canonical values canonical forms of re-modeled input values. But if not then people probably won't complain that you call the re-modeled values canonical forms of the input values even though technically they wouldn't be.
Consider a set of objects, each of which can have multiple representations. From your example, that would be the set of JSON objects and the fact that each object has multiple valid representations, e.g., each with different permutations of its members, less white spaces, etc.
Canonicalization is the process of converting any representation of a given object to one and only one, unique per object, representation (a.k.a, canonical form). To test whether two representations are of the same object, it suffices to test equality on their canonical forms, see also wikipedia's definition.
Normalization is the process of converting any representation of a given object to a set of representations (a.k.a., "normal forms") that is unique per object. In such case, equality between two representations is achieved by "subtracting" their normal forms and comparing the result with a normal form of "zero" (typically a trivial comparison). Normalization may be a better option when canonical forms are difficult to implement consistently, e.g., because they depend on arbitrary choices (like ordering of variables).
Section 1.2 from the "A=B" book, has some really good examples for both concepts.

DICOM file with CT and MR tags

A DICOM file (an artificial axial slice) has been generated from both a CT and an MR images. Can the aggregated file contain both CT and MR DICOM tags? E.g. Echo Time (0x18, 0x81) and KVP (0x18,0x60)?
I cannot find any information whether one image modality module is exclusive of the other and want to find out if such an artificial image might run into troubles with other vendors' software. Any help would be greatly appreciated.
The attribute SOP Class UID (0008,0016) determines which "type of object" you have and by this, the so-called Information Object Definition (IOD). The IOD tells you, which attributes are mandatory and which are allowed (and implicitly: which are not allowed) for the type of object.
So, merging attributes about the acquisition processes from two different IODs is not a good idea. What is going to fail widely is the annotation of these objects in a DICOM viewer. Most viewers have a SOP Class- or Modality- dependent configuration that defines how the images are annotated with DICOM header information. SOP Class UID and Modality have to provide exactly one value which cannot be entirely right in your case. So you have to decide whether another application treats the images as "CT only" or "MR only".
So, there is no way of merging IOD tables and still claiming DICOM conformance for the application that generates images of this type.
A lot of systems I know just treat the DICOM header as a "stream of attributes" not looking at correctness and consistency. As long as your pixel data and ordering information (Patient name, ID, ... , Study Instance UID, Series Instance UID) is properly encoded, it might happen that you will not run into severes issues.
However, I would never advise anyone to implement such thing. It is just a question of time when someone will validate your objects against the DICOM standard, find out that they are blatantly wrong and blaming no one else than you for that.
As explained by other, you are required to follow the DICOM standard. Basically you need to implement what is defined in the related IOD of your SOP Class instance.
Again as explained by other you are allowed to use a so-called 'Standard Extended SOP Class'. But be sure to read the definition for such class:
7.3 Rules Governing Types of SOP Classes
Quoting the paragraph:
Standard Extended SOP Classes shall:
be a proper super set of one Standard SOP Class;
not change the semantics of any Standard Attribute of that Standard SOP Class;
not contain any Private Type 1, 1C, 2, or 2C Attributes, nor add additional Standard Type 1, 1C, 2 or 2C Attributes;
not change any Standard Type 3 Attributes to Type 1, 1C, 2, or 2C;
use the same UID as the Standard SOP Class on which it is based.
So in summary no, you certainly cannot create an MR instance with a left over kVp (0018,0060) attribute, it cannot possibly mean anything for a MR modality in which case you are changing the semantics of a public attribute.

Ontology comparison in owlapi

I am using OWLAPI for a project, and I need to compare two ontologies for differences between them. This would ignore blank nodes so that, for instance, I can determine whether the same OWL restrictions are in both ontologies. Not only do I need to know whether there are differences, but I need to find out what those differences are. does such functionality exist in the OWLAPI, oz is there a relatively simple way to do this?
The equality between anonymous class expressions is not based on the blank node ids - anonymous class expressions only have blank nodes in the textual output, in memory the ids are ignored. So checking if an axiom exists in an ontology will by default match expressions correctly for your diff.
This is not true for individuals - anonymous individuals will not be found to be the same across ontologies, and this is by specs. An anonymous individual in one ontology cannot be found in another, because the anonymous individual ids are scoped to the containing ontology.
Note: the unit tests for OWLAPI have to carry out a very similar task, to verify that an ontology can be parsed, written and parsed again without change (i.e., roundtripped between input syntax and output syntax), so there is code that you can look at to take inspiration. See TestBase.java - equal() method for more details. This includes code to deal with different ids for anonymous individuals.

Alpha renaming in many languages

I have what I imagine will be a fairly involved technical challenge: I want to be able to reliably alpha-rename identifiers in multiple languages (as many as possible). This will require special consideration for each language, and I'm asking for advice for how to minimize the amount of work I need to do by sharing code. Something like a unified parsing or abstract syntax framework that already has support for many languages would be great.
For example, here is some python code:
def foo(x):
def bar(y):
return x+y
return bar
An alpha renaming of x to y changes the x to a y and preserves semantics. So it would become:
def foo(y):
def bar(y1):
return y+y1
return bar
See how we needed to rename y to y1 in order to keep from breaking the code? That is why this is a hard problem. It seems like the program would have to have a pretty good knowledge of what constitutes a scope, rather than just doing, say, a string search and replace.
I would also like to preserve as much of the formatting as possible: comments, spacing, indentation. But that is not 100% necessary, it would just be nice.
Any tips?
To do this safely, you need to be able to to determine
all the identifiers (and those things that are not, e.g., the middle of a comment) in your code
the scopes of validity for each identifer
the ability to substitute a new identifier for an old one in the text
the ability to determine if renaming an identifier causes another name to be shadowed
To determine identifiers accurately, you need a least a langauge-accurate lexer. Identifiers in PHP look different than the do in COBOL.
To determine scopes of validity, you have to be determine program structure in practice, since most "scopes" are defined by such structure. This means you need a langauge-accurate parser; scopes in PHP are different than scopes in COBOL.
To determine which names are valid in which scopes, you need to know the language scoping rules. Your language may insist that the identifier X will refer to different Xes depending on the context in which X is found (consider object constructors named X with different arguments). Now you need to be able to traverse the scope structures according to the naming rules. Single inheritance, multiple inheritance, overloading, default types all will pretty much require you to build a model of the scopes for the programs, insert the identifiers and corresponding types into each scope, and then climb from the point of encounter of an identifier in the program text through the various scopes according to the language semantics. You will need symbol tables, inheritance linkages, ASTs, and the ability to navigage all of these. These structures are different from PHP and COBOL, but they share lots of common ideas so you likely need a library with the common concept support.
To rename an identifier, you have to modify the text. In a million lines of code, you need to point carefully. Modifying an AST node is one way to point carefully. Actually, you need to modify all the identifiers that correspond to the one being renamed; you have to climb over the tree to find them all, or record in the AST where all the references exist so they can be found easily. After modifyingy the tree you have to regenerate the source text after modifying the AST. That's a lot of machinery; see my SO answer on how to prettyprint ASTs preseriving all of the stuff you reasonably suggest should be preserved.
(Your other choice is to keep track in the AST of where the text for the string is,
and the read/patch/write the file.)
Before you update the file, you need to check that you haven't shadowed something. Consider this code:
{ local x;
x=1;
{local y;
y=2;
{local z;
z=y
print(x);
}
}
}
We agree this code prints "1". Now we decide to rename y to x.
We've broken the scoping, and now the print statement which referred
conceptually to the outer x refers to an x captured by the renamed y. The code now prints "2", so our rename broke it. This means that one must check all the other identifiers in scopes in which the renamed variable might be found, to see if the new name "captures" some name we weren't expecting. (This would be legal if the print statement printed z).
This is a lot of machinery.
Yes, there is a framework that has almost all of this as well as a number of robust language front ends. See our DMS Software Reengineering Toolkit. It has parsers producing ASTs, prettyprinters to produce text back from ASTs, generic symbol table management machinery (including support for multiple inheritance), AST visiting/modification machinery. Ithas prettyprinting machinery to turn ASTs back into text. It has front ends for C, C++, COBOL and Java that implement name and type resolution (e.g. instanting symbol table scopes and identifier to symbol table entry mappings); it has front ends for many other langauges that don't have scoping implemented yet.
We've just finished an exercise in implementing "rename" for Java. (All the above issues of course appeared). We about about to start one for C++.
You could try to create Xtext based implementations for the involved languages. The Xtext framework provides reliable infrastructure for cross language rename refactoring. However, you'll have to provide a grammar a at least a "good enough" scope resolution for each language.
Languages mostly guarantee tokens will be unique, whatever the context. A naive first approach (and this will break many, many pieces of code) would be:
cp file file.orig
sed -i 's/\b(newTokenName)\b/TEMPTOKEN/g' file
sed -i 's/\b(oldTokenName)\b/newTokenName/g' file
With GNU sed, this will break on PHP. Rewriting \b to a general token match, like ([^a-zA-Z~$-_][^a-zA-Z0-9~$-_]) would work on most C, Java, PHP, and Python, but not Perl (need to add # and % to the token characters. Beyond that, it would require a plugin architecture that works for any language you wanted to add. At some point, there will be two languages whose variable and function naming rules will be incompatible, and at that point, you'll need to do more and more in the plugin.

Resources