Why does my custom spaCy entity type get detected? - named-entity-recognition

I am writing a spaCy program for which I want to define a custom named entity tag. Following the example here, I add a label called MY_NEW_LABEL to the pipeline.
import spacy
nlp = spacy.load("en_core_web_lg")
ner = nlp.get_pipe("ner")
new_label = "MY_NEW_LABEL"
ner.add_label(new_label)
documents_path = "my_document.txt"
document = nlp(open(documents_path).read())
print([e for e in document.ents if e.label_ == new_label])
When I run the above program it prints out a list of entities labeled with MY_NEW_LABEL. I don't see how this is possible because I never do anything with the label.
Clearly I'm misunderstanding how to work with custom entity tags, but I can't figure out why this would be happening from the documentation. Can anyone tell me why my program doesn't print out an empty list?

This is unexpected behavior. I opened it as spaCy issue 1697: Custom Entity Labels Are Erroneously Detected.

Related

How to use the instance name as a string in Modelica code?

I have a Modelica simulation model composed by some models connected to each other.
I would like to save some data of some of the model instances in my simulation model at a given time using the built-in function
Modelica.Utilities.Streams.writeRealMatrix();
To be sure which instance writes which file, I would like to include the instance name in the writeRealMatrix() output file name, e.g., in case I have an instance called myModel, using the name:
myModelOut.mat.
To do this, I need a way to get the instance name and put it into a string.
I know that Modelica allows using instance names in model icons, through a Text record, using the keyword "%name", but I don't know how to do the same in a regular string (I mean outside any record or icon annotation).
Does anyone know if there is a way to do this?
Thank you in advance.
In your case I think the function getInstanceName() should be a good approach. Using it will need you to edit the model, but given you are writing information from with the class using writeRealMatrix() this shouldn't be an issue.
I have created a small example package with a constant block, that stores its name into final parameter of type String. The example then writes the string to the console at the termination of the simulation:
package GetName
block ConstantNamed "Generate constant signal of type Real"
extends Modelica.Blocks.Sources.Constant;
final parameter String name = getInstanceName();
end ConstantNamed;
model Example
extends Modelica.Icons.Example;
ConstantNamed myConst(k=23) annotation (Placement(transformation(extent={{-10,-10},{10,10}})));
equation
when terminal() then
Modelica.Utilities.Streams.print("### Here is the models full path: '" + myConst.name + "'");
end when;
end Example;
annotation (uses(Modelica(version="4.0.0")));
end GetName;
This should result in a simulation log containing the path of the instance of ConstantNamed, which is Example.myConst:
Note: The print function is added to Example in the above code. It could be added to the ConstantNamed as well. For the case from the question, the print shouldn't be necessary anyways...
Besides that, in case you are using Dymola, there is the ModelManagement library, which contains some functions like ModelManagement.Structure.AST.Classes.ComponentsInClass. But these are more intended to be applied from "outside" to a given model.

Return Filtered Module in DXL

I need to make a filter in certain Module and get the filtered items and loop over them and do some kind of operation.
problem is filtering isn't done , something is wrong as follows :
Filter SwTest = includes(attribute "aVerificationStrategy" ,"SwTest")
Filter Implemented = (attribute "aObjectStatus" < "inReview")
Filter SwTestReqsCASTLE = SwTest && Implemented
Module m = srs_doc
set(m, SwTestReqsCASTLE, accepted , rejected)
filtering on OR ApplyFiltering(m) , i tried each as don't know difference !
so what is wrong ?
Before I answer your main question, first allow me to answer your implied question about the difference between "filtering on" and "ApplyFiltering(m)". The difference is that "filtering on" displays the current filter in the module window, meaning that objects are either shown or hidden depending on the filter. "ApplyFiltering(m)" applies the current filter settings to the module explorer (the area to the left of your objects that shows the hierarchy). "filtering on" shows and hides objects and "ApplyFiltering(m)" reflects the status of those objects in the module explorer.
As for why your filters are not being applied, there could be several reasons:
It is good practice to turn filtering off before you start setting filters. Add the line "filtering off" before the rest of your code.
Your "Implemented" filter is not defined properly. DOORS will see "inReview" as a string, and it will perform a direct comparison with the string value of your "aObjectStatus" attribute in order to determine if an object is accepted or rejected. Is this what you intended?
What type of variable is srs_doc? If it's a string then you need to
call read(), share(), or edit() in order to actually open the
module. If it is a module variable then that line is correct.
I am assuming that "accepted" and "rejected" are integers, but if they are not previously declared then they need to be.
Based on the first paragraph in my comment, your last line should read "filtering on"
Is the module you want to filter being displayed? I realize this is probably obvious, but I have made this mistake before so I thought I should mention it. A filter cannot be applied on a module that is not currently being displayed.
As a side note, you can compound your SwTest and Implemented filters without creating extra Filter variables as follows:
Filter SwTestReqsCASTLE = includes(attribute "aVerificationStrategy", "SwTest") && (attribute "aObjectStatus" < "inReview")
I hope some of that helps! Good luck, and let me know if none of the above solves your problem.

Grails find existing record by criteria

I have (among others) two domain classes:
class Course {
String name
...
}
class Round {
Course course
String startweek // e.g. '201504'
String endweek // e.g. '201534'
String applcode // e.g. 'DA542133'
...
}
Application codes may be issued at several occasions and are then concatenated with 'applcode's separated by blanks. As I am streaming and parsing large amount of data (in XML format) from different sources, I might stumble on the same data from several sources, so I look up the records in the database to see if I may discard the rest of the stream or not. This is possible as the outermost tag contains data stating the above declared attributes. I search the database using:
def c = Course.findByName(name);
def found =
Round.findByCourseAndStartweekAndEndweekAndApplcodeLike(c, sw, ew,'%'+appc+'%')
where the parameters are fairly obvious and which works well but I find these 'findByBlaAndBlablaAnd...' very long and not very readable. My aim here is to find some more readable and thereby more comprehensible method. I have started to read about Criteria and HQL but I think one example or two would help me on the way.
Edit after reading the pages on the link provided by #injecteer:
It was fairly simple to make out the query above. I have worse thing to figure out but the query in my example became with criteria:
def found = Round.createCriteria().get {
eq ('course', c)
eq ('startweek', sw)
eq ('endweek', ew)
like ('applcode', '%'+appc+'%')
};
Much easier to read and understand than the original question.

Reasoning with Pellet on SWRL rules in Jena Framework

I am trying to use Jena framework to edit an existing ontology built with Protoge 4.2. i.e. to change property values or add individuals or classes and then do reasoning. Assume in the ontology we have a rule such that: hasAge(?p,?age)^swrlb:greaterThan(?age,18)->Adult(?p). I would like to be able to change hasAge property on Jena side and see if someone is an Adult or not. Can you please provide me some sample code on this? Any help is appreciated.
Assuming that :
you know how to populate your model by reading in the ontology that you built
You have put Pellet on the classpath
You replace the IRI's below with those from your domain
You have assertions enabled
The following code snippet will add an age to an individual x-test://individual and assert that the property that would be introduced by SWIRL is satisfied.
// create an empty ontology model using Pellet spec
final OntModel model = ModelFactory.createOntologyModel( PelletReasonerFactory.THE_SPEC );
// read the file
model.read( ont );
// Grab a resource and and property, and then set the property on that individual
final Resource Adult = ResourceFactory.createResource("x-domain://Adult");
final Property hasAge = ResourceFactory.createProperty("x-domain://hasAge");
final Resource res = model.createResource("x-test://individual");
res.addLiteral(hasAge, 19);
// Test that the swirl rule has executed
assert( res.hasProperty(RDF.type, Adult) );

MyEntity.findAllByNameNotLike('bad%')

I'm attempting to pull up all entities that have a name that doesn't partially match a given string.
MyEntity.findAllByNameNotLike('bad%')
This gives me the following error:
No such property: nameNot for class: MyEntity
Possible solutions: name" type="groovy.lang.MissingPropertyException">
I had a quick look at the criteria style but I can't seem to get that going either,
def results = MyEntity.withCritieria {
not(like('name', 'bad%'))
}
No signature of method: MyEntity.withCritieria() is applicable for argument types: (MyService$_doSomething_closure1)
Ideally I would like to be able to apply this restriction at the finder level as the database contains a large number of entities that I don't want to load up and then exclude for performance reasons.
[grails 1.3.1]
I've worked out how to do this using withCriteria, the not should have been a closure of its own.
def results = MyEntity.withCritieria {
not {
like('name', 'bad%'))
}
}
The problem I initially had using withCriteria was that I was trying to test this as a unit test, which works fine with the dynamic finders, but not with the criteria API (as far as I can tell).
(I'll leave this unanswered for a day to see if anyone has a better solution, otherwise I'll accept my answer)

Resources