What's the name for this evidence grouping method on an ontology - ontology

Say we have this simple ontology as a toy example:
I now receive evidence about a certain object x (for example, "x is a boat", "x is a car") and want to make a statement on what I know about this object.
A sensible approach would be to use my tree and calculate the lowest common ancestor (LCA) of my evidence, and return that as a result. In my example (car and boat), this would result in vehicle.
Let's now say that I have evidence that an object y is a car and a vehicle. The LCA of these is vehicle. In this case, however, I want the result to be car, since the evidence of y being a vehicle does not contradict it being a car.
I implemented this alternative LCA algorithm (by ignoring evidence that's on the path to root of other evidence), but have not found anything about such an approach on the internet. Is there a name for the alternative algorithm I used?

From your description I think you're implementing a least common subsumer algorithm: the most specific class that subsumes two or more class expressions, and can also be equal to one such class.

Related

Can Word2Vec be used for information extraction?

I am using Gensim to train Word2Vec. I know word similarities are deteremined by if the words can replace each other and make sense in a sentence. But can word similarities be used to extract relationships between entities?
Example:
I have a bunch of interview documents and in each interview, the interviewee always says the name of their manager. If I wanted to extract the name of the manager from these interview transcripts could I just get a list of all human name's in the document (using nlp), and the name that is the most similar to the word "manager" using Word2Vec, is most likely the manager.
Does this thought process make any sense with Word2Vec? If it doesn't, would the ML solution to this problem then be to input my word embeddings into a sequence to sequence model?
Yes, word-vector similarities & relative-arrangements can indicate relationships.
In the original Word2Vec paper, this was demonstrated by using word-vectors to solve word-analogies. The most famous example involves the analogy "'man' is to 'king' as 'woman' is to ?".
By starting with the word-vector for 'king', then subtracting the vector for 'man', and adding the vector for 'woman', you arrive at a new point in the coordinate system. And then, if you look for other words close to that new point, often the closest word will be queen. Essentially, the directions & distances have helped find a word that's related in a particular way – a gender-reversed equivalent.
And, in large news-based corpuses, famous names like 'Obama' or 'Bush' do wind up with vectors closer to their well-known job titles like 'president'. (There will be many contexts in such corpuses where the words appear immediately together – "President Obama today signed…" – or simply in similar roles – "The President appointed…" or "Obama appointed…", etc.)
However, I suspect that's less-likely to work with your 'manager' interview-transcripts example. Achieving meaningful word-to-word arrangements depends on lots of varied examples of the words in shared usage contexts. Strong vectors require large corpuses of millions to billions of words. So the transcripts with a single manager wouldn't likely be enough to get a good model – you'd need transcripts across many managers.
And in such a corpus each manager's name might not be strongly associated with just manager-like contexts. The same name(s) will be repeated when also mentioning other roles, and transcripts may not especially refer to managerial-action in helpful third-person ways that make specific name-vectors well-positioned. (That is, there won't be clean expository statements like, "John_Smith called a staff meeting", or "John_Smith cancelled the project, alongside others like "…manager John_Smith…" or "The manager cancelled the project".)

How to use word embeddings/word2vec .. differently? With an actual, physical dictionary

If my title is incorrect/could be better, please let me know.
I've been trying to find an existing paper/article describing the problem that I'm having: I'm trying to create vectors for words so that they are equal to the sum of their parts.
For example: Cardinal(the bird) would be equal to the vectors of: red, bird, and ONLY that.
In order to train such a model, the input might be something like a dictionary, where each word is defined by it's attributes.
Something like:
Cardinal: bird, red, ....
Bluebird: blue, bird,....
Bird: warm-blooded, wings, beak, two eyes, claws....
Wings: Bone, feather....
So in this instance, each word-vector is equal to the sum of the word-vector of its parts, and so on.
I understand that in the original word2vec, semantic distance was preserved, such that Vec(Madrid)-Vec(Spain)+Vec(Paris) = approx Vec(Paris).
Thanks!
PS: Also, if it's possible, new words should be able to be added later on.
If you're going to be building a dictionary of the components you want, you don't really need word2vec at all. You've already defined the dimensions you want specified: just use them, e.g. in Python:
kb = {"wings": {"bone", "feather"},
"bird": {"wings", "warm-blooded", ...}, ...}
Since the values are sets, you can do set intersection:
kb["bird"] | kb["reptile"]
You'll need to do find some ways decompose the elements recursively for comparisons, simplifications, etc. These are decisions you'll have to make based on what you expect to happen during such operations.
This sort of manual dictionary development is quite an old fashioned approach. Folks like Schank and Abelson used to do stuff like this in the 1970's. The problem is, as these dictionaries get more complex, they become intractable to maintain and more inaccurate in their approximations. You're welcome to try as an exercise---it can be kind of fun!---but keep your expectations low.
You'll also find aspects of meaning lost in these sorts of decompositions. One of word2vec's remarkable properties is its sensitives to the gestalt of words---words may have meaning that is composed of parts, but there's a piece in that composition that makes the whole greater than the sum of the parts. In a decomposition, the gestalt is lost.
Rather than trying to build a dictionary, you might be best off exploring what W2V gives you anyway, from a large corpus, and seeing how you can leverage that information to your advantage. The linguistics of what exactly W2V renders from text aren't wholly understood, but in trying to do something specific with the embeddings, you might learn something new about language.

Rails - Simplifying calculation models & objects

I have asked a few questions about this recently and I am getting where I need to go, but have perhaps not been specific enough in my last questions to get all the way there. So, I am trying to put together a structure for calculating some metrics based on app data, which should be flexible to allow additional metrics to be added easily (and securely), and also relatively simple to use in my views.
The overall goal is that I will be able to have a custom helper that allows something like the following in my view:
calculate_metric(#metrics.where(:name => 'profit'),#customer,#start_date,#end_date)
This should be fairly self explanatory - the name can be substituted to any of the available metric names, and the calculation can be performed for any customer or group of customers, for any given time period.
Where the complexity arises is in how to store the formula for calculating the metric - I have shown below the current structure that I have put together for doing this:
You will note that the key models are metric, operation, operation_type and operand. This kind of structure works ok when the formula is very simple, like profit - one would only have two operands, #customer.sales.selling_price.sum and #customer.sales.cost_price.sum, with one operation of type subtraction. Since we don't need to store any intermediate values, register_target will be 1, as will return_register.
I don't think I need to write out a full example to show where it becomes more complicated, but suffice to say if I wanted to calculate the percentage of customers with email addresses for customers who opened accounts between two dates (but did not necessarily buy), this would become much more complex since the helper function would need to know how to handle the date variations.
As such, it seems like this structure is overly complicated, and would be hard to use for anything other than a simple formula - can anyone suggest a better way of approaching this problem?
EDIT: On the basis of the answer from Railsdog, I have made some slight changes to my model, and re-uploaded the diagram for clarity. Essentially, I have ensured that the reporting_category model can be used to hide intermediate operands from users, and that operands that may be used in user calculations can be presented in a categorised format. All I need now is for someone to assist me in modifying my structure to allow an operation to use either an actual operand or the result of a previous operation in a rails-esqe way.
Thanks for all of your help so far!
Oy vey. It's been years (like 15) since I did something similar to what it seems like you are attempting. My app was used to model particulate deposition rates for industrial incinerators.
In the end, all the computations boiled down to two operands and an operator (order of operations, parentheticals, etc). Operands were either constants, db values, or the result of another computation (a pointer to another computation). Any Operand (through model methods) could evaluate itself, whether that value was intrinsic, or required a child computation to evaluate itself first.
The interface wasn't particularly elegant (that's the real challenge I think), but the users were scientists, and they understood the computation decomposition.
Thinking about your issue, I'd have any individual Metric able to return it's value, and create the necessary methods to arrive at that answer. After all, a single metric just needs to know how to combine it's two operands using the indicated operator. If an operand is itself a metric, you just ask it what it's value is.

Liskov Substitution Principle and the directionality of the original statement

I came across the original statement of the Liskov Substitution Principle on Ward's wiki tonight:
What is wanted here is something like the following substitution property: If for each object o1 of type S there is an object o2 of type T such that for all programs P defined in terms of T, the behavior of P is unchanged when o1 is substituted for o2 then S is a subtype of T." - Barbara Liskov, Data Abstraction and Hierarchy, SIGPLAN Notices, 23,5 (May, 1988).
I've always been crap at parsing predicate logic (I failed Calc IV the first time though), so while I kind of understand how the above translates to:
Functions that use pointers or references to base classes must be able to use objects of derived classes without knowing it.
what I don't understand is why the property Liskov describes implies that S is a subtype of T and not the other way around.
Maybe I just don't know enough yet about OOP, but why does Liskov's statement only allow for the possibility S -> T, and not T -> S?
the hypothetical set of programs P (in terms of T) is not defined in terms of S, and therefore it doesn't say much about S. On the other hand, we do say that S works as well as T in that set of programs P, and so we can draw conclusions about S and its relationship to T.
One way to think about this is that P demands certain properties of T. S happens to satisfy those properties. Perhaps you could even say that 'every o1 in S is also in T'. This conclusion is used to define the word subtype.
As pointed by sepp2k there are multiple views explaining this in the other post.
Here are my two cents about it.
I like to look at it this way
If for each object o1 of type TallPerson there is an object o2 of type Person such that for all programs P defined in terms of Person, the behavior of P is unchanged when o1 is substituted for o2 then TallPerson is a subtype of Person. (replacing S with TallPerson and T with Person)
We typically have a perspective of objects deriving some base class have more functionality, since its extended. However with more functionality we are specializing it and reducing the scope in which they can be used thus becoming subtypes to its base class(broader type).
A derived class inherits its base class's public interface and is expected to either use the inherited implementation or to provide an implementation that behaves similarly (e.g., a Count() method should return the number of elements regardless of how those elements are stored.)
A base class wouldn't necessarily have the interface of any (let alone all) of its derived classes so it wouldn't make sense to expect an arbitrary reference to a base class to be substitutable for a specified derived class. Even if it appears that only the subset of the interface that supported by the base class's interface is required, that might not be the case (e.g., it could be that a shadowing method in the specific derived class is referred to).

Hierarchy of meaning

I am looking for a method to build a hierarchy of words.
Background: I am a "amateur" natural language processing enthusiast and right now one of the problems that I am interested in is determining the hierarchy of word semantics from a group of words.
For example, if I have the set which contains a "super" representation of others, i.e.
[cat, dog, monkey, animal, bird, ... ]
I am interested to use any technique which would allow me to extract the word 'animal' which has the most meaningful and accurate representation of the other words inside this set.
Note: they are NOT the same in meaning. cat != dog != monkey != animal
BUT cat is a subset of animal and dog is a subset of animal.
I know by now a lot of you will be telling me to use wordnet. Well, I will try to but I am actually interested in doing a very domain specific area which WordNet doesn't apply because:
1) Most words are not found in Wordnet
2) All the words are in another language; translation is possible but is to limited effect.
another example would be:
[ noise reduction, focal length, flash, functionality, .. ]
so functionality includes everything in this set.
I have also tried crawling wikipedia pages and applying some techniques on td-idf etc but wikipedia pages doesn't really do much either.
Can someone possibly enlighten me as to what direction my research should go towards? (I could use anything)
It looks like you want to use something like the hypernym/hyponym relationships in WordNet, but without actually using WordNet due to language and domain specific coverage issues? That is, if you had the domain specific hypernym relationships, you could get the "super" representation by just looking for the nearest parent that subsumed all of the words in the list, or the nearest node that was equal to one of the list words and subsumed all of the others.
To start, I would first point out that WordNets are actually available for many of the worlds major languages see the list at Global WordNet.
To get domain specific hypernym relationships, you could use the technique presented in Snow et al.'s Learning syntactic patterns for automatic hypernym discovery. That is, you could start off with a small list of seed hypernyms, and then use them to train a classifier to detected the hypernyms in a corpus. You would then run this classifier over data from your domain in order to build a list of domain specific hypernym pairs.
The opinion mining and sentiment analysis folks might be doing related things, in terms of deciding what words represent features of products, without knowing anything about the products.
A quick sketch of an idea for how you might do this, which I've totally made up on the spot:
Parse a bunch of sentences in the relevant domain; find the noun phrases and adjectives. Figure out which noun phrases are associated with which adjectives. Cluster the noun phrases together based on the set of adjectives used to describe them. Animals will tend together because they're going to be described by adjectives like "furry" or "cute", etc. (In particular, hierarchical clustering would probably be most appropriate.)
If you try this, and it works, let me know. :)

Resources