Terminology: Is there a word/phrase that descibes the purpose of the keywords "class", "struct", "interface", "enum"? - keyword

Greetings StackOverflow,
While doing some reflection programming in C# I got to thinking about the keywords used to define classes, structures and interfaces. There are keywords like "public" and "internal" which are called "access modifiers". Then you've got the "sealed" keyword which is a sort of "inheritance modifier". In total you get a "type declaration", but what word or phrase describes the keywords "class", "interface", or "struct"?
I couldn't find anything after doing some Google searching, there wasn't anything concrete I could find in the C# language specification, and a much more experienced co-worker and I were talking about it and he didn't know either. Together we thought the phrase "type classification" might work; as the keywords describe what "kind" of type they are. However, it also sounds a bit too broad so I'm hoping for a better term/phrase.
Anyone know the proper term/phrase the effectively describes this group of keywords?

Related

Support for Type Abbreviations / Type Aliases

Please forgive me if I've missed something basic, but does language-ext have support for type abbreviations / type aliases as described here?
In one of my projects I've attempted a rudimentary implementation which allows inheriting from the basic types string, int, long, decimal, Guid etc. This allows 'orderNumber' on your POCO to be strongly represented as opposed to just being an int.
It would be desirable to have json.net custom JsonConverters as well for serialization.
Are these a regular feature of functional languages, or just an F# thing? If the former is true and language-ext doesn't have an implementation is it something I could help add?
Cheers :-)
I think the generic types NewType and FloatType might be what you are looking for. On the Language Ext GitHub home page, search for "newtype".
Also, feel free to ask questions like this on the Language Ext GitHub by creating a new issue.

Apple Appstore keywords – Syntax, stemming & matching rules

I'm not hoping for clues on picking key words, there are guides about that already.
I'm hoping to get a decisive idea, with a reference to documentation or statements from Apple of:
The correct keyword list syntax.
How they are employed for matching in Apple's back end.
For example:
Should they be comma delineated: "ham,chips,beans"?
Or space delineated: "ham chips beans"?
Or comma and space: "ham, chips, beans"?
If customers might search for me by a phrase, such as "hungry cat", should I include "hungry, cat, hungry cat, hungry-cat"? Or is "hungry cat" sufficient?
I believe it's not necessary to add plural forms: "cats" isn't needed, provided I have "cat". But what about other stemming? If people search for "eating cats", is "eat, cat" in my search terms enough?
Thanks.
There are two votes to close stating this question is "opinion based". I've adjusted the question to make it clear that I am not looking for opinion, but statements or documentation from Apple.

Probabilistic Generation of Semantic Networks

I've studied some simple semantic network implementations and basic techniques for parsing natural language. However, I haven't seen many projects that try and bridge the gap between the two.
For example, consider the dialog:
"the man has a hat"
"he has a coat"
"what does he have?" => "a hat and coat"
A simple semantic network, based on the grammar tree parsing of the above sentences, might look like:
the_man = Entity('the man')
has = Entity('has')
a_hat = Entity('a hat')
a_coat = Entity('a coat')
Relation(the_man, has, a_hat)
Relation(the_man, has, a_coat)
print the_man.relations(has) => ['a hat', 'a coat']
However, this implementation assumes the prior knowledge that the text segments "the man" and "he" refer to the same network entity.
How would you design a system that "learns" these relationships between segments of a semantic network? I'm used to thinking about ML/NL problems based on creating a simple training set of attribute/value pairs, and feeding it to a classification or regression algorithm, but I'm having trouble formulating this problem that way.
Ultimately, it seems I would need to overlay probabilities on top of the semantic network, but that would drastically complicate an implementation. Is there any prior art along these lines? I've looked at a few libaries, like NLTK and OpenNLP, and while they have decent tools to handle symbolic logic and parse natural language, neither seems to have any kind of proabablilstic framework for converting one to the other.
There is quite a lot of history behind this kind of task. Your best start is probably by looking at Question Answering.
The general advice I always give is that if you have some highly restricted domain where you know about all the things that might be mentioned and all the ways they interact then you can probably be quite successful. If this is more of an 'open-world' problem then it will be extremely difficult to come up with something that works acceptably.
The task of extracting relationship from natural language is called 'relationship extraction' (funnily enough) and sometimes fact extraction. This is a pretty large field of research, this guy did a PhD thesis on it, as have many others. There are a large number of challenges here, as you've noticed, like entity detection, anaphora resolution, etc. This means that there will probably be a lot of 'noise' in the entities and relationships you extract.
As for representing facts that have been extracted in a knowledge base, most people tend not to use a probabilistic framework. At the simplest level, entities and relationships are stored as triples in a flat table. Another approach is to use an ontology to add structure and allow reasoning over the facts. This makes the knowledge base vastly more useful, but adds a lot of scalability issues. As for adding probabilities, I know of the Prowl project that is aimed at creating a probabilistic ontology, but it doesn't look very mature to me.
There is some research into probabilistic relational modelling, mostly into Markov Logic Networks at the University of Washington and Probabilstic Relational Models at Stanford and other places. I'm a little out of touch with the field, but this is is a difficult problem and it's all early-stage research as far as I know. There are a lot of issues, mostly around efficient and scalable inference.
All in all, it's a good idea and a very sensible thing to want to do. However, it's also very difficult to achieve. If you want to look at a slick example of the state of the art, (i.e. what is possible with a bunch of people and money) maybe check out PowerSet.
Interesting question, I've been doing some work on a strongly-typed NLP engine in C#: http://blog.abodit.com/2010/02/a-strongly-typed-natural-language-engine-c-nlp/ and have recently begun to connect it to an ontology store.
To me it looks like the issue here is really: How do you parse the natural language input to figure out that 'He' is the same thing as "the man"? By the time it's in the Semantic Network it's too late: you've lost the fact that statement 2 followed statement 1 and the ambiguity in statement 2 can be resolved using statement 1. Adding a third relation after the fact to say that "He" and "the man" are the same is another option but you still need to understand the sequence of those assertions.
Most NLP parsers seem to focus on parsing single sentences or large blocks of text but less frequently on handling conversations. In my own NLP engine there's a conversation history which allows one sentence to be understood in the context of all the sentences that came before it (and also the parsed, strongly-typed objects that they referred to). So the way I would handle this is to realize that "He" is ambiguous in the current sentence and then look back to try to figure out who the last male person was that was mentioned.
In the case of my home for example, it might tell you that you missed a call from a number that's not in its database. You can type "It was John Smith" and it can figure out that "It" means the call that was just mentioned to you. But if you typed "Tag it as Party Music" right after the call it would still resolve to the song that's currently playing because the house is looking back for something that is ITaggable.
I'm not exactly sure if this is what you want, but take a look at natural language generation wikipedia, the "reverse" of parsing, constructing derivations that conform to the given semantical constraints.

How do you think the "Quick Add" feature in Google Calendar works?

Am thinking about a project which might use similar functionality to how "Quick Add" handles parsing natural language into something that can be understood with some level of semantics. I'm interested in understanding this better and wondered what your thoughts were on how this might be implemented.
If you're unfamiliar with what "Quick Add" is, check out Google's KB about it.
6/4/10 Update
Additional research on "Natural Language Parsing" (NLP) yields results which are MUCH broader than what I feel is actually implemented in something like "Quick Add". Given that this feature expects specific types of input rather than the true free-form text, I'm thinking this is a much more narrow implementation of NLP. If anyone could suggest more narrow topic matter that I could research rather than the entire breadth of NLP, it would be greatly appreciated.
That said, I've found a nice collection of resources about NLP including this great FAQ.
I would start by deciding on a standard way to represent all the information I'm interested in: event name, start/end time (and date), guest list, location. For example, I might use an XML notation like this:
<event>
<name>meet Sam</name>
<starttime>16:30 07/06/2010</starttime>
<endtime>17:30 07/06/2010</endtime>
</event>
I'd then aim to build up a corpus of diary entries about dates, annotated with their XML forms. How would I collect the data? Well, if I was Google, I'd probably have all sorts of ways. Since I'm me, I'd probably start by writing down all the ways I could think of to express this sort of stuff, then annotating it by hand. If I could add to this by going through friends' e-mails and whatnot, so much the better.
Now I've got a corpus, it can serve as a set of unit tests. I need to code a parser to fit the tests. The parser should translate a string of natural language into the logical form of my annotation. First, it should split the string into its constituent words. This is is called tokenising, and there is off-the-shelf software available to do it. (For example, see NLTK.) To interpret the words, I would look for patterns in the data: for example, text following 'at' or 'in' should be tagged as a location; 'for X minutes' means I need to add that number of minutes to the start time to get the end time. Statistical methods would probably be overkill here - it's best to create a series of hand-coded rules that express your own knowledge of how to interpret the words, phrases and constructions in this domain.
It would seem that there's really no narrow approach to this problem. I wanted to avoid having to pull along the entirety of NLP to figure out a solution, but I haven't found any alternative. I'll update this if I find a really great solution later.

Why would M# be harder to Google than C#?

I read just now in a comment on another question titled Effective Googling for short names
C# isn't bad to Google for at all. It would be a lot harder if it were called M#, by the way.
Why? What am I missing?
It turns out I was somewhat wrong. I had thought that C# just happened to benefit from an understanding of musical keys - a search for "G#" finds plenty of results about the musical key of G#. (This is shown by experimentation, by the way - despite working at Google I don't know anything about the search engine. At least, not on this front.)
However, in this case not only does C# benefit from the musical key side of things, but Google's own help pages explain that C# and other programming languages are special-cased:
Punctuation that is not ignored
Punctuation in popular terms that have
particular meanings, like [ C++ ] or [
C# ] (both are names of programming
languages), are not ignored.
The
dollar sign ($) is used to indicate
prices. [ nikon 400 ] and [ nikon $400
] will give different results.
The
hyphen - is sometimes used as a signal
that the two words around it are very
strongly connected. (Unless there is
no space after the - and a space
before it, in which case it is a
negative sign.)
The underscore symbol
_ is not ignored when it connects two words, e.g. [ quick_sort ].
It would be interesting to know how long it would take a theoretical language "M#" to become searchable... but I'm not going to start speculating on that in a public forum :)
(Note that the Spec# home page comes up as the second link when you search Google for Spec#. At least it's there and pretty prominent though.)
I'll put up my opinion extrapolated from my comment.
As others have suggested, special chars are ignored by Google. But C# may have had a head start in not being ignored (or at least turned into "C") because of the musical note C# which was probably allowed for searches like "Some piece of music in C#". M# would not have benefited such.

Resources