Is possible to define a data property of a class as the number of individuals of another class, and this number is computed automatically.
There is no functionality for counting already available, far as I know.
It could be implemented for asserted axioms, but I don't think it can be guaranteed to work reliably. The open world assumption and the default non unique name assumption mean that it's impossible to say if there are unknown individuals or if any of the known individuals are sameAs each other.
Related
I am developing a Neo4j database that will contain genomic and clinical data for cancer patients. A common design issue in developing graph databases is whether a data item should be represented by a Node or by a property within a Node. In my case, patients will have hundreds of clinical and demographic measurements (e.g. sex, medications, tumor size). Some of these data will be constant (e.g. sex) while others will be subject to variation with each patient visit. The responses I've seen to previous node vs property questions have recommended using the anticipated queries against the data to make the decision. I think I can identify some properties that will be common search criteria and should be nodes (e.g. smoking history, sex, cancer type) but that still leaves me with hundreds of other properties. Is there a practical limit in Neo4j for the number of properties that a Node should contain? Also, a hybrid approach, where some data are properties and others are Nodes would seem to make both loading data from source files and subsequent queries more complicated.
The main idea behind "look at your queries to decide", is that how data relates to each other effects whether a node or property is better. Really, the main point of a graph database is to make walking relationships easier to query. So the real question you should ask yourself is "Does (a)-->()<--(b) have any significant meaning?" In other words, do I need to be able to find other nodes that share this property?
Here are some quick rule-of-thumb guidlines
Node
Has it's own sub-values or relations
Multiple nodes sharing this value has meaning, and you need to be able to walk along this shared value between them
Changes infrequently
If more than 1 value can apply at the same time
Properties
Has a large range of possible values
Changes over time
If more than 1 value can apply, values are usually updated as a set (rather than individually)
Label
Has a small, finite range of mutually exclusive values
Almost never changes
So lets go through the thought process of a few of your properties...
Sex
Will either be "Male" or "Female", and everyone will be connected to one of the two, so they will both end up being super nodes (overloaded). Also, if you do end up needing to find two people that share the same sex, almost any other method would be more efficient than finding them through the super node. However these are mutually exclusive, immutable, genetic traits so making this a label is also perfectly acceptable (and sometimes preferred).
Address
This is a variable value with sub-properties, won't be shared by very many nodes, and the walk from one person to another at the same address (or, by extension, live in an area) has valuable meaning. So this should almost definitely be a node.
Height and Weight
These change constantly with time, have no sub values, and two people sharing this value has little to no meaning. The range of values is far too wide, so Labels make no since either, so this should be a property.
Blood type
While has more options then Sex does, all the same logic applies, except that the relation does matter now (because people must share a blood type to donate). The problem is that this value will be so overloaded, that you will need to filter on area first, and than just verifying blood type. Could be a property or label. The case for node is if you include an "Can_Donate_To" or "Can_Accept" relation between the blood types. While you likely won't walk these relations to find a potential donor (because they are too overloaded, and you will have to filter by area first), you can use them to verify someone can be a donor.
Social Security Number
Is highly sensitive, and a lawsuit waiting to happen. Keep out of the DB if at all possible. If you absolutely have to; this property is immutable, but will be unique to every person, so because of the lack of reuse, is a bad label and will be pointless as a node. Definitely a property. (But should be salted+hashed if only for verification purposes only)
Mother's maiden name
The possible values are endless, and two nodes sharing this value has no real meaning. Definitely a property.
First born child
Since the child is already their own node, with it's own sub properties, just create a relation between the two. While the value of this info is questionable, any time you need to reference another node, always use a relationship for it. Definitely a node.
In modeling, instances of the same label, i.e Student, have same set of properties. However, is it normal that instances of the same label have different sets of properties. For example, I have Product node:
(p:Product)-[:HAS_ATTRIBUTE]->(a:Attributes)
Different instances of Product result in different instances of Attributes. In this case, different Attributes nodes have very different properties.
Is this modeling normal? Different categories of products can have very different attributes.
It's very useful to have different properties. For instance, I have a Y-DNA project with single nucleotide polymorphisms (SNP) Nodes. Some are on the know haplotree and some are not. So, I set a property InHGTree to Y or blank to reflect this. Now I can more readily create queries using the haplotree branching.
BTW, relationships can also have different properties with the same value. DNA results from an individual are in a "kit." The kit is related to numerous SNPs. You want to be able to determine whether the kit is positive or negative for the SNP. It is most logically to put this fact in the relationship between the kit and the SNP.
It's certainly allowed, as there is no table schema as in relational dbs to enforce homogenous properties.
While this provides great flexibility, it may introduce complexity. It's up to the modelers and administrators of the database to provide any guidelines or implement restrictions, if needed.
While that would usually be in the form of convention, APOC triggers (or kernel extensions if you want to implement this yourself) could be used to enforce only certain properties for a node of a given label.
I have a question:
I have a domain : LoanAccount. We have different product of loans but they just different on how to calculate the interest.
for example:
1. Regular Loan calculate interest rate using Annuity Interest Rate formula
2. Vehicle Loan calculate interest rate using Flat Interest Rate formula
3. Temporary Loan calculate interest rate with another formula (i have no idea what is that).
We also could change the rule every year ... we use different formula as well ...
My Question:
Should I put all the logic formula in services ?
Should I make every loan in different domain class ?
or should I make 1 domain class but it has different interest rate calculation methods ?
Any example would be good :)
Thank you in advance !
My suggestion is to separate interest calculating logic from the domain objects.
Hard-wiring the domain object and it's interest calculation is likely to lead you in trouble.
It would be more complicated to change the type of interest calculation for existing account type (which could be expected business request)
When new account type is created you can easily use all the calculation methods you have already implemented for it
It's likely that interest-calculating algorithm will grow in complexity in the future and it may need properties that should not be part of Account domain object, like some business constants, list of transactions etc.
Grails (because Spring) naturally supports to have business logic in services (declarative transactions etc.) rather than in the domain objects. You will always have less pain when going along with the framework than otherwise.
Say you have a class, Car, which has a Driver. If you wanted to access the driver's age, you would do:
#car.driver_age
Instead of
#car.driver.age
If you have delegated the driver's age attribute in the Car model (with prefix set to true).
However, if you also had a Passenger class and you wanted to access the number of passengers in the car, is the following not breaking the Law of Demeter, or is my thinking over-zealous:
#car.passengers.count
I think that count is so general that i would not find it necessary to proxy the call. I would ask myself the question:
Is it possible that there will be an implementation of passengers in the future, that will not respond to count?
Since passengers is extremely likely to be a container type forever, and all container types in Ruby (Array, Hash, …) respond to count in the way you would expect, i would answer this question with “no” and therefore stick with #car.passengers.count.
EDIT
But if you’re being strict, you are indeed breaking the Law of Demeter. Consider for example a class RobotCar < Car that has no passengers at all. Now, following LoD you could simply return 0 from the method car.passenger_count, whereas when not following LoD you would have to return an empty container from passengers in order not to break other code.
In the end, you will have to decide for yourself just how likely it is that the interface will ever change. If you are very certain that it won't ever change, then i guess it's ok to disobey LoD.
If your class/method needs to know the driver's age, it should have a direct reference to the driver:
#driver.age
Or to the array of passengers:
#passengers.count
Accessing these through #car makes many assumptions. Imagine an automatic car without a driver or a toy car without passengers. #car.driver_age or #car.passengers_count wouldn't make much sense.
LoD isn't just "counting dots", and trading an underscore for a dot doesn't help.
http://haacked.com/archive/2009/07/13/law-of-demeter-dot-counting.aspx
http://www.dan-manges.com/blog/37
Cars have a driver that have an age; there's nothing unreasonable about this.
(Well, not really, because we're about to enter an age of driverless cars and this model may not account for that, but that's a separate issue.)
The Law of Demeter for functions requires that a method M of an object O may only invoke the methods of the following kinds of objects:
O itself
M's parameters
any objects created/instantiated within M
O's direct component objects
In particular, an object should avoid invoking
methods of a member object returned by another method.
I am reading http://en.wikipedia.org/wiki/Domain-driven_design right now, and I just need 2 quick examples so I understand what 'value objects' and 'services' are in the context of DDD.
Value Objects: An object that describes a characteristic of a thing. Value Objects have no conceptual identity. They are typically read-only objects and may be shared using the Flyweight design pattern.
Services: When an operation does not conceptually belong to any object. Following the natural contours of the problem, you can implement these operations in services. The Service concept is called "Pure Fabrication" in GRASP.
Value objexts: can someone give me a simple example this please?
Services: so if it isn't an object/entity, nor belong to repository/factories then its a service? I don't understand this.
The archetypical example of a Value Object is Money. It's very conceivable that if you build an international e-commerce application, you will want to encapsulate the concept of 'money' into a class. This will allow you to perform operations on monetary values - not only basic addition, subtraction and so forth, but possibly also currency conversions between USD and, say, Euro.
Such a Money object has no inherent identity - it contains the values you put into it, and when you dispose of it, it's gone. Additionally, two Money objects containing 10 USD are considered identical even if they are separate object instances.
Other examples of Value Objects are measurements such as length, which might contain a value and a unit, such as 9.87 km or 3 feet. Again, besides simply containing the data, such a type will likely offer conversion methods to other measurements and so forth.
Services, on the other hand, are types that performs an important Domain operation, but doesn't really fit well into the other, more 'noun'-based concepts of the Domain. You should strive to have as few Services as possible, but sometimes, a Service is the best way of encapsulating an important Domain Concept.
You can read more about Value Objects, Services and much more in the excellent book Domain-Driven Design, which I can only recommend.
Value Objects: a typical example is an address. Equality is based on the values of the object, hence the name, and not on identity. That means that for instance 2 Person objects have the same address if the values of their Address objects are equal, even if the Address objects are 2 completely different objects in memory or have a different primary key in the database.
Services: offer actions that do not necessarily belong to a specific domain object but do act upon domain objects. As an example, I'm thinking of a service that sends e-mail notifications in an online shop when the price of a product drops below a certain price.
InfoQ has a free book on DDD (a summary of Eric Evan's book): http://www.infoq.com/minibooks/domain-driven-design-quickly
This is a great example of how to identify Value Objects vs Entities. My other post also gives another example.