I am attempting to implement zookeeper as a shared state engine for an application I am creating in erlang. The structure for the state would be like the following:
/appRoot
/parent1:{json}
/child1:{json}
/child2:{json}
/parent2:{json}
/child1:{json}
/child2:{json}
I would like to be able to have a single method that returned all parent nodes when provided /appRoot along with it's data. Say in a list of tuples [{parent1,{json}}, {parent2:{json}}]. Or, if provided /appRoot/parent1 a list of it's subnodes with the data. So far, all I see is a way to get the keys (getChildren) then recursively retrieve the data with the key. It seems like I should be able to just make one call to do this.
I am currently using the ezk erlang client library. If anybody knows a better solution, that would be appreciated as well.
TIA
AFAIK there is no way to atomically list node children along with its data in zookeeper wire protocol. Seems, there is only way to do this is following:
List all node children and monitor children changes.
For each child node read it data and monitor data changes.
Update corresponding values on each children change happened after you started traversal.
This can be easily done over ezk, but I've not provided any example code because it heavily depends on guarantees of atomicity that your app logic demanded.
Related
I am creating a Google dataflow pipeline, using Apache Beam Java SDK. I have a few transforms there, and I finally create a collection of Entities ( PCollection< Entity > ) . I need to write this into the Google DataStore and then, perform another transform AFTER all entities have been written. (such as broadcasting the IDs of the saved objects through a PubSub Message to multiple subscribers).
Now, the way to store a PCollection is by:
entities.DatastoreIO.v1().write().withProjectId("abc")
This returns a PDone object, and I am not sure how I can chain another transform to occur after this Write() has completed. Since DatastoreIO.write() call does not return a PCollection, I am not able to further the pipeline. I have 2 questions :
How can I get the Ids of the objects written to datastore?
How can I attach another transform that will act after all entities are saved?
We don't have a good way to do either of these things (returning IDs of written Datastore entities, or waiting until entities have been written), though this is far from the first similar request (people have asked for this for BigQuery, for example) and we're thinking about it.
Right now your only option is to wait until the entire pipeline finishes, e.g. via pipeline.run().waitUntilFinish(), and then doing what you wanted in your main program (e.g. you can run another pipeline).
https://facebook.github.io/relay/graphql/objectidentification.htm is very clear around what Node is and how it behaves, but it doesn't specify which objects must implement it, or what the consequences are if your object doesn't implement it. Is there a set of features that don't work? Are such objects completely ignored? Not all objects in the existing spec (e.g. pageInfo) implement it, so it's clearly not universally required, but pageInfo is somewhat of a special case.
Another way of thinking about the Node interface is that objects that implement it are refetchable. Refetchability effectively means that an object has an ID that I can use to identify the object and retrieve it; by convention, these IDs will usually be opaque, but will contain type information and an identifier within that type (eg. a Base-64 encoding of a string like "Account:1234").
Relay will leverage refetchability in two ways:
Under a process known as "diffing", if you already have some data for an object identified by ID QWNjb3VudDoxMjM0 (say, the name and address fields), and you then navigate to a view where we show some additional fields (location, createdAt) then Relay can make a minimal query that "refetches" the node but only requests the missing fields.
Relatedly, Relay will diff connections and will make use of the Node interface to fill in missing data on those (example: through some combination of navigation you might have full information for some items in a view, but need to fill in location for some items within the range, or you might modify an item in a connection via a mutation). So, in basic pagination, Relay will often end up making a first + after query to extend a connection, but if you inspect its network traffic in a real app you will also see that it makes node queries for items within connections.
So yes, you're right that pageInfo doesn't implement Node, and it wouldn't really make sense for it to do so.
I'm using neo4j (but probably this also applies to other databases). The user can give his own key/value pairs. But i also need to define some properties by the system. How do i prevent a name clash (on the key)? I could prefix all the system properties, but it seems a bit weird. Also i could make another node and put all the system properties there, but that might make for some difficult queries. What's a good way to solve this?
Neo4jis property graph.
Basically, there is no magic there. You already mentioned all possible solutions.
From my perspective best solution is - add prefixes to user defined properties (for example #). This will keep queries simple enough, and doesn’t affect any performance problems.
Additionally, if this properties are only for READ and you are never going to run query against them, then you can look into storing JSON with user-defined data in your nodes:
SET n.user_data = ‘{“key”: “value”}’
As a complete novice programmer I am trying to populate my neo4j DB with data from heterogeneous sources. For this I am trying to use the Neo4jClient C# API. The heterogeneity of my data comes from a custom, continuously evolving DSL/DSML/metamodel that defines the possible types of elements, i.e. models, thus creating classes for each type would not be ideal.
As I understand, my options are the following:
Have a predefined class for each type of element: This way I can easily serialize my objects that is if all properties are primitive types or arrays/lists.
Have a base class (with a Dictionary to hold properties) that I use as an interface between the models that I'm trying to serialize and neo4j. I've seen an example for this at Can Neo4j store a dictionary in a node?, but I don't understand how to use the converter (defined in the answer) to add a node. Also, I don't see how an int-based dictionary would allow me to store Key-Value pairs where the keys (that are strings) would translate to Property names in neo4j.
Generate a custom query dynamically, as seen at https://github.com/Readify/Neo4jClient/wiki/cypher#manual-queries-highly-discouraged. This is not recommended and possibly is not performant.
Ultimately, what I would like to achieve is to avoid the need to define a separate class for every type of element that I have, but still be able to add properties that are defined by types in my metamodel.
I would also be interested to somehow influencing the serializer to ignore non-compatible properties (similarly to XmlIgnore), so that I would not need to create a separate class for each class that has more than just primitive types.
Thanks,
J
There are 2 problems you're trying to solve - the first is how to program the C# part of this, the second is how to store the solution to the first problem.
At some point you'll need to access this data in your C# code - unless you're going fully dynamic you'll need to have some sort of class structure.
Taking your 3 options:
Please have a look at this question: neo4jclient heterogenous data return which I think covers this scenario.
In that answer, the converter does the work for you, you would create, delete etc as before, the converter just handles the IDictionary instance in that case. The IDictionary<int, string> in the answer is an example, you can use whatever you want, you could use IDictionary<string, string> if you wanted, in fact - in that example, all you'd need to do would be changing the IntString property to be an IDictionary<string,string> and it should just work.
Even if you went down the route of using custom queries (which you really shouldn't need to) you will still need to bring back objects as classes. Nothing changes, it just makes your life a lot harder.
In terms of XmlIgnore - have you tried JsonIgnore?
Alternatively - look at the custom converter and get the non-compatible properties into your DB.
I understand that in usergrid UI I can create an Individual collection , but it does not allow me to create a collection under a collection. Is there a way of doing that . Otherwise we will be forced to write business logic in the proxy layer which we don't want to do.
With Regards
-S
There is no concept of subcollection, but you can use connections. So you can do something like this:
POST cats/fluffy/hasa/toys/ball
The above would mean that that an entity of type "cat" is connected to an entity of type "toy" that is called "ball" by a connection verb "hasa".
You can also store sub-objects in an individual entity (e.g. full JSON is supported). If you want to describe your use-case a bit more, I can maybe recommend other ways to structure your data.