I have some questions about Neo4j and data integrity!
As under Neo4j data integrity is ensured? And since all ACID properties are supported, as are implemented. So atomicity, consistency, isolation, durability?
Maybe one or information sources on this?
Thank you in advance for
best regards
start here:
http://docs.neo4j.org/chunked/milestone/introduction-highlights.html
since i don't understand exactly your question, could you please post your goal - what are you about to do with neo4j, and if any what is your current technology you are using for your goal now? than maybe i could answer whether it's possible or not.
Related
I work with Ruby on Rails and want to cache some objects that I receive from the database. However, security is my priority and I am not sure if marshalling is the best choice over, for example, JSON.
Are there any security risks related to unmarshalling database objects? Is it possible to construct such an object that unmarshalling will result in remote code execution? If yes, how?
ok, I thought about it more, and attained enlightenment. Of course, I can store those objects and highly likely nothing will happen, but I know that this is a possible attack vector. So, I can avoid possible issues completely and not summon Murphy's laws upon me. Thanks to #SergioTulentsev for his patience!
GitHub for Neo4J?
I'm evaluating graph databases as a possible solution for modeling a complex computer network. It occurs to me something like a revision control system would be useful for planning and testing updates to the database. I had been assuming that either we would instantiate a test network graph for such planning and then write a routing to sync the changes.
I see that this question has been asked and answered for relational databases (How do you maintain revision control of your database structure?). But I'm asking for graph databases, probably Neo4J.
In that relational thread someone pitches the Rails approach of making rollback a required element of database development. I like this idea too; I'm not sure how easy it is in graph databases.
How is this handled in the real world?
I found your question while also searching for an answer, so I don't have tested solutions to offer. But I can share that there's some discussion of this at How do I implement revisions with neo4j?, including a specific case at Neo4j / Strategy to keep history of node changes.
There's also a more detailed blog post at http://iansrobinson.com/2014/05/13/time-based-versioned-graphs/, which weighs the read-time / write-time / storage requirements of several alternatives. It also includes a number of diagrams and example queries that helped me wrap my head around what all this would look like.
Hope that's still useful, lo these months later, and sorry I can't be of more help! If you've found something that works in the meantime, can you let us know?
I have searched on net but could not find the satisfied answer to difference between mnesia:delete and mnesia:dirty_delete.Is this related to lock? Any pointers
This piece of docs will be helpful. It explains transactions in distributed mnesia and explains, that dirty functions run without them.
This is indeed related to transactions. Dirty give you the ability to bypass the transactional behaviour, but there is a risk for data integrity. Thus, you have to know what you are doing when using dirty operation.
It's a RoR project.
We want to store user activities, like uploaded a photo, voted for somebody, followed somebody, etc. When listing the activities, we need to list your friends activities as well. So, what is better to use in this case: a document-oriented database (couchdb, mongo db), a graph database (neo4js), or maybe some other approach?
Thank you for helping in advance guys :)
Yeah, I think Neo4j is a good choice, the Rails 3 support is excellent, see https://github.com/andreasronge/neo4j, see even the social examples with cypher like in http://docs.neo4j.org/chunked/snapshot/data-modeling-examples.html, and for activity streams, there are various cool approaches like Graphity, see http://www.rene-pickhardt.de/graphity-an-efficient-graph-model-for-retrieving-the-top-k-news-feeds-for-users-in-social-networks/
Depending on the scale of your application, and volume of activity, I'd recommend a combination of Couchbase (not CouchDB) for actual activity data which is extremely scalable and fast, and Neo4J for the graph discovery (both databases at the same time). I've used the combination very effectively in my application that was both social and real-time.
If you want more info from me, please feel free to contact me directly and I can help with architectural decisions or implementation help.
Take a look at Infinitegraph. It is scalable unlike neo4j. I think they have a free download for 1 million nodes.
Consider using Sqlite. It is a flat file mimicing database and can be used as an embedded database.
i need to explain the practical problems that might be encountered when transforming their transactional (and other) data from their diverse sources into the Data Warehouse. according to my knowledge this is about cleansing and scrubbing data. if anyone knows about any practical problem please help me.thanks for your help
That's a broad topic, but I'll offer a few good starting points.
For starters, think about history. If a transaction updates some data point, do you need to apply that retroactively, or do you need to remember what the value was at any given point in time. For example, suppose you have a monthly report of customers by city, and one of your customers moves. How should the DW reflect that.
Think about data acceptance. Is every input row a good input? For example, if you're dealing with web data, there are crawlers and spammers that you might not want to count the same as you count user traffic.
Think about data synchronization. Do all your inputs use the same keys? Do you know how to translate between them? Does Team A mean the same thing by "cust_id" as Team B does? A project glossary is very helpful here.
Think about localization. Are you inputs all in the same time zone? Do they all use the same calendar system? Do you need to handle unicode?
Think about reporting. Are the data you're capturing able to answer the questions people will ask of the DW? If not, how can you capture data that can?
Think about presentation. Should you be showing customers the same data you're using for internal reporting? Does finance need to see a different slice of the data than marketing?
This really only scratches the surface of the issues that come up on a major DW project. I would refer you to Ralph Kimball's assorted books on Data Warehousing for a more in depth discussion of problems and solutions. Hope this helps you get started.
You give the answer in your question.
According to my knowledge this is about cleansing and scrubbing data.
And you are correct. Cleansing data means that you have a company-wide list of clean element attributes, and a mapping that changes the unclean elements into clean elements.
Processing the data against the clean element attributes is a piece of cake compared to creating the company-wide list of clean element attributes.
You have to get people from different departments to agree on what data to warehouse, and to agree on what each element means. This is a difficult sociological problem. It's not a terribly hard technical problem.
Good luck getting your company-wide list of clean element attributes.