Delete, Create and Add Nodes to Index Neo4j - neo4j

A quick question.
In a single transaction, can't I do the followings:
Delete index say indexMaster if already exists
Create index again indexMaster
Add nodes to index indexMaster
`
When I did the above things I got exception.
This index (Index[indexMaster,Node]) has been marked as deleted in this transaction
This exception occurs at line on which I am adding nodes to it.
EDITED:
I am using Neo4j 2.0.4
Code using Java not REST API
Any Idea
Thanks

Not 100% sure here but I guess it is not possible to delete and recreate the same index in the same transaction. Try to use two transactions, one for deleting the index, the other for creating it.

Related

Merge two nodes on session.save (unique nodes)

I'm trying to map some JSON objects to java objects and then save these objects to my neo4j db.
I have tried to use simple neo4j-ogm and run: session.save(object), but if some nodes already exist they are duplicated instead of being merged.
If I create a unique constraint on the value, then I get an exception when I try to run: session.save(object) if the nodes already exists.
I would like to know if there is a solution using neo4j-ogm, or i need to add Spring Data Neo4J (SDN) to resolve this problem?
As of Neo4j OGM 2.1.0, you can use #Index for this.
Annotate your field with #Index(unique=true, primary=true) and session.save will use a MERGE instead of CREATE
See http://neo4j.com/docs/ogm-manual/current/reference/#reference_programming-model_indexing in the docs

ServerPlugins in Neo4j 2.0.0-M03: Where to create schema index

I'm going to check out the new automatic indexing capabilities that come with Neo4j 2.0. They are described here: http://docs.neo4j.org/chunked/2.0.0-M03/tutorials-java-embedded-new-index.html
Now the automatic index must created at one point. The old way to get an index was just "indexManager.forNodes()" and the index was returned if existing, created if not. With automatic indexing, we just have to create the index once via "schema.indexFor()..." and then be done with it.
My question is, where do I best put the index creation? In the documentation example, they have a main method. But I'm working with a ServerPlugin. I'd like to create the indexes once at startup, if they do not already exist. But where can I do this? And how to I check whether the index already exists? I can get all IndexDefinition for a label. But since an IndexDefinition may depend on a label and on a arbitrary property, I would have to iterate through all IndexDefinitions for a specific label and check whether the one with the correct property does exist.
I could of course simply do what I just wrote, but it seems a bit cumbersome compared to the old index handling which would check automatically whether the requested index exists and create it, if not. So I'm wondering if I simply missed some key points with the handling of the new indices.
Thank you!
I got a response from a Neo4j dev here: http://docs.neo4j.org/chunked/2.0.0-M03/tutorials-java-embedded-new-index.html
He proposes to create the automatic indexes in a neo4j start script, for instance. I also saw that someone already wished for unique indexes (would be a great feature!). That would simplify the index creation but in the end this is now a part of the database setup, it seems.

Add auto increment with scope to existing column in migration-file rails

I have posts and organisations in my database. Posts belongs_to organisation and organisation has_many posts.
I have an existing post_id column in my post table which I by now increment manually when I create a new post.
How can I add auto increment to that column scoped to the organisation_id?
Currently I use mysql as my database, but I plan to switch to PostgreSQL, so the solution should work for both if possible :)
Thanks a lot!
#richard-huxton has the correct answer and is thread safe.
Use a transaction block and use SELECT FOR UPDATE inside that transaction block. Here is my rails implementation. Use 'transaction' on a ruby class to start a transaction block. Use 'lock' on the row you want to lock, essentially blocking all other concurrent access to that row, which is what you want for ensuring unique sequence number.
class OrderFactory
def self.create_with_seq(order_attributes)
order_attributes.symbolize_keys!
raise "merchant_id required" unless order_attributes.has_key?(:merchant_id)
merchant_id = order_attributes[:merchant_id]
SequentialNumber.transaction do
seq = SequentialNumber.lock.where(merchant_id: merchant_id, type: 'SequentialNumberOrder').first
seq.number += 1
seq.save!
order_attributes[:sb_order_seq] = seq.number
Order.create(order_attributes)
end
end
end
We run sidekiq for background jobs, so I tested this method by creating 1000 background jobs to create orders using 8 workers with 8 threads each. Without the lock or the transaction block, duplicate sequence number occur as expected. With the lock and the transaction block, all sequence numbers appear to be unique.
OK - I'll be blunt. I can't see the value in this. If you really want it though, this is what you'll have to do.
Firstly, create a table org_max_post (org_id, post_id). Populate it when you add a new organisation (I'd use a database trigger).
Then, when adding a new post you will need to:
BEGIN a transaction
SELECT FOR UPDATE that organisation's row to lock it
Increment the post_id by one, update the row.
Use that value to create your post.
COMMIT the transaction to complete your updates and release locks.
You want all of this to happen within a single transaction of course, and with a lock on the relevant row in org_max_post. You want to make sure that a new post_id gets allocated to one and only one post and also that if the post fails to commit that you don't waste post_id's.
If you want to get clever and reduce the SQL in your application code you can do one of:
Wrap the hole lot above in a custom insert_post() function.
Insert via a view that lacks the post_id and provides it via a rule/trigger.
Add a trigger that overwrites whatever is provided in the post_id column with a correctly updated value.
Deleting a post obviously doesn't affect your org_max_post table, so won't break your numbering.
Prevent any updates to the posts at the database level with a trigger. Check for any changes in the OLD vs NEW post_id and throw an exception if there is one.
Then delete your existing redundant id column in your posts table and use (org_id,post_id) as your primary key. If you're going to this trouble you might as well use it as your pkey.
Oh - and post_num or post_index is probably better than post_id since it's not an identifier.
I've no idea how much of this will play nicely with rails I'm afraid - the last time I looked at it, the database handling was ridiculously primitive.
Its good to know how to implement it. I would prefer to use a gem myself.
https://github.com/austinylin/sequential (based on sequenced)
https://github.com/djreimer/sequenced
https://github.com/felipediesel/auto_increment
First, I must say this is not a good practice, but I will only focus on a solution for your problem:
You can always get the organisation's posts count by doing on your PostsController:
def create
post = Post.new(...)
...
post.post_id = Organization.find(organization_id).posts.count + 1
post.save
...
end
You should not alter the database yourself. Let ActiveRecord take care of it.

elasticsearch index transactions

I am exploring elasticsearch and comparing it with our current search solution. The use case I have is, , everytime I build index, I have to drop the current index and create the new one with the same name. So that all the old docs are dropped with the old index and the new index will have the fresh data. The indexing process takes couple of minutes to finish.
My question is what happens to the search requests coming in during this time. Does elastic search uses transaction and only commit all changes (dropping the index and new index with the new documents) in a transaction?
What happens if I deleted the index, and an error occurs during the middle of the indexing?
If there are no transactions, is there any workaround to this situation?
Elasticsearch doesn't support transactions. When you delete an index, you delete an index. Until you create a new index users will be getting IndexMissingException exceptions. Once the new index is created they will see only records that were indexed and refreshed (by default refresh occurs every second).
One way to hide this from users is by using aliases. You can create an alias that will point to an index. When you need to reindex your data, you can create a new index, index new data there, switch the alias to the new index and delete the old index.

neo4j auto-index item corrupted or orphaned

I somehow deleted a node without the auto index entry updating. Now when I do a query for the missing node using the auto-index
node:index:node_auto_index:uname:test
I get the following error:
Node[19220] not found. This can be because someone else deleted this entity while we were trying to read properties from it, or because of concurrent modification of other properties on this entity. The problem should be temporary.
How do I clean up the index to flush the orphaned index entry and also prevent this from happening in the production database
FWIW This issue has been fixed, so that the entities will be automatically removed from the auto index when deleted. 1.9 will contain this fix.

Resources