neo4j auto-index item corrupted or orphaned - neo4j

I somehow deleted a node without the auto index entry updating. Now when I do a query for the missing node using the auto-index
node:index:node_auto_index:uname:test
I get the following error:
Node[19220] not found. This can be because someone else deleted this entity while we were trying to read properties from it, or because of concurrent modification of other properties on this entity. The problem should be temporary.
How do I clean up the index to flush the orphaned index entry and also prevent this from happening in the production database

FWIW This issue has been fixed, so that the entities will be automatically removed from the auto index when deleted. 1.9 will contain this fix.

Related

Delete, Create and Add Nodes to Index Neo4j

A quick question.
In a single transaction, can't I do the followings:
Delete index say indexMaster if already exists
Create index again indexMaster
Add nodes to index indexMaster
`
When I did the above things I got exception.
This index (Index[indexMaster,Node]) has been marked as deleted in this transaction
This exception occurs at line on which I am adding nodes to it.
EDITED:
I am using Neo4j 2.0.4
Code using Java not REST API
Any Idea
Thanks
Not 100% sure here but I guess it is not possible to delete and recreate the same index in the same transaction. Try to use two transactions, one for deleting the index, the other for creating it.

ServerPlugins in Neo4j 2.0.0-M03: Where to create schema index

I'm going to check out the new automatic indexing capabilities that come with Neo4j 2.0. They are described here: http://docs.neo4j.org/chunked/2.0.0-M03/tutorials-java-embedded-new-index.html
Now the automatic index must created at one point. The old way to get an index was just "indexManager.forNodes()" and the index was returned if existing, created if not. With automatic indexing, we just have to create the index once via "schema.indexFor()..." and then be done with it.
My question is, where do I best put the index creation? In the documentation example, they have a main method. But I'm working with a ServerPlugin. I'd like to create the indexes once at startup, if they do not already exist. But where can I do this? And how to I check whether the index already exists? I can get all IndexDefinition for a label. But since an IndexDefinition may depend on a label and on a arbitrary property, I would have to iterate through all IndexDefinitions for a specific label and check whether the one with the correct property does exist.
I could of course simply do what I just wrote, but it seems a bit cumbersome compared to the old index handling which would check automatically whether the requested index exists and create it, if not. So I'm wondering if I simply missed some key points with the handling of the new indices.
Thank you!
I got a response from a Neo4j dev here: http://docs.neo4j.org/chunked/2.0.0-M03/tutorials-java-embedded-new-index.html
He proposes to create the automatic indexes in a neo4j start script, for instance. I also saw that someone already wished for unique indexes (would be a great feature!). That would simplify the index creation but in the end this is now a part of the database setup, it seems.

elasticsearch index transactions

I am exploring elasticsearch and comparing it with our current search solution. The use case I have is, , everytime I build index, I have to drop the current index and create the new one with the same name. So that all the old docs are dropped with the old index and the new index will have the fresh data. The indexing process takes couple of minutes to finish.
My question is what happens to the search requests coming in during this time. Does elastic search uses transaction and only commit all changes (dropping the index and new index with the new documents) in a transaction?
What happens if I deleted the index, and an error occurs during the middle of the indexing?
If there are no transactions, is there any workaround to this situation?
Elasticsearch doesn't support transactions. When you delete an index, you delete an index. Until you create a new index users will be getting IndexMissingException exceptions. Once the new index is created they will see only records that were indexed and refreshed (by default refresh occurs every second).
One way to hide this from users is by using aliases. You can create an alias that will point to an index. When you need to reindex your data, you can create a new index, index new data there, switch the alias to the new index and delete the old index.

How to reindex only some objects in Sunspot Solr

We use Sunspot Solr for indexing and searching in our Ruby on Rails application.
We wanted to reindex some objects and someone accidentally ran the Product.reindex command from the Rails Console. The result was that indexing of all products started from scratch and our catalogue appeared empty while indexing was taking place.
Since we have a vast amount of data the reindexing has been taken three days so far. This morning when I checked on the progress of the reindexing, it seems like there was one corrupt data entry which resulted in the reindexing stopping without completing.
I cannot restart the entire Product.reindex operation again as it takes way too long. Is there a way to only run reindexing on selected products? I want to select a range of products that aren't indexed and then just run indexing on thise. How can I add a single product to the index without having to run a complete reindex of entire data set?
Sunspot does index an object in the save callback so you could save each object but maybe that would trigger other callbacks too. A more precise way to do it would be
Sunspot.index [post1, post2]
Sunspot.commit
or with autocommit
Sunspot.index! [post1, post2]
You could even pass in object relations as they are just an array too
Sunspot.index! post1.comments
I have found the answer on https://github.com/sunspot/sunspot#reindexing-objects
Whenever an object is saved, it is automatically reindexed as part of the save callbacks. So all that was needed was to add all the objects that needed reindexing to an array and then loop through the array, calling save on each object. This successfully updated the required objects in the index.

Rails: Oracle constraint violation

I'm doing maintenance work on a Rails site that I inherited; it's driven by an Oracle database, and I've got access to both development and production installations of the site (each with its own Oracle DB). I'm running into an Oracle error when trying to insert data on the production site, but not the dev site:
ActiveRecord::StatementInvalid (OCIError: ORA-00001: unique constraint (DATABASE_NAME.PK_REGISTRATION_OWNERSHIP) violated: INSERT INTO registration_ownerships (updated_at, company_ownership_id, created_by, updated_by, registration_id, created_at) VALUES ('2006-05-04 16:30:47', 3, NULL, NULL, 2920, '2006-05-04 16:30:47')):
/usr/local/lib/ruby/gems/1.8/gems/activerecord-oracle-adapter-1.0.0.9250/lib/active_record/connection_adapters/oracle_adapter.rb:221:in `execute'
app/controllers/vendors_controller.rb:94:in `create'
As far as I can tell (I'm using Navicat as an Oracle client), the DB schema for the dev site is identical to that of the live site. I'm not an Oracle expert; can anyone shed light on why I'd be getting the error in one installation and not the other?
Incidentally, both dev and production registration_ownerships tables are populated with lots of data, including duplicate entries for country_ownership_id (driven by index PK_REGISTRATION_OWNERSHIP). Please let me know if you need more information to troubleshoot. I'm sorry I haven't given more already, but I just wasn't sure which details would be helpful.
UPDATE: I've tried dropping the constraint on the production server but it had no effect; I didn't want to drop the index as well because I'm not sure what the consequences might be and I don't want to make production less stable than it already is.
Curiously, I tried executing by hand the SQL that was throwing an error, and Oracle accepted the insert statement (though I had to wrap the dates in to_date() calls with string literals to get around an "ORA-01861: literal does not match format string" error). What might be going on here?
Based on the name of the constraint, PK_REGISTRATION_OWNERSHIP, you have a primary key violation. If these databases aren't maintaining this data in lockstep, something/someone has already inserted a record into the registration_ownerships table in your production database with company_ownership_id=2 & registration_id=2920. (I'm guessing at the specifics based on the names)
If this particular set of values needs to exist in the production database,
1) check that what's already there isn't what you're trying to insert. if it is, you're done.
2) If you need to insert your sample data as-is, you need to modify the existing data & re-insert it (and all the dependent/refering records), then you can insert your values.
If you query the table and find no matching rows, then one of the following may be the cause:
The session is trying to insert the row twice.
Another session has inserted the row, but hasn't committed yet.
Also, check that the state of the unique constraint is the same between dev and prod. Perhaps the one on dev is marked as not validated - check that the index exists on dev and is a unique index (note: in Oracle it is possible to have a unique constraint validated by a non-unique index).
Take a hard look at the underlying unique index for the constraint. The reason dropping the constraint doesn't change anything is because the index remains, and it's a unique index. What does the following tell you about the indexes in both environments? Are both indexes valid? Are both defined the same? Are they both actually unique?
SELECT ai.table_name, ai.index_name, ai.uniqueness, aic.column_name, ai.status
FROM all_constraints ac JOIN all_indexes ai ON (ac.index_name = ai.index_name)
JOIN all_ind_columns aic ON (ai.index_name = aic.index_name)
WHERE ac.owner = 'YOUR_USER'
AND ac.constraint_name = 'PK_REGISTRATION_OWNERSHIP'
ORDER BY ai.index_name, column_position;
As it happens, there was a spare copy of the "registrations" model lying around the directory; even though it had a different name ("registrations_2349871.rb" or something like that) Rails was running all model functionality (saving, validating, etc) twice, hence the key constraint violation! I've never seen behavior like this before. Deleting the rogue file solved the problem.

Resources