Remote neo4j gremlin inconsistent results - neo4j

I setup a neo4j server, gremlin server and gremlin console. I am connecting gremlin server to neo4j with SteelBridgeLabs/neo4j-gremlin-bolt. When I add several nodes and try to fetch all nodes afterwards from gremlin console, I get inconsistent results. It doesn't return all nodes.
neo4j.properties
gremlin.graph=com.steelbridgelabs.oss.neo4j.structure.Neo4JGraph
#neo4j.graph.name=graph.db
neo4j.identifier=dummy
neo4j.url=bolt://localhost:7687
neo4j.username=neo4j
neo4j.password=pass
neo4j.readonly=false
neo4j.vertexIdProvider=com.steelbridgelabs.oss.neo4j.structure.providers.Neo4JNativeElementIdProvider
neo4j.edgeIdProvider=com.steelbridgelabs.oss.neo4j.structure.providers.Neo4JNativeElementIdProvider
That is how i add nodes and their results
gremlin> g.addV('cat').property("name","sylvester")
==>v[null]
gremlin> g.addV('cat').property("name","tom")
==>v[null]
gremlin> g.addV('cat').property("name","garfield")
==>v[null]
gremlin> g.addV('mice').property("name","jerry")
==>v[null]
Neo4j Browser shows these nodes without a problem. But when i query from gremlin-console i get different result as follows
gremlin> g.V().valueMap()
==>{name=[garfield]}
==>{name=[sylvester]}
==>{name=[tom]}
==>{name=[jerry]}
gremlin> g.V().valueMap()
==>{name=[garfield]}

The comments above apparently lead to the answer here, but I can't say that I'm clear on exactly why the fix mentioned actually resolves the problem.
For those who come across this question, I will summarize how this sort of connectivity is expected to work. Connecting from Gremlin Console to Gremlin Server with:
:remote connect tinkerpop.server conf/remote.yaml
opens a sessionless connection which should create a situation where the server manages transactions for you, meaning each time you submit a request to the server, the a transaction is opened, the traversal executed and the transaction closed. It also means that any mutations to the graph should be automatically be committed (or rolledback on failure) at the end of each request. With that model, there should be no situation where you get stale or inconsistent data.
To build on that notion, performing a submission as:
gremlin> g.tx().rollback();g.V().valueMap()
under this (and any) connection model should automatically yield a fresh transaction explicitly and thus never produce a stale result set.
The following method of connection yields a session such that the user manages the transaction:
:remote connect tinkerpop.server conf/remote.yaml session
and therefore may yield an inconsistent state. You must explicitly commit and rollback transactions as needed as the transaction can extend over multiple requests. In other words expect that it will be necessary to call g.tx().rollback() prior to getting the latest changes made to the graph in a different thread of execution.
The final connection option is as follows and it blends the two concepts:
:remote connect tinkerpop.server conf/remote.yaml session-managed
You get a session in the sense that variables are preserved between requests, but each request represents a single transaction which is commit or rolledback at the end of each request. Like a sessionless connection, you should not expect an inconsistent state or stale data and, as mentioned earlier, use of g.tx().rollback() prior to the query should force start a new transaction even if the managed transaction somehow failed to behave as expected.
If these things are not working as described here, I would likely wonder about the graph provider itself and whether or not its transaction semantics are completely compliant with the TinkerPop model.

Related

Use other database connection and execute query

In our app, we need to switch to read replica database and read from it for some read-only APIs.
We decided to use the around_action filter for that:
Switch DB to read_replica before the action
Yield
Switching back to master.
We decided to use establish_connection for switching, which did the job but later we noticed that it's not thread-safe i.e it causes our other threads to face "#<ActiveRecord::ConnectionNotEstablished: No connection pool with 'primary' found.>" issue. So this solution would have worked in the case of single-threaded servers.
Later we tried to create a new connection pool, as below which is thread-safe:
databases = Rails.configuration.database_configuration
resolver = ActiveRecord::ConnectionAdapters::ConnectionSpecification::Resolver.new(databases)
spec = resolver.spec(:read_replica)
pool = ActiveRecord::ConnectionAdapters::ConnectionPool.new(spec)
pool.with_connection { |conn|
execute SQL query here.
}
The only problem with the above approach is, we can only execute queries using execute method like conn.execute(sql_query) any AR ORM query we execute inside this with_connection block run on the original DB and not read_replica.
Seems like ActiveRecord do have its default connection and it's using it when we run AR ORM queries.
Not sure how can we execute the AR ORM query inside the with_connection block as User.where(id: 1..10).
Please note:
I am aware that we can do this natively in rails 6, need to skip that for now.
I am also aware of the Octopus gem, again need to skip on that.
Appreciate any help, Thanks.

Unable to connect to Neo4j from c# driver Session to fabric database

Using the Neo4j.Driver (4.1.0) I am unable to connect a session to server's configured fabric database. It works fine in the Neo4j Browser. Is there a trick to setting the context to a fabric database?
This times out:
var session = driver.AsyncSession(o => o.WithDatabase("fabric"));
Actual database names work fine.
Does the c# driver not support setting the Session context to a fabric database?
I'm trying to execute something like the following:
use fabric.graph(0)
match ...
set...
I found a workaround by co-opting a sub-query as follows, but it seems that setting the session context would make more sense.
use fabric
call {
use fabric.graph(0)
match ...
set ...
return 0
}
return 0
I've not yet worked with fabric. But I have worked with clusters. You can only add nodes/edges to the one Neo4j database that has a WRITE role. To do this you need a small function to query the routing table and determine the write database. Here's the key query:
CALL dbms.cluster.routing.getRoutingTable({}) YIELD ttl, servers UNWIND servers as server with server where server.role='WRITE' RETURN server.addresses
You then address your write query to that specific database.

Deadlock on concurrent update, but I can see no concurrency

What could trigger a deadlock-message on Firebird when there is only a single transaction writing to the DB?
I am building a webapp with a backend written in Delphi2010 on top of a Firebird 2.1 database. I am getting an concurrent-update error that I cannot make sense of. Maybe someone can help me debug the issue or explain scenarios that may lead to the message.
I am trying an UPDATE to a single field on a single record.
UPDATE USERS SET passwdhash=? WHERE (RECID=?)
The message I am seeing is the standard:
deadlock
update conflicts with concurrent update
concurrent transaction number is 659718
deadlock
Error Code: 16
I understand what it tells me but I do not understand why I am seeing it here as there are no concurrent updates I know of.
Here is what I did to investigate.
I started my appplication server and checked the result of this query:
SELECT
A.MON$ATTACHMENT_ID,
A.MON$USER,
A.MON$REMOTE_ADDRESS,
A.MON$REMOTE_PROCESS,
T.MON$STATE,
T.MON$TIMESTAMP,
T.MON$TOP_TRANSACTION,
T.MON$OLDEST_TRANSACTION,
T.MON$OLDEST_ACTIVE,
T.MON$ISOLATION_MODE
FROM MON$ATTACHMENTS A
LEFT OUTER JOIN MON$TRANSACTIONS T
ON (T.MON$ATTACHMENT_ID = A.MON$ATTACHMENT_ID)
The result indicates a number of connections but only one of them has non-NULLs in the MON$TRANSACTION fields. This connection is the one I am using from IBExperts to query the monitor-tables.
Am I right to think that connection with no active transaction can be disregarded as not contributing to a deadlock-situation?
Next I put a breakpoint on the line submitting the UPDATE-Statement in my application server and executed the request that triggers it. When the breakpoint stopped the application I then reran the Monitor-query above.
This time I could see another transaction active just as I would expect:
Then I let my appserver execute the UPDATE and reap the error-message as shown above.
What can trigger the deadlock-message when there is only one writing transaction? Or are there more and I am misinterpreting the output? Any other suggestions on how to debug this?
Firebird uses MVCC (Multiversion Concurrency Control) for its transaction model. One of the features is that - depending on the transaction isolation - you will only see the last version committed when your transaction started (consistency and concurrency isolation levels), or that were committed when your statement started (read committed). A change to a record will create a new version of the record, which will only become visible to other active transactions when it has been committed (and then only for read committed transactions).
As a basic rule there can only be one uncommitted version of a record. So attempts by two transactions to update the same record will fail for one of those transaction. For historical reasons these type of errors are grouped under the deadlock error family, even though it is not actually a deadlock in the normal concurrency vernacular.
This rule is actually a bit more restrictive depending on your transaction isolation: for consistency and concurrency level there can also be no newer committed versions of a record that is not visible to your transaction.
My guess is that for you something like this happened:
Transaction 1 started
Transaction 2 started with concurrency or consistency isolation
Transaction 1 modifies record (new version created)
Transaction 1 commits
Transaction 2 attempts to modify same record
(Note, step 1+3 and 2 could be in a different order (eg 1,3,2, or 2,1,3))
Step 5 fails, because the new version created in step 3 is not visible to transaction 2. If instead read committed had been used then step 5 would succeed as the new version would be visible to the transaction at that point.

The cypher PROFILE keyword asks for a transaction even if there is already one

I am trying to profile the following query on the neo4j server console (community edition, version 1.9.2):
PROFILE START ungrouped=node(1)
CREATE (grouped{__type__:'my.package.Grouped'})<-[:HAS_NEXT]-(ungrouped)
MATCH (ungrouped)-[:LEAF]->(leaf)
WITH leaf.`custom-GROUP` as groupValue, grouped, leaf
CREATE UNIQUE (grouped)-[:GROUP]->({__type__:'my.package.Group',groupKey:'GROUP',groupValue:groupValue,groupOrigin:ID(ungrouped)})-[:LEAF]->(leaf)
RETURN DISTINCT grouped;
When I run the above query, I get the message
==> I need a transaction!
Ok, so I created one with
BEGIN TRANSACTION
==> Transaction started
Afterwards I run the same query again. But unfortunately I get the same message again:
==> I need a transaction!
But there definitely is a transaction. When I type
ROLLBACK
the transaction is successfully rolled back:
==> Transaction rolled back
Am I doing something wrong? Does profiling not work for such kind of a query by design? Or is this simply a bug in neo4j?

Single user database connection best practices

With MS Access single user,
Is it good practice or okay to maintain a persistent connection throughout?
psuedocode:
app.start();
access.connect();
domanymanystuff();
access.disconnect();
app.exit();
--- OR ----
app.start();
access.connect();
doonetask();
access.disconnect();
...
access.connect();
doanothertask();
access.disconnect();
...
app.exit();
?
Honestly it won't matter since most data connection are pooled and will hang around for reuse after you have closed them. You do want to make sure that your transactions are performed in a 'per unit of work' fashion.
Otherwise, even with a single user DB you could find your application locking itself out.
So, try this:
Open connection
Start transaction
Perform unit of work
Commit transaction
...
Start transaction
Perform unit of work
Commit transaction
...
Start transaction
Perform unit of work
Commit transaction
...
Close connection
You can maintain a persistent connection throughout with a single-user database.

Resources