Is it safe to execute `DROP DATABASE mykeyspace CASCADE` in DSE Hive? - datastax-enterprise

Is it safe to execute DROP DATABASE mykeyspace CASCADE in DSE Hive? Will it remove Cassandra data? Or can I safely be sure that only Hive metadata is removed? When I try executing this I have a table snapshot created around the same time, which makes me worried.
This question is somewhat related to Can't connect to CFS node.

Yes, I tried dropping a table and the equivalent table in Cassandra is still there.

Related

mnesia files damaged need to forensically dump everything

I have damaged my Mnesia database beyond repair as a result of overestimating the fragility of the implementation. When I try Mnesia API the records I need are not visible even though they keys are visible in the file. Even though the documentation indicates that Mnesia artifacts are DETS files they cannot be opened with or identified as DETS artifacts. PS: dump_to_textfile() does not work either.
Eventually I was able to dump my DB. It did not end my Mnesia problems but it gave me options I did not have before.
SETUP:
Originally I had implemented a master-master mnesia cluster. (read the docs). It turns out that not even the most seasoned Erlang programmer uses Mnesia replication as there are to many flaws. In fact I come to this information from the Erlang inner circle and a few L1 teams too. In my case, however, the work was already in production. And that's when problems started.
We started getting DB consistency errors and, my favorite, network or DB partition errors. It takes a very highly skilled and knowledgeable individual to recover as well as a lot of planning and code in advance; which I did not have.
Ultimately I took two steps. (a) removed the second app so that even though the DB was in a master-master cluster; one was a slave because it was never used as a master. (b) In a second implementation I split the cluster so that the app ran on a single node with a single DB. #a was in production and #b was the warm standby. Replication was manual as writes were very rare.
In the single node deployment there are two nodes. The first node is the application; app#ks and on the same hardware was an "erl" node when I needed to rpc into the app and see how things were going.
MY SOLUTION:
when I posted this question I was trying to dump the contents of my Mnesia DB. I was having a number of problems because I was trying to access the DB from the admin node as the application node was operational.
Because I was trying to access the mnesia lib from the erl node the DB was not LOCAL to the erl node and so dump_to_textfile produced an empty file. I eventually had success when I used rpc to tell the app#ks node to dump.
STILL UNDEFINED
When I launched the admin node I set the mnesia dir parameter to the same folder as the app#ks node. I have a vague memory that this is undesirable.
There are many more Mnesia issues to solve but none that refer to the problem I reported. But I still do not know how to extract the raw data from the various DB files.

neo4j broken/corrupted after ungraceful shutdown

I'm using Neo4j over windows for testing purposes and I'm working with a db containing ~2 million relations and about the same amount of nodes. after I had an ungraceful shutdown of neo4j while writing a batch of relations the db got corrupted.
it seems like there are some broken nodes/relations in the db and whenever I try to read them I get this error (I'm using py2neo):
Error: NodeImpl#1292315 not found. This can be because someone else deleted this entity while we were trying to read properties from it, or because of concurrent modification of other properties on this entity. The problem should be temporary.
I tried rebooting but neo4j fails to recover from this error. I found this question:
Neo4j cannot read certain nodes. Throws NotFoundException. Corrupt database
but the answer he got is no good for me because it involved in going over the db and redo the indexing, and I can't even read those broken nodes/relations so I can't fix their index (tried it and got the same error).
In general I've had many stability issues with neo4j (and on multiple platforms, not just windows). if no decent solution is found for this problem I will have to switch to a different database.
thanks in advance!
I wrote a tool a while ago that allows you to copy a broken store and keeps the good records intact.
You might want to check it out. I assume you used the 2.1.x version of Neo4j.
https://github.com/jexp/store-utils/tree/21
For 2.0.x check out:
https://github.com/jexp/store-utils/tree/20
To verify if your datastore is consistent follow the steps mentioned in http://www.markhneedham.com/blog/2014/01/22/neo4j-backup-store-copy-and-consistency-check/.
Are you referring to batch inserter API when speaking of "while writing a batch of relations"?
If so, be aware that batch inserter API requires a clean shutdown, see the big fat red warning on http://docs.neo4j.org/chunked/stable/batchinsert.html.
Are the broken nodes schema indexed and are you attempting to read them via this indexed label/property? If so, it's possible you may have a broken index following the sudden shutdown.
Assuming this is the case, you could try deleting the schema subdirectory within the graph store directory while the server is not running and let the database rebuild the index on restart. While this isn't an official way to recover from a broken index, it can sometimes work. Obviously, I suggest you back up your store before trying this.

Temporarily stop mnesia replication

I have an erlang application that uses mnesia to store some basic state that defines users and roles of our system. We have a new feature that we need to roll out that requires an extension of the record schema stored in one our mnesia tables.
Our deployment plan was to take one node out of the cluster (just by removing from network), deploy the code, run a script to upgrade the record schema on that node. Bring it back into service. However, once I upgrade the records on this node, it replicates to the other nodes and certain operations begin failing on those nodes because of the mis-matched record schema. Obviously a BIG PROBLEM for zero-down-time deployments.
Is there a way to isolate my schema changes so that the schema upgrade can be run on each node as they are upgraded? Preferably for only the table being upgraded, allowing the other tables to keep replicating. However, I could live with shutting-of replication between all nodes for the few minutes it takes for use to deploy to all nodes.
I had this exact problem. The only way I was able to solve it was to take all nodes out of the cluster and leave only one live, upgrade that "master" node's schema and code, which can hopefully be done while live, then for each remaining node, delete its database files, upgrade the code, and bring the node up (creating the tables with the correct new schema) and back into the cluster.
I used an escript I wrote that adds and removes nodes from a cluster to make this easier, and an Ansible playbook to orchestrate it. I really don't want to do that again any time soon.
The essential problem is that Mnesia doesn't have schema versioning, otherwise this could be done in a much better way.

Remove Garbage form Firebird DataBase

Firebird 2.1.3 database seems to be creating a lot of garbage from uncompleted transactions this is causing the database to run very slowly until its garbage is removed via a database sweep or server restart. My database size its 30gb+.
Have you any idea what could be causing this?
Do any of the new stored procedures create excess garbage?
Please Help me.?
A Firebird database getting slow after a period of time is usually a sign of bad client transaction management. This can be easily checked by inspecting various transaction counters from the header page, which can be queried by running:
gstat -h <yourdatabase>
when your database becomes slow. For example: Pretty much all access libraries, when running transactions in auto commit mode (basically when you don't care about starting explicit transactions in your client application), are using COMMIT RETAINING, which basically blocks moving OIT/OAT forward.
Beside the gstat command-line tool, with Firebird 2.1 you also have the monitoring tables, in particular MON$TRANSACTIONS, to identify long-running transactions.

Migrate Data from Neo4j to SQL

Hi I am using neo4j in my application and my structure is as following:
I am using Embedded Graph API
I have several databases that I point to using a pool that I maintain in my application eg-> db1, db2, db3, ..... db100
When I want to access a particular database I point to it using new EmbeddedGraphDatabase("Path to db(n)")
The problem is that when the connection pool count increases the RAM size being consumed by the application keep increasing and breaks down the application at a point of limit.
So I am Thinking of migrating from Neo4j to some other Database.
Additionally only a small part of my database is utilizing the graph structure.
One way for migration is that I write a script for it. Is there any better option?
My another question is what is the best Database so that my structure can be maintained.
Other view-point that I am thinking about is I can keep part of my data into Neo4j and shift another part to some other database.
If anything is unclear I can clarify.
Thanks in advance.
An EmbeddedGraphDatabase instance is not the equivalent of a "connection" in SQL. It's designed to run a long time (days, months). Hence starting/stopping is costly.
What is the use case for having hundreds of separate databases in the same JVM?
Your lots of small databases will perform poorly as the graphdb is designed to hold the whole datamodel on a single host.
Do you run a single JVM per database?
You can control the amount of memory used by neo4j by providing the correct properties for memory mapping and also use the gcr cache from neo4j-enterprise and control the cache size-property variables.
I think it still makes sense to keep the graph part in Neo4j and only move the non-graphy part.

Resources