cleaning up remains of Datastax Enterprise Hadoop install - datastax-enterprise

We have removed the analytics datacenter, but I am seeing lots of stuff hanging around. For instance keyspaces
select * from schema_keyspaces;
HiveMetaStore | True | org.apache.cassandra.locator.SimpleStrategy | {"replication_factor":"1"}
Also I have references to CFSCompactionStrategy left in the log files. I want to cleanup our ring completely. No weird keyspaces ... remove CFSCompactionStrategy Ideas?
Edited with some more info from the recommended solution:
UPDATE schema_keyspaces set strategy_options = '{"Cassandra":"2"}' where keyspace_name in ('keyspace1', 'keyspace2');
drop keyspace cfs_archive;
drop keyspace dse_security;
drop keyspace cfs;
DROP KEYSPACE "HiveMetaStore";
Then clean out the folders...
This may be needed as well:
DELETE from system.schema_columnfamilies where keyspace_name = 'cfs';
delete from system.schema_columns where keyspace_name in ('cfs', 'cfs_archive');

You can simply drop the HiveMetaStore (and cfs and cfs_archive). The keyspaces are created the first time an analytics node is started and behave exactly like standard Cassandra keyspaces.
At this point you only have the metadata for them; the data shouldn't be replicated on the other nodes unless you changed the replication strategy for those keyspaces at some point.

Related

ksqlDB server shuts down when config, offset and status topic is changed

I'm running a single ksqlDB Server on embedded mode on our Kubernetes cluster and I want to add a connector.
Adding a connector produces a Request timed out on Kafka Connect exactly similar to this blog post by Robin Moffatt.
So he suggests to change the KAFKA_OFFSET_REPLICATION_FACTOR contained in his docker-compose example.
But unfortunately in our Test environment, I don't have easy access to the existing Kafka cluster (we have admins there), so I think the fastest way to go about is to instead change the:
KSQL_CONNECT_CONFIG_STORAGE_TOPIC - change to a different topic name
KSQL_CONNECT_OFFSET_STORAGE_TOPIC
KSQL_CONNECT_STATUS_STORAGE_TOPIC
KSQL_CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR to -1 (originally this value is 1)
KSQL_CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR to -1 (originally this value is 1)
KSQL_CONNECT_STATUS_STORAGE_REPLICATION_FACTOR to -1 (originally this value is 1)
But when I change the topic names, I can see that new topics are created (using ksqlDB's SHOW TOPICS command), but it always shuts down and restarts forever, here are the logs:
[2021-07-22 01:27:19,889] INFO ProcessingLogConfig values:
ksql.logging.processing.rows.include = false
ksql.logging.processing.stream.auto.create = false
ksql.logging.processing.stream.name = KSQL_PROCESSING_LOG
ksql.logging.processing.topic.auto.create = false
ksql.logging.processing.topic.name =
ksql.logging.processing.topic.partitions = 1
ksql.logging.processing.topic.replication.factor = 1
(io.confluent.ksql.logging.processing.ProcessingLogConfig:372)
[2021-07-22 01:27:19,891] ERROR Aborting application start (io.confluent.ksql.rest.server.KsqlRestApplication:378)
io.confluent.ksql.rest.server.KsqlRestApplication$AbortApplicationStartException: Shutting down application during waitForPreconditions
at io.confluent.ksql.rest.server.KsqlRestApplication.waitForPreconditions(KsqlRestApplication.java:441)
at io.confluent.ksql.rest.server.KsqlRestApplication.startKsql(KsqlRestApplication.java:386)
at io.confluent.ksql.rest.server.KsqlRestApplication.startAsync(KsqlRestApplication.java:370)
at io.confluent.ksql.rest.server.MultiExecutable.doAction(MultiExecutable.java:68)
at io.confluent.ksql.rest.server.MultiExecutable.startAsync(MultiExecutable.java:42)
at io.confluent.ksql.rest.server.KsqlServerMain.tryStartApp(KsqlServerMain.java:89)
at io.confluent.ksql.rest.server.KsqlServerMain.main(KsqlServerMain.java:64)
[2021-07-22 01:27:19,892] INFO Server up and running (io.confluent.ksql.rest.server.KsqlServerMain:90)
[2021-07-22 01:27:19,892] INFO Server shutting down (io.confluent.ksql.rest.server.KsqlServerMain:96)
[2021-07-22 01:27:19,892] INFO ksqlDB shutdown called (io.confluent.ksql.rest.server.KsqlRestApplication:498)
[2021-07-22 01:27:34,926] INFO API server stopped (io.confluent.ksql.api.server.Server:196)
[2021-07-22 01:27:34,927] INFO ksqlDB shutdown complete (io.confluent.ksql.rest.server.KsqlRestApplication:553)
I don't have anymore details, it's just that.
When I return the config, offset and status topic names to what I had at first, the ksqlDB Server starts fine, but again I'm stuck with the problem that I can't create connectors.
I have a fear that when I attempt to delete the topics manually, ksqlDB server wont be able to start properly because it keeps on finding the original config, offset and status topics I had at first.
I have solved the issue, apparently using -1 as value for:
KSQL_CONNECT_CONFIG_REPLICATION_FACTOR
KSQL_CONNECT_OFFSET_REPLICATION_FACTOR
KSQL_CONNECT_STATUS_REPLICATION_FACTOR
doesn’t work properly, the Config topic becomes 20 partitions,
when I read in the Confluent Docs it should only be 1 partition, I think that’s why the ksqlDB Server just restarts endlessly, I just need to gather the right evidences.
Turning those values to 3 (which is our Kafka broker's default rep factor config) I think solved the issue, it was hard, because no error message/s are seen, like when it doesn’t want more than 1 partition of the created Config topic.

How can I create a OpenEBS cstor pool?

Setup -- OpenEBS 0.7
K8S -- 1.10, GKE
I am having 3 node cluster with 2 disks each per node. Can I use these disks for cstor pool creation? How can I do that? Should I have to manually select the disks?
Yes, You can use the disks attached to the Nodes for creating cStor Storage Pool using OpenEBS.The main prereuisties is to unmount the disks if it is being used.
With the latest OpenEBS 7.0 release, the following disk types/paths are excluded by OpenEBS data plane components Node Disk Manager(NDM) which identifies the disks to create cStor pools on nodes.
loop,/dev/fd0,/dev/sr0,/dev/ram,/dev/dm-
You can also customize by adding more disk types associated with your nodes. For example, used disks, unwanted disks and so on. This must be done in the 'openebs-operator-0.7.0.yaml' file that you downloaded before installation. Add the device path in openebs-ndm-config under ConfigMap in the openebs-operator.yaml file as follows.
"exclude":"loop,/dev/fd0,/dev/sr0,/dev/ram,/dev/dm-"
Example:
{
"key": "path-filter",
"name": "path filter",
"state": "true",
"include":"",
"exclude":"loop,/dev/fd0,/dev/sr0,/dev/ram,/dev/dm-"
}
So just install openebs-oprator.yaml which is mentioned in the docs.openebs.io and after the installation it will detect the disks. Follow the instruction given in the doc. You can create pool either by manually selecting the disks or by auto way.

Neo4j Java VM Tuning (v2.3 Community)

From what I can tell I'm having an issue with my Neo4j v2.3 Community Java VM adding items to the Old Gen Heap and never being able to Garbage Collecting them.
Here is a detailed outline of the situation.
I have a PHP file which calls the Dropbox Delta API and writes out the file structure to my Neo4j Database. Each call to Delta returns a 2000 Item data sets of which I pull out the information I need, the following is an example of what that query looks like with just one item, usually I send in full batches of 2000 items as it gave me the best results.
***Following is an example Query***
MERGE (c:Cloud { type:'Dropbox', id_user:'15', id_account:''})
WITH c
UNWIND [
{ parent_shared_folder_id:488417928, rev:'15e1d1caa88',.......}
]
AS items MERGE (i:Item { id:items.path, id_account:'', id_user:'15', type:'Dropbox' })
ON Create SET i = { id:items.path, id_account:'', id_user:'15', is_dir:items.is_dir, name:items.name, description:items.description, size:items.size, created_at:items.created_at, modified:items.modified, processed:1446769779, type:'Dropbox'}
ON Match SET i+= { id:items.path, id_account:'', id_user:'15', is_dir:items.is_dir, name:items.name, description:items.description, size:items.size, created_at:items.created_at, modified:items.modified, processed:1446769779, type:'Dropbox'}
MERGE (p:Item {id_user:'15', id:items.parentPath, id_account:'', type:'Dropbox'})
MERGE (p)-[:Contains]->(i)
MERGE (c)-[:Owns]->(i)
***The query is sent via Everyman****
static function makeQuery($client, $qry) {
return new Everyman\Neo4j\Cypher\Query($client, $qry);
}
This works fine and generally from start to finish takes 8-10 seconds to run.
The Dropbox account I am accessing contains around 35000 items, and takes around 18 runs of my PHP to populate my Neo4j Database with the folder/file structure of the dropbox account.
With every run of this PHP, around 50mb of items are added to the Neo4j JVM Old Gen heap, 30mb of that is not removed by GC.
The end result is obviously the VM runs out of memory and gets stuck in a constant state of GC throttling.
I have tried a range of Neo4j VM settings, as well as an update from Neo4j v2.2.5 to v2.3, which actually has appeared to make the problem worse.
My current settings are as follows,
-server
-Xms4096m
-Xmx4096m
-XX:NewSize=3072m
-XX:MaxNewSize=3072m
-XX:SurvivorRatio=1
I am testing on a windows 10 PC with 8GB of ram and an i5 2.5GHz quad core. Java 1.8.0_60
Any info on how to solve this issue would be greatly appreciated.
Cheers, Jack.
Reduce the new size to 1024m
change your settings to:
-server
-Xms4096m
-Xmx4096m
-XX:NewSize=1024m
It is most likely that the size of your tx grows too large.
I recommend sending each of the parents in separately, so instead of the UNWIND sent one statement each.
Make sure to use the new transactional http endpoint, I recommend to go wit neoclient instead of Neo4jPHP
You should also use parameters instead of literal values!!!
And don't repeeat user-id and type etc. properties on every item.
Are you sure you want to connect everything to c not just the root of the directory structure? I would do the latter.
MERGE (c:Cloud:Dropbox { id_user:{userId}})
MERGE (p:Item:Dropbox {id:{parentPath}})
// owning the parent should be good enough
MERGE (c)-[:Owns]->(p)
WITH c
UNWIND {items} as item
MERGE (i:Item:Dropbox { id:item.path})
ON Create SET i += { is_dir:item.is_dir, name:item.name, created_at:item.created_at }
SET i += { description:item.description, size:item.size, modified:items.modified, processed:timestamp()}
MERGE (p)-[:Contains]->(i);
Make sure to use 2.3.0 for best MERGE performance for relationships.

Can't connect to CFS node

I removed (or decommisioned, can't remember) a DSE analytics node (with IP 10.14.5.50) a couple of months ago. When I now try to execute a dse shark (CREATE TABLE ccc AS SELECT ...) query I now receiving:
15/01/22 13:23:17 ERROR parse.SharkSemanticAnalyzer: org.apache.hadoop.hive.ql.parse.SemanticException: 0:0 Error creating temporary folder on: cfs://10.14.5.50/user/hive/warehouse/mykeyspace.db. Error encountered near token 'TOK_TMP_FILE'
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1256)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1053)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:8342)
at shark.parse.SharkSemanticAnalyzer.analyzeInternal(SharkSemanticAnalyzer.scala:105)
at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:284)
at shark.SharkDriver.compile(SharkDriver.scala:215)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:342)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:977)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:888)
at shark.SharkCliDriver.processCmd(SharkCliDriver.scala:347)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
at shark.SharkCliDriver$.main(SharkCliDriver.scala:240)
at shark.SharkCliDriver.main(SharkCliDriver.scala)
Caused by: java.lang.RuntimeException: java.io.IOException: Error connecting to node 10.14.5.50:9160 with strategy STICKY.
at org.apache.hadoop.hive.ql.Context.getScratchDir(Context.java:216)
at org.apache.hadoop.hive.ql.Context.getExternalScratchDir(Context.java:270)
at org.apache.hadoop.hive.ql.Context.getExternalTmpFileURI(Context.java:363)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1253)
... 12 more
I guess the above error is due to my keyspace referring to the old node:
shark> DESCRIBE DATABASE mykeyspace;
OK
mykeyspace cfs://10.14.5.50/user/hive/warehouse/mykeyspace.db
Time taken: 0.997 seconds
Is there any way for me to fix this incorrect database path?
Tried (but failed) workaround to recreate the database: In cqlsh I created a keyspace thekeyspace and added a table thetable. I the opened up dse hive (and noticed that DESCRIBE DATABASE thekeyspace is giving me a correct cfs path). However, I am unable to drop the the database using DROP DATABASE thekeyspace.
Additional information:
I have no external tables in my keyspace.
Making the SELECT against the tables works.
Setting -hiveconf cassandra.host=WORKING_NODE_IP does not help.
The following commands return proper IP:s (ie. not X.X.X.50):
dsetool listjt
dsetool jobtracker
dsetool sparkmaster
I am getting the same error when I execute the query using dse hive.
No Shark variable is referring to X.X.X.50 when I execute set; in its REPL.
I am running DSE 4.5.
Stumbled across this page that says you need to TRUNCATE "HiveMetaStore"."MetaStore" (in cqlsh) after removing Hive nodes. That did the trick.

Neo4j.rb 1.9 HA in development working intermittently, then giving errors

Hullo,
We are attempting to set up an Neo4j HA cluster in our Rails dev environment, much like what is explained here: https://github.com/andreasronge/neo4j/wiki/Neo4j%3A%3ARails-Config
We have two instances in the cluster. Server 1 is the app, Server 2 is the Rails console. They both start fine, but eventually one of them will fall over. Usually, it's one of the following:
1) java.io.FileNotFound: /server_1_path/path/to/some/RailsModel_exact/_2.fxm file. Somehow, the indexes expect a file to exist that does not exist. Sometimes, the file does not exist in EITHER server directory, and the only thing that helps is to make both sets of index files identical by copying one to the other.
2) Orphaned index.lock files. The error here will say that a certain index is locked, and removing the specific .lock file fixes the issue. Annoying.(maybe similar issue)
3) Add data in one instance, never shows up in the other instance. In this case, I create a node in the Rails console, and it never shows up in the app, or vice versa. In this case, it seems that both instances start up as master and will never sync. Usually have to delete one of the dbs and restart to get them working again.
I am not sure if the new 1.9 HA stuff isn't ready for prime time or we are being too nonchalant with how we quit the app/console and Neo4j is not shutting down cleanly.
This is a highly frustrating issue. We'd appreciate any help/pointers to get it working right.
We are using the 1.9 M03 version of the gem, and here is our config:
server_id = ((defined? Rails::Console)) ? 2 : 1
config.neo4j['enable_ha'] = true
config.neo4j['enable_remote_shell'] = "port=133#{server_id}"
config.neo4j['ha.server_id'] = server_id
config.neo4j['ha.server'] = "localhost:600#{server_id}"
config.neo4j['ha.pull_interval'] = '1s'
config.neo4j['ha.discovery.enabled'] = false
config.neo4j['ha.initial_hosts'] = [1,2,3].map{|id| ":500#{id}"}.join(',')
config.neo4j['ha.cluster_server'] = ":5001-5099" #"#{server_id}"
config.neo4j.storage_path = File.expand_path("db/ha_neo_#{server_id}", Object::Rails.root)
config.neo4j['online_backup_server']= "localhost:636#{server_id}"
config.neo4j['ha.cluster_server'] = "localhost:500#{server_id}"
config.neo4j['webserver.port'] = "747#{server_id}"
config.neo4j['webserver.https.port'] = "748#{server_id}"
config.neo4j['enable_remote_shell'] = "port=933#{server_id}"
config.neo4j['use_adaptive_cache'] = false
puts "Config HA cluster, ha.server_id: #{config.neo4j['ha.server_id']}, db: #{config.neo4j.storage_path}"
Thanks for any/all help/advice.

Resources