Deleting ALL ItemsName() in a single Query in SimpleDB - amazon-simpledb

HI,
I want to delete all ItemNames in single query in simpledb.
whether it's possible in simple db.If possible please give the query for deleting all items in simple DB
Thanks
senthil

SimpleDB doesn't have any way to delete multiple records with a single query, and there is no equivalent to 'TRUNCATE TABLE'.
Your options are either to delete records one at a time or to delete the entire domain.

Use the DeleteDomain operation to delete an entire domain. You can re-create the domain using CreateDomain afterward.

SimpleDB supports batch delete now: http://docs.amazonwebservices.com/AmazonSimpleDB/latest/DeveloperGuide/SDB_API_BatchDeleteAttributes.html
But you can only do 25 at a time.
If you want to delete the entire domain, do as Scrappydog suggest and delete the domain. Much faster than deleting one by one.

Related

Add relationships to existing data in Neo4j

To start with Neo4j (4.2.3) I loaded a year's worth of flights data (7m rows) and wanted to try and model a flight as a relationship between origin and destination airport. However the following query just eats up memory and has not finished after two days, so something is clearly amiss:
MATCH (f:Flight), (dest:Airport), (orig:Airport)
WHERE f.Dest = dest.IATA_Code AND f.Origin = orig.IATA_Code
CREATE (orig)-[r:FlightTo {DeptDateTime:f.DepDT, ArriveDateTime:f.ArrDT, Flight:f.Name}]->(dest)
I can do this instead:
LOAD CSV WITH HEADERS FROM 'file:///flights.csv' AS row
MERGE (o:Org_Airport {Org_IATA:row.Origin})
MERGE (d:Dest_Airport {Dest_IATA:row.Dest})
CREATE (o)-[r:FlightTo {DeptDateTime:row.DepDT, ArriveDateTime:row.ArrDT, Flight:row.Name}]->(d)
While this has the advantage of working (even in a reasonable time) it feels ugly to essentially duplicate the airports and also to go through the CSV file again when all the required data is already in the database.
I'm not quite there with my graph thinking probably so I'd appreciate some guidance on what the best way is to add a relationship like this, keeping in mind that original load files might get lost.
Do you have indexes set? Looking at your first query, you'd need:
CREATE INDEX ON :Flight(Dest);
CREATE INDEX ON :Airport(IATA_Code);
If you don't have indexes/constraints set on the label/property, the look up/merge will be very slow.

How to delete field for a given measurement from influxdb?

I created multiple fields to test output in grafana, however I want to delete the unwanted fields from influxdb, is there a way?
Q: I want to delete the unwanted fields from influxdb, is there a way?
A: Short answer. No. Up until the latest release 1.4.0, there is no straightforward way to do this.
The reason why this is so was because Influxdb is explicitly designed to optimise point creation. Thus functionalities for the "UPDATE" and "DELETE" side of things are compromised for it.
To drop fields of a given measurement, the easiest way would be to;
Retrieve the data out first
Modify its content
Drop the measurement
Re-insert the modified data back
Reference:
https://docs.influxdata.com/influxdb/v1.4/concepts/insights_tradeoffs/

Most effective, secure way to delete Postgres column data

I have a column in my Postgres table that I want to remove for expired rows. What's the best way to do this securely? It's my understanding that simply writing 0's for those columns is ineffective because Postgres creates a new row upon Updates and marks the old row as dead.
Is the best way to set the column to null and manually vacuum to clean up the old records?
I will first say that it is bad practice to alter data like this - you are changing history. Also the below is only ONE way to do this (a quick and dirty way and not to be recommended):
1 Backup your database first.
2 Open PgAdmin, select the database, open the Query Editor and run a query.
3 It would be something like this
UPDATE <table_name> SET <column_name>=<new value (eg null)>
WHERE <record is dead>
The WHERE part is for you to figure out based on you are identifying which rows are dead (eg. is_removed=true, is_deleted=true are common for identifying soft deleted records).
Obviously you would have to run this script regularly. The better way would be to update your application to do this job instead.

Neo4J Batch Inserter is slow with big ids

I'm working on an RDF files importer but I have a problem, my data files have duplicate nodes. For this reason, I use a big ids to insert the nodes using batch inserter but the proccess is slow. I have seen this post when Michael recommends to use a index but the process remains slow.
Another option would be to merge duplicate nodes but I think there is no automatic option to do so in Neo4J. Am I wrong?
Could anyone help me? :)
Thanks!
There is no duplicate handling in the CSV batch importer yet (it's planned for the next version), as it is non-trivial and memory expensive.
Best to de-duplicate on your side.
Don't use externally supplied id's as node-id's that can get large from the beginning that just doesn't work. Use an efficient map (like trove) to keep the mapping between your key and the node-id.
I usually use a two-pass and an array for it then sort the array, array index becomes node-id and after sorting you can do another pass that nulls-out duplicate entries
Perfect :) The data would have the following structure:
chembl_activity:CHEMBL_ACT_102540 bao:BAO_0000208 bao:BAO_0002146 .
chembl_document:CHEMBL1129248 cco:hasActivity
chembl_activity:CHEMBL_ACT_102551 .
chembl_activity:CHEMBL_ACT_102540 cco:hasDocument
chembl_document:CHEMBL1129248 .
Each line corresponds with a relationship between two nodes and we could see that the node chembl_activity:CHEMBL_ACT_102540 is duplicated.
I wanted to save as id the hashcode of the node name but that hashcode is a very large number that slows the process. So I could check for ids to only create the relationship and not the nodes.
Thanks for all! :)

after clearing the neo4j database,when creating new node it starts to count from where the increment was before [duplicate]

Is there a possibility to reset the indices once I deleted the nodes just as if deleted the whole folder manually?
I am deleting the whole database with node.delete() and relation.delete() and just want the indices to start at 1 again and not where I had actually stopped...
I assume you are referring to the node and relationship IDs rather than the indexes?
Quick answer: You cannot explicitly force the counter to reset.
Slightly longer answer: Generally speaking, these IDs should not carry any relevance within your application. There have been a number of discussions about this within the Neo4j mailing list and Stack Overflow as the ID is an internal artifact and should not be used like a primary key. It's purpose is more akin to an in-memory address and if you require unique identifiers, you are better off considering something like a UUID.
You can stop your database, delete all the files in the database folder, and start it again.
This way, the ID generation will start back from 1.
This procedure completely wipes your data, so handle with care.
Now you certainly can do this using Python.
see https://stackoverflow.com/a/23310320

Resources