influxdb delete query with time AND non-time clause - influxdb

Per documentation, deleting is supported only if where clause has time. That works fine. But, if I want to narrow down by time and another clause, it errors out..is that supported yet?
ie. delete from datapoints where time>now()-1h and code='12345'

The relevant documentation: http://influxdb.com/docs/v0.8/api/query_language.html#deleting-data-or-dropping-series
The DELETE syntax for individual points is only valid in 0.8 and prior versions. All development is on the 0.9 version and no work is being done to fix any 0.8 issues. Therefore the current functionality of DELETE FROM is final for all 0.8.x versions.

Related

How can I change an Optuna's trial result?

I'm using optuna on a complex ML algorithm, where each trial takes around 3/4 days. After a couple of trials, I noticed that the values that I was returning to optuna were incorrect, but I do have the correct results on another file (saving as a backup). Is there any way I could change this defectives results directly in the study object?
I know I can export the study in a pandas dataframe using study.trials_dataframe() and then change it there? However, I need to visualize it in optuna-dashboard, so I would need to directly change it in the study file. Any suggestions?
Create a new Study, use create_trial to create trials with correct values and use Study.add_trials to insert them into the new study.
old_trials = old_study.get_trials(deepcopy=False)
correct_trials = [
optuna.trial.create_trial(
params=trial.params,
distributions=trial.distributions,
value=correct_value(trial.params)
) for trial in old_trials]
new_study = optuna.create_study(...)
new_study.add_trials(correct_trials)
Note that Optuna doesn't allow you to change existing trials once they are finished, i.e., successfully returned a value, got pruned, or failed. (This is an intentional design; Optuna uses caching mechanisms intensively and we don't want to have inconsistencies during distributed optimization.)
You can only create a new study containing correct trials, and optionally delete the old study.

Keeping min/max value in a BigTable cell

I have a problem where it would be very helpful if I was able to send a ReadModifyWrite request to BigTable where it only overwrites the value if the new value is bigger/smaller than the existing value. Is this somehow possible?
Note: I thought of a hacky way where I use the timestamp as my actual value, and have the max number of versions 1, so that would keep the "latest" value which is the higher timestamp. But those timestamps would have values from 1 to 10 instead of 1.5bn. Would this work?
I looked into the existing APIs but haven't found anything that would help me do this. It seems like it is available in DynamoDB, so I guess it's reasonable to ask for BigTable to have it as well https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_UpdateItem.html#API_UpdateItem_RequestSyntax
Your timestamp approach could probably be made to work, but would interact poorly with stuff like age-based garbage collection.
I also assume you mean CheckAndMutate as opposed to ReadModifyWrite? The former lets you do conditional overwrites, the latter lets you do unconditional increments/appends. If you actually want an increment that only works if the result will be larger, just make sure you only send positive increments ;)
My suggestion, assuming your client language supports it, would be to use a CheckAndMutateRow request with a value_range_filter. This will require you to use a fixed-width encoding for your values, but that's no different than re-using the timestamp.
Example: if you want to set the value to 000768, but only if that would be an increase, use a value_range_filter from 000000 to 000767, inclusive, and do your write in the true_mutation of the CheckAndMutate.

Neo4J's APOC plugin (3.1.3.6) is running very slow

I recently upgraded my Neo4j to 3.1.3, and alongside that, got the most recent APOC plugin (3.1.3.6).
I had a bit of code that worked fine, and could create ~3 million relationships in about a minute and a half wall time. But now, it's been running for over 8 hours and shows no sign of stopping...
Because the code used to run without any problems, I'm hoping something must have changed between versions that has lead to my code having been borked.
Is it rock_n_roll that should be changed (maybe to apoc.periodic.commit with positional arguments or something)? Thanks for any insight.
Here's what I'm running .
CALL apoc.periodic.rock_n_roll(
"MATCH (c:ChessPlayer),(r:Record) WHERE c.ChessPlayer_ID = r.ChessPlayer RETURN c,r",
"CYPHER planner=rule WITH {c} AS c, {r} AS r CREATE (c)-[:HAD_RECORD]->(r)",
200000)
My understanding is that call is querying the Cartesian product of ChessPlayers and Records, and then trying to filter them out row by row, and then doing the batch update on those final results (which eats a lot of memory, I think this one opening transaction is what's killing you). So if you can break it up so that each transaction can touch as few nodes as possible, it should be able to perform massively better (especially if r.ChessPlayer is indexed, since now you don't need to load all of them)
CALL apoc.periodic.rock_n_roll(
"MATCH (c:ChessPlayer) WHERE NOT EXISTS((c)-[:HAD_RECORD]->()) RETURN c",
"MATCH (r:Record) WHERE c.ChessPlayer_ID = r.ChessPlayer WITH c,r CREATE UNIQUE (c)-[:HAD_RECORD]->(r)",
100000)
periodic.commit() would work on a similar principle. The smaller (least nodes touched) you can make each transaction, the faster the batch will become.

How to avoid overlap between pages of firebase query results

I am trying to implement infinite scroll (aka paging) using Firebase's relatively new query functionality. I am stuck on one hopefully minor issue.
I ask for the first 10 results as follows:
offersRef.queryOrderedByChild(orderedByChildNamed).queryLimitedToFirst(10).observeEventType(.ChildAdded, andPreviousSiblingKeyWithBlock:childAddedBlock, withCancelBlock:childAddedCancelBlock)
But when I want to get the next 10, I will have to start with the 10th key as my starting value. What I really want is to pass the 10th key and tell firebase that I want it offset by 1, so that it will observe the next 10. But I think "offset" is old syntax (before query functionality was rolled out) and can't be used here.
So I tried asking for 11 and then ignoring the first one, but that is problematic as you may quickly guess, since the results I am observing can (and will) change:
offersRef.queryOrderedByChild(orderedByChildNamed).queryStartingAtValue(startingValue,childKey:startingKey!).queryLimitedToFirst(10+1).observeEventType(.ChildAdded, andPreviousSiblingKeyWithBlock:childAddedBlock, withCancelBlock:childAddedCancelBlock)
And just for clarity, the following are all variables defined in my app and not particularly germane to the question:
offersRef
orderedByChildNamed
childAddedBlock
childAddedCancelBlock

Mongoid min aggregation returning incorrect value

I have a collection called data. Each document looks something like this:
x: {"value"=>1358747699.6922424}, y: {"value"=>17.9}
Also, I have indexes on x.value and y.value. Using the built in mongoid aggregation .min, I wanted to get the minimum y value. I tried doing this:
data.min(:'y.value')
and it returns 16.2, which I know is not correct, it should be 14.4, which I can prove by:
data.map{|d| d['y']d['value']}.sort.first
returns 14.4
Or:
data.order_by([:'y.value', :asc]).limit(1).first['y']['value']
Also returns 14.4
So I can't figure out why .min does not seem to be working correctly?
Short Answer
In your case, just ignore the min function provided by Mongoid. Use the one you came up with instead:
data.order_by([:'y.value', :asc]).limit(1).first['y']['value']
Explanation
I did some digging into how the Mongoid aggregations are actually implemented. Turns out they use a map_reduce that runs the entire collection. [1]
The query that you wrote should be much more efficient because it can use that index you built on y.value. That means 1 index lookup versus the entire collection (not to mention it actually works...).
Check the MongoDB profiler to ensure your indexes are actually being used. [2]
The Question you Actually Asked
As far as exactly why min fails, I am at a loss. Without seeing your data, I can't see any reason why the underlying map_reduce would fail. Maybe it has to do with the embedded field or maybe this.y.value is not defined for some objects.
Here's an opportunity to give back to open source. Try posting to the Mongoid issues board. Be sure to crosslink if you do. I'd like to see the resolution for this.

Resources