Counting unique values in 1 column based on 5 criteria? - google-sheets

I've been trying for hours to create such a formula. I would like for it to count all unique e-mail addresses (in column G) based on five criteria. Each criterion is stored in a separate column (columns B, C, I, R, and S). I'm working in Google Sheets. Could anyone please help to correct the below formula I have so far?
=ArrayFormula((SUM(IF(("Startup English"='All Session Data'!B:B)*("Yes"='All Session Data'!C:C)*("IQ"='All Session Data'!I:I)*("Female"='All Session Data'!R:R)*("Syrian"='All Session Data'!S:S), 1/COUNTIFS('All Session Data'!G:G,'All Session Data'!G:G,'All Session Data'!B:B, "Startup English",'All Session Data'!C:C, "Yes",'All Session Data'!I:I, "IQ",'All Session Data'!R:R,"Female",'All Session Data'!S:S,"Syrian"),0))))
I've also tried this formula and get a formula parse error:
=IFERROR(ROWS(UNIQUE(FILTER('All Session Data'G:G,('All Session Data'!B:B="Startup English")*('All Session Data'!C:C="Yes"*('All Session Data'!I:I="IQ")*('All Session Data'!R:R="Female")*('All Session Data'!S:S="Syrian"))))),0)

=IFERROR(ROWS(UNIQUE(FILTER('All Session Data'G:G,('All Session Data'!B:B="Startup English")*('All Session Data'!C:C="Yes")*('All Session Data'!I:I="IQ")*('All Session Data'!R:R="Female")*('All Session Data'!S:S="Syrian"))))),0)
Adding * between your conditions means AND (using + would mean OR).

Related

Neo4J: Compare a property value gathered with "order by Limit 1" with a property value of another label

There are two labels Car and Session. The Sessions are ordered descending with limit 1 to get the latests Session. This latest Session has one playerVehicleIdx as property. This exact playerVehicleIdx should be matched with the right vehicleIdx of the Cars available.
Idea:
MATCH (m:Car) with (n:Session) Return n ORDER BY n.sessionStart DESC LIMIT 1
WHERE n.playerVehicleIdx = m.vehicleIdx RETURN m,n
The query is wrong and i tried many different variants that did not work as well.
How to fix the query to compare the property value of the one node that comes from the sorting of the Sessions with the property value of the Cars?
First, you should create a relationship between Car and Session using vehicleIdx. Then run below query:
// Get all car and session
MATCH (c:Car)--(s:Session)
// Group by car, get the latest session based on sessionStart
WITH c, collect(s)[0] as s ORDER BY s.sessionStart DESC
// When there are more than 1 session per car, it will get the top session record where [0] is the index of the first record
RETURN c, s

Insert a lot of data that depends on the previously inserted data

I am trying to store a "navigation" path on the database
The paths are stored on the logfile as a string, something like "a1 b1 c1 d1" where each one is a "token"
I want that for each token I store the path to it, as an example I can have
a1 -> b1 -> c1
a1 -> b1 -> c2
a1 -> b2 -> c2
So, if I ask all the subtokens for a1 I will get [b1 => 2, b2 => 1] on a token => count format.
This way I can get all the subtokens for a given token and the "usage count" for each of those subtokens.
It is possible to have
a1 -> b1 -> c1
g1 -> h1 -> b1
But for me, those two b1 are not the same, the count should not be the same.
There should not be a LOT of tokens, but there will be a lot of entries on the logfile so I will expect a big count value for those tokens.
I am representing the data like that (sqlite3):
id; parent_id; token; count
where the parent_id is a FK to the same table.
My issue is. I have around 50k entries on my log and I can have more.
I am inserting the data on the database using the following procedure
search for a entry that has the parent_id + token (for the first token the parent_id is null)
EXISTS: Update the count
DON'T EXISTS: Create a entry
Save the ID of the updated entry/new entry as a parent_id
Repeat until there are no more tokens to consume
With 50k entries having an average of 4 tokens per entry it gives 200k tokens to process.
It does not write a lot of data on the database as a lot of those tokens repeats, even if I can have the same token with different parent_id.
The issue is.... it is too slow.... I cannot perform an insert in chunks as I depends on the id of an existing one or the id of a new one. Worse I also need to update the count.
I was thinking to use some sort of tree to store this data, but there is the problem where it is possible that there are old records that needs to be preserved and this data needs to be counted on top of the existing one.
I can create the tree using the database + update it with the current data, but it feels like an overcomplicated solution for a problem.
Does anyone have any idea on how to optimize the insertion of this data?
I am using rails (active record) + sqlite 3.

grafana-influxdb get multiple rows for last timestamp

I am using telegraf-influxdb-grafana together. But I could not get rows for only last timestamp.
Here is what I am doing;
Collecting DB statistics(Running queries at that time) with Telegraf(exec plugin).
Storing output to influxdb
Trying to monitor running queries over grafana
But I need to get all rows at last timestamp.
Here is what I've tried;
> select * from postgresql_running_queries where time=(select max(time) from postgresql_running_queries)
ERR: error parsing query: found SELECT, expected identifier, string, number, bool at line 1, char 54
Here is what I want to see;
Time DB USER STATE QUERY
2017-06-06 14:25.00 mydb myuser active my_query
2017-06-06 14:25.00 mydb myuser idle in transaction my_query2
2017-06-06 14:25.00 mydb2 myuser2 active my_query3
Can any one help me to achive this?
I am open to any solution.
select last(fieldname) from measurment_name;
Query in this format will return last timestamp data from the InfluxDB.
But I am surprised with the fact that you are expecting 3 values for a single timestamp (unless you have different TAG values, refer this documentation how to store duplicate points). You will a ONLY ONE record for a given timestamp. InfluxDB overwrites previous content if there is another entry for same timestamp, here is why.
Your results will be something like (if you don't have different TAG value):
Time DB USER STATE QUERY
2017-06-06 14:25.00 mydb2 myuser2 active my_query3
EDIT:
Based on comment, my guess is you are using TAGs to differentiate, still above query should work, if not, you may try by adding WHERE clause.

How to find document with SOLR query and exact string match

Considering a simple table:
CREATE TABLE transactions (
enterprise_id uuid,
transaction_id text,
state text,
PRIMARY KEY ((enterprise_id, transaction_id))
and Solr core with default, auto-generated parameters.
How do I construct a Solr query that will find me record(s) in this table that have state value exact match to an input, considering the state can be arbitrary string?
I tried this with state value of a+b. This works fine with q=state:"a+b", but that creates a "phrase query":
"rawquerystring": "state:\"a+b\"",
"querystring": "state:\"a+b\"",
"parsedquery": "PhraseQuery(state:\"a b\")",
"parsedquery_toString": "state:\"a b\"",
So, the same record is found if I use query like q=state:"a(b", which results into the same phrased query and finds the record with state of a+b. That is unacceptable to me, because I need an exact match.
I went through https://cwiki.apache.org/confluence/display/solr/Other+Parsers, and tried using q={!term f=state}a+b or q={!raw f=state}a+b, but neither even finds my sample transaction record.
Probably you got state generated as a TextField where standard tokenization is applied StandardTokenizer and then a split is made on + and the plus sign itself is discarded. You could use a different tokenizer (whitespace?) or just make state an StrField for an exact match.
This works for me with state as an StrField:
select * from transactions where solr_query='state:a+b';

Using SimpleDB NextToken when records in query are updated

I have a case where we are doing a select on a domain like:
select * from mydomain where some_val = 'foo' and some_date < '2012-03-01T00:00+01:00'
When iterating the results of this query - we are doing some work and then updating the row and setting the field some_date to the current date/time. Marking that it's processed.
The question I have is will the nexttoken request break when it returns to simpledb to get the next set of records? When it returns to get the next batch - all of the ones in the first batch will now have some_date with a value that no longer is within the original query range.
I don't know how the next-token is implemented to know whether its just a pointer to the next item or whether it somehow is an offset that might "skip" a whole batch of records.
So if we retrieved 3 records at a time and I had this in my domain:
record 1, '2012-01-12T19:20+01:00'
record 2, '2012-02-14T19:20+01:00'
record 3, '2012-01-22T19:20+01:00'
record 4, '2012-01-21T19:20+01:00'
record 5, '2012-02-22T19:20+01:00'
record 6, '2012-01-20T19:20+01:00'
record 7, '2012-01-18T19:20+01:00'
record 8, '2012-01-17T19:20+01:00'
record 9, '2012-02-12T19:20+01:00'
My first execution I would get: record 1, 2, 3
If i set their some_date field to: '2012-03-12T19:20+01:00' before returning for the next-token batch - would the next-token request then return 4,5,6? Or would it return 7,8,9 (because the token was set to start at the 4th record and now 1,2,3 are no longer in the result set).
If it is important - we are using the boto library (python).
would the next-token request then return 4,5,6? Or would it return
7,8,9 [...]?
Good question, this can indeed be a bit confusing - still anything but the former (i.e. 4,5,6) wouldn't make sense for practical usage and Amazon SimpleDB works like so accordingly, see Select:
Operations that run longer than 5 seconds return a time-out error
response or a partial or empty result set. Partial and empty result
sets contain a NextToken value, which allows you to continue the
operation from where it left off [emphasis mine]
Please take note of the additional note in section Request Parameters though, which might be a bit surprising eventually:
Note
The response to a Select operation with ConsistentRead set to
true returns a consistent read. However, for any following Select
operation requests that include a NextToken value, Amazon SimpleDB
ignores the ConsistentRead field, and the subsequent results are
eventually consistent. [emphasis mine]

Resources