get data by duration (daily, weekly, monthly) in neo4j CQL - neo4j

New to neo4j.
My neo4j relationships look something like....
(p:Person{
name:"XYZ",
joinDate:'date_ in_datetime_format'
})
-[:PART_OF]->
(c:Club {name:"ABC"})
I want to produce a list of members for durations like daily, weekly, monthly, quarterly, yearly. So if duration is Daily, I want to get 'no. of people added' for all days along with a index (list of {ratio, index}). Same goes for weekly, monthly and so on...
I want to write different queries for different duration cases.
I know how to count all schemes and members and hot get the cases, the thing I am getting stuck at counting members added in a duration like daily, weekly, and so on.

Cypher has an automatic groupBy where the key on the left is the aggregation key and the rest is automatically grouped by that key.
https://neo4j.com/docs/cypher-manual/current/functions/aggregating/
So if you return the day of the Person joinDate property and a count(*) it will do what you're looking for.
For months you return the month, for quarters you return the quarters. You can check here about instant values possible https://neo4j.com/docs/cypher-manual/current/syntax/temporal/#cypher-temporal-accessing-components-temporal-instants
For daily :
(p:Person{
name:"XYZ",
joinDate:'date_ in_datetime_format'
})
-[:PART_OF]->
(c:Club {name:"ABC"})
RETURN p.joinDate.year + '-' + p.joinDate.month + '-' p.joinDate.day AS window, count(*) AS count
For monthly :
(p:Person{
name:"XYZ",
joinDate:'date_ in_datetime_format'
})
-[:PART_OF]->
(c:Club {name:"ABC"})
RETURN p.joinDate.year + '-' + p.joinDate.month AS window, count(*) AS count
For quarterly :
(p:Person{
name:"XYZ",
joinDate:'date_ in_datetime_format'
})
-[:PART_OF]->
(c:Club {name:"ABC"})
RETURN p.joinDate.year + '-' p.joinDate.quarter AS window, count(*) AS count
For yearly :
(p:Person{
name:"XYZ",
joinDate:'date_ in_datetime_format'
})
-[:PART_OF]->
(c:Club {name:"ABC"})
RETURN p.joinDate.year AS window, count(*) AS count

Related

How to find the overall avg and compare each row (line item) to it

I'd really appreciate any help on this:
Background on my data: [Date] (down to hour level), [Volume] -> Volume is equivalent to SUM[Number of Records]
How do I find the [avg volume (for each line item - each hour of each day) + std.dev (volume - for each line item - each hour of each day)] and then compare it against each hour of each day
So if each line item > (avg + std.dev) then it should say "ITEM A", Else "ITEM B"
Then, when I make a table: Date, Volume, Item - it should say whether each hour of each day is ITEM A or ITEM B
What I have tried:
IF [Volume] > (AVG([Volume])+STDEV([Volume])) THEN "ITEM A"ELSE "ITEM B"END
Errors I am getting:
AVG is being called with (table), did you mean float
Tables can only be aggregated and only using COUNT function
STDEV is being called with (table), did you mean float
Can't compare table and integer values

Should this be a SUMIF formula?

I'm trying to make a formula that can recognize in Column A the name Brooke B for instance here, from there I'd like to SUM the values listed in Column I Cash Discounts for that specific user.
(Yes this user has no Cash Discounts, thus column I states "Non-Cash Payment").
There's about 80 users total here, so I'd prefer to automate the name recognition in Column A.
Sheet: https://docs.google.com/spreadsheets/d/1xzzHT7VjG24UJ4ZXaiZWsfzroTpn7jCJLexuTOf6SQs/edit?usp=sharing
Desired Results listed in Cash Discounts sheet, listed per user in column C.
You are trying to calculate the total amount of the Cash Discount per person given to people in a list. You have data that has been exported from a POS system to which that you have added a formula to calculate the amout of the discount on a line by line basis. You have speculated whether the discount totals could be calculated using SUMIFS formulae.
In my view, the layout of the spreadsheet and the format of the POS report do not lend themselves to isolating discrete data elements though Google sheets functions (though, no doubt, someone with greater skills than I will disprove this theory). Column A, containing names, also includes sub-groupings (and their sub-totals) as well as transaction dates. There are 83 unique persons and over 31,900 transaction lines.
This answer is a script-based solution which updates a sheet with the names and values of the discount totals. The elapsed execution time is #11 seconds.
function so5882893202() {
var ss = SpreadsheetApp.getActiveSpreadsheet();
// get the Discounts sheet
var discsheetname = "Discounts";
var disc = ss.getSheetByName(discsheetname);
//get the Discounts data
var discStartrow = 3;
var discLR = disc.getLastRow();
var discRange = disc.getRange(discStartrow, 1, discLR-discStartrow+1, 9);
var discValues = discRange.getValues();
// isolate Column A
var discnameCol = discValues.map(function(e){return e[0];});//[[e],[e],[e]]=>[e,e,e]
//Logger.log(discnameCol); // DEBUG
// isolate Column I
var discDiscounts = discValues.map(function(e){return e[8];});//[[e],[e],[e]]=>[e,e,e]
//Logger.log(discDiscounts); // DEBUG
// create an array to build a names list
var names =[]
// get the number of rows on the Discounts sheet
var discNumrows = discLR-discStartrow+1;
// Logger.log("DEBUG: number of rows = "+discNumrows);
// identify search terms
var searchPercent = "%";
var searchTotal = "Total";
// loop through Column A
for (var i=0; i<discNumrows; i++){
//Logger.log("DEBUG: i="+i+", content = "+discnameCol[i]);
// test if value is a date
if (Object.prototype.toString.call(discnameCol[i]) != "[object Date]") {
//Logger.log("it isn't a date")
// test whether the value contains a % sign
if ( discnameCol[i].indexOf(searchPercent) === -1){
//Logger.log("it doesn't have a % character in the content");
// test whether the value contains the word Total
if ( discnameCol[i].indexOf(searchTotal) === -1){
//Logger.log("it doesn't have the word total in the content");
// test whether the value is a blank
if (discnameCol[i] != ""){
//Logger.log("it isn't empty");
// this is a name; add it to the list
names.push(discnameCol[i])
}// end test for empty
}// end test for Total
} // end for percentage
} // end test for date
}// end for
//Logger.log(names);
// get the number of names
var numnames = names.length;
//Logger.log("DEBUG: number of names = "+numnames)
// create an array for the discount details
var discounts=[];
// loop through the names
for (var i=0;i<numnames;i++){
// Logger.log("DEBUG: name = "+names[i]);
// get the first row and last rows for this name
var startrow = discnameCol.indexOf(names[i]);
var endrow = discnameCol.lastIndexOf(names[i]+" Total:");
var x = 0;
var value = 0;
// Logger.log("name = "+names[i]+", start row ="+ startrow+", end row = "+endrow);
// loop through the Cash Discounts Column (Column I) for this name
// from the start row to the end row
for (var r = startrow; r<endrow;r++){
// get the vaue of the cell
value = discDiscounts[r];
// test that it is a value
if (!isNaN(value)){
// increment x by the value
x = +x+value;
// Logger.log("DEBUG: r = "+r+", value = "+value+", x = "+x);
}
}
// push the name and the total discount onto the array
discounts.push([names[i],x]);
}
//Logger.log(discounts)
// get the reporting sheet
var reportsheet = "Sheet10";
var report = ss.getSheetByName(reportsheet);
// define the range (allow row 1 for headers)
var reportRange = report.getRange(2,1,numnames,2);
// clear any existing content
reportRange.clearContent();
//update the values
reportRange.setValues(discounts);
}
Report Sheet - extract
Not everyone wants a script solution to their problem. This answer seeks to supply a repeatable solution using common garden-variety formula/functions.
As noted elsewhere, the layout of the spreadsheet does not lend itself to a quick/simple solution, but it IS possible to break down the data to compile a non-script answer. Though it may "seem" as though the following formula are less than "simple, when taken one-at-a-time they are logical, very easy to create, and very easy to verify successful outcomes.
Note: It is important to know at the outset that the first row of data = row#3, and the last row of data = row#31916.
Step#1 - get Text values from ColumnA
Enter this formula in Cell J3, and copy to row 31916
=if(isdate(A3),"",A3):
evaluates Column A, if the content is a date, returns blank, otherwise, returns the context
Taking Customer "AJ" as an example, the content at this point includes:
AJ
10% BuildingDiscount
10% BuildingDiscount Total:
Northwestern 10%
Northwestern 10% Total:
AJ Total:
Step#2 - ignore the values that contain "10%" (this removes both headings and sub-subtotals
Enter this formula in Cell K3 and copy to row 31916
=iferror(if(search("10%",J3)>0,"",J3),J3): searches for "10%" in Column J. Returns all values except those that containing "10%".
Taking Customer "AJ" as an example, the content at this point includes:
AJ
AJ Total:
**Step#3 - ignore the values that contain the word "Total"
Enter this formula in Cell L3 and copy to row 31916.
=iferror(if(search("total",K3)>0,"",K3),K3)
Taking Customer "AJ" as an example, the content at this point includes:
AJ
Results after Step#3
You might wonder, "couldn't this be done in a single formula?" and/or "an array formula would be more efficent". Both those thoughts are true, but we're looking at simple and easy, and a single formula is NOT simple (as shown below); and given that, an array formula is out-of-the-question unless/until an expert can wave a magic wand over the data.
FWIW - Combining Steps#1, 2 & 3
each of the Steps#1, 2 and 3 build on each other. So it is possible to create a single formula that combines these steps.
enter this formula in Cell J3, and copy dow to row #31916.
=iferror(if(search("total",iferror(if(search("10%",if(isdate(A3),"",A3))>0,"",if(isdate(A3),"",A3)),if(isdate(A3),"",A3)))>0,"",iferror(if(search("10%",if(isdate(A3),"",A3))>0,"",if(isdate(A3),"",A3)),if(isdate(A3),"",A3))),iferror(if(search("10%",if(isdate(A3),"",A3))>0,"",if(isdate(A3),"",A3)),if(isdate(A3),"",A3)))
As the image showed, step#3 concludes with mainly empty cells in Column L; the only populated cell is the first instance of the customer name at the start of their transactions - such as "Alec" in this example. However (props to #Rubén) it is possible to populate the blank transaction Cells in Column L. An arrayformula to find the previous non-empty cell in another column on Webapps explains how.
Step#4 - Create a customer name for each transaction row.
Enter this formula in Cell M3, it will automatically populate the cells to row#31916
=ArrayFormula(vlookup(ROW(3:31916),{IF(LEN(L3:L31916)>0,ROW(3:31916),""),L3:L31916},2))
Step#5 - Get the discount amount for each transaction value
The discount values are already displayed in Column I. They are interspersed with text values, so the formula for tests if this is a total line by testing the value in Column D; only if there is a vale (Product item) does the formula then test of there is a value in column I.
Enter this formula in Cell N3, it will automatically populate the cells to row#31916
=ArrayFormula(if(len(D3:D31914)>0,if(ISNUMBER(I3:I31916),I3:I31916,0),""))
Screenshot after step#5
Reporting by Query
Reporting is done via queries. These can go anywhere, but it is probably more convenient to put it on a separate sheet.
Step#6.1 - query the results to create report showing total by ALL customers
=query(Discounts_analysis!$M$2:$N$31916,"select M, sum(N) where N is not null group by M label M 'Customer', sum(N) 'Total Discount' ",1)
Step#6.2 - query the results to create report showing total by customer where the customer received a discount
=query(Discounts_analysis!$M$2:$N$31916,"select M, sum(N) where N >0 group by M label M 'Customer', sum(N) 'Total Discount' ",1)
Step#6.3 - query the results to create report showing customers with no discount
- `=query(query(Discounts_analysis!$M$2:$N$31916,"select M, sum(N) where N is not null group by M label M 'Customer', sum(N) 'Total Discount' ",1),"select Col1 where Col2=0")`
Queries screenshot

Google Ads Script (AWQL) get custom date range for reporting

I need to pull a google ads report that will get data from a fixed date (28th May) until today and push the data to a spreadsheet. I can't figure out how to define the date range for this query
I've tried googling and reading the google documentation but I can't figure it out
function main() {
var spreadsheet = SpreadsheetApp.openByUrl('https://docs.google.com/spreadsheets/d/XXX');
var sheet = spreadsheet.getSheetByName('Data')
var report = AdsApp.report(
'SELECT Date, CampaignName, AverageFrequency, Impressions, ImpressionReach ' +
'FROM CAMPAIGN_PERFORMANCE_REPORT ' +
'WHERE Impressions > 0 ' +
'DURING 20190528,TODAY');
sheet.clearContents();
report.exportToSheet(sheet);
}
I need to use today as the end date instead of the campaign end date as the end date for this query as I'm trying to pull frequency as a metric and it will just show blank values if the end date is in the future.
Please let me know if there is a way to make the query work. Thanks!
The TODAY keyword acts as the "full range" of the DURING property and cannot be used as the end part (as far as I know). The following should work.
function main() {
var endDate = new Date();
var endRange = Utilities.formatDate(endDate, 'America/Chicago', 'YYYYMMdd');
var spreadsheet = SpreadsheetApp.openByUrl('https://docs.google.com/spreadsheets/d/XXX');
var sheet = spreadsheet.getSheetByName('Data')
var report = AdsApp.report(
'SELECT Date, CampaignName, AverageFrequency, Impressions, ImpressionReach ' +
'FROM CAMPAIGN_PERFORMANCE_REPORT ' +
'WHERE Impressions > 0 ' +
'DURING 20190528,' + endRange);
sheet.clearContents();
report.exportToSheet(sheet);
}
Date ranges for the report are defined in the DURING clause of the query. Date ranges can be specified in two different ways:
A custom date range using regular AWQL syntax, for example:
SELECT Id, Criteria, AdGroupName
FROM KEYWORDS_PERFORMANCE_REPORT
DURING 20190101,20190325
A date range type, for example:
SELECT Id, Criteria, AdGroupName
FROM KEYWORDS_PERFORMANCE_REPORT
DURING LAST_7_DAYS
In your case you should use:
DURING 20190528, 20190723
There is no other option for you to do that.

Get ISO-dates out of an neo4j node property

Within a graph there are Person-Nodes which have properties with information about the birthday and place of birth of a person e.g.
Jaroslavice 8.10.1679
Alcudia 26.7.1689
Is it possible to get ISO-dates and the place out of that property of type text and put it in new properties ?
It is certainly possible.
One way would be to search for nodes that do not contain your new property; then use the split function to divide the text on spaces
and periods; and then reassemble the date in the format you require.
Something like this...
MATCH (person:Person)
WHERE NOT exists(person.birthdate)
WITH person,
split(person.informations,' ')[0] AS place,
split(person.informations,' ')[1] AS date
WITH person,
place,
split(date,'.')[0] AS day,
split(date,'.')[1] AS month,
split(date,'.')[2] AS year
SET person.birth_place = place,
person.birthdate = substring('0000', 0, 4 - size(year)) + year
+ '-'
+ substring('00', 0, 2 - size(month)) + month
+ '-'
+ substring('00', 0, 2 - size(day)) + day

Neo4j Aggregate Multiple Lines into a Map

I have the following Cypher script:
MATCH (sy:SchoolYear)<-[:TERM_OF*]-()<-[:DAY_OF]-(d:Day)
WHERE sy.year = 2015
OPTIONAL MATCH (d)<-[:START]-(e:Enrollment)-[:AT]->(s:School)
RETURN d.date, s.abbreviation, count(e)
ORDER BY d.date
This gives me all of the dates in the range that I want and returns number of students that have enrolled for each school for that date, or null. The only issue I have is that different schools are on different lines, causing a single date to have multiple lines. I would like to aggregate those into a single line per date.
I'm given:
1/1/2000, School 1, 5
1/1/2000, School 2, 10
1/2/2000, null, null
1/3/2000, School 1, 6
What I would like:
1/1/2000, {School 1 : 5, School 2: 10}
1/2/2000, null
1/3/2000, {School 1: 6}
I've tried:
MATCH (sy:SchoolYear)<-[TERM_OF*]-()<-[:DAY_OF]-(d:Day)
WHERE sy.year = 2015
OPTIONAL MATCH (d)<-[:START]-(e:Enrollment)-[:AT]->(s:School)
WITH d, s.abbreviation as abb, count(e) as enr
RETURN d.date, {abb:enr}
ORDER BY d.date
How should I go about this?
Here is how I would go with this aggregate each school into a map and the maps into a collection
MATCH (sy:SchoolYear)<-[TERM_OF*]-()<-[:DAY_OF]-(d:Day)
WHERE sy.year = 2015
OPTIONAL MATCH (d)<-[:START]-(e:Enrollment)-[:AT]->(s:School)
WITH d, s, count(e) as students
RETURN d.date, collect({name:s.abbreviation, students:students})
ORDER BY d.date
This is a bit ugly, but I think it returns what you are after. I tried using the school name as a key just like you did in your example and I could not get that to work either. In the end I resorted to this.
MATCH (sy:SchoolYear)<-[TERM_OF*]-()<-[:DAY_OF]-(d:Day)
WHERE sy.year = 2015
OPTIONAL MATCH (d)<-[:START]-(e:Enrolment)-[:AT]->(s:School)
// collect the schools and their counts together
with d, [s.abbreviation, count(e)] as school_count
// collect all of the school counts together by date
with d.date as date, collect(school_count) as school_counts
// format the school counts as a string with the schools
// as keys and the counts as values
with date, reduce( out = "", s in school_counts | out + s[0] + " : " + s[1] + ", " ) as school_count_str
return date, '{ ' + left(school_count_str, length(school_count_str)-2) + ' }' as school_counts
order by date

Resources