Google Datastudio show empty row combinations - google-sheets

I am creating a Google Data Studio report for a car dealership and I have a problem.
I have made these 3 screenshots to illustrate:
If you see on the first screenshot, the datasource is pretty simple, used/new indicates weather the car being sold is new or used and if it is a sportscar or family car, and exchange/clean deal indicates weather the dealership takes/buys the customers old car in for a trade off in price. The rest should be self explanatory.
On screenshot2-3 you see my report, I have one table for each salesperson and it shows the amount of sales for each combination that has sales.
The problem is this, I want the tables to show each combination even if it does not have any sales at all, it should just show 0 then in record count. Like Mike on the left has more combinations than john, I still want Johns table to show those combinations just with a 0 then, and it should be sorted the same on each table so they look the same, just different data in the cells.
Is this possible to do?

To solve this problem, you need to make a combination of data, from the database with itself. Your main analysis dimension, which will generate your combinations, is used/new and Exchange/Clean deal. So your combination should be:
The filter defined in the second database (right base) must contain a filter telling which person the table will be destined for. So, for each table, you must make a new combination that contains the person-specific filter.
I just took a sample from your original database (10 first lines) and the result is:

Related

Joining of non-unique cells across multiple columns in Google Sheets

I have some personal data with first/last names, e-mails, institutions people work at, etc. There are many, many duplicates because this was collected from a few sources over 2-3 years. Sometimes the same person provided different versions of their name, a different e-mail address, etc. I'd like to have a compact version of this data, where a single person (identified by a PersonID) is listed on a single row, with unique variants of their name, e-mail, etc. listed in each cell. Bonus points if the values in every cell are sorted, but far from required.
Example above also available at https://docs.google.com/spreadsheets/d/1jizgysC1dntZHg8pZ0--dSAPevSfyXyiVyenj02GiwQ/edit#gid=0
I'm looking for a way to display the unique values in each column of a filter result, ideally staying away from =QUERY if at all possible.
This is easy to do when working with just one resulting column:
=FILTER(A4:A9,D4:D9=1) --> =JOIN(", ",UNIQUE(FILTER(A4:A9,D4:D9=1)))
...but the moment the filter spits out results in multiple columns:
=FILTER(A4:C9,D4:D9=1) --> ???
...I have no clue what to do, other than doing the code above for each column separately (which would be a hassle, given the number of columns involved). Is this possible?
Here's one way to do that using MAP:
=MAP(UNIQUE(D4:D),LAMBDA(id,BYCOL(FILTER(A4:D,D4:D=id),LAMBDA(col,JOIN(CHAR(10),UNIQUE(col))))))
By stablishing a column in I with unique values (=UNIQUE(D4:D)), you can use this MAKE ARRAY:
=MAKEARRAY(COUNTA(I4:I),COUNTA(F3:H3),LAMBDA(r,c,
JOIN(CHAR(10),FILTER(INDEX(A4:C,,c),D4:D=INDEX(I4:I,r)))))

How to OCR scanned voting protocols

As part of a hobby project I'm trying to digitalise all the voting records of the Swedish parliament to see if I can extract any interesting statistics (yes a strange hobby I know).
From 1983 to 2001 the voting records look something like in the example. They are printed from some kind of voting machine and only exist on paper (that are now scanned and on my disk).
Every vote consists of three pages with two columns each of members and votes as in the example.
Some translations from Swedish: (Plats = Seat, Ledamöter = Members, Parti = party, Röst = Vote).
The list is sorted on the party column and then alphabetically on the member column.
After an election a member stays on their seat until next election or that the member quits parliament and is replaced by a replacement member. There can also be temporary replacements.
Replacements are also sorted into the list.
The Parti/Party column contains the party abbreviation and can only be one of nine letters (since there are only nine different parties in that time period). Members stay with their party but can technically change between or during election periods.
The Röst/vote column can be one of J,N,A,F (Ja = Yey, Nej = Ney, Avstår = Pass, Frånvarande = Absent) and are also aligned in a four sub-columns.
The columns and rows are not always in the same place in the picture. It can both be slightly translated and/or rotated. The quality of the scan is also not always this good.
At the top of the first page (not shown i the example) there is a summation of votes per party that can be checked against.
There are about 18.000 votes in total.
For votes before to 1983 the layout was different and I was able to make a custom program in node.js (although most languages would be ok for me) that semi-automatically could scan the votes but this looks like something that should be easy to do with tesseract or something similar.
My question is really if its possible to hint tesseract about the layout so that it can do some better guesses of what the text is. I'm aware one can make a custom wordlist (where I could for example add all members names manually).
I'm guessing that there might be a way to make a custom pattern list but I haven't figured that out.
Does anyone have any good suggestions on how to tackle this?

When using QUERY, how can I make it so that data moves together when using filter?

I am creating a tool for a video game I play.
Link to the example spreadsheet (Please make a copy to edit so that this copy stays intact for additional helpers).
Sheet 1 is “Choose Owned”. It contains a list of all of the champions available in the game and includes their attributes.
Column A contains checkboxes. Checking a checkbox indicates that the user owns that champion, and brings it to Sheet 2.
Sheet 2 is called “Owned”. It contains a list of the champions checked off in “Choose Owned” (aka the champions the user owns). “Owned” includes the champion attributes too, as first seen in “Choose Owned”.
Beyond those same attributes, “Owned” contains 8 additional columns.
These columns are from Columns G:N and are labeled ‘Level’, ‘Rank’,
‘Ascension Lvl’, and ‘Team Label(s)’ (‘Team Label(s)’ takes up
columns J:N). This data is all unique information and requires the
user to input the information themselves depending on their
champions.
Because there are so many champions, I want the user to be able to use the Filter function in “Owned” so they can easily locate the champion they need or sort the table however they wish.
However, because I use the QUERY function to get the data from “Choose Owned”, the Filter function tends to break. The most obvious error comes when you try to sort A-Z or Z-A; this simply cannot be done. I was fine with this, and have even included a note at the top telling the user to avoid sorting alphabetically.
Everything else works correctly until the user tries to add a new champion from “Choose Owned”. When the champion is added to “Owned”, the additional, unique data in columns G:N go out of order because they don’t move with their original champions.
Example:
I choose my champions. These champions are copied to “Owned”.
I pick their relative data in columns G:N.
A few days later, I obtain new champions and check them off in “Choose Owned” so they are added to “Owned”. However, when I do this, the champions stay in the same order as they are in “Choose Owned”, and columns G:N do not move with their champions so now, that information is with the wrong champion.
I want the additional data (G:N) to move with their champions when the table is edited due to champions being added. Or, in other words, I want those columns to stay linked to the first columns.
If there is a different way to achieve all of this like if I have to use a function other than QUERY, that is fine!
Please share any solutions you may have. I would prefer to not use a script but will consider the idea if it works.
this is a common issue within Google Sheets and it's solvable in 2 ways:
either by introducing a common value (unique ID) and then linking the manual input to query and aligning it by ID
or easier approach in your case - using timestamp/linear ID so every new entry would be added to the bottom and then the query would be sorted based on this order.

How to reflect multiple cells based on specific criteria

Imagine a list on the left filled with employees going down the spreadsheet and headers across the top based categorized on infractions that an employee might violate. this sheet is connected to another sheet which adds a one every time a form is submitted against the employee adding up for the quarter. So employee john smith has across his row would show a 0 if he never committed this infraction and add a 1 to the column each time he did so a row might look like this. John Smith 0 4 5 0 1
The goal is to show the experts name and infraction with how many times this infraction took place removing the infractions that he did not commit so ideally it would look like John Smith 4 5 1 and the header of each number would show what he did.
The goal is to make it much easier to see who did what essentially. There will be over 100 employees and alot of 0's so optically it would look better to distill in order to quickly identify who did what and how many times.
Any ideas?
V lookups and important ranges based on if this is greater than 0 is tedious and does not exactly pull what we want. Essentially omitting the 0s and just showing what an employee has done rather than what they have not done is the goal. All index and match formulas do not seem to specifically answer this problem
simple Index V lookups and matching formulas have been tried
Not able to reflect all three variables (employee/frequency/infraction) while not showing on a master list the people who did not commit the offense
There's a few ways you could set this up. I would set this up so it
Column A = Employee
Column B = Infraction
Column C = 1
Column D = Date
That way you can do a pivot summary and have the employees, with their infractions below their name and the months/years they occurred. Also you can adjust this table as necessary, such as filter by the employee name or by date or by infraction.
The added benefit is you could create a chart with all of these as filters, like cutoff a date range or pick an employee or infraction and it can show a bar graph of all the infractions by month or something like that.
I would agree that listing your data of infractions line by line (as they happen) and using a pivot table would probably be the easiest.
You could also use the AGGREGATE function to pull from a large database as well. This way you could type in an employees name, and a list of all infractions would pop up next to the name (or wherever you would want it) with as much detail as you would like. This way is more complex, but using both a pivot table and the AGGREGATE function might get you the best of both worlds (you could searching infraction types, dates, employees, employee types, and get all the details in the world if wanted).
Hope this helps!
JW

"Relationship may be needed" although relationship is set

I have two simple table in Excel 2010 - Products and Sales:
I then linked them into PowerPivot - and here created the relationship from Sales.ProductId to Products.Id - like this:
Now I'm trying to build a Pivot that for each productId in the Sales table also shows me the Category and the PurchasePrice from the ProductTable. However, here I'm stuck - because the relationship is somehow faulty. When configuring the Pivot, I get a message Relationship may be needed even if the relationship already exists. The resulting - wrong - pivot table looks like this:
though I'm actually trying to achieve this:
I know that I could create calculated columns in the Sales table of PowerPivot and pull all required data using the RELATED DAX function - but as the real world project requires a lot of joins and fields I'm hoping that there's a better solution than this workaround...
I uploaded the example file here.
Peter,
Your question was a bit more tricky than I first though and I spent quite some time digging around, but I think I found the solution after reading this article about making distinct count in related tables.
I have updated your source file (so that Sales table contains more than 1 sales per product) and uploaded it to my public Dropbox folder.
You can see that I created 4 new calculated measures to illustrate my solution and to make it a bit easier to understand (Excel 2010 terminology, in 2013 it's Calculated Field):
Sales Price Total
=SUM(Sales[SalesPrice])
Products Sold
=COUNT(Sales[ProductId])
Purchase Price per Item
=CALCULATE(SUM(Products[PurchasePrice]),Sales)
Purchase Price Total
=[Purchase Price per Item] * [Product Sold]
The key difference here is the formula for calculating purchase price per item. The reason why you should SUM the product purchase price within CALCULATE function is explained in great detail in the linked article (even though in a bit different context):
In this way, any filter active on Sales (the related table) propagates
to the lookup table.
There might be some other parts I missed, but I have tried this in couple other examples and it simply works as it should:

Resources