Associate scatter plot data with per-item labels - google-sheets

I have some data in a Google Sheets table, formatted like so:
Label | ValueA | ValueB
------+--------+-------
A | 1 | 1
B | 1 | 2
A | 3 | 3
B | 2 | 4
C | 9 | 1
I would like to render a scatterplot, with a single colored point for each entry, in which everything with an A label is color 1, everything with a B label is color 2, and so on, and they all share the same coordinate space.
I've poked around quite a bit in the options available in the UI, but nothing seems to do it. Multi color plots can be made, but they never associate the labels the way I want them to.
I guess this will take some scripting to do, but I really don't know where to start.

Maybe try a bubble chart instead?:
I suspect what you really want may be:
but the logic of the data layout that seems to be required to achieve this escapes me.

Related

Esttab: Create new row with logarithm of a beta coefficient

I am using Stata and I'm currently trying to figure out how to create new row that shows me the relative effect of a certain coefficient.
eststo, title(log_total[1]]): reg log_total a b
eststo, title(log_total[2]]): reg log_total a b c
esttab using total.tex
To give a sample code, this is what I have.
However, in the end besides the rows for a, b, and c I want to have a row that says effect for a where I calculate exp(a)-1 and where I want to print exp(a)-1 %.
The table should look the following:
| | total[1]| total[2]|
|:---- |:------:| -----:|
| a| 0.014| 0.021|
| b| 0.031| 0.005|
| c| | 0.082|
| Effect| 1.4 %| 2.1 %|
How can I add this "Effect" row to my table using esttab? I tried using estadd which works for fixed values but I was not able to figure out how to include a calculation in there.
Thank you a lot!

Add categories in MDS plot

I) PROBLEM
Let’s say I have a matrix like this with distances (in kilometers) between the homes of different people.
| | Person 1 | Person 2 | Person 3 |
|----------|----------|----------|----------|
| Person 1 | | | |
| Person 2 | 24 | | |
| Person 3 | 17 | 153 | |
And I have a data table like this:
| Person | Party |
|----------|----------|
| Person 1 | Party A |
| Person 2 | Party B |
| Person 3 | Party C |
I want to do multidimensional scaling (dissimilarity by distance) to visualize i) how close each person lives to another; ii) which party each person votes for (different colors for each party)
II) CURRENT RESULT
My current plot of MDS (made with SPSS) is like this (I don’t use a code line, but a menu commands in SPSS).
:
III) EXPECTED RESULT
I want to add a different color for each person depending on which party this person votes for:
IV) QUESTION(S)
Can I do it in SPSS? How to add the data about votes in the matrix and how to show it in MDS plot?
EDIT
There is quite the same problem and solution for R.
R) Create double-labeled MDS plot
But I want to do it in SPSS.
I don't believe it's possible to create a plot like the one you show directly from either of the MDS procedures currently available in SPSS Statistics, PROXSCAL or ALSCAL. I think what you'd need to do would be to save the common space coordinates to a new dataset or file, then add the Party variable to that new dataset or file, define it as Nominal in the measurement level designation in the Data Editor, and then use the Grouped Scatter option under Scatter/Dot in the chart Gallery in the Chart Builder, defining groups by the Party variable.
The PROXSCAL procedure lets you save things from the dialogs in the Output sub-dialog. The ALSCAL procedure only supports saving out of common space coordinates and other things using command syntax, specifically using the OUTFILE subcommand (you can paste the command from the dialogs, then add this subcommand).

Calculate hierarchical labels for Google Sheets using native functions

Using Google Sheets, I want to automatically number rows like so:
The key is that I want this to use built-in functions only.
I have an implementation working where child items are in separate columns (e.g. "Foo" is in column B, "Bar" is in column C, and "Baz" is in column D). However, it uses a custom JavaScript function, and the slow way that custom JavaScript functions are evaluated, combined with the dependencies, possibly combined with a slow Internet connection, means that my solution can take over one second per row (!) to calculate.
For reference, here's my custom function (that I want to abandon in favor of native code):
/**
* Calculate the Work Breakdown Structure id for this row.
*
* #param {range} priorIds IDs that precede this one.
* #param {range} names The names for this row.
* #return A WBS string id (e.g. "2.1.5") or an empty string if there are no names.
* #customfunction
*/
function WBS_ID(priorIds,names){
if (Array.isArray(names[0])) names = names[0];
if (!names.join("")) return "";
var lastId,pieces=[];
for (var i=priorIds.length;i-- && !lastId;) lastId=priorIds[i][0];
if (lastId) pieces = (lastId+"").split('.').map(function(s){ return s*1 });
for (var i=0;i<names.length;i++){
if (names[i]){
var s = pieces.concat();
pieces.length=i+1;
pieces[i] = (pieces[i]||0) + 1;
return pieces.join(".");
}
}
}
For example, cell A7 would use the formula:
=WBS_ID(A$2:A6,B7:D7)
...to produce the result "1.3.2"
Note that in the above example blank rows are skipped during numbering. An answer that does not honor this—where the ID is calculated determinstically from the ROW())—is acceptable (and possibly even desirable).
Edit: Yes, I've tried to do this myself. I have a solution that uses three extra columns which I chose not to include in the question. I have been writing equations in Excel for at least 25 years (and Google Spreadsheets for 1 year). I have looked through the list of functions for Google Spreadsheets and none of them jumps out to me as making possible something that I didn't think of before.
When the question is a programming problem and the problem is an inability to see how to get from point A to point B, I don't know that it's useful to "show what I've done". I've considered splitting by periods. I've looked for a map equivalent function. I know how to use isblank() and counta().
Lol this is hilariously the longest (and very likely the most unnecessarily complicated way to combine formulas) but because I thought it was interesting that it does in fact work, so long as you just add a 1 in the first row then in the second row you add:
=if(row()=1,1,if(and(istext(D2),counta(split(A1,"."))=3),left(A1,4)&n(right(A1,1)+1),if(and(isblank(B2),isblank(C2),isblank(D2)),"",if(and(isblank(B2),isblank(C2),isnumber(indirect(address(row()-1,column())))),indirect(address(row()-1,column()))&"."&if(istext(D2),round(max(indirect(address(1,column())&":"&address(row()-1,column())))+0.1,)),if(and(isblank(B2),istext(C2)),round(max(indirect(address(1,column())&":"&address(row()-1,column())))+0.1,2),if(istext(B2),round(max(indirect(address(1,column())&":"&address(row()-1,column())))+1,),))))))
in my defense ive had a very long day at work - complicating what should be a simple thing seems to be my thing today :)
Foreword
Spreadsheet built-in functions doesn't include an equivalent to JavaScript .map. The alternative is to use the spreadsheets array handling features and iteration patterns.
A "complete solution" could include the use of built-in functions to automatically transform the user input into a simple table and returning the Work Breakdown Structure number (WBS) . Some people refer to transforming the user input into a simple table as "normalization" but including this will make this post to be too long for the Stack Overflow format, so it will be focused in presenting a short formula to obtain the WBS.
It's worth to say that using formulas for doing the transformation of large data sets into a simple table as part of the continuous spreadsheet calculations, in this case, of WBS, will make the spreadsheet to slow to refresh.
Short answer
To keep the WBS formula short and simple, first transform the user input into a simple table including task name, id and parent id columns, then use a formula like the following:
=ArrayFormula(
IFERROR(
INDEX($D$2:$D,MATCH($C2,$B$2:$B,0))
&"."
&COUNTIF($C$2:$C2,C2),
RANK($B2,FILTER($B$2:B,LEN($C$2:$C)=0),TRUE)&"")
)
Explanation
First, prepare your data
Put each task in one row. Include a General task / project to be used as the parent of all the root level tasks.
Add an ID to each task.
Add a reference to the ID of the parent task for each task. Left blank for the General task / project.
After the above steps the data should look like the following:
+---+--------------+----+-----------+
| | A | B | C |
+---+--------------+----+-----------+
| 1 | Task | ID | Parent ID |
| 2 | General task | 1 | |
| 3 | Substast 1 | 2 | 1 |
| 4 | Substast 2 | 3 | 1 |
| 5 | Subsubtask 1 | 4 | 2 |
| 6 | Subsubtask 2 | 5 | 2 |
+---+--------------+----+-----------+
Remark: This also could help to reduce of required processing time of a custom funcion.
Second, add the below formula to D2, then fill down as needed,
=ArrayFormula(
IFERROR(
INDEX($D$2:$D,MATCH($C2,$B$2:$B,0))
&"."
&COUNTIF($C$2:$C2,C2),
RANK($B2,FILTER($B$2:B,LEN($C$2:$C)=0),TRUE)&"")
)
The result should look like the following:
+---+--------------+----+-----------+----------+
| | A | B | C | D |
+---+--------------+----+-----------+----------+
| 1 | Task | ID | Parent ID | WBS |
| 2 | General task | 1 | | 1 |
| 3 | Substast 1 | 2 | 1 | 1.1 |
| 4 | Substast 2 | 3 | 1 | 1.2 |
| 5 | Subsubtask 1 | 4 | 2 | 1.1.1 |
| 6 | Subsubtask 2 | 5 | 2 | 1.1.2 |
+---+--------------+----+-----------+----------+
Here's an answer that does not allow a blank line between items, and requires that you manually type "1" into the first cell (A2). This formula is applied to cell A3, with the assumption that there are at most three levels of hierarchy in columns B, C, and D.
=IF(
COUNTA(B3), // If there is a value in the 1st column
INDEX(SPLIT(A2,"."),1)+1, // find the 1st part of the prior ID, plus 1
IF( // ...otherwise
COUNTA(C3), // If there's a value in the 2nd column
INDEX(SPLIT(A2,"."),1) // find the 1st part of the prior ID
& "." // add a period and
& IFERROR(INDEX(SPLIT(A2,"."),2),0)+1, // add the 2nd part of the prior ID (or 0), plus 1
INDEX(SPLIT(A2,"."),1) // ...otherwise find the 1st part of the prior ID
& "." // add a period and
& IFERROR(INDEX(SPLIT(A2,"."),2),1) // add the 2nd part of the prior ID or 1 and
& "." // add a period and
& IFERROR(INDEX(SPLIT(A2,"."),3)+1,1) // add the 3rd part of the prior ID (or 0), plus 1
)
) & "" // Ensure the result is a string ("1.2", not 1.2)
Without comments:
=IF(COUNTA(B3),INDEX(SPLIT(A2,"."),1)+1,IF(COUNTA(C3),INDEX(SPLIT(A2,"."),1)& "."& IFERROR(INDEX(SPLIT(A2,"."),2),0)+1,INDEX(SPLIT(A2,"."),1)& "."& IFERROR(INDEX(SPLIT(A2,"."),2),1)& "."& IFERROR(INDEX(SPLIT(A2,"."),3)+1,1))) & ""

How to generate a line chart with too many data in ruby/ruby on rails?

I'm trying to generate a line chart with data I take from a database.
The data basically have a date field, an estimated progress field and a real progress field.
The progresses may be nil but the date is always there.
Since I don't know what are the intervals of the date and I need the intervals of the date distributed uniformly , I want to make the data from the first date until the last date with steps of 1 day.
For example, let's say I have this in the database:
| date | estimated progress | real progress |
| 2012-08-01 | 0.0 | |
| 2012-08-02 | | 0.15 |
| 2012-08-05 | 0.3 | |
I would like to generate a line chart with this info:
x = [2012-08-01, 2012-08-02, 2012-08-03, 2012-08-04, 2012-08-05]
ep = [0.0 , 0.0, 0.0, 0.0, 0.3]
rp = [nil , 0.15, 0.15, 0.15, 0.15 ]
But since the start date and the finish date can be way too separated, I'd like to show the x labels with a custom interval. It could be every 3, 5 or 7 days depending on the distance between those dates.
I'm trying this with gchartrb which use the google chart api but I realized I can't have nil values inside my data. So I should replace it with 0.0 even though it's not 0. It's unknown.
The other problem I found is that I don't know how to specify the labels to show those intervals I said before. It just show me every label and therefore, it's not readable.
I'm looking for another gem, a solution for gchartrb or ideas to generate the data differently and make it understandable.
Maybe you should check this link:
http://railscasts.com/episodes/223-charts.
Here is also good library for charts:
Flotr 2 and gem for it flotr2-rails.
For Flotr 2 it is worth to check example with time/dates labels on axis.

What do Push and Pop mean for Stacks?

long story short my lecturer is crap, and was showing us infix to prefix stacks via an overhead projector and his bigass shadow was blocking everything so i missed the important stuff
he was referring to push and pop, push = 0 pop = x
he gave an example but i cant see how he gets his answer at all,
2*3/(2-1)+5*(4-1)
step 1 Reverse : )1-4(*5+)1-2(/3*2 ok i can see that
he then went on writing x's and o's operations and i got totally lost
answer 14-5*12-32*/+ then reversed again to get +/*23-21*5-41
if some one could explain to me the push pop so i could understand i would be very greatful, i have looked online but alot stuff im finding seems to be a step above this, so i really need to get an understanding here first
Hopefully this will help you visualize a Stack, and how it works.
Empty Stack:
| |
| |
| |
-------
After Pushing A, you get:
| |
| |
| A |
-------
After Pushing B, you get:
| |
| B |
| A |
-------
After Popping, you get:
| |
| |
| A |
-------
After Pushing C, you get:
| |
| C |
| A |
-------
After Popping, you get:
| |
| |
| A |
-------
After Popping, you get:
| |
| |
| |
-------
The rifle clip analogy posted by Oren A is pretty good, but I'll try another one and try to anticipate what the instructor was trying to get across.
A stack, as it's name suggests is an arrangement of "things" that has:
A top
A bottom
An ordering in between the top and bottom (e.g. second from the top, 3rd from the bottom).
(think of it as a literal stack of books on your desk and you can only take something from the top)
Pushing something on the stack means "placing it on top".
Popping something from the stack means "taking the top 'thing'" off the stack.
A simple usage is for reversing the order of words. Say I want to reverse the word: "popcorn". I push each letter from left to right (all 7 letters), and then pop 7 letters and they'll end up in reverse order. It looks like this was what he was doing with those expressions.
push(p)
push(o)
push(p)
push(c)
push(o)
push(r)
push(n)
after pushing the entire word, the stack looks like:
| n | <- top
| r |
| o |
| c |
| p |
| o |
| p | <- bottom (first "thing" pushed on an empty stack)
======
when I pop() seven times, I get the letters in this order:
n,r,o,c,p,o,p
conversion of infix/postfix/prefix is a pathological example in computer science when teaching stacks:
Infix to Postfix conversion.
Post fix conversion to an infix expression is pretty straight forward:
(scan expression from left to right)
For every number (operand) push it on the stack.
Every time you encounter an operator (+,-,/,*) pop twice from the stack and place the operator between them. Push that on the stack:
So if we have 53+2* we can convert that to infix in the following steps:
Push 5.
Push 3.
Encountered +: pop 3, pop 5, push 5+3 on stack (be consistent with ordering of 5 and 3)
Push 2.
Encountered *: pop 2, pop (5+3), push (2 * (5+3)).
*When you reach the end of the expression, if it was formed correctly you stack should only contain one item.
By introducing 'x' and 'o' he may have been using them as temporary holders for the left and right operands of an infix expression: x + o, x - o, etc. (or order of x,o reversed).
There's a nice write up on wikipedia as well. I've left my answer as a wiki incase I've botched up any ordering of expressions.
The algorithm to go from infix to prefix expressions is:
-reverse input
TOS = top of stack
If next symbol is:
- an operand -> output it
- an operator ->
while TOS is an operator of higher priority -> pop and output TOS
push symbol
- a closing parenthesis -> push it
- an opening parenthesis -> pop and output TOS until TOS is matching
parenthesis, then pop and discard TOS.
-reverse output
So your example goes something like (x PUSH, o POP):
2*3/(2-1)+5*(4-1)
)1-4(*5+)1-2(/3*2
Next
Symbol Stack Output
) x )
1 ) 1
- x )- 1
4 )- 14
( o ) 14-
o 14-
* x * 14-
5 * 14-5
+ o 14-5*
x + 14-5*
) x +) 14-5*
1 +) 14-5*1
- x +)- 14-5*1
2 +)- 14-5*12
( o +) 14-5*12-
o + 14-5*12-
/ x +/ 14-5*12-
3 +/ 14-5*12-3
* x +/* 14-5*12-3
2 +/* 14-5*12-32
o +/ 14-5*12-32*
o + 14-5*12-32*/
o 14-5*12-32*/+
+/*23-21*5-41
A Stack is a LIFO (Last In First Out) data structure. The push and pop operations are simple. Push puts something on the stack, pop takes something off. You put onto the top, and take off the top, to preserve the LIFO order.
edit -- corrected from FIFO, to LIFO. Facepalm!
to illustrate, you start with a blank stack
|
then you push 'x'
| 'x'
then you push 'y'
| 'x' 'y'
then you pop
| 'x'
A stack in principle is quite simple: imagine a rifle's clip - You can only access the topmost bullet - taking it out is called "pop", inserting a new one is called "push".
A very useful example for that is for applications that allow you to "undo".
Imagine you save each state of the application in a stack. e.g. the state of the application after every type the user makes.
Now when the user presses "undo" you just "pop" the previous state from the stack. For every action the user does - you "push" the new state to the stack (that's of course simplified).
About what your lecturer specifically was doing - in order to explain it some more information would be helpful..
Ok. As the other answerers explained, a stack is a last-in, first-out data structure. You add an element to the top of the stack with a Push operation. You take an element off the top with a Pop operation. The elements are removed in reverse order to the order they were put inserted (hence Last In, First Out). For example, if you push the elments 1,2,3 in that order, the number 3 will be at the top of the stack. A Pop operation will remove it (it was the last in) and leave 2 at the top of the stack.
Regarding the rest of the lecture, the lecturer tried to describe a stack-based machine that evaluates arithmetic expressions. The machine operates by continuously popping 3 elements from the top of the stack. The first two elements are operands and the third is an operator (+, -, *, /). It then applies this operator on the operands, and pushes the result onto the stack. The process continues until there is only one element on the stack, which is the value of the expression.
So, suppose we begin by pushing the values "+/*23-21*5-41" in left-to-right order onto the stack. We then pop 3 elements from the top. The last in is first out, which means the first 3 element are "1", "4", and "-" in that order. We push the number 3 (the result of 4-1) onto the stack, then pop the three topmost elements: 3, 5, *. Push the result, 15, onto the stack, and so on.
push = add to the stack
pop = remove from the stack
Simply:
pop: returns the item at the top then remove it from the stack
push: add an item onto the top of the stack.
after all these good examples adam shankman still can't make sense of it. I think you should open up some code and try it. The second you try a myStack.Push(1) and myStack.Pop(1) you really should get the picture. But by the looks of it, even that will be a challenge for you!

Resources