I have a ordinall variable say A, which has labels 'absolutely yes','yes','neutral', 'no', 'absolutely no'. I want to select only the data which gives value yes for the variable A. Then I want to use these data only and compare some ordinal variables.
So how can I isolate only the data with value yes for the variable A
Something like this should do the trick:
USE ALL. *removes any filters you have in place.
COMPUTE filter_$=(A = 'yes'). *creates a filter variable.
VARIABLE LABELS filter_$ "A = 'yes' (FILTER)". *labels it.
VALUE LABELS filter_$ 0 'Not Selected' 1 'Selected'. *labels the values.
FORMATS filter_$ (f1.0). *formats the filter variable.
FILTER BY filter_$. *this is the uses the filter variable you just created.
EXECUTE.
You can also point-and-click your way to the answer under Data -> Select Cases. Select the "If condition is satisfied" button and make your selections from there.
Related
I know that through
select cases if char.substr(variable_name,1,3)="I22".
I can select values based on the first # of characters but this is not exactly my question. I need to select RANGE OF values that start with few characters, here is an example of what I want:
if I have the following cases:
I22A33
I22B33
I22C33
I22D33
So I want to select I22B33 and I22C33 out of the above 4 values, so it's like a range of cases between b and c.
One way to flag any cases that meet your criteria is using INDEX and a series of OR conditions. Not particularly modular, but if you just have a couple of conditions you're searching for it could get you on your way.
Edit: These searches are case-insensitive (due to UPCASE) and search for matches at the start of the string. To search for matches anywhere within the string set the condition to > 0 (instead of = 1).
COMPUTE f_I22 = (INDEX(UPCASE(var_name),'I22B33') = 1)
OR (INDEX(UPCASE(var_name),'I22C33') = 1) .
EXE .
Assuming in this range of values that you want to select, all the values will start with either "I22B" or "I22C", you can simply use:
select cases if char.substr(variable_name,1,4)="I22B" or
char.substr(variable_name,1,4)="I22C".
I have a list of data with a title column (among many other columns) and I have a Power BI parameter that has, for example, a value of "a,b,c". What I want to do is loop through the parameter's values and remove any rows that begin with those characters.
For example:
Title
a
b
c
d
Should become
Title
d
This comma separated list could have one value or it could have twenty. I know that I can turn the parameter into a list by using
parameterList = Text.Split(<parameter-name>,",")
but then I am unsure how to continue to use that to filter on. For one value I would just use
#"Filtered Rows" = Table.SelectRows(#"Table", each Text.StartsWith([key], <value-to-filter-on>))
but that only allows one value.
EDIT: I may have worded my original question poorly. The comma separated values in the parameterList can be any number of characters (e.g.: a,abcd,foo,bar) and I want to see if the value in [key] starts with that string of characters.
Try using List.Contains to check whether the starting character is in the parameter list.
each List.Contains(parameterList, Text.Start([key], 1)
Edit: Since you've changed the requirement, try this:
Table.SelectRows(
#"Table",
(C) => not List.AnyTrue(
List.Transform(
parameterList,
each Text.StartsWith(C[key], _)
)
)
)
For each row, this transforms the parameterList into a list of true/false values by checking if the current key starts with each text string in the list. If any are true, then List.AnyTrue returns true and we choose not to select that row.
Since you want to filter out all the values from the parameter, you can use something like:
= Table.SelectRows(#"Changed Type", each List.Contains(Parameter1,Text.Start([Title],1))=false)
Another way to do this would be to create a custom column in the table, which has the first character of title:
= Table.AddColumn(#"Changed Type", "FirstChar", each Text.Start([Title],1))
and then use this field in the filter step:
= Table.SelectRows(#"Added Custom", each List.Contains(Parameter1,[FirstChar])=false)
I tested this with a small sample set and it seems to be running fine. You can test both and see if it helps with the performance. If you are still facing performance issues, it would probably be easier if you can share the pbix file.
This seems to work fairly well:
= List.Select(Source[Title], each Text.Contains(Parameter1,Text.Start(_,1))=false)
Replace Source with the name of your table and Parameter1 with the name of your Parameter.
i m searching sum of class of each student using countif formula, but any student have unique username like A*di (in the image) and so the calculation is false. And any other student using username like </John>, and 'Angel. and make calculation false
Formula: =COUNTIF('Data Asli'!$A:$A,$A$2)
Use SUMPRODUCT(--EXACT(..)) to run an exact, case-sensitive comparison that ignores wildcards:
=SUMPRODUCT(--EXACT('Data Asli'!$A:$A,$A2))
How it works:
EXACT(Value1, Value2) will return TRUE or FALSE, depending on whether the 2 values exactly match (same capitals, no wildcards, et cetera)
-- will convert TRUE/FALSE into 1/0
SUMPRODUCT(Array1[,Array2]) will run down the arrays, multiply the numbers together, then add them. It also forces many functions to both treat a Range as an array, and output an array.
So, as an example, the steps run like this:
=SUMPRODUCT(--EXACT(A1:A5, A2))
=SUMPRODUCT(--EXACT({Value1,Value2,Value3,Value4,Value2}, Value2))
a.k.a.
=SUMPRODUCT(--{EXACT(Value1,Value2),EXACT(Value2,Value2),EXACT(Value3,Value2),EXACT(Value4,Value2),EXACT(Value2,Value2)})
=SUMPRODUCT(--{FALSE,TRUE,FALSE,FALSE,TRUE})
=SUMPRODUCT({0,1,0,0,1})
=2
How would you do the following in spss:
var participant_number = 0.
DO IF (condition =1 AND trial_order = 1).
participant_number = ppnr.
DO IF (ppnr = participant_number).
COMPUTE start_condition = 1.
END IF.
ELSE.
participant_number = ppnr.
DO IF (ppnr = participant_number).
COMPUTE start_condition = 0.
END IF.
END IF.
The variable participant_number needs to be defined for the inner loops and not change throughout the inner if. I am just trying to set a value for all the participant cases if the participant fulfills a condition.
In SPSS, in general, (with exceptions, but let's keep things simple for now), variables are global. If they come from the dataset, they can be used in syntaxes without fear of going out of scope.
Note that variables need to be "computed"/created first, before being used. You can do that with syntax or manually in the Data window.
DO IF is useful if you want to perform multiple transformations. Otherwise, a structure like
IF [condition][transformation].
EXECUTE.
would do the trick.
If I understood your goal correctly, you can re-write your code like this:
***create a temporary variable, to check each case if your condition is met. Set the temporary variable to 0 as default value.
compute tempvar=0.
***then set it to 1, if condition is met.
***This is at case level, not participant level.
if condition=1 and trial_order=1 tempvar=1.
exe.
***aggregate the temp variable, from case level at participant level.
***for each participant (ppnr), it will look at all values of tempvar, and set the start_condition as the maximum of tempvar - either 0 or 1.
AGGREGATE
/OUTFILE=* MODE=ADDVARIABLES
/BREAK=ppnr
/start_condition=MAX(tempvar).
***optional.
delete variable tempvar.
At the end, start_condition will be 1 for each case of a participant if (condition =1 AND trial_order = 1) is met for at least one case of that participant; otherwise, it will be 0.
In SPSS i have a variabele with a lot of different values (8 figure number; 00000000). Every row is a person. I want to aggregate this data on postal area and count the number of different values in a postal area. Is there a way?
Result within a postal area should be 1 to N : 1 = every person has the same value, N = every person has a different value
Aggregate in two steps. Assuming your dataset name is data1, with variables var1 (the variable of interest) and postalcode, I would do this:
Create a dataset step1, with one row for each combination of values of postalcode and var1. Also possible by using the command casestovars.
dataset declare step1.
dataset activate data1.
aggregate outf=step1 /break=postalcode var1 /n=n(var1).
Create a dataset result with one row for each postalcode, and a variable n for the number of rows from the previous dataset step1.
dataset declare result.
dataset activate step1.
aggregate outf=result /break=postalcode /n=n(var1).
So, in conclusion: first break by both of the variables, then break only by the variable of postal code. This should do the trick!