How to merge zero values (vector(0) with metric values in PromQL - monitoring

I'm using flexlm_exporter to export my license usage to Prometheus and from Prometheus to custom service (Not Graphana).
As you know Prometheus hides missing values.
However, I need those missing values in my metric values, therefore I added to my prom query or vector(0)
For example:
flexlm_feature_used_users{app="vendor_lic-server01",name="Temp"} or vector(0)
This query adds a empty metric with zero values.
My question is if there's a way to merge the zero vector with each metric values?
Edit:
I need grouping, at least for a user and name labels, so vector(0) is probably not the best option here?
I tried multiple solutions in different StackOverflow threads, however, nothing works.
Please assist.

It would help if you used Absent with labels to convert the value from 1 to zero, use clamp_max
( Metrics{label=“a”} OR clamp_max(absent(notExists{label=“a”}),0))
+
( Metrics2{label=“a”} OR clamp_max(absent(notExists{label=“a”}),0)
Vector(0) has no label.
clamp_max(Absent(notExists{label=“a”},0) is 0 with label.

If you do sum(flexlm_feature_used_users{app="vendor_lic-server01",name="Temp"} or vector(0)) you should get what you're looking for, but you'll lose possibility to do group by, since vector(0) doesn't have any labels.

I needed a similar thing, and ended up flattening the options. What worked for me was something like:
(sum by xyz(flexlm_feature_used_users{app="vendor_lic-server01",name="Temp1"} + sum by xyz(flexlm_feature_used_users{app="vendor_lic-server01",name="Temp2"}) or
sum by xyz(flexlm_feature_used_users{app="vendor_lic-server01",name="Temp1"} or
sum by xyz(flexlm_feature_used_users{app="vendor_lic-server01",name="Temp2"}

There is no an easy generic way to fill gaps in returned time series with zeroes in Prometheus. But this can be easily done via default operator in VictoriaMetrics:
flexlm_feature_used_users{app="vendor_lic-server01",name="Temp"} default 0
The q default N fills gaps with the given default value N per each time series returned from q. See more details in MetricsQL docs.

Related

max-series-per-database limit exceeded clarification needed / how to calculate number of series in use

We recently started to encounter this error:
{"error":"partial write: max-series-per-database limit exceeded: (1000000) dropped=1"}
When writing metric data like this:
resque_job,environment=beta,billing_status=active-current,billing_active=active,instance_id=1103,instance_testmode=0,instance_staging=0,server_addr=RESQUE,database_host=db11.msp1.our-domain.com,admin_sso_key=_EMPTY_,admin_is_internal=_EMPTY_,queue_priority=default seconds_spent_job=0.20966601371765,number_in_batch=1 1649203450783000002
I know that Influx recommends you keep your series cardinality low, and our impression was that series cardinality would mean keeping each tag individually to a small number of values. e.g. we felt comfortable sending instance_id=1103 as a tag, because we know that there will never be more than 2000 distinct instance_id tag values.
But after running into this error... I'm afraid maybe I was mistaken here. Do we actually need to keep the cardinality of all possible combinations of all tags low? e.g. do these two things count as two separate series towards the 1,000,000 default max, because the instance_id is different?
resque_job,environment=beta,billing_status=active-current,billing_active=active,instance_id=1111,instance_testmode=0,instance_staging=0,server_addr=RESQUE,database_host=db11.msp1.our-domain.com,admin_sso_key=_EMPTY_,admin_is_internal=_EMPTY_,queue_priority=default seconds_spent_job=0.20966601371765,number_in_batch=1 1649203450783000002
resque_job,environment=beta,billing_status=active-current,billing_active=active,instance_id=2222,instance_testmode=0,instance_staging=0,server_addr=RESQUE,database_host=db11.msp1.our-domain.com,admin_sso_key=_EMPTY_,admin_is_internal=_EMPTY_,queue_priority=default seconds_spent_job=0.20966601371765,number_in_batch=1 1649203450783000002
If those count as two separate series... then is there a better way to structure this data in Influx? 1,000,000 total seems like a tiny amount if each separate combination of tags is a separate series...
Does InfluxDB 2.x help with this?
Is there a better tool that can handle a large number of tags and not bump into limits like this?
There is no way to figure out what data was not recorded. Update the max-series-per-database configuration to be more than 1M in order to stop dropping data.
This can be an indication that you are creating a lot of series. i saw some documentation on why that isn't great.
Hope this helps!

Can I search through and compare commonly named variables in SPSS?

I have a list of about 30 variables, all named something like test_1, test_2, test_3, etc. I need to check if the values are all the same, and typically do so by exporting to excel and using an if statement comparing the min value to the max (i.e. if the min=max then all the values are the same).
Is there a way I can do this right in SPSS without having to export? It seems inefficient to compare if test_1=test_2 and test_2=test_3 etc.
This is sort of a hack, but it get's the job done: can calculate the standard deviation of all your variables:
compute sd_test=SD(test_1, test_2, ..., test_n).
EXECUTE.
sd_test=0 for records where all test_i variables are equal.

SPSS "No cases were input" warning - Is it possible to get a table with 0 counts?

I am running a huge syntax, with lots of CTABLES and FREQUENCIES commands. Some of them have a filter:
TEMPORARY.
SELECT IF [condition].
FREQUENCIES VAR1.
In some cases, this results in no cases being selected, so the output is just a warning text. Is it possible to still get a table with 0 counts...?
If all cases are screened out, a procedure never gets a chance to run. However, suppose you create one case with everything missing but a filter value of 1. Then use CTABLES instead of FREQUENCIES and specify that empty categories should be shown (on the Categories subdialog if using the gui.)
If you want to make this perfectly accurate, create a weight variable with case 1 weighted by a very small value (1e-8, say), and all the other cases with a a weight of 1.

How do you include categories with 0 responses in SPSS frequency output?

Is there a way to display response options that have 0 responses in SPSS frequency output? The default is for SPSS to omit in the frequency table output any response option that is not selected by at least a single respondent. I looked for a syntax-driven option to no avail. Thank you in advance for any assistance!
It doesn't show because there is no one single case in the data is with that attribute. So, by forcing a row of zero you'll need to realize we're asking SPSS to do something incorrect.
Having said that, you can introduce a fake case with the missing category. E.g. if you have Orange, Apple, and Pear, but no one answered they like Pear, the add one fake case that says Pear.
Now, make a new weight variable that consists of only 1. But for the Pear case, make it very very small like 0.00001. Then, go to Data > Weight Cases > Weight cases by and put that new weight variable over. Click OK to apply. Now what happens is that SPSS will treat the "1" with a weight of 1 and the fake case with a weight that is 1/10000 of a normal case. If you rerun the frequency you should see the one with zero count shows up.
If you have purchased the Custom Table module you can also do that directly as well, as far as I can tell from their technical document. That module costs 637 to 3630 depending on license type, so probably only worth a try if your institute has it.
So, I'm a noob with SPSS, I (shame on me) have a cracked version of SPSS 22 and if I understood your question correctly, this is my solution:
double click the Frequency table in Output
right click table, select Table Properties
go to General and then uncheck the Hide empty rows and columns option
Hope this helps someone!
If your SPSS version has no Custom Tables installed and you haven't collected money for that module yet then use the following (run this syntax):
*Note: please use variable names up to 8 characters long.
set mxloops 1000. /*in case your list of values is longer than 40
matrix.
get vars /vari= V1 V2 /names= names /miss= omit. /*V1 V2 here is your categorical variable(s)
comp vals= {1,2,3,4,5,99}. /*let this be the list of possible values shared by the variables
comp freq= make(ncol(vals),ncol(vars),0).
loop i= 1 to ncol(vals).
comp freq(i,:)= csum(vars=vals(i)).
end loop.
comp names= {'vals',names}.
print {t(vals),freq} /cnames= names /title 'Frequency'. /*here you are - the frequencies
print {t(vals),freq/nrow(vars)*100} /cnames= names /format f8.2 /title 'Percent'. /*and percents
end matrix.
*If variables have missing values, they are deleted listwise. To include missings, use
get vars /vari= V1 V2 /names= names /miss= -999. /*or other value
*To exclude missings individually from each variable, analyze by separate variables.

Please help on using SPSS to add scales of Likert-type

Since the last post is closed due to unclear expression, here is a edited one.
There are in total 20 items from 5 Likert-type scale questions from a questionnaire. I need to add the 20 items from 5 separate questions to create a total scale. I already got the data.
The question is just like the picture above. How can I run the command to add the 20 items from 5 separate questions? What is the command?
Is it something like Transform > Compute variable. Enter a variable name, specify which items to add up, and hey presto (e.g. "V1+V2+V3" etc)?
You can do exactly as you suggested, using the Transform -> Compute variable... function. Simply type in the name of your new scale in the Target variable box and the addition you want in the Numeric variable box.
You will see that the following SPSS syntax command is run:
COMPUTE total=v1 + v2 + v3 + v4.
EXECUTE.
If any of the variables has a missing value, the simply adding them will result in a missing value as well. If you don't want to impute for missing values, using the MEAN command in syntax works well. Also, if the variables are contiguous in the data file, you can make the syntax much more readable by using the TO modifier.
COMPUTE myscore=MEAN(variable1 TO variable5)*5.
The resulting value provides an efficient expected value.
However, it seems like the problem in this case is that the data entry process has dummy coded all of the items, producing 20 separate variables instead of 5, where each block of 4 variables has a value of 0 or 1 but represents the values 1 to 4. In this case, you can use the following syntax:
COMPUTE mycounter=1.
COMPUTE myscore=0.
EXECUTE.
DO REPEAT a=variable1 TO variable20.
COMPUTE myscore=myscore+mycounter*a.
COMPUTE mycounter=mycounter+1.
IF (mycounter=5) mycounter=1.
END REPEAT.
EXECUTE.
Note that the variables from variable1 to variable20 must have each set of dummy codes from the original items clustered together in ascending order.

Resources