We using elastic search on a single node and unassigned monitoring shards are piling up.
Where do those shards come from and how can they be avoided?
# curl http://localhost:9200/_cat/shards?pretty|grep UNASSIGNED
--:.monitoring-es-6-2021.02.08 0 r UNASSIGNED
--:.monitoring-es-6-2021.01.25 0 r UNASSIGNED
--.monitoring-es-6-2021.01.29 0 r UNASSIGNED
--.monitoring-es-6-2021.02.17 0 r UNASSIGNED
:-.monitoring-es-6-2021.01.20 0 r UNASSIGNED
-:--.monitoring-es-6-2021.01.31 0 r UNASSIGNED
-.monitoring-es-6-2021.02.01 0 r UNASSIGNED
-:-.monitoring-es-6-2021.02.09 0 r UNASSIGNED
-:.monitoring-es-6-2021.02.07 0 r UNASSIGNED
--.monitoring-es-6-2021.02.12 0 r UNASSIGNED
13.monitoring-es-6-2021.02.15 0 r UNASSIGNED
18k.monitoring-es-6-2021.02.16 0 r UNASSIGNED
.monitoring-es-6-2021.01.19 0 r UNASSIGNED
.monitoring-es-6-2021.01.22 0 r UNASSIGNED
.monitoring-es-6-2021.02.18 0 r UNASSIGNED
.monitoring-es-6-2021.02.11 0 r UNASSIGNED
.monitoring-es-6-2021.01.18 0 r UNASSIGNED
.monitoring-es-6-2021.01.24 0 r UNASSIGNED
....
We start elasticsearch using a docker image, form a Dockerfile like this:
ARG version=5.6.16
FROM docker.elastic.co/elasticsearch/elasticsearch:${version}
RUN /usr/share/elasticsearch/bin/elasticsearch-plugin install analysis-phonetic \
&& /usr/share/elasticsearch/bin/elasticsearch-plugin install analysis-icu
With this env:
environment:
- xpack.security.enabled=false
- discovery.type=single-node
- TAKE_FILE_OWNERSHIP=true
We found https://www.datadoghq.com/blog/elasticsearch-unassigned-shards/ but it seems to solve this only temporarily and would be needed to be executed regularily. We try to avoid a hacky cron job solution and want to solve the root cause :-)
I believe your indexing pattern is creating multiple indexes and Shards are linked to each index created https://www.elastic.co/blog/how-many-shards-should-i-have-in-my-elasticsearch-cluster.
Check https://discuss.elastic.co/t/what-are-monitoring-indices/144855
Where do those shards come from?
"Those are system indices created by Elasticsearch and Kibana, they're
usually small and gets deleted after seven days so i wouldn't worry."
How can they be avoided?
"With 6.3 x-pack ships with any distribution by default and monitoring
is enabled by default. If you really want to disable it, you can, see
https://www.elastic.co/guide/en/elasticsearch/reference/current/monitoring-settings.html"
By index pattern, is about the way you create your indices.
For example, this process creates one index per day.
Other patterns:
Create an index by customer
Create an index by date
Create an index per year
etc.
There will be those who only have one index and do not follow a pattern to create indices.
The patterns for creating indices do not exist anywhere in ES.
It is a pattern that you, as an ES user, follow to create your indices according to your needs.
Related
I want to check unassigned tickets in jira using java,python or any other script.
Using jira-python we first search for all the possible issues, then we check if the assignee field is None, thus nobody assigned.
# search_issues can only return 1000 issues, so if there are more we have to search again, thus startAt=count
issues = []
while True:
tmp_issues = jira_connection.search_issues('', startAt=count, maxResults=count + 999)
if len(tmp_issues) == 0:
# Since Python does not offer do-while, we have to break here.
break
issues.extend(tmp_issues)
count += 999
not_assigned = []
for i in issues:
if i.fields.assignee is None:
not_assigned.add(i)
I have a dataset that has three variables which indicate a category of event at three time points (dispatch, beginning, end). I want to establish the number of cases where (a) the category is the same for all three time points (b) those which have changed at time point 2 (beginning) and (c) those which have changed at time point 3 (end).
Can anyone recommend some syntax or a starting point?
To measure a change (non-equivalent) against T0 (Time zero or in your case Dispatch), wouldn't you simply check for equivalence between respective variables?:
DATA LIST FREE /ID T0 T1 T2.
BEGIN DATA.
1 1 1 1.
2 1 1 0.
3 1 0 1.
4 0 1 1.
5 1 0 0.
6 0 1 0.
7 0 0 1.
8 0 0 0.
END DATA.
COMPUTE ChangeT1=T0<>T1.
COMPUTE ChangeT2=T0<>T2.
To check all the values are the same across all three variables would be just (given you have string variables else otherwise you could do this differently if working with numeric variables such as Standard deviation):
COMPUTE CheckNoChange=T0=T1 & T0=T2.
I successfully wrote an intersection of text search and other criteria using Redis. To achieve that I'm using a Lua script. The issue is that I'm not only reading, but also writing values from that script. From Redis 3.2 it's possible to achieve that by calling redis.replicate_commands(), but not before 3.2.
Below is how I'm storing the values.
Names
> HSET product:name 'Cool product' 1
> HSET product:name 'Nice product' 2
Price
> ZADD product:price 49.90 1
> ZADD product:price 54.90 2
Then, to get all products that matches 'ice', for example, I call:
> HSCAN product:name 0 MATCH *ice*
However, since HSCAN uses a cursor, I have to call it multiple times to fetch all results. This is where I'm using a Lua script:
local cursor = 0
local fields = {}
local ids = {}
local key = 'product:name'
local value = '*' .. ARGV[1] .. '*'
repeat
local result = redis.call('HSCAN', key, cursor, 'MATCH', value)
cursor = tonumber(result[1])
fields = result[2]
for i, id in ipairs(fields) do
if i % 2 == 0 then
ids[#ids + 1] = id
end
end
until cursor == 0
return ids
Since it's not possible to use the result of a script with another call, like SADD key EVAL(SHA) .... And also, it's not possible to use global variables within scripts. I've changed the part inside the fields' loop to access the list of ID's outside the script:
if i % 2 == 0 then
ids[#ids + 1] = id
redis.call('SADD', KEYS[1], id)
end
I had to add redis.replicate_commands() to the first line. With this change I can get all ID's from the key I passed when calling the script (see KEYS[1]).
And, finally, to get a list 100 product ID's priced between 40 and 50 where the name contains "ice", I do the following:
> ZUNIONSTORE tmp:price 1 product:price WEIGHTS 1
> ZREMRANGEBYSCORE tmp:price 0 40
> ZREMRANGEBYSCORE tmp:price 50 +INF
> EVALSHA b81c2b... 1 tmp:name ice
> ZINTERSTORE tmp:result tmp:price tmp:name
> ZCOUNT tmp:result -INF +INF
> ZRANGE tmp:result 0 100
I use the ZCOUNT call to know in advance how many result pages I'll have, doing count / 100.
As I said before, this works nicely with Redis 3.2. But when I tried to run the code at AWS, which only supports Redis up to 2.8, I couldn't make it work anymore. I'm not sure how to iterate with HSCAN cursor without using a script or without writing from the script. There is a way to make it work on Redis 2.8?
Some considerations:
I know I can do part of the processing outside Redis (like iterate the cursor or intersect the matches), but it'll affect the application overall performance.
I don't want to deploy a Redis instance by my own to use version 3.2.
The criteria above (price range and name) is just an example to keep things simple here. I have other fields and type of matches, not only those.
I'm not sure if the way I'm storing the data is the best way. I'm willing to listen suggestion about it.
The only problem I found here is storing the values inside a lua scirpt. So instead of storing them inside a lua, take that value outside lua (return that values of string[]). Store them in a set in a different call using sadd (key,members[]). Then proceed with intersection and returning results.
> ZUNIONSTORE tmp:price 1 product:price WEIGHTS 1
> ZREVRANGEBYSCORE tmp:price 0 40
> ZREVRANGEBYSCORE tmp:price 50 +INF
> nameSet[] = EVALSHA b81c2b... 1 ice
> SADD tmp:name nameSet
> ZINTERSTORE tmp:result tmp:price tmp:name
> ZCOUNT tmp:result -INF +INF
> ZRANGE tmp:result 0 100
IMO your design is the most optimal one. One advice would be to use pipeline wherever possible, as it would process everything at one go.
Hope this helps
UPDATE
There is no such thing like array ([ ]) in lua you have to use the lua table to achieve it. In your script you are returning ids right, that itself is an array you can use it as a separate call to achieve the sadd.
String [] nameSet = (String[]) evalsha b81c2b... 1 ice -> This is in java
SADD tmp:name nameSet
And the corresponding lua script is the same as that of your 1st one.
local cursor = 0
local fields = {}
local ids = {}
local key = 'product:name'
local value = '*' .. ARGV[1] .. '*'
repeat
local result = redis.call('HSCAN', key, cursor, 'MATCH', value)
cursor = tonumber(result[1])
fields = result[2]
for i, id in ipairs(fields) do
if i % 2 == 0 then
ids[#ids + 1] = id
end
end
until cursor == 0
return ids
The problem isn't that you're writing to the database, it's that you're doing a write after a HSCAN, which is a non-deterministic command.
In my opinion there's rarely a good reason to use a SCAN command in a Lua script. The main purpose of the command is to allow you to do things in small batches so you don't lock up the server processing a huge key space (or hash key space). Since scripts are atomic, though, using HSCAN doesn't help—you're still locking up the server until the whole thing's done.
Here are the options I can see:
If you can't risk locking up the server with a lengthy command:
Use HSCAN on the client. This is the safest option, but also the slowest.
If you're want to do as much processing in a single atomic Lua command as possible:
Use Redis 3.2 and script effects replication.
Do the scanning in the script, but return the values to the client and initiate the write from there. (That is, Karthikeyan Gopall's answer.)
Instead of HSCAN, do an HKEYS in the script and filter the results using Lua's pattern matching. Since HKEYS is deterministic you won't have a problem with the subsequent write. The downside, of course, is that you have to read in all of the keys first, regardless of whether they match your pattern. (Though HSCAN is also O(N) in the size of the hash.)
Using the ruby redis client
I have a key that contains a list of values they follow the pattern of
campaign_id|telephone|query_id
there are thousands of these in a individual list what i want to do is delete all the ones that have for example the query_id of 4 from that redis list. Can you do this through some sort of pattern matching? Please could someone give me an example as i've been reading through other questions and am a bit lost
You basically have one of two options: a) do it your (RoR) application or b) do it in Redis.
I'm not a RoR expert so I can't advise on the how, but note that taking the a) path you'll basically be moving the entire list to your application and there do the filtering. The bigger your list is, the more time it will take it to cross the network.
Option b) means that you'll be filtering the list right in Redis - this can be done simply and efficiently when you use Lua. Example:
$ cat dellistbyqueryid.lua
-- removes a range of list elements that confirm to a given
-- query_id. Elements are stored as: 'campaign_id|telephone|query_id'
-- KEYS[1] - a list
-- ARGV[1] - a query_id
-- return: number of elements removed
local l = tonumber(redis.call('LLEN', KEYS[1]))
local n = 0
while l > 0 do
local curr = redis.call('LINDEX', KEYS[1], -1)
local id = curr:match( '.*|.*|(.*)' )
if id == ARGV[1] then
redis.call('RPOP', KEYS[1])
n = n + 1
else
redis.call('RPOPLPUSH', KEYS[1], KEYS[1])
end
l = l - 1
end
return n
Output:
$ redis-cli LPUSH list "foo|bar|1" "baz|qaz|2" "lua|redis|1"
(integer) 3
$ redis-cli --eval dellistbyqueryid.lua list , 1
(integer) 2
$ redis-cli LRANGE list 0 -1
1) "baz|qaz|2"
I'm using a Redis sorted set to store a ranking for a project I'm working on. We hadn't anticipated (!) how we wanted to handle ties. Redis sorts lexicographically the entries that have the same score, but what we want to do is instead give the same rank to all the entries that have the same score, so for instance in the case of
redis 127.0.0.1:6379> ZREVRANGE foo 0 -1 WITHSCORES
1) "first"
2) "3"
3) "second3"
4) "2"
5) "second2"
6) "2"
7) "second1"
8) "2"
9) "fifth"
10) "1"
we want to consider second1, second2 and second3 as both having position 2, and fifth to have position 5. Hence there is no entry in the third or fourth position. ZREVRANK is not useful here, so what's the best way to get the number I'm looking for?
It seems to me one way is writing a little Lua script and use the EVAL command. The resulting operation has still logarithmic complexity.
For example, suppose we are interested in the position of second2. In the script, first we get its score with ZSCORE, obtaining 2. Then we get the first entry with that score using ZRANGEBYSCORE, obtaining second3. The position we're after is then ZREVRANK of second3 plus 1.
redis 127.0.0.1:6379> ZSCORE foo second2
"2"
redis 127.0.0.1:6379> ZREVRANGEBYSCORE foo 2 2 LIMIT 0 1
1) "second3"
redis 127.0.0.1:6379> ZREVRANK foo second3
(integer) 1
So the script could be something like
local score = redis.call('zscore', KEYS[1], ARGV[1])
if score then
local member = redis.call('zrevrangebyscore', KEYS[1], score, score, 'limit', 0, 1)
return redis.call('zrevrank', KEYS[1], member[1]) + 1
else return -1 end