How to combine search text with other criteria using Redis? - lua

I successfully wrote an intersection of text search and other criteria using Redis. To achieve that I'm using a Lua script. The issue is that I'm not only reading, but also writing values from that script. From Redis 3.2 it's possible to achieve that by calling redis.replicate_commands(), but not before 3.2.
Below is how I'm storing the values.
Names
> HSET product:name 'Cool product' 1
> HSET product:name 'Nice product' 2
Price
> ZADD product:price 49.90 1
> ZADD product:price 54.90 2
Then, to get all products that matches 'ice', for example, I call:
> HSCAN product:name 0 MATCH *ice*
However, since HSCAN uses a cursor, I have to call it multiple times to fetch all results. This is where I'm using a Lua script:
local cursor = 0
local fields = {}
local ids = {}
local key = 'product:name'
local value = '*' .. ARGV[1] .. '*'
repeat
local result = redis.call('HSCAN', key, cursor, 'MATCH', value)
cursor = tonumber(result[1])
fields = result[2]
for i, id in ipairs(fields) do
if i % 2 == 0 then
ids[#ids + 1] = id
end
end
until cursor == 0
return ids
Since it's not possible to use the result of a script with another call, like SADD key EVAL(SHA) .... And also, it's not possible to use global variables within scripts. I've changed the part inside the fields' loop to access the list of ID's outside the script:
if i % 2 == 0 then
ids[#ids + 1] = id
redis.call('SADD', KEYS[1], id)
end
I had to add redis.replicate_commands() to the first line. With this change I can get all ID's from the key I passed when calling the script (see KEYS[1]).
And, finally, to get a list 100 product ID's priced between 40 and 50 where the name contains "ice", I do the following:
> ZUNIONSTORE tmp:price 1 product:price WEIGHTS 1
> ZREMRANGEBYSCORE tmp:price 0 40
> ZREMRANGEBYSCORE tmp:price 50 +INF
> EVALSHA b81c2b... 1 tmp:name ice
> ZINTERSTORE tmp:result tmp:price tmp:name
> ZCOUNT tmp:result -INF +INF
> ZRANGE tmp:result 0 100
I use the ZCOUNT call to know in advance how many result pages I'll have, doing count / 100.
As I said before, this works nicely with Redis 3.2. But when I tried to run the code at AWS, which only supports Redis up to 2.8, I couldn't make it work anymore. I'm not sure how to iterate with HSCAN cursor without using a script or without writing from the script. There is a way to make it work on Redis 2.8?
Some considerations:
I know I can do part of the processing outside Redis (like iterate the cursor or intersect the matches), but it'll affect the application overall performance.
I don't want to deploy a Redis instance by my own to use version 3.2.
The criteria above (price range and name) is just an example to keep things simple here. I have other fields and type of matches, not only those.
I'm not sure if the way I'm storing the data is the best way. I'm willing to listen suggestion about it.

The only problem I found here is storing the values inside a lua scirpt. So instead of storing them inside a lua, take that value outside lua (return that values of string[]). Store them in a set in a different call using sadd (key,members[]). Then proceed with intersection and returning results.
> ZUNIONSTORE tmp:price 1 product:price WEIGHTS 1
> ZREVRANGEBYSCORE tmp:price 0 40
> ZREVRANGEBYSCORE tmp:price 50 +INF
> nameSet[] = EVALSHA b81c2b... 1 ice
> SADD tmp:name nameSet
> ZINTERSTORE tmp:result tmp:price tmp:name
> ZCOUNT tmp:result -INF +INF
> ZRANGE tmp:result 0 100
IMO your design is the most optimal one. One advice would be to use pipeline wherever possible, as it would process everything at one go.
Hope this helps
UPDATE
There is no such thing like array ([ ]) in lua you have to use the lua table to achieve it. In your script you are returning ids right, that itself is an array you can use it as a separate call to achieve the sadd.
String [] nameSet = (String[]) evalsha b81c2b... 1 ice -> This is in java
SADD tmp:name nameSet
And the corresponding lua script is the same as that of your 1st one.
local cursor = 0
local fields = {}
local ids = {}
local key = 'product:name'
local value = '*' .. ARGV[1] .. '*'
repeat
local result = redis.call('HSCAN', key, cursor, 'MATCH', value)
cursor = tonumber(result[1])
fields = result[2]
for i, id in ipairs(fields) do
if i % 2 == 0 then
ids[#ids + 1] = id
end
end
until cursor == 0
return ids

The problem isn't that you're writing to the database, it's that you're doing a write after a HSCAN, which is a non-deterministic command.
In my opinion there's rarely a good reason to use a SCAN command in a Lua script. The main purpose of the command is to allow you to do things in small batches so you don't lock up the server processing a huge key space (or hash key space). Since scripts are atomic, though, using HSCAN doesn't help—you're still locking up the server until the whole thing's done.
Here are the options I can see:
If you can't risk locking up the server with a lengthy command:
Use HSCAN on the client. This is the safest option, but also the slowest.
If you're want to do as much processing in a single atomic Lua command as possible:
Use Redis 3.2 and script effects replication.
Do the scanning in the script, but return the values to the client and initiate the write from there. (That is, Karthikeyan Gopall's answer.)
Instead of HSCAN, do an HKEYS in the script and filter the results using Lua's pattern matching. Since HKEYS is deterministic you won't have a problem with the subsequent write. The downside, of course, is that you have to read in all of the keys first, regardless of whether they match your pattern. (Though HSCAN is also O(N) in the size of the hash.)

Related

wrk executing Lua script

My question is that when I run
wrk -d10s -t20 -c20 -s /mnt/c/xxxx/post.lua http://localhost:xxxx/post
the Lua script that is only executed once? It will only put one item into the database at the URL.
-- example HTTP POST script which demonstrates setting the
-- HTTP method, body, and adding a header
math.randomseed(os.time())
number = math.random()
wrk.method = "POST"
wrk.headers["Content-Type"] = "application/json"
wrk.body = '{"name": "' .. tostring(number) .. '", "title":"test","enabled":true,"defaultValue":false}'
Is there a way to make it create the 'number' variable dynamically and keep adding new items into the database until the 'wrk' command has finished its test? Or that it will keep executing the script for the duration of the test creating and inserting new 'number' variables into 'wrk.body' ?
Apologies I have literally only being looking at Lua for a few hours.
Thanks
When you do
number = math.random
you're not setting number to a random number, you're setting it equal to the function math.random. To set the variable to the value returned by the function, that line should read
number = math.random()
You may also need to set a random seed (with the math.randomseed() function and your choice of an appropriately variable argument - system time is common) to avoid math.random() giving the same result each time the script is run. This should be done before the first call to math.random.
As the script is short, system time probably isn't a good choice of seed here (the script runs far quicker than the value from os.time() changes, so running it several times immediately after one another gives the same results each time). Reading a few bytes from /dev/urandom should give better results.
You could also just use /dev/urandom to generate a number directly, rather than feeding it to math.random as a seed. Like in the code below, as taken from this answer. This isn't a secure random number generator, but for your purposes it would be fine.
urand = assert (io.open ('/dev/urandom', 'rb'))
rand = assert (io.open ('/dev/random', 'rb'))
function RNG (b, m, r)
b = b or 4
m = m or 256
r = r or urand
local n, s = 0, r:read (b)
for i = 1, s:len () do
n = m * n + s:byte (i)
end
return n
end

How to use the result of SMEMBERS as input for SUNION in a Lua script

I'm trying to produce a Lua script that takes the members of a set (each representing a set as well) and returs the union.
This is a concrete example with these 3 sets:
smembers u:1:skt:n1
1) "s2"
2) "s3"
3) "s1"
smembers u:1:skt:n2
1) "s4"
2) "s5"
3) "s6"
smembers u:1:skts
1) "u:1:skt:n1"
2) "u:1:skt:n2"
So the set u:1:skts contains the reference of the other 2 sets and I
want to produce the union of u:1:skt:n1 and u:1:skt:n2 as follows:
1) "s1"
2) "s2"
3) "s3"
4) "s4"
5) "s5"
6) "s6"
This is what I have so far:
local indexes = redis.call("smembers", KEYS[1])
return redis.call("sunion", indexes)
But I get the following error:
(error) ERR Error running script (call to f_c4d338bdf036fbb9f77e5ea42880dc185d57ede4):
#user_script:1: #user_script: 1: Lua redis() command arguments must be strings or integers
It seems like it doesn't like the indexes variable as input of the sunion command. Any ideas?
Do not do this, or you'll have trouble moving to the cluster. This is from the documentation:
All Redis commands must be analyzed before execution to determine which keys the command will operate on. In order for this to be true for EVAL, keys must be passed explicitly. This is useful in many ways, but especially to make sure Redis Cluster can forward your request to the appropriate cluster node.
Note this rule is not enforced in order to provide the user with opportunities to abuse the Redis single instance configuration, at the cost of writing scripts not compatible with Redis Cluster.
If you still decide to go against the rules, use Lua's unpack:
local indexes = redis.call("smembers", KEYS[1])
return redis.call("sunion", unpack(indexes))

how to delete key from list in redis matching a pattern?

Using the ruby redis client
I have a key that contains a list of values they follow the pattern of
campaign_id|telephone|query_id
there are thousands of these in a individual list what i want to do is delete all the ones that have for example the query_id of 4 from that redis list. Can you do this through some sort of pattern matching? Please could someone give me an example as i've been reading through other questions and am a bit lost
You basically have one of two options: a) do it your (RoR) application or b) do it in Redis.
I'm not a RoR expert so I can't advise on the how, but note that taking the a) path you'll basically be moving the entire list to your application and there do the filtering. The bigger your list is, the more time it will take it to cross the network.
Option b) means that you'll be filtering the list right in Redis - this can be done simply and efficiently when you use Lua. Example:
$ cat dellistbyqueryid.lua
-- removes a range of list elements that confirm to a given
-- query_id. Elements are stored as: 'campaign_id|telephone|query_id'
-- KEYS[1] - a list
-- ARGV[1] - a query_id
-- return: number of elements removed
local l = tonumber(redis.call('LLEN', KEYS[1]))
local n = 0
while l > 0 do
local curr = redis.call('LINDEX', KEYS[1], -1)
local id = curr:match( '.*|.*|(.*)' )
if id == ARGV[1] then
redis.call('RPOP', KEYS[1])
n = n + 1
else
redis.call('RPOPLPUSH', KEYS[1], KEYS[1])
end
l = l - 1
end
return n
Output:
$ redis-cli LPUSH list "foo|bar|1" "baz|qaz|2" "lua|redis|1"
(integer) 3
$ redis-cli --eval dellistbyqueryid.lua list , 1
(integer) 2
$ redis-cli LRANGE list 0 -1
1) "baz|qaz|2"

Multiple-get in one go - Redis

How can we use lua scripting to implement multi-get.
Let's say if I've set name_last to Beckham and name_first to David. What should the lua script be in order to get both name_last and name_first in one go?
I can implement something like:
eval "return redis.call('get',KEYS[1])" 1 foo
to get the value of single key. Just wondering on how to enhance that scripting part to get values related to all keys (or multiple keys) by just making one call to redis server.
First, you want to send the fields you want to return to EVAL (the 0 indicates that there are no KEYS so these arguments will be accessible from ARGV):
eval "..." 0 name_last name_first
Second, you can query the values for the individual fields using MGET:
local values = redis.call('MGET', unpack(ARGV))
Third, you can maps the values back to the field names (the index of each value corresponds to the same field):
local results = {}
for i, key in ipairs(ARGV) do
results[key] = values[i]
end
return results
The command you'll end up executing would be:
eval "local values = redis.call('MGET', unpack(ARGV)); local results = {}; for i, key in ipairs(ARGV) do results[key] = values[i] end; return results" 0 name_last name_first
Do a loop over the KEYS table and for each store its GET response in a take that you return. Untested code:
local t = {}
for _, k in pairs(KEYS) do
t[#t+1] = redis.call('GET', k)
end
return t
P.S. You can also use MGET instead btw :)

How can filter any SET by its concat value according to another SET in Redis

I have a filter optimization problem in Redis.
I have a Redis SET which keeps the doc and pos pairs of a type in a corpus.
example:
smembers type_in_docs.1
result: doc.pos pairs
array (size=216627)
0 => string '2805.2339' (length=9)
1 => string '2410.14208' (length=10)
2 => string '3516.1810' (length=9)
...
Another redis set i create live according to user choices
It contains selected docs.
smembers filteredDocs
I want to filter doc.pos pairs "type_in_docs" set according to user Doc id choices.
In fact if i didnt use concat values in set it was easy with SINTER.
So i implement a php filter code as below.
It works but need an optimization.
In big doc.pairs set too much time need. (Nearly After 150000 members!)
$concordance= $this->redis->smembers('types_in_docs.'.$typeID);
$filteredDocs= $this->redis->smembers('filteredDocs');
$filtered = array_filter($concordance, function($pairs) use ($filteredDocs) {
if( in_array(substr($pairs, 0, strpos($pairs, '.')), $filteredDocs) ) return true;
});
I tried sorted set with scores as docId.
Bu couldnt find a intersect or filter option for score values.
I am thinking and searching a Redis based solution with supported keys, sets or Lua script for time optimization.
But nothing find.
How can i filter Redis sets with concat values?
Thanks for helps.
Your code is slow primarily because you're moving a lot of data from Redis to your PHP filter. The general motivation here should be perform as much filtering as possible on the server. To do that you'd need to pay some sort of price in CPU & RAM.
There are many ways to do this, here's one:
Ensure you're using Redis v2.8.9 or above.
To allow efficiently looking for doc only, keep your doc.pos pairs as is but use Sorted Sets with score = 0, your e.g.:
ZADD type_in_docs.1 0 2805.2339 0 2410.14208 0 3516.1810
This will allow you to mimic SISMEMBER for doc in the set with:
ZRANGEBYLEX type_in_docs.1 [<$typeID> (<$typeID + "\xff">
You can now just SMEMBERS on the (usually) smaller filterDocs set and then call ZRANGEBYLEX on each for immediate gains.
If you want to do better - in extreme cases (i.e. large filterDocs, small type_in_docs) you should do the reverse.
If you want to do even better, use Lua to wrap up the filtering logic - something like:
-- #usage: redis-cli --filter_doc_pos.lua <filter set keyname> <type pairs keyname>
-- #returns: list of matching doc.pos pairs
local r = {}
for _, fv in pairs(redis.call("SMEMBERS", KEYS[1])) do
local t = redis.call("ZRANGEBYLEX", KEYS[2], "[" .. fv , "(" .. fv .. "\xff")
for _, tv in pairs(t) do
r[#r+1] = tv
end
end
return r

Resources