Cassandra cql comparator type counter

Cassandra cql comparator type counter - ruby-on-rails

i want to use the following code for updating a field...
##db.execute("UPDATE user_count SET counters = counters + #{val} WHERE cid = 1 ")
First time i tried that i got the following fail:
CassandraCQL::Error::InvalidRequestException: invalid operation for non commutative columnfamily user_count
I found out that i have to use the comparator counter, but i cant find how i can setup that with the cassandra-cql gem... does anybody know how i can get this to work?
below there is my code that does not work ...
##db.execute("CREATE COLUMNFAMILY user_count(cid varchar PRIMARY KEY, counters counter) with comparator = counter " )
##db.execute("INSERT INTO user_count (cid, counters) VALUES (?,?)", 1, 0)

You need to set default_validation=CounterColumnType instead of comparator.
##db.execute("CREATE COLUMNFAMILY user_count(cid varchar PRIMARY KEY, counters counter) with default_validation=CounterColumnType")
##db.execute("update user_count set counters = counters + 1 where cid = 1")
You must use 'update' to change the counter value, there is no insert syntax for counters (in CQL update and insert do the same thing so you can create new rows using update).
Currently you cannot have counters and non-counters in the same column family (from the wiki: "A column family either contains only counters, or no counters at all.")

Related

How to properly parameterize my postgresql query

I'm trying to parameterize my postgresql query in order to prevent SQL injection in my ruby on rails application. The SQL query will sum a different value in my table depending on the input.
Here is a simplified version of my function:
def self.calculate_value(value)
calculated_value = ""
if value == "quantity"
calculated_value = "COALESCE(sum(amount), 0)"
elsif value == "retail"
calculated_value = "COALESCE(sum(amount * price), 0)"
elsif value == "wholesale"
calculated_value = "COALESCE(sum(amount * cost), 0)"
end
query = <<-SQL
select CAST(? AS DOUBLE PRECISION) as ? from table1
SQL
return Table1.find_by_sql([query, calculated_value, value])
end
If I call calculate_value("retail"), it will execute the query like this:
select location, CAST('COALESCE(sum(amount * price), 0)' AS DOUBLE PRECISION) as 'retail' from table1 group by location
This results in an error. I want it to execute without the quotes like this:
select location, CAST(COALESCE(sum(amount * price), 0) AS DOUBLE PRECISION) as retail from table1 group by location
I understand that the addition of quotations is what prevents the sql injection but how would I prevent it in this case? What is the best way to handle this scenario?
EDIT: I added an extra column to be fetched from the table to highlight that I can't use pick to get one value.

find_by_sql is used when you want to populate objects with a single line of literal SQL. But we can use ActiveRecord for most of this, we just need one single column. To make objects, use select. If you just want results use pluck.
As you're picking from a fixed set of strings there's no risk of SQL injection in this code. Use Arel.sql to pass along a SQL literal you know is safe.
def self.calculate_value(result_name)
sum_sql = case result_name
when "quantity"
"sum(amount)"
when "retail"
"sum(amount * price)"
when "wholesale"
"sum(amount * cost)"
end
sum_sql = Arel.sql(
"coalesce(cast(#{sum_sql} as double precision), 0) as #{result_name}"
)
return Table1
.group(:location)
# replace pluck with select to get Table1 objects
.pluck(:location, sum_sql)
end

How to get the number of entries in a measurement

I am a newbie to influxdb. I just started to read the influx documentation.
I cant seem to get the equivalent of 'select count(*) from table' to work in influx db.
I have a measurement called cart:
time status cartid
1456116106077429261 0 A
1456116106090573178 0 B
1456116106095765618 0 C
1456116106101532429 0 D
but when I try to do
select count(cartid) from cart
I get the error
ERR: statement must have at least one field in select clause

I suppose cartId is a tag rather than a field value? count() currently can't be used on tag and time columns. So if your status is a non-tag column (a field), do the count on that.
EDIT:
Reference

This works as long as no field or tag exists with the name count:
SELECT SUM(count) FROM (SELECT *,count::INTEGER FROM MyMeasurement GROUP BY count FILL(1))
If it does use some other name for the count field. This works by first selecting all entries including an unpopulated field (count) then groups by the unpopulated field which does nothing but allows us to use the fill operator to assign 1 to each entry for count. Then we select the sum of the count fields in a super query. The result should look like this:
name: MyMeasurement
----------------
time sum
0 47799
It's a bit hacky but it's the only way to guarantee a count of all entries when no field exists that is always present in all entries.

PostgreSQL and ActiveRecord subselect for race condition

I'm experiencing a race condition in ActiveRecord with PostgreSQL where I'm reading a value then incrementing it and inserting a new record:
num = Foo.where(bar_id: 42).maximum(:number)
Foo.create!({
bar_id: 42,
number: num + 1
})
At scale, multiple threads will simultaneously read then write the same value of number. Wrapping this in a transaction doesn't fix the race condition because the SELECT doesn't lock the table. I can't use an auto increment, because number is not unique, it's only unique given a certain bar_id. I see 3 possible fixes:
Explicitly use a postgres lock (a row-level lock?)
Use a unique constraint and retry on fails (yuck!)
Override save to use a subselect, I.E.
INSERT INTO foo (bar_id, number) VALUES (42, (SELECT MAX(number) + 1 FROM foo WHERE bar_id = 42));
All these solutions seem like I'd be reimplementing large parts of ActiveRecord::Base#save! Is there an easier way?
UPDATE:
I thought I found the answer with Foo.lock(true).where(bar_id: 42).maximum(:number) but that uses SELECT FOR UDPATE which isn't allowed on aggregate queries
UPDATE 2:
I've just been informed by our DBA, that even if we could do INSERT INTO foo (bar_id, number) VALUES (42, (SELECT MAX(number) + 1 FROM foo WHERE bar_id = 42)); that doesn't fix anything, since the SELECT runs in a different lock than the INSERT

Your options are:
Run in SERIALIZABLE isolation. Interdependent transactions will be aborted on commit as having a serialization failure. You'll get lots of error log spam, and you'll be doing lots of retries, but it'll work reliably.
Define a UNIQUE constraint and retry on failure, as you noted. Same issues as above.
If there is a parent object, you can SELECT ... FOR UPDATE the parent object before doing your max query. In this case you'd SELECT 1 FROM bar WHERE bar_id = $1 FOR UPDATE. You are using bar as a lock for all foos with that bar_id. You can then know that it's safe to proceed, so long as every query that's doing your counter increment does this reliably. This can work quite well.
This still does an aggregate query for each call, which (per next option) is unnecessary, but at least it doesn't spam the error log like the above options.
Use a counter table. This is what I'd do. Either in bar, or in a side-table like bar_foo_counter, acquire a row ID using
UPDATE bar_foo_counter SET counter = counter + 1
WHERE bar_id = $1 RETURNING counter
or the less efficient option if your framework can't handle RETURNING:
SELECT counter FROM bar_foo_counter
WHERE bar_id = $1 FOR UPDATE;
UPDATE bar_foo_counter SET counter = $1;
Then, in the same transaction, use the generated counter row for the number. When you commit, the counter table row for that bar_id gets unlocked for the next query to use. If you roll back, the change is discarded.
I recommend the counter approach, using a dedicated side table for the counter instead of adding a column to bar. That's cleaner to model, and means you create less update bloat in bar, which can slow down queries to bar.

Order by nil value in column

I have table with column position, which in some cases, for some collection of records can be nil. I have default order options like
order('positions ASC')
id| name | position
1 5 null
2 6 null
3 7 null
If for some collection that I sort (example above), all values have null in position column, in which order I will get this collection from db?
I'm suggestion I will get collection in order of ids (1,2,3). Am I correct?
Addition #1: DB - Postgresql

According Postgres manual, if no sorting clause the records are returned according with physical position at the disk. It says nothing for sorted records with equal values on sort fields. But, it uses b-tree and, like clasic db managers, it must return on the order stored at the b-tree. You must expect that each of this change on db reorganization.
At the end, there are no warranty on the order of records with same values on sort fields.
Note: using Postgres you can make the NULL values at the first or the last (it is detailed at the referrer link).
At this related question, I'm agree with #macek.

You can do something like this.
Cats:
id| name | position
1 5 null
2 6 null
3 7 not_null
nil = Cat.order("id ASC").where(position: nil) = [1, 2]
not_nil = Cat.order("id ASC").where("position is not null") = [3]
not_nil + nil = [3, 1, 2]
This preserves order.

MySQL stored procedure causing problems?

EDIT:
I've narrowed my mysql wait timeout down to this line:
IF #resultsFound > 0 THEN
INSERT INTO product_search_query (QueryText, CategoryId) VALUES (keywords, topLevelCategoryId);
END IF;
Any idea why this would cause a problem? I can't work it out!
I've written a stored proc to search for products in certain categories, due to certain constraints I came across, I was unable to do what I wanted (limiting, but whilst still returning the total number of rows found, with sorting, etc..)
It's meant splits up a string of category Ids, from 1,2,3 in to a temporary table, then builds the full-text search query based on sorting options and limits, executes the query string and then selects out the total number of results.
Now, I know I'm no MySQL guru, very far from it, I've got it working, but I keep getting time outs with product searches etc. So I'm thinking this may be causing some kind of problem?
Does anyone have any ideas how I can tidy this up, or even do it in a much better way that I probably don't know about?
Thanks.
DELIMITER $$
DROP PROCEDURE IF EXISTS `product_search` $$
CREATE DEFINER=`root`#`localhost` PROCEDURE `product_search`(keywords text, categories text, topLevelCategoryId int, sortOrder int, startOffset int, itemsToReturn int)
BEGIN
declare foundPos tinyint unsigned;
declare tmpTxt text;
declare delimLen tinyint unsigned;
declare element text;
declare resultingNum int unsigned;
drop temporary table if exists categoryIds;
create temporary table categoryIds
(
`CategoryId` int
) engine = memory;
set tmpTxt = categories;
set foundPos = instr(tmpTxt, ',');
while foundPos <> 0 do
set element = substring(tmpTxt, 1, foundPos-1);
set tmpTxt = substring(tmpTxt, foundPos+1);
set resultingNum = cast(trim(element) as unsigned);
insert into categoryIds (`CategoryId`) values (resultingNum);
set foundPos = instr(tmpTxt,',');
end while;
if tmpTxt <> '' then
insert into categoryIds (`CategoryId`) values (tmpTxt);
end if;
CASE
WHEN sortOrder = 0 THEN
SET #sortString = "ProductResult_Relevance DESC";
WHEN sortOrder = 1 THEN
SET #sortString = "ProductResult_Price ASC";
WHEN sortOrder = 2 THEN
SET #sortString = "ProductResult_Price DESC";
WHEN sortOrder = 3 THEN
SET #sortString = "ProductResult_StockStatus ASC";
END CASE;
SET #theSelect = CONCAT(CONCAT("
SELECT SQL_CALC_FOUND_ROWS
supplier.SupplierId as Supplier_SupplierId,
supplier.Name as Supplier_Name,
supplier.ImageName as Supplier_ImageName,
product_result.ProductId as ProductResult_ProductId,
product_result.SupplierId as ProductResult_SupplierId,
product_result.Name as ProductResult_Name,
product_result.Description as ProductResult_Description,
product_result.ThumbnailUrl as ProductResult_ThumbnailUrl,
product_result.Price as ProductResult_Price,
product_result.DeliveryPrice as ProductResult_DeliveryPrice,
product_result.StockStatus as ProductResult_StockStatus,
product_result.TrackUrl as ProductResult_TrackUrl,
product_result.LastUpdated as ProductResult_LastUpdated,
MATCH(product_result.Name) AGAINST(?) AS ProductResult_Relevance
FROM
product_latest_state product_result
JOIN
supplier ON product_result.SupplierId = supplier.SupplierId
JOIN
category_product ON product_result.ProductId = category_product.ProductId
WHERE
MATCH(product_result.Name) AGAINST (?)
AND
category_product.CategoryId IN (select CategoryId from categoryIds)
ORDER BY
", #sortString), "
LIMIT ?, ?;
");
set #keywords = keywords;
set #startOffset = startOffset;
set #itemsToReturn = itemsToReturn;
PREPARE TheSelect FROM #theSelect;
EXECUTE TheSelect USING #keywords, #keywords, #startOffset, #itemsToReturn;
SET #resultsFound = FOUND_ROWS();
SELECT #resultsFound as 'TotalResults';
IF #resultsFound > 0 THEN
INSERT INTO product_search_query (QueryText, CategoryId) VALUES (keywords, topLevelCategoryId);
END IF;
END $$
DELIMITER ;
Any help is very very much appreciated!

There is little you can do with this query.
Try this:
Create a PRIMARY KEY on categoryIds (categoryId)
Make sure that supplier (supplied_id) is a PRIMARY KEY
Make sure that category_product (ProductID, CategoryID) (in this order) is a PRIMARY KEY, or you have an index with ProductID leading.
Update:
If it's INSERT that causes the problem and product_search_query in a MyISAM table the issue can be with MyISAM locking.
MyISAM locks the whole table if it decides to insert a row into a free block in the middle of the table which can cause the timeouts.
Try using INSERT DELAYED instead:
IF #resultsFound > 0 THEN
INSERT DELAYED INTO product_search_query (QueryText, CategoryId) VALUES (keywords, topLevelCategoryId);
END IF;
This will put the records into the insertion queue and return immediately. The record will be added later asynchronously.
Note that you may lose information if the server dies after the command is issued but before the records are actually inserted.
Update:
Since your table is InnoDB, it may be an issue with table locking. INSERT DELAYED is not supported on InnoDB.
Depending on the nature of the query, DML queries on InnoDB table may place gap locks which will lock the inserts.
For instance:
CREATE TABLE t_lock (id INT NOT NULL PRIMARY KEY, val INT NOT NULL) ENGINE=InnoDB;
INSERT
INTO t_lock
VALUES
(1, 1),
(2, 2);
This query performs ref scans and places the locks on individual records:
-- Session 1
START TRANSACTION;
UPDATE t_lock
SET val = 3
WHERE id IN (1, 2)
-- Session 2
START TRANSACTION;
INSERT
INTO t_lock
VALUES (3, 3)
-- Success
This query, while doing the same, performs a range scan and places a gap lock after key value 2, which will not let insert key value 3:
-- Session 1
START TRANSACTION;
UPDATE t_lock
SET val = 3
WHERE id BETWEEN 1 AND 2
-- Session 2
START TRANSACTION;
INSERT
INTO t_lock
VALUES (3, 3)
-- Locks

Try wrapping your EXECUTE with the following:
SET SESSION TRANSACTION ISOLATION LEVEL READ UNCOMMITTED ;
EXECUTE TheSelect USING #keywords, #keywords, #startOffset, #itemsToReturn;
SET SESSION TRANSACTION ISOLATION LEVEL REPEATABLE READ ;
I do something similiar in TSQL for all report stored proc and searches where repeatable reads aren't important to reduce locking/blocking issues with other processes running on the database.

Turn on slow queries, that will give you an idea of what is taking so long to execute that there is a timeout.
http://dev.mysql.com/doc/refman/5.1/en/slow-query-log.html
Pick the slowest query and optimise that. then run for a while and repeat.
There is some excellent information and tools here http://hackmysql.com/nontech
DC
UPDATE:
Either you have a network problem causing the timeout, if you are using a local mysql instance then that is unlikely, OR something is locking a table for far too long causing a timeout. the process that is locking the table or tables for far too long will be listed in the slow log as a slow query. you can also get the slow log query to display any queries that fail to use an index resulting in an inefficient query.
If you can get the problem to occur while you are present then you can also use a tool like phpmyadmin or the commandline to run "SHOW PROCESSLIST\G" this will give you a list of what queries are running while the problem is occurring.
You think the problem is in your insert statement, therefore something is locking that table. therefore you need to find what is locking that table, therefore you need to find what is running so slow its locking the table for far too long. Slow queries is one way to do that.
Other things to look at
CPU - is it idle or running at full pelt
IO - is io causing holdups
RAM - are you swapping all the time (will cause excessive io)
Does the table product_search_query use an index?
What is the primary key?
If your index uses strings that are too long? you may build a huge index file that causes very slow inserts (slow query log will also show that)
And yes the problem may be elsewhere, but you must start somewhere mustn't you.
DC

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart