So I'm trying to pivot a table using a stored procedure in Redshift. The issue is that the result set is dynamic. That means that we'd need to be able to dynamically pivot the table below. This is what I am trying to pivot:
| object_uid | field | value |
|------------|----------|----------|
| post:1 | field_1_a| test |
| post:2 | field_2_a| turtle |
| post:2 | field_2_b| frog |
| post:3 | field_3_a| mountain |
| ...... | ..... | ...... |
|------------|----------|----------|
This would be pivoted into the following:
| object_uid | field_1_a| field_2_a| field_2_b| field_3_a|
|------------|----------|----------|----------|----------|
| post:1 | test | | | |
| post:2 | | turtle | frog | |
| post:3 | | | | mountain |
| ....... | ..... | ....... | ....... | ....... |
|------------|----------|----------|----------|----------|
Essentially I am trying to construct a chained string of column names (the field_* columns) via SELECT LISTAGG statement in the subquery, and trying to interpolate that statement's output in the CREATE TABLE sql statement. Then once the CREATE TABLE sql statement is constructed, the sql statement gets executed via the EXECUTE command.
However, This is not behaving as expected. I am a relative newcomer to Redshift, so I apologize in advance if this is a terrible way to go about pivoting a table from a tall one to a wide one. This is the code that I have so far:
CREATE OR REPLACE PROCEDURE my_proc (
tmp_name INOUT varchar(256)
) AS $$
DECLARE
sql VARCHAR(MAX) := '';
BEGIN
WITH pivot_output AS (
SELECT LISTAGG(temp.output, ', ') WITHIN GROUP (ORDER BY temp.output) AS metadata FROM
(
SELECT DISTINCT
'MAX(IF(cm.metadata = ''' || metadata || ''',cm.field_value,NULL)) AS ' || QUOTE_LITERAL(metadata)
AS output
FROM "content_metadata" cm
WHERE cm."source_uid_type" = 'post'
) AS temp
);
sql = 'CREATE TABLE ' || tmp_name || ' AS SELECT cm.object_uid, ' || pivot_output.metadata || ' FROM content_metadata cm GROUP BY cm.object_uid';
EXECUTE sql;
END;
$$ LANGUAGE plpgsql;
CALL my_proc ('output_table');
I get the following error when trying to execute the above:
The database reported a syntax error: Amazon Invalid operation: syntax error at or near "$1";
A little bit stumped by the error. Does anyone have any clues / suggestions?
I added a working example of emulating the PIVOT FOR syntax in our GitHub repo "Amazon Redshift Utils". https://github.com/awslabs/amazon-redshift-utils/blob/master/src/StoredProcedures/sp_pivot_for.sql
I hope this is useful for you. Let me know if you have any issues with it.
Related
I have a DOCX file with this content:
# Heading
+---------------------+
| Paragraph |
| |
| ## Subheading |
| |
| +-----------------+ |
| | Nested table | |
| +-----------------+ |
+---------------------+
One last paragraph
Here is a sample file.
I want to run it through Pandoc and get this Markdown, with all tables unwrapped:
# Heading
Paragraph
## Subheading
Nested table
One last paragraph
I'm trying to write a Lua filter with walk_block but I have no experience with Lua and not making any progress. Can anyone point me in a helpful direction?
function Table(table)
pandoc.walk_block(table, {
Str = function(el)
-- TODO now what???
end
})
end
The Lua interface to tables is currently rather complex, so it's much simpler to convert the table into a so-called simple table. We can do so with pandoc.utils.to_simple_table. A simple table has a header row (header) and multiple body rows (rows), and we get access to cells by iterating over a row. Each cell is just a Blocks list, which we can collect in an accumulator.
Here's how this looks like:
function Table (tbl)
local simpleTable = pandoc.utils.to_simple_table(tbl)
local blocks = pandoc.Blocks{}
for _, headercell in ipairs(simpleTable.header) do
blocks:extend(headercell)
end
for _, row in ipairs(simpleTable.rows) do
for _, cell in ipairs(row) do
blocks:extend(cell)
end
end
return blocks, false
end
Running that filter should unwrap all tables, leaving just their contents.
I am new to ruby on rails but I am not finding the meaning of this line of code. I saw in the documentation of rails that select will build an array of objects from the database for the scope, converting them into an array and iterating through them using Array#select. Anyway I can’t understand the result of this line of code and on what it consists.
model.legal_storages.select{|storage| storage.send(document_type)==true}.last
model.legal_storages.select { |storage| storage.send(document_type) == true }.last - From the result of the last operation, select only the last element.
| | |
| | --------------------- For each element in model.legal_stores invoke
| | the method that is held by the variable document_type
| | and check if it's equal to true.
| |
| --------- Over the result of the last method,
| call select to filter those elements where
| condition in the block evaluates to true.
|
------------------- Invoke the method legal_stores in model.
I'm wondering how one might handle a query like this. Let's suppose I had the following text contained in Cell A2 of a spreadsheet:
Case Bakers' Flats 12" White Flour Tortillas 10/12ct
and needed to put the following formula into B2:
=QUERY(importrange("KEY", "DATA!A1:Z1000"), "select Col24 where (Col1 = '"&A2&"')")
It would produce an error.
My question is: Is there any way to avoid tripping up the query when the string I am using contains any assortment of quotation marks and apostrophes?
Short answer
To escape single quote / apostrophe, embrace the string containing an apostrophe between double quotes (").
To escape double quotes, apply a double substitution, first to remove the double-quotes, then to add them again.
Explanation
Google Sheets QUERY built-in function automatically escape some characters by internally adding \ before single quotes but it's doesn't work when the cell value to be used as the source for the criteria includes double quotes. As a workaround, the the use of double substitution is proposed.
Example for single quote / apostrophe
Below table represents and spreadsheet range that contains
Column A: The data source
Cell B1: The data value to be used in the criteria expression
Cell C1: The following formula =QUERY(A:A,"SELECT * WHERE A = """&B1&""" ")
+---+---------+-----+-----+
| | A | B | C |
+---+---------+-----+-----+
| 1 | I'm | I'm | I'm |
| 2 | You're | | |
| 3 | It's | | |
| 4 | I am | | |
| 5 | You are | | |
| 6 | It is | | |
+---+---------+-----+-----+
Example for single quote / apostrophe and double quotes
=SUBSTITUTE(
QUERY(
SUBSTITUTE(A:A,"""","''"),
"SELECT * WHERE Col1 = """&SUBSTITUTE(B1,"""","''")&""""
),
"''",""""
)
Note that instead of using a the letter A as identifier of the data source column it's used Col1.
Reference
https://developers.google.com/chart/interactive/docs/querylanguage
=query('Book list'!A:F,"select B,C,D,E where F=""free"" and D="""&E1&"""") - because:
Here are the formats for each type of literal:
String literals should be enclosed in either single or double quotes. Examples: "fourteen", 'hello world', "It's raining".
as mentioned in the reference above.
I have a feature that logs into a trading system and keys a number of trades. Theres a lot of reusable steps at the beginning of each trade (initial trade set up) But each trade has different arguments.
Here is an example
Scenario: Trade 1
Given I have selected my test data: "20003"
And I have connected to VMS with the following details:
| Field | Value |
| Username | user |
| Password | password |
| Session | myServer |
When I run the DCL command to set my privileges to full
Then I expect to see the following:
| Pass Criteria | Timeout |
| Privileges Set | 00:00:30 |
When I ICE to the test account: "System Test"
Then I expect to be ICED see the following:
| Pass Criteria | Timeout |
| "ICED to System Test" | "00:00:10" |
When I run a dcl to delete the company: "Test_Company"
Then I expect to see a confirmation that company: "Test_Company" has been deleted or doesnt exist
So within those steps the 2 things that could change is the "Given" argument so the test data ID and also the Test company at the end.
What I wanted was some way to run a background step so that its being able to know what parameters to enter. So if it was Trade 1 for example it would enter 20003, if it was Trade 2 enter 20004 etc.
Can I do this? I was thinking using the "Example" table that Scenario Outline uses. Or is there a better way to do this? I dont want these repeatable steps in all of my scenarios as it takes up lots of room and doesnt look too readable.
So I did some searching and couldn't find a solution that didn't require a lot of coding so I made this up:
this is what the background looks like
Background:
Given I have selected my test data:
| Scenario | ID |
| DirectCredit_GBP | 20003 |
| Cheque_GBP | 20004 |
| ForeignCheque_GBP | 20005 |
And in order to find which row it should use the method behind it uses ScenarioContext. Here is the method:
[Given(#"I have selected my test data:")]
[When(#"I have selected my test data:")]
public static void setTestDataID(Table data)
{
string scenario = ScenarioContext.Current.ScenarioInfo.Title;
string testDataId = data.ReadTable("Scenario", scenario, "ID"));
TestDriver.LoadTestData(testDataId);
}
What the method does is search the table for the scenario name (using an extension method I wrote) and get the ID, once its got the ID it passes it into my TestDriver method.
It seems to work fine and keeps the test readable.
I am using the TextDirectoryLoader in weka which takes in as an input a directory which has the training data as files arranged in folders and each folder indicates a class label. I pass the test_example directory name as an argument. The training part is fine.
Example:
+- text_example
|
+- class1
| |
| + file1.txt
| |
| + file2.txt
| |
| ...
|
+- class2
| |
| + another_file1.txt
| |
| + another_file2.txt
| |
| ...
The above illustration borrowed from here
For testing and predicting labels, I create a similar structure.
+- predictor_unknowns
|
+- unknown
| |
| + unknownfile1.txt
| |
| + unknownfile2.txt
| |
| ...
I again pass the director predictor_unknowns as an arguement to TextDirectoryLoader and I can see the predicting is done fine but I am not sure how to print the file name for which the preidiction is happening. I need to print unknownfile1.txt,unknownfile2.txt etc for which the prediction is happening.
Hope the question is clear enough.
In weka, those text files and classes become an Instance and the filenames are not saved in Instance class.
Instead, you can get the text content of that file which got classified.
double pred = 0d;
Instance current = getInstance();
pred = classifier.classifyInstance(current);
System.out.println("\nText: "+current.attribute(0)); // Change index according to your dataset
System.out.println("Class: "+tempInstances.classAttribute().value((int) pred));
In the interest of benefiting others who may have this question, the documentation for the TextDirectoryLoader explains that you can save the filename as an extra attribute.
On the command line, just add the -F flag.
In Java code, you can use the following line (tdl is an instance of TextDirectoryLoader):
tdl.setOutputFilename(true);
As long as you do not run the dataset through any filters, each instance will have a string attribute called "filename". If you are planning to run the dataset through filters, it may be useful to use a FilteredClassifier so that you can still access the filename.