How to order the data back from Amazon simpleDB int specific column order - amazon-simpledb

I'm using Amazon's SimpleDB Java client to read data from SimpleDB. The problem I have is even though I specified the columns in the some order in the SelectRequest like the following:
SelectRequest req = new SelectRequest("SELECT TIMESTAMP, TYPE, APP, http_status, USER_ID from mydata");
SElectResult res = _sdb.select(req);
..
It returned data in following column order:
APP, TIMSTAMP, TYPE, USER_ID, http_status,
It seems it automatically reordered the columns in ascend order. Is there any way I can force the order as I specified in the select clause?

The columns returned are not an ordered list but an unordered set of attributes. You can't control the order they come back in. SELECT is designed to work even in cases where some of the attributes in your query don't exist for every (or any) returned items. In those cases specifically you wouldn't be able to rely on order anyway. I realize that's small consolation if you have structured your data set so that the attributes are always present.
However, since you know the desired order ahead of time, it should be pretty easy to pull the data out of the result in the proper order. It's just XML after all, or in the case of the Java client, freshly parsed XML.
The Select operation returns a set of Attributes for ItemNames that match the select expression.
SimpleDB docs for SELECT

Related

RoR API - when I switch db from sqlite to postgres, any updated objects in API array are moved to end of array

When I update an object in my sqlite API with ajax, it keeps the order of my object array - so the front end looks the same. When I update an object in the API after switching the db to postgres, it changes the order of the array - mostly placing the updated objects at the end of the array. Any ideas what's going on here?
I've tried deleting and remaking the database, no luck. I switched back to sqlite and is working normally again.
In SQL order is not guaranteed. If you desire a particular order, the safest thing to do is to add a sort key to your records, and make sure you're doing an ORDER BY on your select statement.
The fact that SQLite is preserving your ordering is kind of a "mistake" of implementation. You should not rely on the engine to do anything outside the specification.
Quote from the Postgres docs:
After a query has produced an output table (after the select list has been processed) it can optionally be sorted. If sorting is not chosen, the rows will be returned in an unspecified order. The actual order in that case will depend on the scan and join plan types and the order on disk, but it must not be relied on. A particular output ordering can only be guaranteed if the sort step is explicitly chosen.
That said: without an explicit ORDER clause the order of the returned records is kind of random.

Implementing a unique surrogate key in Advantage Database Server

I've recently taken over support of a system which uses Advantage Database Server as its back end. For some background, I have years of database experience but have never used ADS until now, so my question is purely about how to implement a standard pattern in this specific DBMS.
There's a stored procedure which has been previously developed which manages an ID column in this manner:
#ID = (SELECT ISNULL(MAX(ID), 0) FROM ExampleTable);
#ID = #ID + 1;
INSERT INTO Example_Table (ID, OtherStuff)
VALUES (#ID, 'Things');
--Do some other stuff.
UPDATE ExampleTable
SET AnotherColumn = 'FOO'
WHERE ID = #ID;
My problem is that I now need to run this stored procedure multiple times in parallel. As you can imagine, when I do this, the same ID value is getting grabbed multiple times.
What I need is a way to consistently create a unique value which I can be sure will be unique even if I run the stored procedure multiple times at the same moment. In SQL Server I could create an IDENTITY column called ID, and then do the following:
INSERT INTO ExampleTable (OtherStuff)
VALUES ('Things');
SET #ID = SCOPE_IDENTITY();
ADS has autoinc which seems similar, but I can't find anything conclusively telling me how to return the value of the newly created value in a way that I can be 100% sure will be correct under concurrent usage. The ADS Developer's Guide actually warns me against using autoinc, and the online help files offer functions which seem to retrieve the last generated autoinc ID (which isn't what I want - I want the one created by the previous statement, not the last one created across all sessions). The help files also list these functions with a caveat that they might not work correctly in situations involving concurrency.
How can I implement this in ADS? Should I use autoinc, some other built-in method that I'm unaware of, or do I genuinely need to do as the developer's guide suggests, and generate my unique identifiers before trying to insert into the table in the first place? If I should use autoinc, how can I obtain the value that has just been inserted into the table?
You use LastAutoInc(STATEMENT) with autoinc.
From the documentation (under Advantage SQL->Supported SQL Grammar->Supported Scalar Functions->Miscellaneous):
LASTAUTOINC(CONNECTION|STATEMENT)
Returns the last used autoinc value from an insert or append. Specifying CONNECTION will return the last used value for the entire connection. Specifying STATEMENT returns the last used value for only the current SQL statement. If no autoinc value has been updated yet, a NULL value is returned.
Note: Triggers that operate on tables with autoinc fields may affect the last autoinc value.
Note: SQL script triggers run on their own SQL statement. Therefore, calling LASTAUTOINC(STATEMENT) inside a SQL script trigger would return the lastautoinc value used by the trigger's SQL statement, not the original SQL statement which caused the trigger to fire. To obtain the last original SQL statement's lastautoinc value, use LASTAUTOINC(CONNECTION) instead.
Example: SELECT LASTAUTOINC(STATEMENT) FROM System.Iota
Another option is to use GUIDs.
(I wasn't sure but you may have already been alluding to this when you say "or do I genuinely need to do as the developer's guide suggests, and generate my unique identifiers before trying to insert into the table in the first place." - apologies if so, but still this info might be useful for others :) )
The use of GUIDs as a surrogate key allows either the application or the database to create a unique identifier, with a guarantee of no clashes.
Advantage 12 has built-in support for a GUID datatype:
GUID and 64-bit Integer Field Types
Advantage server and clients now support GUID and Long Integer (64-bit) data types in all table formats. The 64-bit integer type can be used to store integer values between -9,223,372,036,854,775,807 and 9,223,372,036,854,775,807 with no loss of precision. The GUID (Global Unique Identifier) field type is a 16-byte data structure. A new scalar function NewID() is available in the expression engine and SQL engine to generate new GUID. See ADT Field Types and Specifications and DBF Field Types and Specifications for more information.
http://scn.sap.com/docs/DOC-68484
For earlier versions, you could store the GUIDs as a char(36). (Think about your performance requirements here of course.) You will then need to do some conversion back and forth in your application layer between GUIDs and strings. If you're using some intermediary data access layer, e.g. NHibernate or Entity Framework, you should be able to at least localise the conversions to one place.
If some part of your logic is in a stored procedure, you should be able to use the newid() or newidstring() function, depending on the type of the backing column:
INSERT INTO Example_Table (newid(), OtherStuff)

Assign Key Field Value Only If Corresponding Lookup Result value Exist

I have ten master tables and one Transaction table. In my transaction table (it is a memory table just like ClientDataSet) there are ten lookup fields pointing to my ten master tables.
Now i am trying to dynamically assigning key field values to all my lookup key field values (of the transaction table) from a different Server(data is coming as a soap xml). Before assigning these values i need to check whether the corresponding result value is valid in master tables or not. I am using a filter (eg status = 1 ) to check whether it is valid or not.
Currently how we are doing is, before assigning each key field value we are filtering the master tables using this filter and using the locate function to check whether it is there or not. and if located we will assign its key field value.
This will work fine if there is only few records in my master tables. Consider my master tables having fifty thousand records each (yeah, customer is having so much data), this will lead to big performance issue.
Could you please help me to handle this situation.
Thanks
Basil
The only way to know if it is slow, why, where, and what solution works best is to profile.
Don't make a priori assumptions.
That being said, minimizing round trips to the server and the amount of data transferred is often a good thing to try.
For instance, if your master tables are on the server (not 100% clear from your question), sending only 1 Query (or stored proc call) passing all the values to check at once as parameters and doing a bunch of "IF EXISTS..." and returning all the answers at once (either output params or a 1 record dataset) would be a good start.
And 50,000 records is not much, so, as I said initially, you may not even have a performance problem. Check it first!

How best to map this structure onto amazon SimpleDB

So simpledb has a kind of spreadsheet data model.
I have an app that simply needs to store keys against values. Except that a single key can have multiple values.
There will be multiple clients. Each client has an id with it's own set of keys.
I'd like to stick with a single domain if I can at this stage.
How can I map this onto simpleDB?
I was thinking
domain = mydomain
item = clientid
attribute.n.name = key_1 ... key_n
attribute.n.value = val1 ... valn
That would satisfy the ability to store multiple values for the same key.
But then I found that I need to either get ALL attributes in my select or know example
how many attributes I have. I will not know this up front.
Also I allow deleting a specific value from a key (or attribute). I will have to search for it first. It seems that in the select there is no attributeName() function, just the itemName() function.
Would it perhaps be better to make the item name a combination of id + key + _n ?
e.g. if the id is 'myid' and the key is 'boots' then the item name would be
'myidboots_1'
And then have a single attribute per item called say 'keyval'.
and I can do a
select 'keyval' where itemName like 'myidboots_%' ?
Still kindof cumbersome compared to a normal sql database.
Maybe I should try encoding the values like a comma separated list?
Except that it's probably more cumbersome and also I've read that there is a 1000 character limit.
Any other suggestions?
I'm not sure I totally follow your question, but I think it might be helpful to point out that SimpleDB lets you do classic SQL style queries like:
select * from foo where bar = '1'
This will return all the attributes/values for the resulting records.

Fetch data from multiple tables and sort all by their time

I'm creating a page where I want to make a history page. So I was wondering if there is any way to fetch all rows from multiple tables and then sort by their time? Every table has a field called "created_at".
So is there any way to fetch from all tables and sort without having Rails sorting them form me?
You may get a better answer, but I would presume you would need to
Create a History table with a Created date column, an autogenerated Id column, and any other contents you would like to expose [eg Name, Description]
Modify all tables that generate a "history" item to consume this new table via Foreign Key relationship on History.Id
"Mashing up" tables [ie merging different result sets into a single result set] is a very difficult problem, but you would effectively be doing the above anyway - just in the application layer, so why not do it correctly and more efficiently in the data layer.
Hope this helps :)
You would need to perform the sql like:
Select * from table order by created_at incr
: Store this into an array. Do this for each of the data sources, and then perform a merge sort on all the arrays in Ruby. Of course this will work well for small data sets, but once you get a data set that is large (ie: greater than will fit into memory) then you will have to use a different collect/merge algorithm.
So I guess the answer is that you do need to perform some sort of Ruby, unless you resort to the Union method described in another answer.
Depending on whether these databases are all on the same machine or not:
On same machine: Use OrderBy and UNION statements in your sql to return your result set
On different machines: You'll want to test this for performance, but you could use Linked Servers and UNION, ORDER BY. Alternatively, you could have ruby get the results from each db, and then combine them and sort
EDIT: From your last comment about different tables and not DB's; use something like this:
SELECT Created FROM table1
UNION
SELECT Created FROM table2
ORDER BY created

Resources