Is a multi-valued stored procedure parameter just bad practice? - stored-procedures

I have the strange aversion to passing in multiple ID parameters to a single stored procedure. For example, this feels just wrong:
GetMyObject(ListofIDs, OtherParam1, OtherParam2, ...)
I understand HOW to do it (correctly if I must).. but I don't feel like I should do it. I feel like it defeats the purpose of a "get item" stored procedure/sub routine. I feel like I should build my SPs to support appropriate filter parameters. If my caller has a list of IDs, shouldn't they call the sp that many times?
Help?

A "get item by ID" routine should never return more than one object, because that makes absolutely no linguistic sense.
A "get items by IDs" routine? Sure, if you have a decent use-case for it and it'll be used often enough.
But most of the time, yes, instead of a routine returning multiple items by ID, you want a routine that returns items based on application-appropriate filtering parameters (e.g. "give me all transactions from January 8th for more than $10").
By the way, sometimes a range of IDs (e.g. everything between 5 and 10) is a perfectly valid set of filters!
Incidentally, this isn't necessarily just a MySQL or SQL-in-general issue. Almost any sort of dataset querying API in any language will present these same design questions, and their answers will usually be highly similar.

Related

SSIS Merge Join component wrote 0 rows

First of all, thanks to the community for the amount of information on the site, helped me a lot with C# and SSIS. The second thing is that i'm not very good with english, so please be patient, if you don't understand something, please ask, i'll try to make it better.
I got 2 OLEDB connection source from different databases, both tables got a column with an ID that I use as a Join Key. In RUT CRUZADOS, the ID its a float datatype, while in the other source (CTACTE AÑO PAS) I don't know which type of data it is (I can't open the database with sql server, i can only do SELECT operations).
When I combine them in the Merge,it doesn't return me any mistake, but when I run the program, this happens.
[SSIS.Pipeline] Information: "component "CARGOS ABONOS" (239)" wrote 0
rows.
In Microsoft Access, the "Inner Join" returns like 4 Millions of rows. I think the problem its the metadata but i dont know how to use the "Data Conversion". Can someone help me please.
Thank you all
You can view the data types, at least as far as SSIS is concerned by double clicking on connector lines. In the Data Flow Path Editor that pops up, the Metadata tab will describe the column types.
That said, it doesn't matter because the Merge Join transformation is only going to allow you to merge data of the same type.
A Merge Join requires the source system data to be sorted. This is accomplished by either adding sort components into the stream (not recommended as this is an asynchronous transform that eats all your memory and kills your performance) or by explicitly sorting in your source systems and then marking them as sorted in the Advanced tab.
Since I don't see a Sort, that leads me to believe the sort is done in the source systems. Or, the sorts are not done there but someone has marked the output as sorted. There must be explicit ORDER BY clauses in those source queries. Sometimes, SQL Server will return data in the same order but unless there is an ORDER BY, it cannot be guaranteed. (I wish I could use the flash tag to emphasis the last point).
Future readers, if you have a sort in both systems and they are both sorted on the same column, then you need to examine collations. Case Insensitive is a different beast than Case Sensitive and a sort on an ASCII based system yields a different sort than one using EBCIDIC for mixed alpha-numeric like I once had...
As the source data type appears to be floats, then sorting is not the likely culprit. The realization is dawning on me, instead of sort issues, you likely have an uglier and more insidious comparison issue. Floating point numbers are approximations. 1=1 but 1.00000000000(etc) may or may not be equal to 1.0000000000(etc)1
Do you actually need the decimal places to make the match? If not, casting to an integer in both (and sorting on the CAST'ed value) systems should make these matches work. If there are decimal places that matter, then you're going to need to cast that into an exact numeric type (and pray that they both convert in the same way). The fact that Access does it leads me to believe Integer data type will be your salvation.

Is it bad to change _id type in MongoDB to integer?

MongoDB uses ObjectId type for _id.
Will it be bad if I make _id an incrementing integer?
(With this gem, if you're interested)
No it isn't bad at all and in fact the built in ObjectId is quite sizeable within the index so if you believe you have something better then you are more than welcome to change the default value of the _id field to whatever.
But, and this is a big but, there are some considerations when deciding to move away from the default formulated ObjectId, especially when using the auto incrementing _ids as shown here: https://docs.mongodb.com/v3.0/tutorial/create-an-auto-incrementing-field
Multi threading isn't such a big problem because findAndModify and the atomic locks can actually take care of that, but then you just hit into your first problem. findAndModify is not the fastest function nor the lightest and there have been significant performance drops noticed when using it regularly.
You also have to consider the overhead of doing this yourself anyway, even without findAndModify. For every insert you will need an extra query. Imagine having a unique id that you have to query the uniqueness of every time you want to insert. Eventually your insert rate will drop to a crawl and your lock time will build up.
Of course the ObjectId is really good at being unique without having to check or formulate its own uniqueness by touching the database prior to insertion, hence it doesn't have this overhead.
If you still feel an integer _id suites your scenario, then go for it, but bare in mind the overhead described above.
You can do it, but you are responsible to make sure that the integers are unique.
MongoDB doesn't support auto-increment fields like most SQL databases. When you have a distributed or multithreaded application which has multiple processes and/or threads which create new database entries, you have to make sure that they use the same counter. Otherwise it could happen that two threads try to store a document with the same _id in the database.
When that happens, one of them will fail. That means you have to wait for the database to return a success or error (by calling GetLastError or by setting the write concerns to acknowledged), which takes longer than just sending data in a fire-and-forget manner.
I had a use case for this: replacing _id with a 64 bit integer that represented a simhash of a document index for searching.
Since I intended to "Get or create", providing the initial simhash, and creating a new record if one didn't exist was perfect. Also, for anyone Googling, MongoDB support explained to me that simhashes are absolutely perfect for sharding and scaling, and even better than the more generic ObjectId, because they will divide up the data across shards perfectly and intrinsically, and you get the key stored for negative space (a uint64 is much smaller than an objectId and would need to be stored anyway).
Also, for you Googlers, replacing a MongoDB _id with something other than an objectId is absolutely simple: Just create an object with the _id being defined; use an integer if you like. That's it: Mongo will simply use it. If you try to create a document with the same _id you'll get an error (E11000/Duplicate key). So like me, if you're using simhashing, this is ideal in all respects.

Questions about implementing surrogate key in Ruby on Rails

For an upcoming project we need to have unique real world identifiers that are exposed to users for things like Account Numbers or Case Numbers (like a bug tracking ID). These will always be system generated and unchangeable. Right now we plan to run strictly on Heroku.
While (as my name would suggest) I am new to the wonderfulness that is Ruby on Rails, I have a long background in enterprise application development. I'm trying to bridge between what I have done in the past while doing in the "RoR way"
Obviously RoR has wonderful primary key support. I have read dozens of posts here recommending to adapt business requirements to just use the out of the box id/key methodology.
So let me describe what I am trying to accomplish and please let me know if you have faced similar objectives and what approach you took.
1) Would like to have a human readable key with a consistent length. There is value in always having an Account ID or Transaction ID that is the same length (for form validation, training sales staff, etc.) Using Ruby's innate key generation one could just add buffer characters (e.g. 100000 instead of 1).
2) Compactness: My initial plan was to go with a base 36 unique key (e.g. 36 values [0..9],[a..z]). As part of our API/interface we plan on exposing certain non-confidential objects based on a shortform URL (e.g. xx.co/000001). I like the idea of being able to have a five character identifier in base 36 vs. 7+ in decimal.
So I can think of two possible approaches:
a) add my own field and develop my own unique key generator (or maybe someone will point me to one).
b) Pad leading digits (and I assume I can force the unique key generation to start at 1xxxxxxx rather than 0000001). Then use the to_s(36) method to convert it to and from base 36 for all interactions with humans. Maybe even store the actual ID value in the database in the base 36 format to avoid ongoing conversions, but always do the conversion before a query to avoid the need to have another index.
I'm leaning towards approach B, as it seems like it would be optimal from a DB performance standpoint and that it would require the least investment in non-value added overhead. Once again, any real world experience with these topics and thoughts on the best approach would be greatly appreciated.
Thanks in advance!
I would never use the primary key in a Rails table for anything of business importance. There will come a day when someone on the business end will want to change it, and it'll end up being an enormous pain in the butt and will invalidate a bunch of URLs you and your users thought were canonical and will mess up all your foreign keys and blah blah blah. It's just a really bad idea and I would encourage you not to do it.
The Rails way to do this is have a new column, called something like number or bug_tracking_number or whatever strikes your fancy, and before_validation implement a callback that gives it a value. This is where you can let your creativity shine; something like this sounds like what you want:
before_validation( :on => :create ) do
self.number = CaseNumber.count + 1
end
You can pad the number there, ensure its uniqueness, or do whatever else you want.

drop-down combo box-like functionality for queries

Is it possible to provide the following type of functionality with informix client tools?
As the user types the first two characters of a name, the drop-down list is empty. At the third character, the list fills with just the names beginning with those three characters. At the fourth character, MS-Access completes the first matching name (assuming the combo's AutoExpand is on). Once enough characters are typed to identify the customer, the user tabs to the next field.
The time taken to load the combo between keystrokes is minimal. This occurs once only for each entry, unless the user backspaces through the first three characters again.
If your list still contains too many records, you can reduce them by another order of magnitude by changing the value of constant conSuburbMin from 3 to 4.
This requires a combination of two things, only one of which is partially under the control of Informix the DBMS or Informix the Client API.
First of all, you need the gadget that is accepting user input to asynchronously generate a query which matches what the user has typed, fetches some of the results from the DBMS, and shows them. Secondly, you need the DBMS to respond rapidly to such queries. Part of the issue is 'what form does the query take'. But the basic functionality is:
SELECT TitleCaseName
FROM ReferenceTable
WHERE LowerCaseName[1,3] = 'abc';
You might or might not bother with 'first rows optimization'; you might or might not bother with an ORDER BY. Your code would only select the first N rows. You might do it with some prioritization information - most frequently used names, etc.
But this is logic is basically the same for any DBMS - give or take the details such as the choice of technique for dealing with case-mapping (function call vs column) and notation for substrings vs LIKE 'abc%'.
The tricky stuff, though, is the asynchronous combination of user-input plus collecting data from the DBMS; that is best handled with multiple threads, one dealing with the user input, one dealing with the DBMS and (possibly) one dealing with the display (or that might also be the one dealing with user input). And that requires hooking into the UI API - not something that the Informix APIs do of their own accord. The UI can get at Informix (or any other DBMS) easily enough through ODBC or any other faintly similar API.

Cleaning Up Query Strings

This is more of an open question. What is your opinion on query strings in a URL? While creating sites in ASP.NET MVC you spend a lot of time thinking about and crafting clean URLs only for them to be shattered the first time you have to use query strings, especially on a search form.
For example I recently did a fairly simple search form with half a dozen text field and two or three lists of checkboxes and selects. This produced the query string below when submitted
countrylcid=2057&State=England&StateId=46&Where=&DateFrom=&DateTo=&Tags=&Keywords=&Types
=1&Types=0&Types=2&Types=3&Types=4&Types=5&Costs=0.0-9.99&Costs=10.00-29.99&Costs=30.00-59.99&Costs=60.00-10000.00
Beautiful I think you'll agree. Half the fields had no information in them and the list inputs are very verbose indeed.
A while ago I implemented a simple solution to this for paging which produced a url such as
www.yourdomain.com/browse/filter-on/page-1/perpage-50/
This used a catchall route to grab what is essentially a replacement query string after the filter-on portion. Works quite well but breaks down when doing form submissions.
Id be keen to hear what other solutions people have come up with? There are lots of articles on clean urls but are aimed at asp.net developers creating basic restful urls which MVC has covered. I am half considering diving into model binding to produce a proper solution along those lines. With the above convention the large query string could be rewritten as:
filter-on/countrylcid-2057/state-England/stateId-46/types-{1,0,2,3,4,5}/costs-{0.0-9.99,10.00-29.99,30.00-59.99,60.00-10000.00}/
Is this worth the effort?
Thanks,
My personal view is that where users are likely to want to either bookmark or pass on URLs to other people then a nice, clean "friendly" URL is the way to go. Aesthetically they are much nicer. For simple pagination and ordering then a re-written URL is a good idea.
However, for pages that have a large number of temporary, dynamic fields (such as a search) then I think the humble query string is fine. Like wise for pages whose contents are likely to change significantly given the exact same URL in the future. In these cases, URLs with query strings are fine and, perhaps, even preferable as they at least indicate to the observant user that the page is dynamic. However, in these cases it may be better to use form POST variables, anyway, that way visitors are not tempted to "fiddle" with the values.
In addition to what others have said, a URL implies a hierarchy that is semantic. Whether true today or not, the ancestry is directories and people still think of it as such. That's why you have controller/action/id. Likewise, to me a querystring implies options or queries.
Personally, I think a rewritten URL is best when you can't tell if there's an interpreter behind it -- maybe it's just a generated HTML file?
So however you choose to do it (and it's a pain on the client in a search form -- I'd say more trouble than it's worth), I'd support you doing it for hierarchies.
E.g. /Search/Country/State/City
but once you start getting into prices and types, or otherwise having to preface a "directory" with the type of value (e.g. /prices=50.00/ or worse, with an array), then that's where you've lost me.
In fact, if all elements are filled in, then all you've really done is taken the querystring, replaced "&" with "/", and combined your arrays into a single field.
If you're going to be writing the javascript anyways, why don't you just loop through the form elements and:
Remove the empty ones, cleaning up the querystring from the "&price_low=&price_high=&" sorts of things.
Combine multiple values into an array structure
But then submit as a querystring.
James
Aren't the values of the different fields available in the FormsCollection anyway on post?

Resources