Is Amazon SimpleDB a good choice for my data? - ruby-on-rails

I'm working on a database that stores user-created surveys. The database needs to store a unique ID for each survey. Using SQL, I'd just use a SERIAL column type so each row has an auto-incrementing numeric key.
SimpleDB seems to store everything as a string, meaning I would have to generate a unique key myself. Since this key will be part of the URL, I think a UUID is just too long. I want to be able to load a survey with something like:
foo.com/1a
Is there any way to have SimpleDB generate a unique row ID for each item you store? Thanks!

OK, so you're asking for opinions here.... I would choice any other storage mechanism over simpledb. For example, you could easily go with MongoDB as a document storage alternative to a relational DB and get more benefits than with SimpleDB.
As far as wanting a short unique URL, you can search and find a ruby implementation to turn an ID into a shorted ID. http://blog.kischuk.com/2008/06/23/create-tinyurl-like-urls-in-ruby/
That implementation will turn 1174229 into "7sH_" (according to the post. YMMV)
So, you'd have something like
class Survey
include Mongoid::Document
def to_param
generate_url(self.id)
end
end
in routes
resources :surverys, :path=>''
And that would create
http://yourapp/7sh_
Of course this technique can work for non-mongo installs.

SDB Explorer supports bulk upload from MYSQL to Amazon SimpleDB. You can upload your MYSQL data to Amazon SimpleDB using Upload feature. While upload SDB Explorer generates unique ID. Even you can choose you own MYSQL filed as a itemName() i.e. unique id for Amazon SimpleDB.

Related

Is the QuickBooks ListID unique across company files?

I'm using the Web Connector to fetch data from multiple company files, and I want to know whether storing the ListID of an Invoice will produce a unique identifier across company files, or if I need to also store in each record a reference to its company file to establish uniqueness.
Is the QuickBooks ListID unique across company files?
No, it's not. It's a hexadecimal incrementing integer.
if I need to also store in each record a reference to its company file to establish uniqueness.
Yes, you do need to do that.

Ruby on Rails - Implementing UUID as Primary Key With Existing Schema

Currently I am creating a RESTful API for a mobile application. The RESTful API has a number of end points that allow users to exchange personal information between each other. I was testing how secure these endpoints were and quickly realized that if a third party managed to gain access to the API they could easily look up other user's information by guessing their user id or using an automated script to collect a wide range of personal information. This was due to the fact that I was using a primary key that was a simple auto-incremented integer which made it predictable and easy to determine other user's ids. I immediately began looking for something that didn't follow a distinct pattern. I came across UUIDs and decided to implement them with my existing rails app.
Was this a wise decision? I definitely see the upside to using UUIDs but upon further research I found that there were a number of negatives to this approach. Many sources claim that using UUIDs will cause performance issues with large tables. Are UUIDs right for my situation?
My second question is about implementing this in an existing Ruby on Rails application. I made the switch to UUIDs by following this article: http://rny.io/rails/postgresql/2013/07/27/use-uuids-in-rails-4-with-postgresql.html. I ran into an issue with enabling the uuid-ossp extension. I created a migration and put enable_extension 'uuid-ossp' inside the change function. I then changed the existing migrations to support UUIDs as their primary key and ran rake db:drop db:create db:migrate to recreate the database with the edited migrations. This failed with the error PG::UndefinedFunction: ERROR: function uuid_generate_v4() does not exist. I quickly realized that this was because I had created the migration that enabled the uuid-ossp extension after the migrations that I had edited to use UUIDs. When I changed the time stamp in the name of the migration to a date that preceded all migrations the db:migrate command completed with no errors. This felt very hack and defeated the purpose of having migrations. What is the correct way of adding this extension via a migration?
Edit in response to comments:
So a number of comments were made that suggested that I should just be properly authenticating users and checking their permissions before allowing them to view certain data. I have user authentication built into my application but will better explain my situation and why I needed something more than auto-incremented primary keys.
I have a number of users on this application and each user has the ability to create private and public contacts. Public contacts are viewable by everyone using the mobile application. Private contacts can only be viewed by the user who created them. However, a user can share their private contacts with other users by showing other users with the mobile application a QR code that has the contacts ID encoded into it. When the user decodes the contact ID a request is sent to the backend to notify the backend that the user is now an owner of that private contact. This allows the second user to now receive updates from that private contact. This is a large feature of my application. The aim here is to force people to have to exchange these contacts in person and to disallow others from seeing these contacts unless this process has happened.
Implementing this concept proved to be fairly tricky as all users could potentially share all private contacts with any other user on the system. I found this extremely hard to implement using permissions as which contacts a user can view is constantly changing.
Originally I implemented this with auto-incremented integers as my primary key for the contact IDs. It worked but forced me to create a very insecure API endpoint that essentially would take a user ID and a private contact ID as parameters and would add that user as an owner of that contact. Because auto-incremented IDs are so predictable a user with access to the API could essentially loop through a sequence of numbers calling the endpoint each time, pass the sequence number in as the contact ID and add themselves as owners to contacts that hadn't been shared with them. This would by pass the whole process of having to share the contact in person and in large defeats the purpose of having my mobile application.
I decided I needed something less predictable, completely random and unique to each private contact. I found UUIDs while doing research to solve this problem and changed the contact ID in my model to be of type UUID. Are UUIDs the best way to solve this? Should I use something else? Have I gone about solving this problem the wrong way?
Are UUIDs the best way to solve this?
You could use them as a solution. If you do, you should build a new contacts table and model instead of trying to migrate the old model. As well as being tricky to implement, any migration would immediately make existing contact/invite emails invalid (since they contain the old id). Briefly support both models, and retire the old auto-incrementing id model once you are happy that traffic using it is no longer important to your application.
There is still a flaw - your contact share links will now be long-lasting, and if anyone gets access to a contact's id for any reason, and know enough to construct the URL for gaining that user as a contact, then they gain the ability to share it to themselves and anyone else completely outside of the control of your application. This because you are relying on knowledge of the id as the only thing preventing access to the contact details.
Should I use something else?
In my opinion, yes. Use a separate nonce or one-off code model (with UUIDs, or an indexed column containing a long random string - you could use SecureRandom for this) that can grant rights to complete the sharing. When someone wants to share a contact, create the nonce object with details about what is being shared - e.g. the contact_id - and use it to generate email link pointing to a route that will find the nonce and allow access to the resource.
The model doesn't need to be called "Nonce" or contain that as a column, this is just a common name for the pattern. Instead you might call the new model "ContactShare" and the secret property "link_code".
This will allow you to resolve access to contacts using your app's permissions model as normal, and block the possible misuse of sharing links. When the controller with the nonce id or code is invoked, create permissions at that point in order to grant access to the contacts. Then expire or delete the nonce, so it cannot be re-used. I prefer expiry, so you can track usage - this can be as simple as a used boolean column that you update once the sharing request has succeeded.
Note I am not referring to Rack::Auth::Digest nonce routine, which is specific to server authentication. I did not find a RoR pre-built nonce model, but it is possible it goes under a different name.

Store ruby Mail (from gem) object in ActiveRecord

I'm currently implementing a very basic IMAP client into an application I'm building in Rails. I'm using the Mail gem which supplies lots of useful ways of parsing the imap data.
I'd like to store the Mail object that it's generating in the database. Is that possible?
i.e.
email = Email.new
email.uid = id
email.mail = Mail.new(imap.fetch(id, "RFC822")[0]["attr"]["RFC822"]
email.save
It's a convenience thing where I don't want to have to download the object again unless I have to since performance on the IMAP call is slow, but I'd like to be able to have it there to look back on (and do any breaking down I needed to later).
I could then call
email.find(x).mail.body
and various other useful things without having to build out that functionality in my own email model.
Q1: How would I set up the active record model?
Q1a: Would I be better off doing something that excluded the attachments to make it an easier object to store? (is that even possible?)
Appreciate your help,
Several database schemata have been developed to store mail. I've worked on one, and there are others. Believe me, it's hard work. The result can be very useful, but since your question doesn't focus on the result I suspect it's not worthwhile in your case.
You might find it easier to use a json library to write your object graph to a file with an automatically inferred structure, as most json libraries seem to support these days. That won't let you do as much, but it's very much easier and lets you store both completely and incompletely retrieved messages. If you haven't fetched a particular body part, the json library will just write a null for that field.
It depends on what you want to do with the stored mails. If you need only specific parts of the mail to be easily accessible trough the database you won't need a complex setup like archiveopteryx, which basically maps a complete representation of emails to relational database tables. In most cases though you won't need that much detail and it will be totally perfect to use a simple data model.
A1: rails g model Email from to subject date:datetime message_id body. this are just the basic parts, should get you started.
A1a: You don't need to store the attachments if you don't want to. If you need them, you'll probably be better off not storing them in the database itself. Attachments are just like uploads so there are plenty of gems that can help you do that (https://www.ruby-toolbox.com/categories/rails_file_uploads).
Using posgres jsonb columns, you can store the email as json, in my case I disregard the attachments (which I store the reference to and retrieve as and when required).
This works pretty well with the Mail gem.

Logging data changes for synchronization

I am looking for solution of logging data changes for public API.
There is a need to tell client app which tables form db has changed and need to be synchronised since the app synchronised last time and also need to be for specific brand and country.
Current Solution:
Version table with class_names of models which is touched from every model on create, delete, touch and save action.
When we are touching version for specific model we also look at the reflected associations and touch them too.
Version model is scoped to brand and country
REST API is responding to a request that includes last_sync_at:timestamp, brand and country
Rails look at Version with given attributes and return class_names of models which were changed since lans_sync_at timestamp.
This solution works but the problem is performance and is also hard to maintenance.
UPDATE 1:
Maybe the simple question is.
What is the best practice how to find out and tell frontend apps when and what needs to be synchronized. In terms of whole concept.
Conditions:
Front end apps needs to download only their own content changes not whole dataset.
Does not invoked synchronization when application from different country or brand needs to be synchronized.
Thank you.
I think that the best solution would be to use redis (or some other key-value store) and save your information there. Writing to redis is much faster than any sql db. You can write some service class that would save the data like:
RegisterTableUpdate.set(table_name, country_id, brand_id, timestamp)
Such call would save given timestamp under key that could look like i.e. table-update-1-1-users, where first number is country id, second number is brand id, followed by table name (or you could use country and brand names if needed). If you would like to find out which tables have changed you would just need to find redis keys with query "table-update-1-1-*", iterate through them and check which are newer than timestamp sent through api.
It is worth to rmember that redis is not as reliable as sql databases. Its reliability depends on configuration so you might want to read redis guidelines and decide if you would like to go for it.
You can take advantage of the fact that ActiveModel automatically logs every time it updates a table row (the 'Updated at' column)
When checking what needs to be updated, select the objects you are interested in and compare their 'Updated at' with the timestamp from the client app
The advantage of this approach is that you don't need to keep an additional table that lists all the updates on models, which should speed things up for the API users and be easier to maintain.
The disadvantage is that you cannot see the changes in data over time, you only know that a change occurred and you can access the latest version. If you need to track changes in data over time efficiently, than I'm afraid you'll have to rework things from the top.
(read last part - this is what you are interested in)
I would recommend that you use the decorator design pattern for changing the client queries. So the client sends a query of what he wants and the server decides what to give him based on the client's last update.
so:
the client sends a query that includes the time it last synched
the server sees the query and takes into account the client's nature (device-country)
the server decorates (changes accordingly) the query to request from the DB only the relevant data, and if that is not possible:
after the data are returned from the database manager they are trimmed to be relevant to where they are going
returns to the client all the new stuff that the client cares about.
I assume that you have a time entered field on your DB entries.
In that case the "decoration" of the query (abstractly) would be just to add something like a "WHERE" clause in your query and state you want data entered after the last update.
Finally, if you want that to be done for many devices/locales/whatever implement a decorator for the query and the result of the query and serve them to your clients as they should be served. (Keep in mind that in contrast with a subclassing approach you will only have to implement one decorator for each device/locale/whatever - not for all combinations!
Hope this helped!

Store data in Ruby on Rails without Database

I have a few data values that I need to store on my rails app and wanted to know if there are any alternatives to creating a database table just to do this simple task.
Background: I'm writing some analytics and dashboard tools for my ruby on rails app and i'm hoping to speed up the dashboard by caching results that will never change. Right now I pull all users for the last 30 days, and re-arrange them so I can see the number of new users per day. It works great but takes quite a long time, in reality I should only need to calculate the most recent day and just store the rest of the array somewhere else.
Where is the best way to store this array?
Creating a database table seems a bit overkill, and I'm not sure that global variables are the correct answer. Is there a best practice for persisting data like this?
If anyone has done anything like this before let me know what you did and how it turned out.
Ruby has a built-in Hash-based key value store named PStore. This provides simple file based, transactional persistance.
PStore documentation
If you've got a database already, it's really not a big deal to create a separate table for tracking this sort of thing. When doing reporting, it's often to your advantage to create derivative summary tables exactly like what you're describing. You can update these as required using a simple SQL statement and there's no worry that your temporary store will somehow go away.
That being said, the type of report you're trying to generate is actually something that can be done in real-time except on extravagantly large data sets. The key is to have indexes that describe the exact grouping operation you're trying to do. For instance, if you're grouping by calendar date, you can create a "date" field and sync it to the "created_at" time as required. An index on this date field will make doing a GROUP BY created_date very quick:
SELECT created_date AS on_date, COUNT(id) AS new_users FROM users GROUP BY created_date
Using a lightweight database like sqlite shouldn't feel like an overkill. Alternatively, you can use key-store solutions like tokyo cabinet or even store the array in a flat file manually but I really don't see any overkill in using sqlite.

Resources