I read some Firebase database structure guides on how to structure your data properly (without data nesting) but I have one question.
So, I have an iOS app that uses Firebase database. The users need to login/register.
In terms of data structure, my database looks like this:
-Database
---Users
-----User1
--------username: johndoe
---------email: johndoe#test.com
---------display_name: John Doe
-----User2 {....}
-----User3 {....}
Now, let's imagine I have 100K users in there. Every time a new user is being registered, I check if the username & the email already exist in the database, if they don't then create the new user account.
My question is - Do I need to create a new object that contains only the usernames and another that contains only the emails? I'm asking this because I'm concerned that if I iterate through the Users objects I will potentially be downloading hundreds of megabytes just to check if the username and the email already exist.
Firebase will not allow duplicate users (authentication names). So when you call createUser, firebase will return an error if the user already exists.
Secondly, if you are performing a query for a specific item in Firebase, you are not downloading anything unless that item is found. So whether its 10 or 100k user nodes, nothing is downloaded when performing a query other than the nodes that match the query, which would only be one if there was a duplicate user. again though, this is not needed since Firebase rejects duplicate authentication names.
And to clarify; there is nothing wrong with nesting nodes. However, keeping them flat is usually better depending on your use case. So don't go overcomplicating your structure if you don't need to.
Oh, and your Firebase structure is spot on. Keep going with that.
Related
As of right now I am using Firebase Realtime Database to include chat functionality as part of an app I'm working on. The only issue I've seemingly run into is figuring out how to include a user's data (profile, username, birthday, etc.) so that if a user clicks on a chat, they can then seamlessly go to a user's profile page without needing to fetch more data from the backend. Here's the current structure I'm using in Firebase Realtime Database for this:
$chats
$chatId
id
users
0: some user id
1: some user id
lastMessage
$userChats
$userId
$chatId: true
$users
$userId
user info here
In my case what I would like to know is if it makes more sense to duplicate all the user data for each user into each chat within the users array or if I should just use the referencing userId and pull that data after in a separate request?
Considering I store my users primarily in a separate PostgreSQL database I wonder if I could do a separate query to that database and not even worry about storing the users in the realtime database as well (considering I have to include aggregate info for each user like counters).
If you are always going to be fetching user data alongside a chat, then you should store them together. No need to make more than one call unnecessarily.
However, if you will ever fetch user data and/or conversations separately, I would recommend storing the user data separately and not within the conversation.
Also, if you really want an "immediate" feel (beyond the already "realtime" database performance), you could also fetch the user data in the background as soon as a particular chat is opened. That way, if the user taps to view a profile, it'll have already been fetched and will give that "instantaneous" experience you are looking for.
Plus, you have to remember that Realtime Database charges on the amount of data being transferred, no matter how many calls it takes (as opposed to Firestore which charges on the number of queries), so storing it separately does not increase billing at all compared to one query, and actually saves money in the cases where you don't need both data sets.
I am writing an app, where a signed up user should be able to see which of his contacts have signed up, too. What is the most elegant way to do this?
I was planning to create an array of all locally saved email addresses extracted from the user's local iOS addressbook and create a query for those. Is there any better way to do this?
Edit: Is this actually possible without downloading the whole user list? I could use a for loop with queryStartingAtValue(emailAddress) and queryEndingAtValue(emailAddress). But this could possibly lead to hundreds of queries at the same time.
In NoSQL databases you'll often end up modeling the data in ways that your application wants to consume it.
In this case it seems your app needs to look up whether a user exists, based on their email address. For that purpose I'd add a list of email-to-uid data:
emailToUid
"test#mail,com": "P0...wklsh1"
"MJQZ1347": "Aj1278a..."
This is essentially a self-created index that allows you to check whether an email address is used without having to run a query.
Now you can loop over the contact and look whether there is a user for that email address with a:
ref.child("emailToUid").child(email).observeSingleEventOfType(.Value
This is going to be very fast. Because of the way Firebase communicates with the back-end, there's going to be very little difference between a single request with 100 email addresses or 100 requests with a single email address. See my answer here for more on that: Speed up fetching posts for my social network app by using query instead of observing a single event repeatedly
You could have something like this
user
-$user_id
-email
-username
-contacts
-contact_uid1:email1,
-contact_uid2:email2,
-contact_uid3:email3,
And then do:
Download the contact child from firebase and save it to a var
Create the loop to check every contact in the address book
If the email is in the contact child don't do anything (it means you already verified the user once)
If the email is not in the contact child launch a single event query to find the uid of an specific email
in the callback of the query if it is nsnull the user is not in the app
if the user exist, add the user to your contacts child node in firebase
This way you only launch the queries of the contacts you haven't checked
Currently I am creating a RESTful API for a mobile application. The RESTful API has a number of end points that allow users to exchange personal information between each other. I was testing how secure these endpoints were and quickly realized that if a third party managed to gain access to the API they could easily look up other user's information by guessing their user id or using an automated script to collect a wide range of personal information. This was due to the fact that I was using a primary key that was a simple auto-incremented integer which made it predictable and easy to determine other user's ids. I immediately began looking for something that didn't follow a distinct pattern. I came across UUIDs and decided to implement them with my existing rails app.
Was this a wise decision? I definitely see the upside to using UUIDs but upon further research I found that there were a number of negatives to this approach. Many sources claim that using UUIDs will cause performance issues with large tables. Are UUIDs right for my situation?
My second question is about implementing this in an existing Ruby on Rails application. I made the switch to UUIDs by following this article: http://rny.io/rails/postgresql/2013/07/27/use-uuids-in-rails-4-with-postgresql.html. I ran into an issue with enabling the uuid-ossp extension. I created a migration and put enable_extension 'uuid-ossp' inside the change function. I then changed the existing migrations to support UUIDs as their primary key and ran rake db:drop db:create db:migrate to recreate the database with the edited migrations. This failed with the error PG::UndefinedFunction: ERROR: function uuid_generate_v4() does not exist. I quickly realized that this was because I had created the migration that enabled the uuid-ossp extension after the migrations that I had edited to use UUIDs. When I changed the time stamp in the name of the migration to a date that preceded all migrations the db:migrate command completed with no errors. This felt very hack and defeated the purpose of having migrations. What is the correct way of adding this extension via a migration?
Edit in response to comments:
So a number of comments were made that suggested that I should just be properly authenticating users and checking their permissions before allowing them to view certain data. I have user authentication built into my application but will better explain my situation and why I needed something more than auto-incremented primary keys.
I have a number of users on this application and each user has the ability to create private and public contacts. Public contacts are viewable by everyone using the mobile application. Private contacts can only be viewed by the user who created them. However, a user can share their private contacts with other users by showing other users with the mobile application a QR code that has the contacts ID encoded into it. When the user decodes the contact ID a request is sent to the backend to notify the backend that the user is now an owner of that private contact. This allows the second user to now receive updates from that private contact. This is a large feature of my application. The aim here is to force people to have to exchange these contacts in person and to disallow others from seeing these contacts unless this process has happened.
Implementing this concept proved to be fairly tricky as all users could potentially share all private contacts with any other user on the system. I found this extremely hard to implement using permissions as which contacts a user can view is constantly changing.
Originally I implemented this with auto-incremented integers as my primary key for the contact IDs. It worked but forced me to create a very insecure API endpoint that essentially would take a user ID and a private contact ID as parameters and would add that user as an owner of that contact. Because auto-incremented IDs are so predictable a user with access to the API could essentially loop through a sequence of numbers calling the endpoint each time, pass the sequence number in as the contact ID and add themselves as owners to contacts that hadn't been shared with them. This would by pass the whole process of having to share the contact in person and in large defeats the purpose of having my mobile application.
I decided I needed something less predictable, completely random and unique to each private contact. I found UUIDs while doing research to solve this problem and changed the contact ID in my model to be of type UUID. Are UUIDs the best way to solve this? Should I use something else? Have I gone about solving this problem the wrong way?
Are UUIDs the best way to solve this?
You could use them as a solution. If you do, you should build a new contacts table and model instead of trying to migrate the old model. As well as being tricky to implement, any migration would immediately make existing contact/invite emails invalid (since they contain the old id). Briefly support both models, and retire the old auto-incrementing id model once you are happy that traffic using it is no longer important to your application.
There is still a flaw - your contact share links will now be long-lasting, and if anyone gets access to a contact's id for any reason, and know enough to construct the URL for gaining that user as a contact, then they gain the ability to share it to themselves and anyone else completely outside of the control of your application. This because you are relying on knowledge of the id as the only thing preventing access to the contact details.
Should I use something else?
In my opinion, yes. Use a separate nonce or one-off code model (with UUIDs, or an indexed column containing a long random string - you could use SecureRandom for this) that can grant rights to complete the sharing. When someone wants to share a contact, create the nonce object with details about what is being shared - e.g. the contact_id - and use it to generate email link pointing to a route that will find the nonce and allow access to the resource.
The model doesn't need to be called "Nonce" or contain that as a column, this is just a common name for the pattern. Instead you might call the new model "ContactShare" and the secret property "link_code".
This will allow you to resolve access to contacts using your app's permissions model as normal, and block the possible misuse of sharing links. When the controller with the nonce id or code is invoked, create permissions at that point in order to grant access to the contacts. Then expire or delete the nonce, so it cannot be re-used. I prefer expiry, so you can track usage - this can be as simple as a used boolean column that you update once the sharing request has succeeded.
Note I am not referring to Rack::Auth::Digest nonce routine, which is specific to server authentication. I did not find a RoR pre-built nonce model, but it is possible it goes under a different name.
I have two type of models in the application I'm working on: User and Account.
Every account has many users. Every user has one account.
When I download the user object from an API, I get the account_id, but not the actual account object. The account object will be downloaded after the user object.
What is the best practice for establishing the relationship between the user and his account in this situation?
Should I insert an empty row into the Accounts table with just its account_id field filled in? And then later, when I download the account, update that row?
First, Core Data centric definitions, you have 2 entities (User and Account) and no tables (because this is an object store, not a SQLite database).
So, you wouldn't have empty rows, you would have stub objects (partially complete objects that will be filled in later).
There is no best practice when it comes to stub objects. Whether you should create them is entirely dependent upon your use case. In some cases it helps to have the basic information about an item so that you have something to show the user while you go and get the details. In your case, you only have an identity so the benefit of stub objects seems very low.
I am developing a gallery which allows users to post photos, comments, vote and do many other tasks.
Now I think that it is correct to allow users to unsubscribe and remove all their data if they want to. However it is difficult to allow such a thing because you run the risk to break your application (e.g. what should I do when a comment has many replies? what should I do with pages that have many revisions by different users?).
Photos can be easily removed, but for other data (i.e. comments, revisions...) I thought that there are three possibilities:
assign it to the admin
assign it to a user called "removed-user"
mantain the current associations (i.e. the user ID) and only rename user's data (e.g. assign a new username such as "removed-user-24" and a non-existent e-mail such as "noreply-removed-user-24#mysite.com"
What are the best practices to follow when we allow users to remove their accounts? How do you implement them (particularly in Rails)?
I've typically solved this type of problem by having an active flag on user, and simply setting active to false when the user is deleted. That way I maintain referential integrity throughout the system even if a user is "deleted". In the business layer I always validate a user is active before allowing them to perform operations. I also filter inactive users when retrieving data.
The usual thing to do is instead of deleting them from a database, add a boolean flag field and have it be true for valid users and false for invalid users. You will have to add code to filter on the flag. You should also remove all relevant data from the user that you can. The primary purpose of this flag is to keep the links intact. It is a variant of the renaming the user's data, but the flag will be easier to check.
Ideally in a system you would not want to "hard delete" data. The best way I know of and that we have implemented in past is "soft delete". Maintain a status column in all your data tables which ideally refers to the fact whether the row is active or not. Any row when created is "Active" by default; however as entries are deleted; they are made inactive.
All select queries which display data on screen filter results for only "active records". This way you get following advantages:
1. Data Recovery is possible.
2. You can have a scheduled task on database level, which can take care of hard deletes of once in a way; if really needed. (Like a SQL procedure or something)
3. You can have an admin screen to be able to decide which accounts, entries etc you'd really want to mark for deletion
4. A temperory disabling of account can also be implemented with same solution.
In prod environments where I have worked on, a hard delete is a strict No-No. Infact audits are maintained for deletes also. But if application is really small; it'd be upto user.
I would still suggest a "virtual delete" or a "soft delete" with periodic cleanup on db level; which will be faster efficient and optimized way of cleaning up.
I generally don't like to delete anything and instead opt to mark records as deleted/unpublished using states (with AASM i.e. acts as state machine).
I prefer states and events to just using flags as you can use events to update attributes and send emails etc. in one foul swoop. Then check states to decide what to do later on.
HTH.
I would recommend putting in a delete date field that contains the date/time the user unsubscribed - not only to the user record, but to all information related to that user. The app should check the field prior to displaying anything. You can then run a hard delete for all records 30 days (your choice of time) after the delete date. This will allow the information not to be shown (you will probably need to update the app in a few places), time to allow the user to re-subscribe (accidental or rethinking) and a scheduled process to delete old data. I would remove ALL information about the member and any related comments about the member or their prior published data (photos, etc.)
I am sure it changing lot since update with Data Protection and GDPR, etc.
the reason I found this page as I was looking for advice because of new Apply policy on account deletion requirements extended https://developer.apple.com/news/?id=i71db0mv
We are using Ruby on Rails right now. Your answers seem a little outdated? or not or still useful right now
I was thinking something like that
create a new table “old_user_table” with old user_id , First name, Second name, email, and booking slug.
It will allow keep all users who did previous booking. And deleted their user ID in the app. We need to keep all records for booking for audit purpose in the last 5 years in the app.
the user setup with this app, the user but never booking, then the user will not transfer to “old_user_table” cos the user never booking.
Does it make sense? something like that?