I checked out Redis pub/sub functionality and at first glance it looks perfect for something like forming a twitter feed.
However I tried to google for Redis pub/sub and newsfeed and I can barely find any example or use case about this. If Redis is actually not good for this, what are the disadvantages?
First of all, Redis pub/sub is not a data storage, but just a data flow channel.
For example(Chronologically sequenced)
You create a channel named news:feed
User A joins news:feed
User B publishes to news:feed
This scenario works fine. But the following doesn't:
You create a channel named news:feed
User B publishes to news:feed
User A joins news:feed
In this case, user A will never receive the message published by User B, before he (A) joined.
If you want to implement newsfeed using pub/sub, you have to create several channels (at least as many as users). Here is an implementation of a simple Twitter clone: http://redis.io/topics/twitter-clone
Related
So I got a task to prepare a simple analysis on how useful, from sociometrical point of view, are Slack API methods (https://api.slack.com/methods).
Yesterday I didn't even know that such thing as sociometry exists, and i still dont know how to evaluate any API using its methodology. Does anyone here ever got a similar task, or have any idea how to approach such analysis? What literature will be useful? I don't mean this analysis to be particularly long, but as for now I don't even know where to start.
Frankly, I am not an expert on sociometry , but here is how I would approach it:
I would assume the goal is to create a sociogramm depicting the relationships between all users on a Slack team using the API methods. So the question is how useful the API methods are to achieve that goal.
Slack does not have a "friends list", like Facebook, so you have to come up with your own approach on how to identify relationships on Slack. Slack is a messaging system, so it makes sense to define it based on who is communicating with whom.
Lets define users to have a relationships if they are
direct messaging each other (including groups)
talking to each other in a channel (using the #user
mention)
or just being part of the same channel and talking in the channel
Now to assess the effectiveness of the API methods. The basic approach would be to retrieve the messages of a public channel with channels.history (or im.history for direct messages, groups.history for for private channel and mpim.history for direct messaging channels with multiple participants) for a given time period. In addition you can retrieve the members of a channel with channels.info (or their pendants for the other channel types). Then you would parse all retrieved messages and the member list of a channel to identify the relationship and calculate the sociagram.
However, Slack will only allow users to access channels, that they are members of. That includes access through the API and that includes users with the role admin and owner.
So its not possible to see all direct messages, groups chats and private channel of a Slack team through the API and we would therefore need to limit the approach to public channels and some private channel. Depending on where most of the conversation is happening on a specific Slack team and which private channels our slack user is a member of this could significantly limit the ability to calculate a complete sociogram.
In summary you can use the API methods to calculate a sociogram for your Slack team based on which users users are communicating with each other. But that analysis will not be 100% complete, since its not possible to access all private communication on a Slack team though the API. The calculated sociogram might still be useful though, if the Slack user doing the calculation has access to all relevant private channels.
I'm looking to basically remake Kik within my app. For most guides I've seen on a firebase chat application, there is one major Messages node, and then underneath that there's a fan-out with messages for each user that reference messages in the main list.
With the way my Firebase is laid out at the moment, it would be easier to implement something like this:
users
chatPartners
02834092890428
chatMessages
2093840923840923
timestamp/userUID/etc.
and just have the actual chat inside of my user's node. This seems to also cut down majorly on having to sift through every single message in a messages node?
So when the users send messages to eachother, I'd update the "chat messages" node under the sender and the user.
Is there any reason NOT to do it like this? I see everyone doing it the first way I described, yet I don't see a reason why storing each chat under user--->chat partner --> the chat log would be an issue.
The only issue you may run in to is how the data is called. Note that when you call the 'Chat Log', because it is a child of 'Users' and 'chatPartners', you will be calling the data of everything in that branch, essentially loading every piece of data in the database under 'Users', which is time and performance sensitive.
I am developing an ios app like Tinder. Users can chat only in private 1:1.
Should I have to open one channel for every single "match"? Is this the correct design pattern for this case study? What about performance if i have one channel per "match".
*Match" is when a user matches to another and can start a private chat.
If one person can have multiple matches, you can ask PubNub client to open separate channel for each nothing person. So, when you have two matching persons, you take some unique identifiers from both of them and using known algorithm create unique name of the channel for which both clients will subscribe to communicate.
One channel for whole application - really bad idea, because of possible massive flow of data, which for most of subscribers will be useless, because consumer is one of other subscribers.
Yes, the best approach is that every "match" should have it's own channel on which both participants publish/subscribe to communicate. PubNub has no limit on channels (nor does it charge based on channels), so this shouldn't create a performance or cost issue.
To add access control to the "match" channel (if you want to ensure no one else can access that channel), use PubNub Access Manager, documented here: http://www.pubnub.com/docs/javascript/tutorial/access-manager.html (use dropdown to change programming language)
If you want to provide chat history, so that the two participants can see messages from previous chat sessions, enable PubNub Storage & Playback, and use the PubNub.History() API, documented here: http://www.pubnub.com/docs/javascript/overview/storage-playback.html
If you want to see when those two participants are connected to the Match channel, use PubNub Presence, documented in the same place.
In the Simperium documentation/help section there is the following text:
All the data that is created seems like it must be tied to a user - is
that correct? Is it possible to have data that isn't tied to a user -
say a database of locations or beers?
Yes, though this isn't very clear yet. You can create a public user
(i.e., a public namespace) with an access token you share with other
users of your app so anyone can read/write to that namespace.
It's possible to limit this to read-only access as well if you need to
authoritatively publish data from a backend service.
Is there an actual example with this?
The scenario I have is as follows
My app will have a calendar
The primary user can add and remove data from the calendar
They will want to invite other users to add and remove data, my thought is that they can give them a token, the user can use their email address and this token to sign in
Am I on the right track?
You're definitely on the right track, but a little too far ahead on that track. The scenario you described is a great fit for Simperium, but sharing and collaboration features aren't yet released.
The help text you quoted is for authoritatively pushing content, for example from a custom backend to all users of your app. An example of this would be a news stream that updates on all clients as new content is added.
This is quite different than sharing calendar data among a group of users who have different access permissions, which is actually a better use of Simperium's strengths. We have a solution for this that we've tested internally, but we're using what we've learned to build a better version of it that will be more scalable for production use.
There's no timeline for this yet, but you'll see it announced on your dashboard at simperium.com.
Social networking website probably maintain tables for users, friends and events...
How do they use these tables to compute friends events in an efficient and scalable manner?
Many of the social networking sites like Twitter don't use an RDBMS at all but a Message Queue application. A lot of them start out with a already present application like RabbitMQ. Some of them get big enough they have to heavily customize or build their own. Twitter is in the process of doing this for the second time.
A message queue application works by holding messages from one service for one or more other services. For instance say service Frank is publishing messages to a queue foo. Joe and Jill are subscribed to Franks foo queue. the application will keep track of whether or not Joe or Jill have recieved the messages and once every subscriber to the queue has recieved the message it discards it. Frank fires messages and forgets about it. Joe and Jill ask for messages from foo and get whatever messages they haven't gotten yet. Joe and Jill do whatever they need to do with the message. Perhaps keeping it around perhaps not.
The message queue application guarantees that everyone who is supposed to get the message can and will get the message when they request them. The publisher can send the messages confident that subscriber can get them eventually. This has the benefit of being completely asynchronous and not requiring costly joins.
EDIT: I should mention also that usually the storage for these kind of things at high scale are heavily denormalized. So Joe and Jill may be storing a copy of the exact same message. This is considered ok because it helps the application scale to billions of users.
Other reading:
http://www.rabbitmq.com/
http://qpid.apache.org/
The mainstay data structure of social networking sites is the graph. On facebook the graph is undirected (When you're someone's friend, they're you're friend). On twitter the graph is directed (You follow someone, but they don't necessarily follow you).
The two popular ways to represent graphs are adjacency lists and adjacency matrices.
An adjacency list is simply a list of edges on the graph. Consider a user with an integer userid.
User1, User2
1 2
1 3
2 3
The undirected interpretation of these records is that user 1 is friends with users 2 and 3 and user 2 is also friends with user 3.
Representing this in a database table is trivial. It is the many to many relationship join table that we are familiar with. SQL queries to find friends of a particular user are quite easy to write.
Now that you know a particular user's friends, you just need to join those results to the updates table. This table contains all the user's updates indexed by user id.
As long as all these tables are properly indexed, you'd have a pretty easy time designing efficient queries to answer the questions you're interested in.
Travis wrote a great post on this ,
Activity Logs and Friend Feeds on Rails & pfeed
For the small scale doing a join on users.friends and users.events and query caching is probably fine but does slow down pretty quickly as friends and events grow. You could also try an event based model in which every time a user creates an event an entry is created in a join table (perhaps called "friends_events"). Thus whenever a user wants to see what events their friends have created they can simply do a join between their own id and the friends_events table and find out. In this way you avoid grabbing all a users with friends and then joining their friends with the events table.