Twitter-like model with RavenDB

Twitter-like model with RavenDB - asp.net-mvc

I am playing around a bit with Raven and trying to figure out what the best way would be to model my objects for a twitter-like scenario. So far I have come up with a few options but not sure which one is the best.
public class User{
public string Id{get;set;}
public List<string> Following{get;set;}
public List<string> Followers{get;set;}
}
The User object is simple and straightforward, just an ID and a list of IDs for people I follow and people following me. The feed setup is where I need help, getting all posts from users that I am following.
Option 1 - The easy route
This searches for all posts of people I follow just based on their UserId.
public class Post{
public string UserId{get;set;}
public string Content{get;set;}
}
Index
public class Posts : AbstractIndexCreationTask<Post>{
public Posts(){
Map = results => from r in results
select new{
r.UserId
};
}
}
Querying
var posts = session.Query<Post,Posts>().Where(c=>c.UserId.In(peopleImFollowing));
This is the obvious route but it smells bad. The query results in a bunch of OR statements sent to Lucene. There is an upper limit of somewhere around 1024 that Raven will handle, so any one user couldn't follow more than 1000 people.
Option 2 - One post for each follower
public class Post{
public string UserId{get;set;}
public string RecipientId{get;set;}
public string Content{get;set;}
}
Adding a new post
foreach(string followerId in me.Followers){
session.Store(new Post{
UserId = me.UserId,
RecipientId = followerId,
Content = "foobar" });
}
This is simple to follow and easy to query but it seems like there would be way too many documents created... perhaps that doesn't matter though?
Option 3 - List of recipients
So far I like this the best.
public class Post{
public string UserId{get;set;}
public List<string> Recipients{get;set;}
public string Content{get;set;}
}
Index
public class Posts : AbstractIndexCreationTask<Post>{
public Posts(){
Map = results => from r in results
select new{
UserId = r.UserId,
Recipient = r.Recipients
}
}
}
Adding new post
session.Store(new Post{
UserId = me.Id,
Recipients = me.Followers,
Content = "foobar"
});
Querying
var posts = session.Query<Post,Posts>().Where(c=>c.Recipient == me.Id);
This seems like the best way but I have never worked with Lucene before. Would it be a problem for the index if someone has 10,000 followers? What if we want to post a message that goes to every single user? Perhaps there is another approach?

From my perspective, only option 1 really works and you will probably want to tune how RavenDB talks to lucene if you want to have support for following more than 1024 users.
Option 2 and Option 3 don't take into account that after you have followed new users you want older tweets of them to show up in your timeline. Likewise, you also want these tweets disappear from your timeline after you unfollowed them. If you want to implement this with one of those two approaches, you would need to duplicate all of their tweets on 'follow' operation and also delete them on 'unfollow'. This would make following/unfollowing a very expensive operation and it could also fail (what if the server that contains parts of the tweets isn't available the moment you're doing this?).
Option 2 also has the immensive disadvantage that it would produce literally tons of duplicate data. Think about famous users with millions of followers and thousands of posts. Then multiply this with thousands of famous users... not even twitter can handle such amounts of data.
Option 3 also has the problem that queries to the index get slow because every lucene document would have this 'recipient' field with perhaps millions of values. And you have trillions of documents... no, I'm not a lucene expert, but I don't think that works fast enough to display the timeline (even ignoring that you are not the only concurrent user that wants to display the timeline).
As I said above, I think that only option 1 works. Maybe someone else has a better approach. Good question btw.

Related

How to only fetch data owned by an authenticated user in GraphRepository

I'm trying to create a REST API using Spring Boot (Version 1.4.0.M2) with Spring Data Neo4j, Spring Data Rest and Spring Security. The Domain is consisting of three types of Entities:
The user entity storing the username/password and relationships to all user owned entities:
#NodeEntity
public class User extends Entity {
#Relationship(type = "OWNS")
private Set<Activity> activities;
#JsonProperty(access = JsonProperty.Access.WRITE_ONLY)
private String password;
private String username;
}
Content entities created by/owned by a user:
#NodeEntity
public class Activity extends Entity {
private String name;
#Relationship(type = "ON", direction = Relationship.INCOMING)
private Set<Period> periods;
#Relationship(type = "HAS", direction = Relationship.INCOMING)
private User user;
}
Data entities storing date and time informations accessible to all users
#NodeEntity
public class Period extends Entity {
#Relationship(type = "HAS")
private Set<Activity> activities;
#Relationship(type = "HAS_PERIOD", direction = Relationship.INCOMING)
private Day day;
private PeriodNames name;
}
I'm trying to keep everything as simple as possible for now so I only use Repositories extending GraphRepository and a Spring Security configuration that uses Basic Authentication and the User Object/Repository in the UserDetailsService. This is working as expected, I am able to create a user, create some objects and so on.
The problem is, I want a user only to access their own entities. Currently, everybody can access everything. As I understood it from my research, I have three ways to achieve this:
Use Spring Security Annotations on the repository methods like this:
#PostAuthorize("returnObject.user.username == principal.username")
#Override
Activity findOne(Long id);
#PostFilter("filterObject.user.username == principal.username")
#Override
Iterable<Activity> findAll();
Annotate the methods with #Query and use a custom query to get the data.
Create custom controllers and services that do the actual data querying, similar to this: sdn4-university
.
Now my question:
What would be the best way to implement the desired functionality using my available tools?
For me it seems to be the preferred way to use #2, the custom query. This way I can only fetch the data that I actually need. I would have to try to find a way to create a query that enables paging for Page<Activity> findAll(Pageable pageable) but I hope this is possible. I wasn't able to use principal.username in the custom query though. It seems as if spring-data-neo4j doesn't have support for SpEL right now. Is this correct or is there another way to access the currently authenticated user in a query?
Way #1, using Spring Security Annotations works for me (see the code above) but I could not figure out how to filter Page<Activity> findAll(Pageable pageable) because it returns a Page object and not an entity or collection. Also I'm not sure if this way is efficient as the database always has to query for all entities and not only the ones owned by a specific user. This seems like a waste of resources. Is this incorrect?
Or should I just go with #3 and implement custom controllers and services? Is there another way that I didn't read about?
I'm very grateful for any input on this topic!
Thanks,
Daniel

Since no one has answered yet, let me try...
1) I agree with your assessment that using Spring #PostAuthorize Security is not the way to go here. For filtering data it seems not to be the perfect way to do it here. As you mentioned, it would either load all the data and then filter it, creating a heavy load or probably wreck the paging mechanism:
Imagine you have a million results, loading them all would be heavy. And if you filter them later, you might end up with let's say 1.000 valid results. But I strongly doubt that the paging mechanism will be able to cope with that, more likely that in the end you will seem to have many empty pages. So if you loaded the, let's say, first 20 results, you might end up with an empty result, because they were all filtered out.
Perhaps for some stuff you could use #PreAuthorize to prevent a query from happening, if you only want to get a single result, like in findOne. This could lead to a 403, if not allowed, which would, imho, be ok. But for filtering collections, Spring security doesn't seem a good idea.
3) That's always a possibility, but I wouldn't go there without trying for alternatives. Spring Data Rest is intended to make our code cleaner and coding easier, so we should not throw it away without being 100% sure that we cannot get it to do what we need.
2) Thomas Darimont wrote in this blog posting that there is a way to use principal (and other stuff) in #Query annotations. Let me sumarize to have the answer here...
The basic idea is to create a new EvaluationContextExcentionSupport:
class SecurityEvaluationContextExtension extends EvaluationContextExtensionSupport {
#Override
public String getExtensionId() {
return "security";
}
#Override
public SecurityExpressionRoot getRootObject() {
Authentication authentication = SecurityContextHolder.getContext().getAuthentication();
return new SecurityExpressionRoot(authentication) {};
}
}
...and...
#Configuration
#EnableJpaRepositories
class SecurityConfiguration {
#Bean
EvaluationContextExtension securityExtension() {
return new SecurityEvaluationContextExtension();
}
}
...which now allows a #Query like this...
#Query("select o from BusinessObject o where o.owner.emailAddress like "+
"?#{hasRole('ROLE_ADMIN') ? '%' : principal.emailAddress}")
To me, that seems to be the most clean solution, since your #Query now uses the principal without you having to write all the controllers yourself.

Ok, i think I have found a solution. I guess it's not very pretty but it works for now. I used #PostAuthorize("returnObject.user.username == principal.username") or similar for repository methods that work with single entities and created a default implementation for Page<Activity> findAll(Pageable pageable) that just gets the username by calling SecurityContextHolder.getContext().getAuthentication().getName() and calls a custom query method that gets the correct data:
#RestResource(exported = false)
#Query("MATCH (u:User)-[:HAS]->(a:Activity) WHERE u.username={ username } RETURN a ORDER BY CASE WHEN NOT { sortingProperty} IS NULL THEN a[{ sortingProperty }] ELSE null END SKIP { skip } LIMIT { limit }")
List<Activity> findAllForUsernamePagedAndSorted(#Param("username") String username, #Param("sortingProperty") String sortingProperty, #Param("skip") int skip, #Param("limit") int limit);

Eager Load from Entity Framework SP

Im trying to populate my domain models and child entities with 1 SQL Store Proceedure execution. Perhaps this is answered here. Im pretty certain it's not possible but I though I would throw the question out there to find possible work arounds.
I have quite complex domain models and im looking for a more efficient way of loading my data rather than query a customer and then lazy load its children. I have presented a simple example of what im trying to achive below;
public class Customer{
public int Id { get; set; }
public virtual Address Address { get; set; }
}
public class Address{
public int Id { get; set; }
}
var customer = this.Database.SqlQuery< Customer >("exec SP_Name")
I know in EF5 you can return multiple data contexts but im hopeful I can resolve muliple child entities.
I hope ive made sense. Im lacking alot of sleep so apologies if it doesn't. Following a sport in a timezone 10 hours behind makes it difficult! :(

Stored procedures in EF don't offer eager loading. They can only load single level of entities. You can either use stored procedure with multiple result sets as mentioned in linked article but that works only with EDMX and you must execute mapped function import instead of SqlQuery. You can also simply use eager loading with LINQ query instead of stored procedure to avoid lazy loading:
var customers = context.Set<Customer>()
.Include(c => c.Address)
.FirstOrDefault(c => c.Name == someName);

MVC Entity Framework Partial Class access DB for property value

I am using Entity Framework mapped to my database. I have a Basket model which can have many BasketItem models, and I have Promotions and Coupons models.
This is for eCommerce checkout functionality and I just don't understand how this will work, here goes:
Because my BasketItems have a foreign key relationship to the Basket if I want to sum up the Subtotal for my basket items in a partial class, I can do this:
public decimal Subtotal {
get {
return this.BasketItems.Sum(pb => pb.Subtotal);
}
}
This is helpful because I can use this inside a view, there's no mucking around with passing a DB context through and it's DRY, etc. etc.
Now I want to apply promotions or coupons to my Subtotal ideally I want it to look like this:
public decimal DiscountedSubtotal {
get {
decimal promotions_discount = 0;
decimal coupons_discount = 0;
return Subtotal - promotions_discount - coupons_discount;
}
}
But there is no access to Promotions or Coupons without either creating some crazy and unnecessary relationships in my database or some light hacking to get this functionality to work. I don't know what I should do to overcome this problem.
Solution 1:
public decimal DiscountedSubtotal(DatabaseEntities db) {
decimal promotions_discount = from p in db.Promotions
select p.Amount;
decimal coupons_discount = from c in db.Coupons
select c.Amount;
return Subtotal - promotions_discount - coupons_discount;
}
I don't want to use this in my View pages, plus I have to send through my context every time I want to use it.
Solution 2: (untested)
public List<Promotion> Promotions { get; set; }
public List<Coupon> Coupons { get; set; }
public Basket()
: base() {
DatabaseEntities db = new DatabaseEntities();
Promotions = db.Promotions.ToList();
Coupons = db.Coupons.ToList();
}
A bit of light hacking could provide me with references to promotions and coupons but i've had problems with creating new contexts before and I don't know if there is a better way to get me to the DiscountedSubtotal property I would ideally like.
So to sum up my question, I would like to know the best way to get a DiscountedSubtotal property.
Many thanks and apologies for such a long read :)

I think the problem here is that you're not really using a coherent architecture.
In most cases, you should have a business layer to handle this kind of logic. Then that business layer would have functions like CalculateDiscountForProduct() or CalculateNetPrice() that would go out to the database and retrieve the data you need to complete the business rule.
The business class would talk to a data layer that returns data objects. Your view only needs it's view model, which you populate from the business objects returned by your businesss layer.
A typical method might be:
public ActionResult Cart() {
var model = _cartService.GetCurrentCart(userid);
return View(model);
}
So when you apply a discount or coupon, you would call a method like _cartService.ApplyDiscount(model.DiscountCode); then return the new model back to the view.
You might do well to study the Mvc Music Store sample project, as it includes cart functionality and promo codes.
http://www.asp.net/mvc/tutorials/mvc-music-store/mvc-music-store-part-1

ASP.NET MVC help needed

Can someone explane the following code for me?
public class StoreEditorViewModel
{
public List<Ticket> TotalView { get; set; }
public StoreEditorViewModel()
{
using (MvcTicketsEntities storeDB = new MvcTicketsEntities())
{
var temp = storeDB.Tickets.Include(x => x.Genres).Include(x => x.Artists).ToList();
TotalView = temp.ToList();
}
}
}
I don't understand the Inculde(x => x.genres) *genres is another table in my database. ( i use entity Framework)

The Include is telling EF to fetch the Genres records as part of this sql request, rather than making you call twice (once for Tickets and again for the Tickets Genres).
To quote Jon Galloway in the MVC Music Store example (your code looks very similar)
"We’ll take advantage of an Entity Framework feature that allows us to indicate other related entities we want loaded as well when the Genre object is retrieved. This feature is called Query Result Shaping, and enables us to reduce the number of times we need to access the database to retrieve all of the information we need. We want to pre-fetch the Albums for Genre we retrieve, so we’ll update our query to include from Genres.Include(“Albums”) to indicate that we want related albums as well. This is more efficient, since it will retrieve both our Genre and Album data in a single database request."

what is the best way to store a user filtered query params in a database table?

I have an ASP.NET MVC website. In my backend I have a table called People with the following columns:
ID
Name
Age
Location
... (a number of other cols)
I have a generic web page that uses model binding to query this data. Here is my controller action:
public ActionResult GetData(FilterParams filterParams)
{
return View(_dataAccess.Retrieve(filterParams.Name, filterParams.Age, filterParams.location, . . .)
}
which maps onto something like this:
http://www.mysite.com/MyController/GetData?Name=Bill .. .
The dataAccess layer simply checks each parameter to see if its populated to add to the db where clause. This works great.
I now want to be able to store a user's filtered queries and I am trying to figure out the best way to store a specific filter. As some of the filters only have one param in the queryString while others have 10+ fields in the filter I can't figure out the most elegant way to storing this query "filter info" into my database.
Options I can think of are:
Have a complete replicate of the table (with some extra cols) but call it PeopleFilterQueries and populate in each record a FilterName and put the value of the filter in each of field (Name, etc)
Store a table with just FilterName and a string where I store the actual querystring Name=Bill&Location=NewYork. This way I won't have to keep adding new columns if the filters change or grow.
What is the best practice for this situation?

If the purpose is to save a list of recently used filters, I would serialise the complete FilterParams object into an XML field/column after the model binding has occurred. By saving it into a XML field you're also giving yourself the flexibility to use XQuery and DML should the need arise at a later date for more performance focused querying of the information.
public ActionResult GetData(FilterParams filterParams)
{
// Peform action to get the information from your data access layer here
var someData = _dataAccess.Retrieve(filterParams.Name, filterParams.Age, filterParams.location, . . .);
// Save the search that was used to retrieve later here
_dataAccess.SaveFilter(filterParams);
return View(someData);
}
And then in your DataAccess Class you'll want to have two Methods, one for saving and one for retrieving the filters:
public void SaveFilter(FilterParams filterParams){
var ser = new System.Xml.Serialization.XmlSerializer(typeof(FilterParams));
using (var stream = new StringWriter())
{
// serialise to the stream
ser.Serialize(stream, filterParams);
}
//Add new database entry here, with a serialised string created from the FilterParams obj
someDBClass.SaveFilterToDB(stream.ToString());
}
Then when you want to retrieve a saved filter, perhaps by Id:
public FilterParams GetFilter(int filterId){
//Get the XML blob from your database as a string
string filter = someDBClass.GetFilterAsString(filterId);
var ser = new System.Xml.Serialization.XmlSerializer(typeof(FilterParams));
using (var sr = new StringReader(filterParams))
{
return (FilterParams)ser.Deserialize(sr);
}
}
Remember that your FilterParams class must have a default (i.e. parameterless) constructor, and you can use the [XmlIgnore] attribute to prevent properties from being serialised into the database should you wish.
public class FilterParams{
public string Name {get;set;}
public string Age {get;set;}
[XmlIgnore]
public string PropertyYouDontWantToSerialise {get;set;}
}
Note: The SaveFilter returns Void and there is no error handling for brevity.

Rather than storing the querystring, I would serialize the FilterParams object as JSON/XML and store the result in your database.
Here's a JSON Serializer I regularly use:
using System.IO;
using System.Runtime.Serialization.Json;
using System.Text;
namespace Fabrik.Abstractions.Serialization
{
public class JsonSerializer : ISerializer<string>
{
public string Serialize<TObject>(TObject #object) {
var dc = new DataContractJsonSerializer(typeof(TObject));
using (var ms = new MemoryStream())
{
dc.WriteObject(ms, #object);
return Encoding.UTF8.GetString(ms.ToArray());
}
}
public TObject Deserialize<TObject>(string serialized) {
var dc = new DataContractJsonSerializer(typeof(TObject));
using (var ms = new MemoryStream(Encoding.UTF8.GetBytes(serialized)))
{
return (TObject)dc.ReadObject(ms);
}
}
}
}
You can then deserialize the object and pass it your data access code as per your example above.

You didn't mention about exact purpose of storing the filter.
If you insist to save filter into a database table, I would have following structure of the table.
FilterId
Field
FieldValue
An example table might be
FilterId Field FieldValue
1 Name Tom
1 Age 24
1 Location IL
3 Name Mike
...

The answer is much more simple than you are making it:
Essentially you should store the raw query in its own table and relate it to your People table. Don't bother storing individual filter options.
Decide on a value to store (2 options)
Store the URL Query String
This id be beneficial if you like open API-style apps, and want something you can pass nicely back and forth from the client to the server and re-use without transformation.
Serialize the Filter object as a string
This is a really nice approach if your purpose for storing these filters remains entirely server side, and you would like to keep the data closer to a class object.
Relate your People table to your Query Filters Table:
The best strategy here depends on what your intention and performance needs are. Some suggestions below:
Simple filtering (ex. 2-3 filters, 3-4 options each)
Use Many-To-Many because the number of combinations suggests that the same filter combos will be used lots of times by lots of people.
Complex filtering
Use One-To-Many as there are so many possible individual queries, it less likely they are to be reused often enough to make the extra-normalization and performance hit worth your while.
There are certainly other options but they would depend on more detailed nuances of your application. The suggestions above would work nicely if you are say, trying to keep track of "recent queries" for a user, or "user favorite" filtering options...
Personal opinion
Without knowing much more about your app, I would say (1) store the query string, and (2) use OTM related tables... if and when your app shows a need for further performance profiling or issues with refactoring filter params, then come back... but chances are, it wont.
GL.

In my opinion the best way to save the "Filter" is to have some kind of json text string with each of the "columns names"
So you will have something in the db like
Table Filters
FilterId = 5 ; FilterParams = {'age' : '>18' , ...
Json will provide a lot of capabilities, like the use of age as an array to have more than one filter to the same "column", etc.
Also json is some kind of standard, so you can use this "filters" with other db some day or to just "display" the filter or edit it in a web form. If you save the Query you will be attached to it.
Well, hope it helps!

Assuming that a nosql/object database such as Berkeley DB is out of the question, I would definitely go with option 1. Sooner or later you'll find the following requirements or others coming up:
Allow people to save their filters, label, tag, search and share them via bookmarks, tweets or whatever.
Change what a parameter means or what it does, which will require you to version your filters for backward compatibility.
Provide auto-complete functions over filters, possibly using a user's filter history to inform the auto-complete.
The above will be somewhat harder to satisfy if you do any kind of binary/string serialization where you'll need to parse the result and then process them.
If you can use a NoSql DB, then you'll get all the benefits of a sql store plus be able to model the 'arbitrary number of key/value pairs' very well.

Have thought about using Profiles. This is a build in mechanism to store user specific info. From your description of your problem its seems a fit.
Profiles In ASP.NET 2.0
I have to admit that M$ implementation is a bit dated but there is essentially nothing wrong with the approach. If you wanted to roll your own, there's quite a bit of good thinking in their API.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart