Mahout trust aware collaborative filtering - mahout

I'm trying develop a trust-aware collaborative filtering approach. I have two epinions datasets. One with who trusts who: <ID_truster, ID_trusted>. And one with ratings: <ID_truster, ITEM, RATING>.
How can I make recommendations (User-User based) using only ratings from people who I trust?
At the moment I only make recommendations using the second dataset, taking in consideration every user.
Thank you

The closest thing I can think of is to use a user-neighborhood-based approach, and only include trusted users in the neighborhood. You would need to write some extra code for that, to disqualify untrusted users, by returning a very negative similarity value for them. Look at the UserSimilarity interface.

Related

Time Sensitive Collaborative Filtering

I am trying to use collaborative filtering to recommend items to the user based on their past purchase. I have created a user vector representing his usage and item vector(A) with values populated as probabilty of B given A. The objective to somewhat capture the items sold together in items vector representation. Now I need to find the time when these recommendations should be presented. As the items I am recommending are of periodic use timing is very important.
So I am trying to explore constraint-based Recommendations to make my recommendation time sensitive. The approach I am thinking is to create time-sensitive constraint based on the last date of purchase and average consumption rate. But the problem is creating constraint as user level will become computationally difficult.
I need your suggestion regarding the approach or suggestion of any better way to implement the same. All I want to develop a recommendation engine using customer's usage data of items that are consumed and required to purchase again. I need to output list of recommendation as well as timing of presenting the recommendation to the user
Thanks
The way I see it, there are two basic options here that you can pursue. On the one hand, the temporal features can be incorporated as additional information and converted into a kind of hybrid recommendation. The Python package "lightfm" is a good example.
On the other hand, the problem can also be modeled as a time series problem. A well-known paper dealing with Next Basket Recommendations is "A Dynamic Recurrent Model for Next Basket Recommendation". Here too, there are already implementations on Github.

Recommender System: Is it content-based filtering?

Can someone please help me clarify.
I am currently using collaborative filtering (ALS) which returns a recommendation list with scores corresponding to the recommended items. In addition to this, I am boosting the scores (+0.1) if the items contain a tag that corresponds with what the user has specified they prefer such as "romantic movies". To me, this is considered a hybrid collaborative approach since it's boosting the Collaborative filtering results with content-based filtering (Please correct me if I am wrong).
Now, what if I did the same approach without doing Collaborative filtering? would it be considered Content-based Filtering? since I will be still recommending dishes based on the content and attributes of each dish corresponding to what the user has specified they like (such as "romantic movies").
The reason why I'm confused is because I've seen content-based filtering where they apply an algorithm such as Naive Bayes etc, and this approach would be similar to a simple search of the items (on the contents).
Not sure you can do what you suggest because you have no score to boost without CF.
You are indeed using a hybrid, much the same as the Universal Recommender. To do purely content-based recommendations you have to implement two methods
Personalized recommendations: here you have to look at the content of items the user preferred and find items that have similar content. This can be done by using something like the Mahout spark-rowsimilarity job to create a model of item: list-of-similar-items then indexing the results with a search engine and using the user's preferred item ids as the query. This is being added to the Universal Recommender.
"People who liked this also liked these": these are items similar to one being viewed, for example, and are the same for all users. They are not personalized and so are useful even for anonymous users with no history. This can be done with the same indexed ids as above but using the items similar to the one being viewed as the query. One might think to use only the similar items themselves but by using them as a query you can put the categorical boost in the search engine query and have boosted items returned. This already works in the Universal Recommender but the similar items are not in the model yet.
That said mixing content with collaborative-filtering will almost surely give better results since CF works better when the data is available. The only time to rely on content-based recommendations is when your catalog is of one-off items, which never get enough CF interactions or you have rich content, which has a short lifetime like breaking news.
BTW anyone who wants to help add the pure content-based part to the Universal Recommender can contact the new maintainers of it at ActionML.com

Mahout Recommender - questions to setup user preference

I'm looking for some advice / guidance --
I'm working on a recommendation engine / personnel assistance app, using Mahout as the framework -
What I want to do is for new users of the app to begin by answering 5 questions and use the answers from the questions to effect the recommendation -- pretty much feeding the answers as a user-preference
I'm just not sure how to incorporate this into my code, I'm not even sure where to begin looking - I've been Googling but none of the search results really address this...
Any suggestions / advice / guidance will be greatly appreciated
Thanks
I did just that with the new Spark Itemsimilarity implementation about a year ago. You'll need a search engine for the recommendations query because Mahout doesn't have a server. I'd suggest using the new "Universal Recommender" engine template with PredicitonIO. It uses Mahout to calculate the model and Elasticsearch to serve it.
https://templates.prediction.io/PredictionIO/template-scala-parallel-universal-recommendation
PreditionIO is a framework of integrated components that provide an event server (for event storage) integration with Hadoop/HDFS, Spark, Hbase, and a REST or SDK API. All you do is install it and get the template as a plugin engine. This will provide pretty advanced recommendations queries with multiple event ingestion, a hybrid content-based method to tune results, and several methods of using popular items for backfill when no other recommendations can be made. It also uses realtime user actions for recommendations.
This last bit is super important if you want to have your users go through some training. This way they will see the benefit of training in realtime. Check this site, where I did exactly what you are talking about: https://guide.finderbots.com Notice the "Trainer". It presents you with movies and asks for thumbs up or down for as many as you care to do, then when you ask for recommendations they will be based on the realtime preferences of the user. You need to create an account first so we have a user-id.
The way I created the list for the trainer is by cluster popular items. By clustering I mean based on the users that preferred the items. Clustering produces items that are differentiated because they belong to different clusters, which means different user-sets tended to like them, and the popular ones are more likely to be known by users when they go through training. These are good things to have in a trainer.

Apache Mahout: can we combine User-Item and Item-Item?

I am new to Mahout, and am still playing around with it.
My question is, is it appropriate to combine Item-Item and User-Item?
My use case is, a social networking application will try to recommend something for the current user based on user historical data (with higher priority), and combine the recommendation results from the current user's friends historical data (with lower priority), and display the result with ordered rating list.
The reason is, for example a new user might not have much historical data in the system, we can recommend something from his friends historical data. Once the user accumulate enough historical data, the recommendation should be based on more on that.
Is it appropriate to design system in this way?
Thank you for your time,
George
This is fairly simple to write. You can create recommendations for the user, and then combine with recommendations for the other users. A dumb version of this logic would be to add: merge lists of recommendations by adding the scores for items that appear in both lists. Maybe you add N friends' recs together, and then add N times the user's own recs. You take recommendations from this list then.
This doesn't exist in the project per se but it's quite easy to write a method to do this on the List<RecommendedItem> that comes back from recommend().

collaborative filtering approach for tips/recommendations related to registered courses

I am looking at a specific problem where I need to build a recommender.
The generalized problem is as follows,
Each user has registered for (say) x courses (c1, c2, c3, .. cx)
Depending on each course, I need to provide (say) top 5 tips/recommendations to the user (e.g. study materials that could be useful etc)
I need collaborative elements to be applied to learn what recommendations are proving helpful to users.
I looked at the recommendation engines like Apache Mahout Taste, but I am unable to model my problem in a way that it looks like the examples shown. (The extra filtering criteria where a user is associated with one or more courses and each recommendation/tip could be associated with one or more courses is throwing me off.)
Please let me know if there is any good way of modeling such a problem? Any pointers to documentation/examples would be very appreciated.
I am just starting my research in this area so please bear with me if I have misunderstood any concepts.
Thanks,
Vivek
This may be too simple to need a recommender. If each course has a set of associated materials, then it seems clear that taking course c1 means they should have the associated materials for the course. Maybe rank from among all materials by popularity. That might be very easy and accomplish most of what you need.
If you want to model this as CF, you can; I don't know how much data you have. If you have just a handful of users and courses it will be too sparse to give useful answers.
Your users have relations to two things: courses and materials. You don't want to recommend courses, but rather materials. I would build two data models: one with user-course info, and one with user-material purchase info. Use the user-course data as the basis of a UserSimilarity implementation that defines user-user similarity. Then piece that together with a NearestNUserNeighborhood, a GenericUserBasedRecommender, but using the other user-material data model.
You will be using user-user similarity based on courses to make recommendations from among materials.

Resources