How to structure data in order to use recommendation engine in mahout - mahout

For instance, i have a transaction table that tracking which user had buy which item and the quantity. My data only include user, item and quantity. Therefore, how can i use mahout to recommend other items to the user ?
Any recommendation method will do....
For example, the transaction table data:
User Item Quantity
user1 item1 20
user1 item2 50
user1 item3 0
user1 item4 10
user1 item5 0
user2 item1 50
user2 item2 1
user2 item3 100
user2 item4 77
user2 item5 40
user3 item1 150
user3 item2 0
user3 item3 5
user3 item4 10
user3 item5 40
How can i know which item should i recommend to user1 ?

I think using quantity column is not so sensible in a recommendation system. At least mahout does not have implementation for quantity of data (instead, implementation for ratings exists).
So you can remove quantity column and rows with 0 quantity then you will have boolean pref dataset.
There are several implementation methods for boolean pref datasets such as;
http://bigdatahandling.blogspot.co.uk/2014/01/recommendations-with-mahout-for-boolean.html

Related

Does pyspark.ml.recommendation.ALS create a pivot table under the hood?

An ALS recommendation model performs a matrix factorization where it factorizes a matrix of users vs items in latent factors.
A matrix of 3 users and 3 items would look like this:
users
item_1
item_2
item_3
user_1
NA
4
1
user_2
4
3
0
user_3
NA
1
NA
My dataframe starts such as:
users
items
rating
user_1
item_2
4
user_1
item_3
1
user_2
item_1
4
user_2
item_2
3
user_2
item_3
0
user_3
item_2
1
My question is, before inserting my dataframe in ALS module, do I need to transform it in way where, at the end, I will have a structure such as:
users
items
rating
user_1
item_1
NA
user_1
item_2
4
user_1
item_3
1
user_2
item_1
4
user_2
item_2
3
user_2
item_3
0
user_3
item_1
NA
user_3
item_2
1
user_3
item_3
NA
Or, will, under the hood, ml.recommendation.ALS function create those observations related to the places without interactions? Such as:
users
items
rating
user_1
item_1
NA
If it does not, a way to produce the expected table, would be pivot it and then unpivot it, but it would produce a very huge matrix of users vs items. However, from the examples presented in the documentation, it seems that this process (pivot and then, unpivot) is not necessary.
Yes. It is not necessary.
After you train you the ALS model, the fitted model should be used to predict the "missing interactions".
Thus, the term "fill" (in your sentence " ml.recommendation.ALS module fill those missing interactions") is not appropriate, you should uses the term "predict".

Multiple Item price depends on remaining quantities

When I'm entering the order Invoice I want to bring the buying price of (Item1) and its Remaining quantities from (purchaseDetail) table,
(Item1) had multiple records in the (purchaseDetail) table depends on it's purchase invoice .
In my continuous subform ordersDetailSubform I have a combobox cboItemsName .
So , depends on the quantity I entered and the previous quantity of Item1 orders I want to bring the buying price list as a row source query of cboItemsName containing only the available quantity prices .
ItemId
quantity
buyPrice
1
5
3$
1
10
2$
Let's say I have previous orders of Item1 of quantity (4) .
If I now entering my new order In my orders form , when I'm typing (Item1) , the cboItemsName row source query show me the Remaining quantities and its prices like that
remaining quantities
buying Price
1
3 $
10
2 $
Can help with the best Ideas to do that ?
It is possible that the remaining quantity of each item and it's bought price appears in the combo box that you have in your ordersDetailSubform but the price that you want to be shown should be the last price of the item that you purchased the item.
For every item one price can come with the remaining quantity in the combo box list.
Thanks a lot!

sum all occurrences of range values in other range

There are two sheets.
Sheet1:
Surname
Frst
Item1
Item2
Item3
Item4
Bernard
Shaw
SectionA
SectionB
SectionD
SectionF
Lincoln
Abel
SectionB
SectionE
Shakesp
Earl
SectionA
SectionE
SectionH
Sheet2:
Group1
Group2
Group3
SectionA
SectionD
SectionG
SectionB
SectionE
SectionH
SectionC
SectionF
SectionI
I want to count all occurrences of Sheet 2 column 1, all occurrences of Sheet 2 column 2 (etc) in each row of Sheet 1. For example:
Surname
Frst
Grp1
Grp2
Grp3
Bernard
Shaw
2
2
0
Lincoln
Abel
1
1
0
Shakesp
Earl
1
1
1
I have tried countifs and count and vlookups with limited success. Example:
=COUNTIFS(Sheet1!C1:Y1,#'Sheet2'!A:A)
Thanks for the help!
I not sure using # will be working or not, as I am getting error by testing your formula, however there is a simple method to solve your problem by using countif also, the only difference is that I will count on criteria on each row and add up all, here is the formula:
=COUNTIF($C3:$F3,A$10)+COUNTIF($C3:$F3,A$11)+COUNTIF($C3:$F3,A$12)

Summing values and averaging values while removing duplicates in Excel or Google Sheets

I have a google sheet that I use to track the different occasion I stock my items for small retail reselling, since the things I buy can have different price on different occasion (discounts, cashback, etc), I need to average the cost of a unique item while also adding the total stocks I have. removing the duplicates.
A B C D E F G
ITEMS STOCKS PRICE PER PCS SUBTOTAL DISCOUNT TOTAL FINALPRICE/PCS
ITEM1 3 $2 $6 10% $5.4 $1.8
ITEM2 2 $3
ITEM4 2 $1.5
ITEM1 2 $1.8 $3.6 10% $3.24 $1.62
So right now, I can put the following formula to column J
to already remove the duplicates and add the stocks of the duplicates.
={unique(A2:A5),ArrayFormula(sumif(A2:A5,unique(A2:A5),B2:B5))}
result as below,
J K
ITEMS STOCKS
ITEM1 5
ITEM2 2
ITEM4 2
But I would also like to average the price per pcs for unique items on column L. Any help is appreciated!
I tried
={unique(A2:A5),ArrayFormula(sumif(A2:A5,unique(A2:A5),B2:B5)), ArrayFormula(averageif(A2:A5,unique(A2:A5),G2:G5))}
This is the error code
Error
Function ARRAY_ROW parameter 3 has mismatched row size. Expected: 24. Actual: 1.
use QUERY instead like:
=QUERY(A2:G,
"select A,sum(B),avg(G)
where A is not null
group by A
label sum(B)'',avg(G)''", 0)

Updated multiple records column value of based selected id

In products table I have fields like
id product_name product_value quantity status
1 abc 10000 50 received
2 efg 5000 15 shipment
3 hij 850 100 received
4 klm 7000 20 shipment
5 nop 350 50 received
I can select multiple rows at a time. And here I selected id=2,4 and need to change the status='received'. How to do multiple update at single time in rails?
Try
Product.where(id: [2, 4]).update_all(status: 'received')
If you are looking for all products that have a status of 'shipment', you can use:
Product.where(status: 'shipment')
From here, you can set all to 'received', or iterate through them and select only the ones you want to make changes to.

Resources