Mahout基于用户的协同过滤算法的例子

关注
Mahout基于用户的协同过滤算法的例子www.shan-machinery.com

每行测试数据分别标识用户id(uid),物品id(itemid),评分(rating),评分时间(time)3464,2502,3,9732825473464,3160,2,9732824943464,2505,3,9671750703464,1703,2,9672480433464,1704,5,9672466803464,3163,1,9671742663464,2369,4,9732823393464,1569,4,9672474363464,896,3,9672475573464,3316,3,9732829343464,2517,3,9671741393464,3174,4,9671742663464,3175,2,9732824213464,3176,3,9671742983464,1573,3,9672478653464,3178,4,9672475873464,105,3,9672480193464,3325,4,9732825473464,1721,3,9672470423464,3327,4,9732828923464,3185,3,9671742983464,1727,4,9672482683464,111,5,9671744383464,3186,4,9672429493464,1729,3,9672471653464,1584,3,9672470783464,2387,3,9672478843464,2389,4,9671752563464,1589,4,9672480193464,1732,4,9672473063464,2391,4,9672469353464,2395,4,9732826253464,2396,5,9672467523464,1597,4,9671749603464,2541,3,967247865

package userBased;import java.io.File;import java.util.List;import org.apache.mahout.cf.taste.impl.model.file.FileDataModel;import org.apache.mahout.cf.taste.impl.neighborhood.NearestNUserNeighborhood;import org.apache.mahout.cf.taste.impl.recommender.GenericUserBasedRecommender;import org.apache.mahout.cf.taste.impl.similarity.PearsonCorrelationSimilarity;import org.apache.mahout.cf.taste.model.DataModel;import org.apache.mahout.cf.taste.neighborhood.UserNeighborhood;import org.apache.mahout.cf.taste.recommender.RecommendedItem;import org.apache.mahout.cf.taste.recommender.Recommender;import org.apache.mahout.cf.taste.similarity.UserSimilarity;/** * mahout基于用户的协同过滤算法 * */public class UserBased {public static void main(String[] args) throws Exception {DataModel model = new FileDataModel(new File("F:/ml-1m/ratings.dat"));/** * 用户偏好数据包含评分欧氏距离:EuclideanDistanceSimilarity 皮尔森距离:PearsonCorrelationSimilarity余弦距离:UncenteredCosineSimilarity用户偏好数据不包含评分 曼哈顿距离:CityBlockSimilarity 对数似然距离: LogLikelihoodSimilarity */UserSimilarity similarity = new PearsonCorrelationSimilarity(model);// 相邻用户UserNeighborhood/** * NearestNUserNeighborhood 指定距离最近的N个用户作为邻居。示例:UserNeighborhood unb = new NearestNUserNeighborhood(10, us, dm);三个参数分别是: 邻居的个数,用户相似度,数据模型 邻居个数的大小直接决定了推荐结果的近似程度和计算的复杂度 ThresholdUserNeighborhood 指定距离最近的一定百分比的用户作为邻居。示例:UserNeighborhood unb = new ThresholdUserNeighborhood(0.2, us, dm); 三个参数分别是: 阀值(取值范围0到1之间),用户相似度,数据模型 */UserNeighborhood neighborhood = new NearestNUserNeighborhood(500, similarity, model);//根据数据模型、用户相似度模型、以及邻近值构建推荐引擎Recommender recommender = new GenericUserBasedRecommender(model, neighborhood, similarity);// 向用户100推荐2个商品List recommendations = recommender.recommend(100, 2);for (RecommendedItem recommendation : recommendations) {// 输出推荐结果System.out.println(recommendation);}}}https://www.shan-machinery.com