Search for a command to run...
Recommender systems apply knowledge discovery techniques to the problem of making personalized product recommendations. These systems are achieving widespread success in e-commerce nowadays. The tremendous growth of customers and products poses some key challenges for collaborative filtering (CF) based recommender systems. These axe: producing high quality recommendations, performing many recommendations per second for millions of customers and products, achieving high coverage in presence of data sparsity, and meeting the demands of widespread availability. New recommender system technologies are needed that can quickly produce high quality recommendations, even for very large-scale problems. In this dissertation, we present our approaches to address three such research challenges—sparsity, scalability, and distribution. For the first two challenges, we perform experiments on real-world data. Our experiments show improvements over the basic CF-based algorithm. For the third challenge, we provide a framework that can be extended to implement distributed recommender systems. We tackle the sparsity problem in two ways—by implementing a model for integrating content-based ratings into a CF system and by applying alternate algorithmic approaches to address sparsity. For the first approach, we apply semi-intelligent filtering agents that generate ratings by analyzing syntactic features of the item content. For the second approach, we apply singular value decomposition (SVD) based prediction algorithms and item-based CF algorithms. Our results suggest that both of these techniques are capable of addressing the sparsity issue. We apply two different approaches to address the scalability issue—by using dimensionality reduction for neighborhood formation and by using incremental model-based techniques. In the first approach, we use SVD based dimensionality reduction technique and a clustering based technique. In the second approach, we use SVD-based incremental technique, incremental item-based technique, and association rule-based technique. Our results suggest that these methods have potential for improving the scalability of recommender systems. We address the distribution issue by analyzing different approaches to implement distributed CF-based recommender systems. We present a taxonomy of recommender system applications based on the relative relevance of geographically proximate and distant users and items. We present three distribution frameworks and an evaluation of each framework-application pair.