Automated User Requirements Extraction from Reddit Using Machine Learning

There are a lot of social platforms, where thousands of people discuss many topics. Some of the discussions are worthy and rich sources of gathering requirements of applications. Among these, Reddit is a valuable source where a tremendous set of information is erected and can be evaluated for helpful outcomes. Reddit is such a type of social platform which is getting popularity day by day. In this research Reddit is a source for getting requirements of selected popular apps. By generating a dataset as this type of step is not carried out manual tagging is done for preparing a training dataset. Preprocessing and classification is done for cleaning the dataset. By using state of the art machine learning algorithms, specifically text-based feature generation using TF-IDF and SVM, Naive Bayes and RF classifier, these are widely used for classification due to their high accuracy and maturity. Trained dataset feed to SVM, Naive Bayes and RF classifier. The results of confusion metrics and accuracy will be useful in the software engineering domain. This work will give a new social platform as a source of information for getting users requirements