Fake News Detection Using N-Gram Analysis and Machine Learning Algorithms

Asha John, Meenakowshalya A


Fake news is untrue information presented as news. Fake news easily spread than real news amongst social networking sites. Detection of fake news is an important emerging research area which is gaining popularity. The main challenge in fake news detection is limited availability of resources (datasets). This project work detects fake news using n-gram analysis and machine learning algorithms. Evaluation and comparison between two feature extraction technique namely Term Frequency (TF)and Term Frequency-Inverted Document Frequency (TF-IDF) and six machine learning algorithms namely like Support Vector Machines (SVM), Linear Support Vector Machines (LSVM), K-Nearest Neighbor (KNN), Stochastic Gradient Descent (SGD), Decision Trees (DT), Logistic Regression (LR) are performed. The evaluation yields best performance exploiting feature extraction technique as Term Frequency-Inverted Document Frequency (TF-IDF) as and Linear SVM and SGD (Stochastic Gradient Descent) as machine learning algorithm with an accuracy of 93.5%. In SGD Performance tuning is applied to increase the accuracy. Comparing Grid Search CV and Random Search CV, the later gives better accuracy of 94.2%.


News detection, feature extraction, N-gram analysis, SVM, LSVM, KNN, DT, SGD, LR, GSCV, RSCV


