AI News, Text Classification using machine learning

Text Classification using machine learning

Text classification is one of the important task that can be done using machine learning algorithm, here in this blog post i am going to share how i started with the baseline model, then tried different models to improve the accuracy and finally settled down to the best model.

Micro F1 Score: In Micro-average method, you sum up the individual true positives, false positives, and false negatives of the system for different sets and the apply them to get the statistics.

Let’s start by importing all the required libraries Here i am using some sample data, you can use data from sklearn library Let’s split the data for training and testing purpose.

The score comparison below shows that micro and weighted F1 scores are generally similar for the same classifier, unlike macro F1 score.

F1 score

In statistical analysis of binary classification, the F1 score (also F-score or F-measure) is a measure of a test's accuracy.

It considers both the precision p and the recall r of the test to compute the score: p is the number of correct positive results divided by the number of all positive results returned by the classifier, and r is the number of correct positive results divided by the number of all relevant samples (all samples that should have been identified as positive).

The F1 score is the harmonic average of the precision and recall, where an F1 score reaches its best value at 1 (perfect precision and recall) and worst at 0.








Earlier works focused primarily on the F1 score, but with the proliferation of large scale search engines, performance goals changed to place more emphasis on either precision or recall[5]

Note, however, that the F-measures do not take the true negatives into account, and that measures such as the Matthews correlation coefficient, Informedness or Cohen's kappa may be preferable to assess the performance of a binary classifier.[3]

The F-score has been widely used in the natural language processing literature, such as the evaluation of named entity recognition and word segmentation.

Create a League Table in an Excel Spreadsheet - Part 1 of 3

Need to create a league table in an Excel spreadsheet? This series of 3 videos breaks down the key steps. Download file link: ...

Azure Machine Learning: Predict Who Survives the Titanic - Jennifer Marsman - Duo Tech Talk

Interested in doing machine learning in the cloud? In this demo-heavy talk, I will set the stage with some information on the different types of machine learning ...

Lecture 34 - Partial Differential Equations

Numerical Methods and Programing by P.B.Sunil Kumar, Dept of physics, IIT Madras.

Adam Collins: Building a Quant S&P Model + Revealing How Hedge Funds Are Positioned

Adam Collins: Building a Quant S&P Model + Revealing How Hedge Funds Are Positioned. // Visit Mr. Collins on his websites: ..

NW-NLP 2018: Compositional Language Modeling for Icon-Based Augmentative & Alternative Communication

The fifth Pacific Northwest Regional Natural Language Processing Workshop will be held on Friday, April 27, 2018, in Redmond, WA. We accepted abstracts ...

Camera Comparison: LG G7 ThinQ vs iPhone X

The LG G7 ThinQ is officially announced and available for pre-order and our retail version review unit has arrived about a week ago. In this video, we put the ...

Risk Assessment of Power Projects

The panelists in this webinar discuss the outlook of global and regional energy trends, levelized cost of electricity trends, feed-in-tariff systems, benchmarking ...

TissueScan: Gene expression Profile in Cancer Tissues via qPCR

TissueScan™ Cancer and Normal Tissue cDNA Arrays are developed for differential gene expression analysis. It validation among hundreds of different human ...

Planning & Development meeting of November 6, 2017