How do I create a bag of words for text classification?

So I have 500 positive labelled lines of text and 500 negative labelled lines of text in two different and I have to build a naive bayes classifier for it.

My proposed course of action is as follows:

1. Divide both the files in 5 parts each [for 5 fold cross validation] 2. Take 1 part of both negative and positive texts, and extract a bag of words from both. 3. Using naive bayes check for word probabilities. 4. Get a threshold for selecting positive and negative. 5. Run it over the rest 4 parts and check for labelling error. 6. Change the initial taken part.

I dont know how should I go about extracting the bag of words and selecting the threshold or even if my approach is right.

I could certainly use help here and also suggestions on python libraries so that I dont have to do all of this manually.

submitted by new_in_montreal
[link][2 comments]

How do I create a bag of words for text classification?

Trending Articles

मुख मैथुन से उठाएं सेक्स का भरपूर मज़ा, जानें क्या है इसका सही तरीकामुख मैथुन...

Mp3 Download: Mdu - Mazola

Explicit Proxy Configuration

Unable to demote a Server 2008 R2 DC due to dcpromo dll error

SSIS 2019, MSOLEDBSQL, Thread safe? Connection issue

A/L Technology Stream – Subject combinations, Syllabuses and Teacher guides

Mahindananda - 'Dimu' daughter Senani's marriage!

[ROM][ONEUI 2.5][10.0][A530F/A530W/A730F]FusionX V1.0

Waves Ultimate 15 v25.04.07 Incl V.R Patch WiN

[ubuntu] Running kickseed is seen stuck at 0%

Rajasthan Board 10th Result 2016 Roll No wise & Name Wise

Practice Sheet of Right form of verbs for HSC Students

Ek Villain [2014 – FLAC]

Kurabuitaki na Sota Koya

AAD Connect loses the connection with SQL Server

Asianet plus schedule – list of programs , movie timings etc

NCERT Solutions for Class 9th Sanskrit Chapter 3 पाथेयम्

Child Kidnapping: Amy McNeil was kidnapped on her way to school by 5 adults;...

100+ Short Whatsapp Status in English | Short Status Quotes Words

ZLT P25 (Globe) - back to original firmware