Channel: Machine Learning

↧

Binary feature vector vs. Integer feature vector

September 30, 2014, 2:58 pm

≫ Next: How should ordered categories be encoded?

≪ Previous: Teaching Authorship Attribution to humanities grad students

I have a very basic, newbie question about binary feature vectors.

Suppose I want to classify 6-letter words into one of two classes (0/1). For a feature vector, I'd like to do one of two things:

1) Each letter has it's own binary string associated with it so A->(1,0,0....0), B->(0,1,0,.....0), ... Z->(0,0.......1) so 'BATMAN' would be represented as a 26*6=156 dimensional vector
(0,1,0.....0, //B
1,0,0.....0, //A
0,0...1...0, //T
.............., //M
.............., //A
..............) //N

2) Each letter has a single integer representing it's value A->1, B->2, ... Z->26 so 'BATMAN' would be represented as a 6 dimensional vector
(2,1,20,13,1,14)

Both representations encode all the information I want it to, but instinct tells me to go with the first representation.
My sense of linear algebra makes me instantly notice that in the first representation, each letter has a linearly independent vector representing it. I'm sure that's relevant, but I'm not sure exactly how.

If my alphabet set becomes larger (not just 26 letters, say 26000 letters), then the second representation still stays 6 dimensional, while the first one become crazy large.

So my questions are:
1. Is there something fundamentally wrong with the second representation?
2. What could possibly go wrong when training a classifier?
3. Is there any situation where the second representation would be favored over the first representation?

submitted by eptheta
[link][4 comments]

↧

Latest Images

7 clever tricks Primark does to keep you walking & buying more than you need...

7 clever tricks Primark does to keep you walking & buying more than you need...

July 20, 2025, 5:14 am

Art for Everyone! Autism advocacy, local stories, and indigenous pride in one...

Art for Everyone! Autism advocacy, local stories, and indigenous pride in one...

July 20, 2025, 5:06 am

Paintings of English Downs 2

Paintings of English Downs 2

July 20, 2025, 4:30 am

How Kerala Women Rescued a Dying Forest and Turned It Into a Safe Haven for...

How Kerala Women Rescued a Dying Forest and Turned It Into a Safe Haven for...

July 20, 2025, 3:30 am

Met Eireann warns of heavy rain & spot flooding for DAYS before big...

Met Eireann warns of heavy rain & spot flooding for DAYS before big...

July 20, 2025, 1:14 am

Who is Kevin Lerena’s wife Geraldine?

Who is Kevin Lerena’s wife Geraldine?

July 20, 2025, 12:57 am

Man stabs woman, baby to death inside Queens home, police say

Man stabs woman, baby to death inside Queens home, police say

July 19, 2025, 11:00 pm

Ang papel ni whistleblower Julie Patidongan sa kaso ng mga nawawalang sabungero

Ang papel ni whistleblower Julie Patidongan sa kaso ng mga nawawalang sabungero

July 19, 2025, 9:45 pm

Telangana Human Rights Commission (TGHRC) seeks report from revenue dept on...

Telangana Human Rights Commission (TGHRC) seeks report from revenue dept on...

July 19, 2025, 7:29 pm

Crisis-hit NHS fat cats raking in MASSIVE salaries as frontline services cry...

Crisis-hit NHS fat cats raking in MASSIVE salaries as frontline services cry...

July 19, 2025, 2:11 pm

Trending Articles

Tyler, The Creator – CHROMAKOPIA [iTunes Plus M4A]

October 28, 2024, 4:56 am

James Martin Normandy tart on James Martin’s French Adventure

February 21, 2017, 7:26 am

Grimsby man threatened 'ex' with knife during row

January 15, 2015, 11:59 pm

Rajasthan Board 10th Result 2016 Roll No wise & Name Wise

August 20, 2016, 5:13 pm

Unified Write Filter (UWF) 環境での運用を考慮した設定について

November 6, 2017, 2:05 am

Bureau of Internal Revenue: Regional Offices (Directory)

January 9, 2014, 11:06 pm

Waves Complete v2019.02.14 Incl Emulator-R2R

February 16, 2019, 7:50 am

Farrah Stone Johnson Pitcher Jon Lester’s wife

October 10, 2016, 9:56 am

Love (2015).H264.Italian.English.Ac3.5.1.multisub.iCV-MIRCrew Seed (62)/Leech...

September 14, 2017, 10:49 am

मुख मैथुन से उठाएं सेक्स का भरपूर मज़ा, जानें क्या है इसका सही तरीकामुख मैथुन...

May 17, 2020, 2:04 pm

Flux Full Pack 2.1 v3.5.16-R2R

May 6, 2016, 3:14 am

The Beatles – Please Please Me [iTunes Plus M4A + M4V]

October 29, 2024, 6:45 am

Practice Sheet of Right form of verbs for HSC Students

September 22, 2019, 11:40 pm

Sex offender, 29, ‘well aware’ girl was just 15

November 9, 2017, 12:00 am

A/L Technology Stream – Subject combinations, Syllabuses and Teacher guides

December 17, 2013, 6:12 pm

236 kg banned scented tobacco worth Rs 1.26 lakh seized in Wadi

June 22, 2021, 5:54 am

Throw Back: Kwaw Kese — Ma Kwan (Ft. Edem) Prod by Hammer

June 4, 2015, 4:10 pm

Corner Bar Sinza yafungwa .. Kutokana na Amri ya Dc Hapi

February 27, 2017, 1:54 am

It’s Kind of a Funny Story 2010 Dual Audio 720p BRRip [Hindi – English] ESubs

June 8, 2016, 6:15 am

STEPHANIA MORALES Arrested by Miami-Dade County Corrections on Dec 12, 2016

December 12, 2016, 9:23 am

Latest Images

7 clever tricks Primark does to keep you walking & buying more than you need...

7 clever tricks Primark does to keep you walking & buying more than you need...

July 20, 2025, 5:14 am

Art for Everyone! Autism advocacy, local stories, and indigenous pride in one...

Art for Everyone! Autism advocacy, local stories, and indigenous pride in one...

July 20, 2025, 5:06 am

Paintings of English Downs 2

Paintings of English Downs 2

July 20, 2025, 4:30 am

How Kerala Women Rescued a Dying Forest and Turned It Into a Safe Haven for...

How Kerala Women Rescued a Dying Forest and Turned It Into a Safe Haven for...

July 20, 2025, 3:30 am

Met Eireann warns of heavy rain & spot flooding for DAYS before big...

Met Eireann warns of heavy rain & spot flooding for DAYS before big...

July 20, 2025, 1:14 am

Who is Kevin Lerena’s wife Geraldine?

Who is Kevin Lerena’s wife Geraldine?

July 20, 2025, 12:57 am

Man stabs woman, baby to death inside Queens home, police say

Man stabs woman, baby to death inside Queens home, police say

July 19, 2025, 11:00 pm

Ang papel ni whistleblower Julie Patidongan sa kaso ng mga nawawalang sabungero

Ang papel ni whistleblower Julie Patidongan sa kaso ng mga nawawalang sabungero

July 19, 2025, 9:45 pm

Telangana Human Rights Commission (TGHRC) seeks report from revenue dept on...

Telangana Human Rights Commission (TGHRC) seeks report from revenue dept on...

July 19, 2025, 7:29 pm

Crisis-hit NHS fat cats raking in MASSIVE salaries as frontline services cry...

Crisis-hit NHS fat cats raking in MASSIVE salaries as frontline services cry...

July 19, 2025, 2:11 pm

© 2025 //www.rssing.com