Chapter:1-Label Encoder vs One Hot Encoder in Machine Learning

Label Encoding:

data format
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
#import datasetdataset=pd.read_csv('Data.csv')
x= dataset.iloc[:,:-1].values
# Taking care of missing data
from sklearn.preprocessing import Imputer
imputer = Imputer(missing_values = 'NaN', strategy = 'mean', axis = 0)
imputer = imputer.fit(x[:, 1:3])
x[:, 1:3] = imputer.transform(x[:, 1:3])
#encoding categorical data
from sklearn.preprocessing import LabelEncoder
labelencoder = LabelEncoder()
x[:, 0] = labelencoder.fit_transform(x[:, 0])
Taking care of missing data
Label encoded country values.

One Hot Encoder

from sklearn.preprocessing import OneHotEncoder
onehotencoder = OneHotEncoder(categorical_features = [0])
x = onehotencoder.fit_transform(x).toarray()
OneHot encoded country values.

--

--

--

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Dimensionality Reduction for Galaxy Evolution

CJA 314 MASTER Possible Is Everything / cja314master.com

Visual Representation of Topic Clusters (Part 2)

10 Free Data Science courses from Harvard

Head Start(Part-2)

Intro to NLP using inaugural speeches of presidents

Identifying and Quantifying Trends in Time Series Data using the Mann-Kendall Test

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
ABHISHEK KUMAR

ABHISHEK KUMAR

More from Medium

Data Discretization

How Do Data Science And Artificial Intelligence Benefit Customer Service?

Difference between Population Data and Sample Data.

A Data Science Project Case Study