Simple Linear Regression in machine learning Using Python

Simple Linear Regression is a Supervised machine learning Algorithm coming under the concept of regression.

It is a statistical model that represents the relationship between one independent variable (X) and one dependent variable (y).

In this context the plot should be a straight line which is called as best-fit line or regression line.

It is represented by the regression equation, that is:
Y = m*X + c

where m represents the slope or gradient of the line, this can be positive slope, negative slope or zero, and c represents the y-intercept of the line.

img

Example: Let we have a csv file named as “homeprice.csv” having two columns like area and price. Here by implementing simple linear regression we will predict the price value by taking an area value. Here we have considered area as X(independent variable) and price as y(dependent variable)


#import libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

#Create dataframe
df=pd.read_csv("E:\dataset\homeprice.csv")
print(df)

Output:

area   price  
0   2600 550000
1   3000 565000
2   3200 610000
3   3600 680000
4   4000 725000

df.shape

Output:
(5, 2)


df.info()
< class 'pandas.core.frame.DataFrame'>
RangeIndex: 5 entries, 0 to 4
Data columns (total 2 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 area  5 non-null int64
1 price 5 non-null int64
dtypes: int64(2)
memory usage: 208.0 bytes


df.describe()

Output:

img

#Check missing value
df.isnull().sum()

Output:

area     0
price    0
dtype: int64

#Scatter plot
plt.scatter(df.area,df.price,marker='*',color='red')
plt.xlabel("area values")
plt.ylabel("price values")
plt.show()
img
#Create LinearRegression model
from sklearn.linear_model import LinearRegression
obj=LinearRegression()

#Trained the model
obj.fit(df[['area']],df.price)

Output:
LinearRegression()


#Predict the value
obj.predict([[4500]])

Output:
array([791660.95890411])


#m-value
obj.coef_

Output:
array([135.78767123])


#c-value
obj.intercept_

Output:
180616.43835616432


#Now we will put m and c value in regression equation y=m*X+c
y=135.78767123*4500+180616.43835616432
print(y)

791660.9588911643


#Plot a bestfit line(regression line)
plt.scatter(df.area,df.price,marker='*',color='red')
plt.plot(df.area,obj.predict(df[['area']]),color='blue')
plt.xlabel("area values")
plt.ylabel("price values")
plt.show()
img

About the Author



Silan Software is one of the India's leading provider of offline & online training for Java, Python, AI (Machine Learning, Deep Learning), Data Science, Software Development & many more emerging Technologies.

We provide Academic Training || Industrial Training || Corporate Training || Internship || Java || Python || AI using Python || Data Science etc





 PreviousNext