In this quick post, I wanted to share a method with which you can perform linear as well as multiple linear regression, in literally 6 lines of Python code.
Today to perform Linear Regression quickly, we will be using the library scikit-learn. If you don’t have it already you can install it using pip: pip install scikit-learn
So now lets start by making a few imports:
We need numpy to perform calculations, pandas to import the data set which is in .csv format in this case, and matplotlib to visualize our data and regression line. We will use the LinearRegression class to perform the linear regression.
Now lets perform the regression:
We have our predictions in Y_pred. Now lets visualize the data set and the regression line:
That’s it! You can use any data set of you choice, and even perform Multiple Linear Regression (more than one independent variable) using the LinearRegression class in sklearn.linear_model. Also this class uses the ordinary Least Squares method to perform this regression. So accuracy wont be high, when compared to other techniques. But if you want to make some quick predictions and get some insight into the data set given to you, then this is a very handy tool.
You can see the video tutorial of the same in this video:
Find the data set and code here: https://github.com/chasinginfinity/ml-from-scratch/tree/master/03%20Linear%20Regression%20in%202%20minutes
Note: This article was originally published on towardsdatascience.com, and kindly contributed to DPhi to spread the knowledge.
Become a guide. Become a mentor.
We at DPhi, welcome you to share your experience in data science – be it your learning journey, experience while participating in Data Science Challenges, data science projects, tutorials and anything that is related to Data Science. Your learnings could help a large number of aspiring data scientists! Interested? Submit here.