Sho't left to data science

Multivariate regression

May 7, 2018

—

in machine learning, mathematics, python, statistics

So: with linear regression (aka simple linear regression) we have one feature which we are using to predict a dependent value (for example number of rooms as a predictor of house price). With multivariate regression (aka multiple linear regression) we simply have multiple features which could be used to predict that dependent value (for example number of rooms, proximity to business hubs, crime rate, etc. as combined potential predictors of house price). I struggled a bit with the details surrounding this concept until I found the most excellent article Simple and Multiple Linear Regression in Python by Adi Bronshtein.

It’s always nice to feel you have built on previous concepts learned, and for some reason I had a burning desire to satisfy myself that concepts covered in Co-variance, Correlation & Linear Regression would map to the next level of multivariate regression, so using Adi’s tutorial, I offer here some worked examples that include a demonstration with one feature to show that prediction results for scipy.stats.linregress(), statsmodels.OLS() and sklearn.linear_model.LinearRegression() are consistent. I then delve into a couple of examples of using different numbers of features and with or without constants to see how the fit can differ quite dramatically.

There are many nuances which I hope I’ll get to grips with in due course, so this very much represents the nuts and bolts to get started: How it works – Multivariate Regression.

how to python