ML | PREDICTING PRICE OF PRE-OWNED CARS
In this article we are going to discuss a practical application of machine learning using a case study on Linear Regression. Linear Regression is a supervised learning algorithm. It is used to predict the real- valued output y based on the given input value x. it depicts the relationship between the dependent variable y and the independent variable x.
we have downloaded the dataset for this problem statement from kaggle.com. Kaggle is an online community devoted to Data Science and Machine Learning founded by Google in 2010. The dataset is named as CAR DETAILS FROM CAR DEKHO.csv. The dataset can be found here.
Data Cleaning
Let us understand the problem statement using the dataset. As we know that Storm Motors act as mediators, and they wish to set price of car depending on the various car attributes. So the selling price is our output variable y and rest other variables are our input x to our model.
Data comes in very messy and unstructured form in order to bring it to workable or structured form, we need to “clean” our data. Some common cleaning includes parsing, converting to one-hot, removing unnecessary data etc. In our case, our data has four categorical independent variables namely [fuel, seller type, transmission, owner]. Our algorithm requires numbers; we can’t work with alphabets popping up in our data. So we have converted our data to numerical using .cat.codes function.
Training model
Once the data is cleaned, it can be used as an input to our Linear Regression model. We will use Scikit-learn’s linear regression model to train our dataset. We have split our data into training set and test set so that we can monitor the performance of our model. Once the model is trained, we have tested the model performance using test data. Now we know that our model is performing well, we will use this model to build an app.
Building ML App using streamlit
Building app using streamlit follows the same steps as we
have shown in our previous articles. In this app we have written the code to
take the various attributes of car as input. We can type these attributes in
the GUI of the app and click on Predict
Car Price to predict the price of the car. These features include year of
registration, KM car has travelled, fuel type, seller type, transmission, and
type of owner. These inputs are processed by the price_pred() function and result obtained is the predicted price of
the car.
UFuncTypeError
We would also like to share the error occurred while running this app, and how we have tackled this error. The error was named as:
In our model case we have two numerical input
feature that is year and km_driven. We have taken these numerical
inputs using st.text_input()
function. Although, our main function was taking these numerical input, price_pred() function was still treating
it as categorical input. So we have changed the function as st.number_input()
for both these features and the error automatically get resolved.
Comments
Post a Comment