MACHINE LEARNING
A Linear Regression model that predicts professional salaries based on years of experience with approx. 89% accuracy.
The Salary Prediction Tool solves a classic supervised learning problem: predicting a continuous variable (Salary) based on an independent variable (Years of Experience). By training a Linear Regression model, we establish a correlation trend line that can forecast earnings for any given experience level.
We load the dataset using Pandas and split it into training (80%) and testing (20%) sets using
train_test_split. This ensures the model is evaluated on unseen data.
import pandas as pd
from sklearn.model_selection import train_test_split
data = pd.read_csv("Salary.csv")
x = data[["YearsExperience"]]
y = data[["Salary"]]
xtrain, xtest, ytrain, ytest = train_test_split(x, y, test_size=0.2, random_state=42)
We initialize Scikit-Learn's LinearRegression algorithm and fit it to our training
data, helping the model learn the relationship between experience and salary.
from sklearn.linear_model import LinearRegression
model = LinearRegression()
model.fit(xtrain, ytrain)
Finally, we predict salaries for the test set and score the model's accuracy. The result confirms how well our regression line fits the data.
y_pred = model.predict(xtest)
accuracy = model.score(xtest, ytest)
print(f"Model Accuracy: {accuracy}")
# Output: ~0.89
This project successfully implemented a machine learning pipeline to forecast professional income. Key takeaways include: