Medical Costs Estimation Using Linear Regression Method
Abstract
Medical costs are a significant issue in the health sector. High healthcare cost lead to the need to anticipate financial risks for individuals and insurance providers. Therefore, medical cost data analysis is necessary to estimate future medical expenses. This research implements data mining techniques using Simple and Multiple Linear Regression methods to estimate medical costs. The dataset used consists of insurance claim data obtained from Kaggle, which includes attributes such as age, gender, body mass index, number of children, smoking habits, region, and medical charges. The research findings that Multiple Linear Regression outperforms Simple Linear Regression in estimating the provided dataset, with R2 value of 80% and lower ?? MSE and MAE values than Simple Linear Regression. The application of linear regression in insurance claim data analysis can provide significant benefits for patients, hospitals, and insurance providers. Overall, this research highlights the effectiveness of data mining techniques, specifically linear regression, in estimating healthcare costs.