Neural Networks with R: Predictive Modeling using nnet and Regression Comparisons — Stats with R (2024)

Neural Networks with R: Predictive Modeling using nnet and Regression Comparisons

Neural networks provide a powerful tool for predictive modeling, capable of capturing complex relationships in data. In this blog post, we will explore how to implement a neural network using the nnet package in R. We will build a neural network model to predict gender, education, and age based on personality test data. Furthermore, we will compare the neural network's performance to traditional regression models using Root Mean Squared Error (RMSE).

1. Setting up the RMSE Function

Before training any models, we define a custom RMSE function to evaluate their performance. The RMSE function computes the square root of the mean squared difference between predictions and actual values for each column in the dataset.

# Load necessary librarieslibrary(nnet)library(psych)# Function for calculating RMSERMSE <- function(Predictions, Test_Data){ RMSE_Results <- rep(0, ncol(Predictions)) for (i in 1:ncol(Predictions)) { if (is.numeric(Predictions[, i])) { RMSE_Results[i] <- mean((Predictions[, i] - Test_Data[, i])^2)^(1/2) } } RMSE_Results <- data.frame(matrix(RMSE_Results, nrow=1)) names(RMSE_Results) <- names(Test_Data) return(RMSE_Results)} 

The RMSE function calculates how far off our predictions are from the true values, providing a key metric to assess model accuracy.

2. Preparing the Data

We will be using the bfi dataset from the psych package, which contains data from a personality test. The gender variable is recoded as 0 for females and 1 for males. We split the dataset into training and testing sets for model evaluation.

# Load and prepare the datasetData <- data.frame(bfi[complete.cases(bfi),])Data$gender <- ifelse(Data$gender == 1, 1, 0)# Splitting the data into train and test setsset.seed(123)In_train <- sample(1:nrow(Data), round(nrow(Data) / 2, 0), replace = FALSE)train <- Data[In_train,]test <- Data[-In_train,] 

This code prepares the data by handling missing values and recoding gender. By setting a seed, we ensure the same random division of data for reproducibility.

3. Building a Neural Network

We use the nnet package to build a neural network with 10 hidden nodes. The model is trained to predict three target variables: gender, education, and age.

# Train the neural networkset.seed(123)model <- nnet(x = train[,1:25], y = train[,26:28], size = 10, maxit = 10^6, MaxNWts = 10^6, linout = TRUE) 

The maxit and MaxNWts parameters are set to high values to ensure that the model can converge. The linout = TRUE argument specifies a linear output, making it suitable for regression tasks.

4. Making Predictions and Evaluating RMSE

After training the neural network, we use it to make predictions on the test set. The RMSE function is then applied to evaluate the model's accuracy.

# Make predictions using the neural networkNN_Predictions <- data.frame(predict(model, test[,1:25]))# Calculate RMSE for the neural networkrmse_nnet <- RMSE(Predictions = NN_Predictions, Test_Data = test[,26:28]) 

This gives us the RMSE for the neural network model, helping us assess how well it performs compared to traditional models.

5. Comparing Predicted and Actual Distributions

For categorical variables like gender and education, we convert the predictions into factors and compare the predicted vs. actual distributions.

# Predicted distribution of genderNN_Predictions$gender <- factor(round(NN_Predictions$gender,0), levels = c(1,0), labels = c("Male","Female"))Predicted_Gender <- table(NN_Predictions$gender)# Actual distribution of genderActual_Gender <- table(factor(as.vector(ifelse(test$gender == 0, "Female", "Male")), levels = c("Male", "Female")))# Summary of predicted and actual genderrbind(c("Predicted", Predicted_Gender), c("Actual", Actual_Gender)) 

This code snippet shows how to compare the neural network's predictions with the actual gender values. A similar process is used for the education variable.

# Predicted distribution of educationNN_Predictions$education <- factor(round(NN_Predictions$education, 0), levels = 1:5, labels = c("HS", "HS complete", "Some College", "BS/BA", "MS/MA"))Predicted_Edu <- table(NN_Predictions$education)# Actual distribution of educationActual_Edu <- table(factor(as.vector(test$education), levels = 1:5, labels = c("HS", "HS complete", "Some College", "BS/BA", "MS/MA")))# Summary of predicted and actual educationrbind(c("Predicted", Predicted_Edu), c("Actual", Actual_Edu)) 

Here, we can see that the neural network may struggle to predict certain categories like education, which may lead to a poor match between predicted and actual distributions.

6. Visualizing Predicted vs Actual for Age

Next, we plot the predicted vs. actual values for the age variable to visualize how well the neural network captures this continuous target variable.

# Plot predicted vs actual for ageplot(test[,28], NN_Predictions[,3], main = "Predicted vs Actual Neural Network", xlab = "Actual", ylab = "Predicted")abline(0,1) 

This plot helps to visualize the model's performance on a continuous outcome like age. Ideally, points should lie along the diagonal line (indicating perfect predictions).

7. Benchmarking with Traditional Regression Models

To compare the neural network's performance, we also fit three traditional regression models: a logistic regression for gender, a Poisson regression for education, and a linear regression for age. We calculate their RMSE for comparison.

# Traditional regression modelsmodel_gender <- glm(as.matrix(train[,26]) ~ as.matrix(train[,1:25]), family = binomial())model_education <- glm(as.matrix(train[,27]) ~ as.matrix(train[,1:25]), family = poisson())model_age <- lm(as.matrix(train[,28]) ~ as.matrix(train[,1:25]))# Predictions for the regression modelsReg_Predictions <- data.frame(predict(model_gender, test[,1:25], type = "response"), predict(model_education, test[,1:25]), predict(model_age, test[,1:25]))names(Reg_Predictions) <- c("gender", "education", "age")# Calculate RMSE for regression modelsrmse_reg <- RMSE(Predictions = Reg_Predictions, Test_Data = test[,26:28]) 

These traditional models serve as benchmarks for comparing the performance of the neural network. By comparing RMSE, we can see which method performs better for each variable.

8. RMSE Comparison: Neural Network vs. Regression

We now summarize the RMSE results for both the neural network and the traditional regression models.

# RMSE comparisonrbind(c("NNet", rmse_nnet), c("Reg", rmse_reg)) 

The table provides a direct comparison of the two approaches, allowing us to see how well the neural network performs against traditional methods for each target variable.

9. Visualizing Predicted vs Actual for Regression

Finally, we plot the predicted vs. actual values for age using the linear regression model to compare its performance with the neural network.

# Plot predicted vs actual for regressionplot(test[,28], Reg_Predictions[,3], main = "Predicted vs Actual Regression", xlab = "Actual", ylab = "Predicted")abline(0,1) 

This plot allows us to visually compare the performance of the regression model for predicting age, with points closer to the diagonal indicating better predictions.

Conclusion

In this post, we implemented a neural network using the nnet package in R and compared its performance to traditional regression models using RMSE. While neural networks offer the potential to capture complex relationships, their performance may vary depending on the dataset and tuning. Comparing RMSE values allows us to objectively evaluate the strengths and weaknesses of each approach for specific tasks like predicting gender, education, and age.

Neural Networks with R: Predictive Modeling using nnet and Regression Comparisons — Stats with R (2024)
Top Articles
Latest Posts
Recommended Articles
Article information

Author: Kieth Sipes

Last Updated:

Views: 5729

Rating: 4.7 / 5 (67 voted)

Reviews: 90% of readers found this page helpful

Author information

Name: Kieth Sipes

Birthday: 2001-04-14

Address: Suite 492 62479 Champlin Loop, South Catrice, MS 57271

Phone: +9663362133320

Job: District Sales Analyst

Hobby: Digital arts, Dance, Ghost hunting, Worldbuilding, Kayaking, Table tennis, 3D printing

Introduction: My name is Kieth Sipes, I am a zany, rich, courageous, powerful, faithful, jolly, excited person who loves writing and wants to share my knowledge and understanding with you.