Error while trying glmnet() in R: "Error in storage.mode(xd) <- "double" : 'list' object cannot be coerced to type 'double'"

1

I'm trying to create a logistic regression model using Ridge, this is the code:

glmnet(X_Train, Y_Train, family='binomial', alpha=0, type.measure='auc')

And this is the error message I'm getting:

Error in storage.mode(xd) <- "double" : 'list' object cannot be coerced to type 'double'

I tried converting all the variables into "numeric" but still doesn't work.

I'm going to post the code for those two datasets so you can reproduce it:

libraries:

library(dplyr)
library(fastDummies)
library(missForest)
library(glmnet)

Data:

url <- 'https://archive.ics.uci.edu/ml/machine-learning-databases/credit-screening/crx.data'
crx <- read.csv(url, sep = ",", header = F)

Getting rid of null-values:

crx[crx == "?"] <- NA
crx <- type.convert(crx, as.is=FALSE)
crx.i <- missForest(as.data.frame(crx))
crx <- crx.i$ximp

Data transformations:

crx <- crx %>% 
rename(Gender = V1,
         Age = V2,
         Debt =  V3,
         Married = V4,
         BankCustomer = V5,
         EducationLevel = V6,
         Ethnicity = V7,
         YearsEmployed = V8,
         PriorDefault = V9,
         Employed = V10,
         CreditScore = V11,
         DriversLicense = V12,
         Citizen = V13,
         ZipCode = V14,
         Income = V15,
         ApprovalStatus = V16)

crx = subset(crx, select = -ZipCode)

crx <- crx %>% 
mutate(ApprovalStatus = recode(ApprovalStatus, 
                  "+" = "1", 
                  "-" = "0")) 

# Normalizing numeric variables:
crx$Age <- scale(crx$Age)
crx$Debt <- scale(crx$Debt)
crx$YearsEmployed <- scale(crx$YearsEmployed)
crx$CreditScore <- scale(crx$CreditScore)
crx$Income <- scale(crx$Income)

crx$Gender <- NULL
crx$DriversLicense <- NULL

Creation of dummy variables:

df <- dummy_cols(crx, remove_selected_columns = T)

df$ApprovalStatus_0 <- NULL
df$ApprovalStatus_1 <- NULL
df$Married_l <- NULL
df$BankCustomer_gg <- NULL

df$ApprovalStatus <- crx$ApprovalStatus

Creation of Training datasets and Test datasets:

X <- df %>% dplyr::select(-ApprovalStatus)
Y <- df$ApprovalStatus

X_Train <- X[0:590, ]
Y_Train <- Y[0:590]

X_Test <- X[591:nrow(X), ]
Y_Test <- Y[591:length(Y)]

And trying to use the glmnet:

glmnet(X_Train, Y_Train, family='binomial', alpha=0, type.measure='auc')

I did some research and I found an article saying that you have to convert everything into numeric class, so I tried converting everything into numeric variables like this:

Y_Train <- as.numeric(Y_Train)
X_Train <- as.data.frame(apply(X_Train, 2, as.numeric))

And still doesn't work. What am I doing wrong exactly?

JMarcos87

Posted 2021-01-31T13:37:05.030

Reputation: 25

What is the data type here? glmnet needs a matrix as input. Try as.matrix() for both X and y – Peter – 2021-01-31T21:39:11.840

oh yes, I tried it and it worked – JMarcos87 – 2021-02-01T20:19:01.823

Answers

1

Glmnet requires a matrix as input for both, $X$ and $y$. So you need to define as.matrix() on all model inputs.

For further examples also see the Glmnet Vignette by Trevor Hastie and Junyang Qian.

Peter

Posted 2021-01-31T13:37:05.030

Reputation: 4 724