How to model a decimal response between 0 to 1 with a GLM in R


I am trying to model a response variable which is a proportion (so a response between 0 and 1, see picture for distribution).

Ideally I would like to model it without using the actual counts, so as a decimal.

So far I have been using a binomial family in R.

model <- glm(Response ~ 
                X1 +
                X2 + 
              data = Training_data,
              family = 'binomial')

I think the model is doing okay, but when I use it for predictions it doesn't do a good job predicting when the ratio is 1 (As you can see from the picture).

I'm not sure if my approach of using a binomial distribution is wrong?

Thanks for your help ![Distribution of actual response and predicted response]1

Jared Fowler

Posted 2020-04-14T06:10:26.523

Reputation: 1

This is not completely off topic, but in my opinion you will get better answers to that question in cross validated:

– lcrmorin – 2020-04-14T07:19:05.643

Thanks, I just posted there. – Jared Fowler – 2020-04-14T22:05:38.617

No answers