Is it possible to find a model that minimises both false positive and false negative?



Is it possible to come up with a model that minimises both false positive and false negative?

Minimising can be done to a point, such as the Bayes error threshold.


Posted 2018-07-15T21:23:11.733

Reputation: 31


In effort of trying to understand the confusion matrix for a machine learning algorithm, I have written this document for my understanding. As I understand, if we increase precision, we will have a reduced recall and increasing recall will reduce accuracy. Increased accuracy mean decreased False Positive, and increased recall mean decreased False Negative. According to my understanding, we can not have a model that will minimize both False Negative and False Positives. Looking for comments from experts.

– vkj – 2018-07-15T21:26:57.857

if a model tries to predict better decrease false positive and false negative together. – parvij – 2018-07-16T06:16:27.097

1Those friends that believe this question is so broad and may have so many answers, please share your opinions and don't vote for closing. This is a very good question. – Media – 2018-07-16T14:27:58.540

1@vkj One of the approaches for finding a moderated value for learning is to employ F1 score but I've not seen something that tries to minimize them both in machine learning era. There is a famous quote that says there is not always a good model that describes entirely your data. In ML we have bias/variance trade-off. As apposed to that, in deep learning we have other tools that may let you handle high variance and high bias problems without any conflict. But I guess your question may have good answers that I've not faced :) – Media – 2018-07-16T14:34:18.513

I think this is too broad because false positives and false negatives are in tension. Minimizing both is ideal, but always a tradeoff at some level. It also implies there is some minimum that can be achieved, but, not clear what this refers to. – Sean Owen – 2018-07-16T21:32:52.637

1@SeanOwen Actually you are right if you are attempting to speak about tasks like regression and statistical approximations. On the contrary, in deep learning it is not really true due to the fact that people never speak about that popular tradeoff. People always speak about increasing the size of training data to achieve better learning. About minimising, it can be interpreted as the Bayes error which is the best thing that can be achieved. – Media – 2018-07-17T13:56:54.890



Yes and No! depending on what do you mean by minimization. When you say minimizing $f$ and $g$ according to something, you are actually looking for a point which minimizes both. It does not mean that this point necessarily finds the minimum of $f$ or $g$. So yes in this sense.

But in case you mean a point in which both of them are in their minimum, this case is not realistic in practice as usually increasing one of them means decreasing the other one. Have a look at the plot bellow. FRR means False Rejection Rate, FAR means False Acceptance Rate and EER means Equal Error Rate.This is the terminology used in Biometric, a ML-based field studying the person identification/recognition.enter image description here

It illustrates what I explained above.


You should consider the importance of each error. In some use-cases (including some biometric use-cases) you may sacrifice one error for the other one. In those cases you minimize the error you need to be minimum without really caring about the other one (of course trying to keep the other one as low as possible as well but not at top priority).

For example I have a Face Recognition system which opens the door of a highly confidential military site to permitted users. Should I try to keep the EER low? Sure not. If sometimes the door is not open for a permitted person it's ok (at most they complain a bit) but door MUST NOT be opened for someone who is not allowed so here you care about keeping FAR low.

Hope it helped

Kasra Manshaei

Posted 2018-07-15T21:23:11.733

Reputation: 5 323

1Thanks for your answer. I just want to say whenever you design a network which is too big which can memorise (hefz kone) your data, you have high variance problem which is somehow doing the desired behaviour in the question. My opinion. – Media – 2018-07-16T14:43:56.430

2Thanks for the comment @Media! you are right ... and one solution to avoid overfitting is statistical model selection which I did not explain. Good and necessary point! damet garm – Kasra Manshaei – 2018-07-16T17:35:46.837

You are welcome :)) – Media – 2018-07-16T17:50:16.540

1Thanks. I asked this question because I have same understanding. I recently came across a business requirement that says "False Positive should be 0%, and False Negative should be 0%". And I was puzzled by this. People happily provided solution too, and I know that is not possible. So, I thought I should ask this question to make sure I am not missing some understanding. – vkj – 2018-07-17T02:07:30.780

your understanding is exactly right my friend and the task is certainly not rational. – Kasra Manshaei – 2018-07-17T09:08:42.663