If we decrease the false negative (select more positives), recall always increases, but precision may increase or decrease. Generally, for models better than random, precision and recall have an **inverse** relationship (@pythinker's answer), but for models worse than random, they have a **direct** relationship (@kbrose's example).

It is worth noting that we can artificially build a sample that causes a model which is better-than-random on true distribution to perform worse-than-random, so we are assuming that the sample resembles the true distribution.

### Recall

We have
$$TP = P - FN$$
therefore, recall would be
$$r = \frac{P-FN}{P} = 1- \frac{FN}{P}$$
which always increases by decrease in $FN$.

### Precision

For precision, the relation is not as straightforward. Lets start with two examples.

**First case**: decrease in precision, by decrease in false negative:

```
label model prediction
1 0.8
0 0.2
0 0.2
1 0.2
```

For threshold $0.5$ (false negative = $\{(1, 0.2)\}$),

$$p = \frac{1}{1+0}=1$$

For threshold $0.0$ (false negative = $\{\}$),

$$p = \frac{2}{2+2}=0.5$$

**Second case**: increase in precision, by decrease in false negative (the same as @kbrose example):

```
label model prediction
0 1.0
1 0.4
0 0.1
```

For threshold $0.5$ (false negative = $\{(1, 0.4)\}$),

$$p = \frac{0}{0+1}=0$$

For threshold $0.0$ (false negative = $\{\}$),

$$p = \frac{1}{1+2}=0.33$$

It is worth noting that ROC curve for this case is

### Analysis of precision based on ROC curve

When we lower the threshold, false negative decreases, and true positive [rate] increases, which is equivalent to **moving to the right in ROC plot**. I did a simulation for better-than-random, random, and worse-than-random models, and plotted ROC, recall, and precision:

As you can see, by moving to the right, for better-than-random model, precision decreases, for random model, precision has substantial fluctuations, and for worse-than-random model precision increases. And there are slight fluctuations in all three cases. Therefore,

By increase in recall, if model is better than random, precision generally decreases. If mode is worse than random, precision generally increases.

Here is the code for simulation:

```
import numpy as np
from sklearn.metrics import roc_curve
from matplotlib import pyplot
np.random.seed(123)
count = 2000
P = int(count * 0.5)
N = count - P
# first half zero, second half one
y_true = np.concatenate((np.zeros((N, 1)), np.ones((P, 1))))
title = 'Better-than-random model'
# title = 'Random model'
# title = 'Worse-than-random model'
if title == 'Better-than-random model':
# GOOD: model output increases from 0 to 1 with noise
y_score = np.array([p + np.random.randint(-1000, 1000)/3000
for p in np.arange(0, 1, 1.0 / count)]).reshape((-1, 1))
elif title == 'Random model':
# RANDOM: model output is purely random
y_score = np.array([np.random.randint(-1000, 1000)/3000
for p in np.arange(0, 1, 1.0 / count)]).reshape((-1, 1))
elif title == 'Worse-than-random model':
# SUB RANDOM: model output decreases from 0 to -1 (worse than random)
y_score = np.array([-p + np.random.randint(-1000, 1000)/1000
for p in np.arange(0, 1, 1.0 / count)]).reshape((-1, 1))
# calculate ROC (fpr, tpr) points
fpr, tpr, thresholds = roc_curve(y_true, y_score)
# calculate recall, precision, and accuracy for corresponding thresholds
# recall = TP / P
recall = np.array([np.sum(y_true[y_score > t])/P
for t in thresholds]).reshape((-1, 1))
# precision = TP / (TP + FP)
precision = np.array([np.sum(y_true[y_score > t])/np.count_nonzero(y_score > t)
for t in thresholds]).reshape((-1, 1))
# accuracy = (TP + TN) / (P + N)
accuracy = np.array([(np.sum(y_true[y_score > t]) + np.sum(1 - y_true[y_score < t]))
/len(y_score)
for t in thresholds]).reshape((-1, 1))
# Sort performance measures from min tpr to max tpr
index = np.argsort(tpr)
tpr_sorted = tpr[index]
recall_sorted = recall[index]
precision_sorted = precision[index]
accuracy_sorted = accuracy[index]
# visualize
fig, ax = pyplot.subplots(3, 1)
fig.suptitle(title, fontsize=12)
line = np.arange(0, len(thresholds))/len(thresholds)
ax[0].plot(fpr, tpr, label='ROC', color='purple')
ax[0].plot(line, line, '--', label='random', color='black')
ax[0].set_xlabel('fpr')
ax[0].legend(loc='center left', bbox_to_anchor=(1, 0.5))
ax[1].plot(line, recall, label='recall', color='blue')
ax[1].plot(line, precision, label='precision', color='red')
ax[1].plot(line, accuracy, label='accuracy', color='black')
ax[1].set_xlabel('1 - threshold')
ax[1].legend(loc='center left', bbox_to_anchor=(1, 0.5))
ax[2].plot(tpr_sorted, recall_sorted, label='recall', color='blue')
ax[2].plot(tpr_sorted, precision_sorted, label='precision', color='red')
ax[2].plot(tpr_sorted, accuracy_sorted, label='accuracy', color='black')
ax[2].set_xlabel('tpr (1 - fnr)')
ax[2].legend(loc='center left', bbox_to_anchor=(1, 0.5))
fig.tight_layout()
fig.subplots_adjust(top=0.88)
pyplot.show()
```

So when random phenomena completely rules, in practice it is observed that they generally have inverse relationship. There are different situations but, can we say generally if we increase precision it means that we predict negative examples more accurately and if we increase recall it means that we predict positive examples more accurately? – tkarahan – 2019-04-12T08:13:30.577

@TolgaKarahan First we need to define "more accurately" in terms of TN, TP, etc. For example "accuracy" is for both positives and negatives, i.e. (TP+TN / P+N) which I added it to the plots, it has a rise and a fall for better-than-random models. – Esmailian – 2019-04-12T11:15:32.827

I mean ratio of correctly predicted labels to all labels for a specific class. Like TP / P or TN / N. If I increase precision does it predict negative examples more accurately with increasing TN / N? – tkarahan – 2019-04-12T11:47:38.330

@TolgaKarahan Aha. For better-than-random models,

increase in precisionmeans decrease in recall (and vice versa), which isdecrease in TP/P(P = TP+FN). For TN/N, we know when threshold is increased (decrease in recall) both TP and FP decrease since we are selecting less positives, thus FP/N decreases, and 1 - FP/N =TN/N increases. So the answer to your question is yes. – Esmailian – 2019-04-12T11:59:11.320It's good. Finally If I define TP / P as positive recall and TN / N as negative recall then I suppose with increasing precision I increase negative recall and with increasing recall because it is same thing I also increase positive recall. So it looks like matter of increasing negative or positive recall and which one more important to me. – tkarahan – 2019-04-12T12:09:14.027

@TolgaKarahan Exactly. – Esmailian – 2019-04-12T12:15:19.527

Thank you so much. It really helped to clarify subject. – tkarahan – 2019-04-12T12:16:56.613