### ROC Curve

Most of us use the ROC curve to assess our binary classifiers everyday. Sometimes we take for granted its theoretical properties. In this post, I will take some time and analyze why the properties are what they are.

### 2 properties will be covered:

- The baseline is the diagonal line
- The area can be interpreted as how strong the classifier is

### 1. The baseline is the diagonal line

So as we know, the x-axis of the ROC curve is the False Positive Rate (FPR) and the y-axis is the True Positive Rate (TPR).

If we have a dataset with $\pi \in [0,1] $ positives, and $ 1 - \pi $ in the negatives, and we predict randomly with a positive rate of $ p $ and negative rate of $ 1 - p $, then we can calculate the TPR and FPR as a ratio.

Since TPR and FPR are both p, a random classifier (baseline) will have a ROC curve of slope 1 (the diagonal) and an AUC of 0.5.

### 2. The area can be interpreted as how strong the classifier is

Technically, the area is a bit different from I described.

Let’s go into why.

The integral means that for each negative example, count the number of positive examples with a higher score than this negative example.

If we have a perfect classifier, then all P positives will be scored higher than the negative example, so the integral will result in a maximum area of $ P * N$.

*Note: Intuitively, the perfect classifier has {TPR = 1 and FPR = 0}, which is the upper left point*

Combined together, the 2 terms give us a nice interpretation of the $A_{AUC} \in [0,1] $ as the classifier’s ability to discern positive and negative data.

### Conclusion

Unfortunately, I did not go over other properties such as linear correlation with **accuracy**, **pareto optimality** and relationships with the **calibration curve**. That’s for another day.

The 2 main properties outlined in this post make the ROC curve a fairly good way to compare binary classifiers. These are great theoretical advantages that other popular metrics (such as the **precision-recall** or the **calibration** curves) don’t have.