This document provides an overview of receiver operating characteristic (ROC) curves. It defines an ROC curve as a graphical plot that illustrates the performance of a binary classifier system by varying its discrimination threshold. An ROC curve plots the true positive rate against the false positive rate. The area under the ROC curve (AUC) provides a single measure of classifier performance, where an AUC of 1 represents a perfect classifier and 0.5 represents a random classifier. The document discusses how ROC curves can be used to compare multiple classifiers and select optimal threshold values to balance sensitivity and specificity.
• A receiveroperating characteristic curve, i.e. ROC curve, is
a graphical plot that illustrates the diagnostic ability of a binary
classifier system as its discrimination threshold is varied.
• The diagnostic performance of a test, or the accuracy of a test to
discriminate diseased cases from normal cases is evaluated using
Receiver Operating Characteristic (ROC) curve analysis
• A Receiver Operating Characteristic (ROC) Curve is a way to compare
diagnostic tests. It is a plot of the true positive rate against the false
positive rate.
2
3.
Theory
• When youconsider the results of a particular test in
two populations, one population with a disease, the
other population without the disease, you will
rarely observe a perfect separation between the
two groups. Indeed, the distribution of the test
results will overlap, as shown in the following
figure.
• For every possible cut-off point or criterion value
you select to discriminate between the two
populations, there will be some cases with the
disease correctly classified as positive (TP = True
Positive fraction), but some cases with the disease
will be classified negative (FN = False Negative
fraction). On the other hand, some cases without
the disease will be correctly classified as negative
(TN = True Negative fraction), but some cases
without the disease will be classified as positive (FP
= False Positive fraction).
3
4.
History
• The name"Receiver Operating Characteristic" came from "Signal Detection
Theory" developed during World War II for the analysis of radar images.
• Radar operators had to decide whether a blip on the screen represented an
enemy target, a friendly ship, or just noise.
• Signal detection theory measures the ability of radar receiver operators to
make these important distinctions.
• Their ability to do so was called the Receiver Operating Characteristics.
• It was not until the 1970's that signal detection theory was recognized as
useful for interpreting medical test results
4
5.
A ROC plotshows:
• The relationship between sensitivity and specificity. For example, a
decrease in sensitivity results in an increase in specificity.
• Test accuracy; the closer the graph is to the top and left-hand
borders, the more accurate the test. Likewise, the closer the graph to
the diagonal, the less accurate the test.
• A perfect test would go straight from zero up the top-left corner and
then straight across the horizontal.
• The likelihood ratio; given by the derivative at any particular cut
point.
5
ROC shows trade-offsbetween sensitivity and
specificity
• The ROC plot is a model-wide evaluation measure that is based on
two basic evaluation measures – specificity and sensitivity.
• Specificity is a performance measure of the whole negative part of a
dataset, whereas sensitivity is a performance measure of the whole
positive part.
• The ROC plot uses 1 – specificity on the x-axis and sensitivity on the y-
axis. False positive rate (FPR) is identical with 1 – specificity, and true
positive rate (TPR) is identical with sensitivity.
7
8.
Making a ROCcurve by connecting ROC points
• A ROC point is a point with a pair of x and y values in the ROC space
where x is 1 – specificity and y is sensitivity.
• A ROC curve is created by connecting all ROC points of a classifier in
the ROC space. Two adjacent ROC points can be connected by a
straight line, and the curve starts at (0.0, 0.0) and ends at (1.0, 1.0).
8
9.
An example ofmaking a ROC curve
• We show a simple example to make a ROC curve by connecting several ROC
points. Let us assume that we have calculated sensitivity and specificity
values from multiple confusion matrices for four different threshold values.
9
10.
• We firstadded four points that matches
with the pairs of sensitivity and specificity
values and then connected the points to
create a ROC curve.
• The plot shows a ROC curve connecting
four ROC points.
10
11.
Interpretation of ROCcurves
a. A ROC curve of a random classifier
• A classifier with the random performance level
always shows a straight line from the origin (0.0,
0.0) to the top right corner (1.0, 1.0).
• Two areas separated by this ROC curve indicates
a simple estimation of the performance level.
ROC curves in the area with the top left corner
(0.0, 1.0) indicate good performance levels,
whereas ROC curves in the other area with the
bottom right corner (1.0, 0.0) indicate poor
performance levels
• A ROC curve represents a classifier with the
random performance level. The curve separates
the space into two areas for good and poor
performance levels. 11
12.
A ROC curveof a perfect classifier
• A classifier with the perfect performance level
shows a combination of two straight lines – from the
origin (0.0, 0.0) to the top left corner (0.0, 1.0) and
further to the top right corner (1.0, 1.0).
• It is important to notice that classifiers with
meaningful performance levels usually lie in the area
between the random ROC curve (baseline) and the
perfect ROC curve.
12
13.
ROC curves formultiple models
• Comparison of multiple classifiers is usually
straight-forward especially when no curves cross
each other. Curves close to the perfect ROC curve
have a better performance level than the ones
closes to the baseline.
• Two ROC curves represent the performance
levels of two classifiers A and B. Classifier A
clearly outperforms classifier B in this example.
13
14.
AUC (Area underthe ROC curve) score
• Another advantage of using the ROC plot is a single
measure called the AUC (area under the ROC curve)
score. As the name indicates, it is an area under the
curve calculated in the ROC space.
• One of the easy ways to calculate the AUC score is
using the trapezoidal rule, which is adding up all
trapezoids under the curve.
• The AUC score can be calculated by the trapezoidal
rule, which is adding up all trapezoids under the
curve. The areas of the three trapezoids 1, 2, 3 are
0.0625, 0.15625, and 0.4375. The AUC score is then
0.65625.
14
15.
• Although thetheoretical range of AUC score is
between 0 and 1, the actual scores of meaningful
classifiers are greater than 0.5, which is the AUC
score of a random classifier.
• It shows four AUC scores. The score is 1.0 for the
classifier with the perfect performance level (P)
and 0.5 for the classifier with the random
performance level (R). ROC curves clearly shows
classifier A outperforms classifier B, which is also
supported by their AUC scores (0.88 and 0.72).
15