Understanding Specificity and Sensitivity

Introduction

2020 was a tough year. Alex and I founded Celantur last March, ten days before Austria entered its first lockdown. Mario boldly joined us in May when the pandemic was in full swing. Thanks to the digital nature of our service and our remote-first culture, we are weathering the crisis comparably well.

Our core business is the protection of everyone’s privacy, and yet there is a crucial aspect to it that we share with the fight against COVID-19.

Test negative or test positive?

In the ideal world, an observation or a test will always tell you the truth, i.e. your COVID-19 test is positive if you suffer from the disease, and negative if you are not.

Reality deviates significantly from its ideal version. Sometimes you suffer from COVID-19 even if your test does not detect it, and sometimes you don’t have it even if your test shows the opposite. And sometimes, as Elon Musk will tell you, these mistakes happen too often:

False positives and false negatives

Let’s visualise the problem using a so-called confusion matrix:

Confusion matrix		Observation
Confusion matrix		Positive	Negative
Reality	Positive	😀 True positive (TP)	😡 False negative (FN)
Reality	Negative	😡 False positive (FP)	😀 True negative (TN)

If you observe something correctly, then it’s either a true positive, i.e. a correct positive test, or a true negative, i.e. a correct negative test.

If you observe something wrongly, then it’s either a false positive, e.g. a positive test despite the patient being coronavirus-free, or a false negative, e.g. a negative test despite the patient being infected.

Ideally, you minimise both forms of wrong observations, but in most cases, you have to make the cruel trade-off between fewer false negatives or fewer false positives.

Confusion matrix applied to real life

Let’s suppose a population of 1000 people, of which 200 are infected with COVID-19. In the first scenario, we have a very sensitive test which detects all the COVID-19 cases. Unfortunately, it also misidentifies 300 healthy patients as infected:

1st scenario		Observation
1st scenario		Positive	Negative
Reality	Positive	TP = 200	FN = 0
Reality	Negative	FP = 300	TN = 500

In the second scenario, we have a very specific test which doesn’t produce the above error, but it yields a positive detection only for 50 cases with the highest virus count, ignoring the remaining 150 infections:

2nd scenario		Observation
2nd scenario		Positive	Negative
Reality	Positive	TP = 50	FN = 150
Reality	Negative	FP = 0	TN = 800

Mathematically speaking, sensitivity is the number of true positives divided by the sum of true positives and false negatives: eq sensitivity

And specificity is the number of true negatives divided by the sum of true negatives and false positives: eq specificity

In the first scenario, the sensitivity is 100% and specificity is only 62.5%, whereas in the second scenario the sensitivity is only 25%, but specificity is 100%.

	Sensitvity	Specificity
1st scenario
2nd scenario

Summing it up, maximising sensitivity means reducing false negatives, and maximising specificity the reduction of false positives.

sensitivity ↑ = ↓ false negatives	specificity ↑ = ↓ false positives

Different situations prioritise sensitivity and specificity differently.

For example, if donated blood is tested for sexually transmitted diseases, tests should have high sensitivity. Even though some blood samples are wrongly tested positive, it is better to prevent a patient from receiving tainted blood.

In the case of COVID-19, a high number of false-positives would severely disrupt the social and economic lives of many people by unnecessarily quarantining them. Thus, very specific tests that reliably detect infectious cases are preferable.

Celantur: specific and sensitive

To protect your privacy, we remove personal data (e.g. faces and license plates) from images and videos, for which we first need to detect them. And here, we encounter the same fundamental problem as a medical testing facility: False negatives, e.g. missing a face, and false positives, e.g. mis-detecting a street sign as a license plate.

Figure: Section of the painting “The School of Athens” by Raphael.
A face detector could correctly identify a face (green TP), misidentify something as a face (false FP), miss a face (red FN), or correctly ignore everything that is not a face, i.e rest of the image as TN. — Figure: Section of the painting “The School of Athens” by Raphael. A face detector could correctly identify a face (green TP), misidentify something as a face (false FP), miss a face (red FN), or correctly ignore everything that is not a face, i.e rest of the image as TN.

Analogously, different applications demand different levels of sensitivity and specificity. Mapping a public square with many unaware passers-by requires the anonymisation to be very sensitive, even if it entails some false positives. On the other hand, mapping an industrial plant requires the anonymisation to be very specific to avoid machines and equipment being mistakenly blurred.

Summary

Understanding the trade-off between specificity and sensitivity is crucial to many fields as diverse as epidemiology and data privacy.
You have to consider the trade-offs invidually for each application, eg. public square vs. industrial plant.
Confusion matrix could be applied to help you decide the right trade-off for your project.

Do your image datasets contain personal data like faces or license plates?

Try Celantur automated image and video blurring. Respect individuals' privacy, comply with data privacy laws and avoid hefty fines. Give it a try!

About Celantur

At Celantur, we use several distinct machine learning models and sets of parameterisations, in order to maximise sensitivity/specificity for the individual use-case our customers encounter.

✅ We anonymize all kinds of RGB-imagery: planar, panorama images and videos.

✅ Our cloud platform is capable of anonymizing around 200.000 panoramas and 24 hours of videos per day.

✅ Industry-grade anonymization quality: detection rate up to 99%.

Expert in image processing optimization?

Take part of our performance engineering technical challenge and win € 150 Amazon voucher! Terms and conditions here.

Understanding Specificity and Sensitivity

Introduction

Test negative or test positive?

False positives and false negatives

Confusion matrix

Observation

Reality

Confusion matrix applied to real life

1st scenario

Observation

Reality

2nd scenario

Observation

Reality

Celantur: specific and sensitive

Summary

Do your image datasets contain personal data like faces or license plates?

About Celantur

Expert in image processing optimization?

Latest Blog Posts

Turning Dashcam Footage into Revenue: Monetizing Data the Responsible Way

Edge AI Learnings: CPU Architectures, GPU Capabilities, and Challenges with Nvidia Jetson

Celantur at Esri UC 2024: Solutions Now Listed on ArcGIS Marketplace

Cloud Service

Web API

Container

Edge

Mobile Mapping

Drones

Automotive & ADAS

Embedded & Edge

Laser Scanning

Image Annotation

Documentation

Success Stories

Blog

Celantur

Jobs Applications Welcome

Trust & Compliance

Technology

Understanding Specificity and Sensitivity

Introduction

Test negative or test positive?

False positives and false negatives

Confusion matrix

Observation

Reality

Confusion matrix applied to real life

1st scenario

Observation

Reality

2nd scenario

Observation

Reality

Celantur: specific and sensitive

Summary

Do your image datasets contain personal data like faces or license plates?

About Celantur

Expert in image processing optimization?

Latest Blog Posts

Turning Dashcam Footage into Revenue: Monetizing Data the Responsible Way

Edge AI Learnings: CPU Architectures, GPU Capabilities, and Challenges with Nvidia Jetson

Celantur at Esri UC 2024: Solutions Now Listed on ArcGIS Marketplace