πŸ” Statistical Methods

MAD vs Tukey:
Choosing the Right Outlier Detection Method

Not all outlier detection methods are created equal. Learn when to use MAD (Median Absolute Deviation) versus Tukey's 1.5Γ—IQR method, how each works, and which performs better for different data distributions.

Published: August 27, 2025
Reading Time: 14 minutes
Difficulty Level: Intermediate

1. What Are Outliers and Why Do They Matter?

Outliers are data points that deviate significantly from the rest of your dataset. They can represent:

  • Data entry errors: Typos, misplaced decimal points, or incorrect measurements
  • Rare events: Legitimate but unusual observations (e.g., a student scoring 100% on a difficult test)
  • Measurement errors: Equipment malfunctions or environmental factors
  • True anomalies: Real but exceptional values that require investigation

Detecting outliers is crucial because they can:

  • Skew your statistics: Outliers can dramatically affect the mean and standard deviation
  • Mislead your analysis: They can hide patterns or create false patterns
  • Require investigation: Understanding why outliers exist can reveal important insights

2. Tukey's 1.5Γ—IQR Method Explained

Tukey's method (also called the 1.5Γ—IQR rule) is the most commonly used outlier detection method for box plots. It was developed by John Tukey in the 1970s as part of exploratory data analysis.

How It Works

  1. Calculate Q1 (first quartile) and Q3 (third quartile)
  2. Calculate IQR (Interquartile Range) = Q3 - Q1
  3. Calculate the lower fence = Q1 - 1.5 Γ— IQR
  4. Calculate the upper fence = Q3 + 1.5 Γ— IQR
  5. Any data point < lower fence or > upper fence is considered an outlier

πŸ’‘ Example

If Q1 = 20, Q3 = 40, then IQR = 20

Lower fence = 20 - 1.5 Γ— 20 = -10

Upper fence = 40 + 1.5 Γ— 20 = 70

Any value < -10 or > 70 is an outlier.

Pros and Cons

βœ… Advantages

  • Simple and intuitive
  • Widely understood and accepted
  • Works well for symmetric data
  • Standard in box plot visualization
  • Fast to calculate

❌ Limitations

  • Assumes symmetric distribution
  • Can flag too many points in skewed data
  • Sensitive to extreme outliers
  • May miss outliers in skewed distributions

3. MAD (Median Absolute Deviation) Method Explained

MAD (Median Absolute Deviation) is a robust outlier detection method that works better than Tukey's method for skewed or asymmetric data. It's based on the median rather than quartiles, making it more resistant to outliers.

How It Works

  1. Calculate the median of your data
  2. Calculate the absolute deviations from the median: |value - median|
  3. Calculate the MAD = median of absolute deviations
  4. Calculate the modified Z-scores using MAD as the scale
  5. Any point with |modified Z-score| > threshold (typically 3.5) is an outlier

πŸ’‘ Example

If median = 25, MAD = 5, threshold = 3.5

For a value of 45: modified Z-score = (45 - 25) / 5 = 4.0

Since |4.0| > 3.5, this value is an outlier.

Pros and Cons

βœ… Advantages

  • Robust to outliers (uses median, not mean)
  • Works well for skewed data
  • Less sensitive to extreme values
  • Better for asymmetric distributions
  • More accurate for non-normal data

❌ Limitations

  • Less well-known than Tukey's method
  • Slightly more complex to explain
  • Requires choosing a threshold (typically 3.5)
  • May be too conservative for some applications

4. Side-by-Side Comparison

Aspect Tukey (1.5Γ—IQR) MAD
Basis Quartiles (Q1, Q3) Median and absolute deviations
Best For Symmetric, normal-like distributions Skewed, asymmetric distributions
Robustness Moderate (uses quartiles) High (uses median)
Complexity Simple (easy to explain) Moderate (requires threshold)
Popularity Very common (box plot standard) Less common (growing in use)
Threshold Fixed (1.5 Γ— IQR) Configurable (typically 3.5)

5. When to Use Each Method

βœ… Use Tukey's Method When:

  • Your data is approximately symmetric
  • You're creating standard box plots
  • You need a simple, widely-understood method
  • Your audience expects traditional box plots
  • You're working with normally-distributed data
  • You want consistency with standard practices

βœ… Use MAD Method When:

  • Your data is skewed or asymmetric
  • You have many outliers that might affect quartiles
  • You need a more robust method
  • You're working with non-normal distributions
  • You want better accuracy for skewed data
  • You're analyzing data with potential contamination

6. Practical Examples

Example 1: Symmetric Data (Tukey Preferred)

Scenario: Test scores from a well-designed exam (approximately normal distribution).

Data:

75, 78, 80, 82, 85, 87, 90, 92, 95, 98

Result: Both methods work well, but Tukey's method is simpler and more standard for this case.

β†’ Try this example in Outlier Calculator (switch between methods) β†’

Example 2: Skewed Data (MAD Preferred)

Scenario: Income data (right-skewed distribution with a few high earners).

Data:

30, 35, 40, 45, 50, 55, 60, 65, 70, 200

Result: MAD method is more robust here. Tukey's method might flag the 200 as an outlier, while MAD considers the overall distribution better.

β†’ Try this example in Outlier Calculator (compare methods) β†’

Example 3: Data with Many Outliers

Scenario: Sensor readings with potential measurement errors.

Data:

12.1, 12.3, 12.5, 12.7, 12.9, 13.1, 13.3, 50.0, 55.0, 60.0

Result: MAD method is more robust because it uses the median, which is less affected by outliers. This makes it better at detecting true outliers in contaminated data.

β†’ Try this example in Outlier Calculator (test both methods) β†’

7. FAQ

Q: Which method is more accurate?

A: Neither is universally more accurate. Tukey's method is better for symmetric, normal-like distributions, while MAD method is better for skewed or asymmetric data. The "best" method depends on your data's distribution.

Q: Can I use both methods in PlotNerd?

A: Yes! PlotNerd's Outlier Calculator allows you to switch between Tukey and MAD methods in real-time. Simply select your preferred method from the dropdown in the results panel, and the chart will update instantly. This lets you compare how each method identifies outliers in your data.

Q: What's the MAD threshold in PlotNerd?

A: PlotNerd uses a default threshold of 3.5 for MAD outlier detection, which is the standard in statistical literature. This means any data point with a modified Z-score greater than 3.5 (in absolute value) is considered an outlier.

Q: Should I remove outliers after detecting them?

A: Not necessarily! Outliers can be legitimate data points that require investigation. Before removing them, consider:

  • Are they data entry errors? (If yes, correct or remove)
  • Are they rare but legitimate events? (Keep them, but note them)
  • Do they represent important insights? (Investigate further)
  • Do they significantly affect your analysis? (Consider robust methods)

Q: Can I use different methods for different groups in a grouped box plot?

A: For consistency, PlotNerd uses the same outlier detection method for all groups in a grouped box plot. This ensures fair comparison across groups. You can switch the method, but it will apply to all groups simultaneously.

8. Conclusion

Choosing between Tukey's 1.5Γ—IQR and MAD outlier detection methods depends on your data's characteristics:

  • Use Tukey's method for symmetric, normal-like distributions and standard box plots
  • Use MAD method for skewed, asymmetric data or when you need more robust outlier detection

With PlotNerd, you can easily compare both methods in real-time, seeing how each identifies outliers in your specific dataset. This helps you choose the most appropriate method for your analysis.

Ready to Test Both Methods?

Try PlotNerd's outlier detection calculator to see how Tukey and MAD methods compare on your data.

Launch Outlier Calculator

πŸ“– Related Articles

πŸ› οΈ Related Tools

πŸ”— See Also