What is the difference between MAD and Tukey outlier detection?

MAD (Median Absolute Deviation) uses median-based scaling while Tukey method uses IQR (interquartile range). MAD is more robust to extreme outliers and better for heavily skewed data, while Tukey is more intuitive and widely used in box plots.

When should I use MAD instead of Tukey?

Use MAD for heavily skewed distributions, data with multiple outlier clusters, or when you need maximum robustness. Use Tukey for symmetric or mildly skewed data, when creating box plots, or when interpretability is important.

Is MAD better than IQR for outlier detection?

MAD is more robust than IQR-based methods for asymmetric data because it uses the median as the center point. However, IQR is easier to explain and visualize via box plots. For most applications, try both methods and compare results.

How do I calculate MAD outliers?

Calculate MAD = median(|x - median(x)|), then compute modified Z-scores: (0.6745 × (x - median)) / MAD. Values with modified Z-scores > 3.5 are typically considered outliers.

MAD vs Tukey: Choosing the Right Outlier Detection Method

Name: PlotNerd
Availability: InStock
Author: PlotNerd

1. What Are Outliers and Why Do They Matter?

Outliers are data points that deviate significantly from the rest of your dataset. They can represent:

Data entry errors: Typos, misplaced decimal points, or incorrect measurements
Rare events: Legitimate but unusual observations (e.g., a student scoring 100% on a difficult test)
Measurement errors: Equipment malfunctions or environmental factors
True anomalies: Real but exceptional values that require investigation

Detecting outliers is crucial because they can:

Skew your statistics: Outliers can dramatically affect the mean and standard deviation
Mislead your analysis: They can hide patterns or create false patterns
Require investigation: Understanding why outliers exist can reveal important insights

2. Tukey's 1.5×IQR Method Explained

Tukey's method (also called the 1.5×IQR rule) is the most commonly used outlier detection method for box plots. It was developed by John Tukey in the 1970s as part of exploratory data analysis.

How It Works

Calculate Q1 (first quartile) and Q3 (third quartile)
Calculate IQR (Interquartile Range) = Q3 - Q1
Calculate the lower fence = Q1 - 1.5 × IQR
Calculate the upper fence = Q3 + 1.5 × IQR
Any data point < lower fence or > upper fence is considered an outlier

Example

If Q1 = 20, Q3 = 40, then IQR = 20

Lower fence = 20 - 1.5 × 20 = -10

Upper fence = 40 + 1.5 × 20 = 70

Any value < -10 or > 70 is an outlier.

Pros and Cons

Advantages

Simple and intuitive
Widely understood and accepted
Works well for symmetric data
Standard in box plot visualization
Fast to calculate

Limitations

Assumes symmetric distribution
Can flag too many points in skewed data
Sensitive to extreme outliers
May miss outliers in skewed distributions

3. MAD (Median Absolute Deviation) Method Explained

MAD (Median Absolute Deviation) is a robust outlier detection method that works better than Tukey's method for skewed or asymmetric data. It's based on the median rather than quartiles, making it more resistant to outliers.

How It Works

Calculate the median of your data
Calculate the absolute deviations from the median: |value - median|
Calculate the MAD = median of absolute deviations
Calculate the modified Z-scores using MAD as the scale
Any point with |modified Z-score| > threshold (typically 3.5) is an outlier

Example

If median = 25, MAD = 5, threshold = 3.5

For a value of 45: modified Z-score = (45 - 25) / 5 = 4.0

Since |4.0| > 3.5, this value is an outlier.

Pros and Cons

Advantages

Robust to outliers (uses median, not mean)
Works well for skewed data
Less sensitive to extreme values
Better for asymmetric distributions
More accurate for non-normal data

Limitations

Less well-known than Tukey's method
Slightly more complex to explain
Requires choosing a threshold (typically 3.5)
May be too conservative for some applications

4. Side-by-Side Comparison

Aspect	Tukey (1.5×IQR)	MAD
Basis	Quartiles (Q1, Q3)	Median and absolute deviations
Best For	Symmetric, normal-like distributions	Skewed, asymmetric distributions
Robustness	Moderate (uses quartiles)	High (uses median)
Complexity	Simple (easy to explain)	Moderate (requires threshold)
Popularity	Very common (box plot standard)	Less common (growing in use)
Threshold	Fixed (1.5 × IQR)	Configurable (typically 3.5)

5. When to Use Each Method

Use Tukey's Method When:

Your data is approximately symmetric
You're creating standard box plots
You need a simple, widely-understood method
Your audience expects traditional box plots
You're working with normally-distributed data
You want consistency with standard practices

Use MAD Method When:

Your data is skewed or asymmetric
You have many outliers that might affect quartiles
You need a more robust method
You're working with non-normal distributions
You want better accuracy for skewed data
You're analyzing data with potential contamination

6. Practical Examples

Example 1: Symmetric Data (Tukey Preferred)

Scenario: Test scores from a well-designed exam (approximately normal distribution).

Data:

75, 78, 80, 82, 85, 87, 90, 92, 95, 98

Result: Both methods work well, but Tukey's method is simpler and more standard for this case.

→ Try this example in Outlier Calculator (switch between methods) →

Example 2: Skewed Data (MAD Preferred)

Scenario: Income data (right-skewed distribution with a few high earners).

Data:

30, 35, 40, 45, 50, 55, 60, 65, 70, 200

Result: MAD method is more robust here. Tukey's method might flag the 200 as an outlier, while MAD considers the overall distribution better.

→ Try this example in Outlier Calculator (compare methods) →

Example 3: Data with Many Outliers

Scenario: Sensor readings with potential measurement errors.

Data:

12.1, 12.3, 12.5, 12.7, 12.9, 13.1, 13.3, 50.0, 55.0, 60.0

Result: MAD method is more robust because it uses the median, which is less affected by outliers. This makes it better at detecting true outliers in contaminated data.

→ Try this example in Outlier Calculator (test both methods) →

7. FAQ

Q: Which method is more accurate?

A: Neither is universally more accurate. Tukey's method is better for symmetric, normal-like distributions, while MAD method is better for skewed or asymmetric data. The "best" method depends on your data's distribution.

Q: Can I use both methods in PlotNerd?

A: Yes! PlotNerd's Outlier Calculator allows you to switch between Tukey and MAD methods in real-time. Simply select your preferred method from the dropdown in the results panel, and the chart will update instantly. This lets you compare how each method identifies outliers in your data.

Q: What's the MAD threshold in PlotNerd?

A: PlotNerd uses a default threshold of 3.5 for MAD outlier detection, which is the standard in statistical literature. This means any data point with a modified Z-score greater than 3.5 (in absolute value) is considered an outlier.

Q: Should I remove outliers after detecting them?

A: Not necessarily! Outliers can be legitimate data points that require investigation. Before removing them, consider:

Are they data entry errors? (If yes, correct or remove)
Are they rare but legitimate events? (Keep them, but note them)
Do they represent important insights? (Investigate further)
Do they significantly affect your analysis? (Consider robust methods)

Q: Can I use different methods for different groups in a grouped box plot?

A: For consistency, PlotNerd uses the same outlier detection method for all groups in a grouped box plot. This ensures fair comparison across groups. You can switch the method, but it will apply to all groups simultaneously.

8. Conclusion

Choosing between Tukey's 1.5×IQR and MAD outlier detection methods depends on your data's characteristics:

Use Tukey's method for symmetric, normal-like distributions and standard box plots
Use MAD method for skewed, asymmetric data or when you need more robust outlier detection

With PlotNerd, you can easily compare both methods in real-time, seeing how each identifies outliers in your specific dataset. This helps you choose the most appropriate method for your analysis.

Ready to Test Both Methods?

Try PlotNerd's outlier detection calculator to see how Tukey and MAD methods compare on your data.

MAD vs Tukey:
Choosing the Right Outlier Detection Method

1. What Are Outliers and Why Do They Matter?

2. Tukey's 1.5×IQR Method Explained

How It Works

Pros and Cons

Advantages

Limitations

3. MAD (Median Absolute Deviation) Method Explained

How It Works

Pros and Cons

Advantages

Limitations

4. Side-by-Side Comparison

5. When to Use Each Method

Use Tukey's Method When:

Use MAD Method When:

6. Practical Examples

Example 1: Symmetric Data (Tukey Preferred)

Example 2: Skewed Data (MAD Preferred)

Example 3: Data with Many Outliers

7. FAQ

Q: Which method is more accurate?

Q: Can I use both methods in PlotNerd?

Q: What's the MAD threshold in PlotNerd?

Q: Should I remove outliers after detecting them?

Q: Can I use different methods for different groups in a grouped box plot?

8. Conclusion

Ready to Test Both Methods?

📖 Related Articles

🛠️ Related Tools

🔗 See Also

Related Tools

Related Articles