Optimizing Classification Models for Marketing with the Expected Profit Curve

Juan Esteban de la Calle
8 min readJul 6, 2023

--

Introduction

In the bustling office of a contemporary marketing firm, analysts and strategists are immersed in a sea of data. With an upcoming marketing campaign in sight, the pressure is on to optimize every penny spent. The aim is clear — maximize the returns. The success of the campaign heavily relies on effectively targeting potential customers who are likely to respond positively. Machine learning classification models serve as the guiding compass for this voyage.

However, steering through the stormy waters of imbalanced datasets and skewed probabilities necessitates more than just accuracy or AUC-ROC.

This article provides you with the tools and insights necessary to not only predict customer behavior but also optimize profits by calibrating your classification models.

Generating Data and Training the Model

First, suppose we have a dataset that represents the information of potential customers. The data is inherently imbalanced, much like real-life marketing datasets where positive responses are usually scarce.

In this very first step, we have a dataset constructed by real-life data of our customers and whether they acquired a product with us or not, this kind of problem is a binary classification, and top-tier algorithms like XGBoost and RandomForest are available to solve this model.

We could use this pseudocode:

import pandas as pd
import numpy as np
import xgboost as xgb
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder, StandardScaler
from sklearn.metrics import accuracy_score, confusion_matrix, roc_auc_score
from sklearn.metrics import make_scorer
from sklearn.model_selection import train_test_split, RandomizedSearchCV


# Columns which won't be used
cols_no_use = [...]


# Numeric columns
numeric_cols = [...]


# Categorical columns
cat_cols = [...]


y = data['product_opening']
X = data.drop(columns=cols_no_use + ['product_opening'])


# Categorical variable codification
for col in cat_cols:
lbl = LabelEncoder()
X[col] = lbl.fit_transform(X[col].astype(str))


# Numerical variables scaling
scaler = StandardScaler()
X[numeric_cols] = scaler.fit_transform(X[numeric_cols])


# Train-test division
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=103)


# Training of xgb
model = xgb.XGBClassifier(use_label_encoder=False)


# Definir los hiperparámetros a ensayar
param_grid = {
'n_estimators': [50, 100, 200],
'learning_rate': [0.01, 0.1, 0.2],
'max_depth': [3, 4, 5],
'gamma': [0, 0.1, 0.2],
'subsample': [0.8, 1.0],
'colsample_bytree': [0.8, 1.0]
}


# Definir la métrica de evaluación
scorer = make_scorer(accuracy_score)


# Definir el objeto de búsqueda de cuadrícula
random_search = RandomizedSearchCV(estimator=model, param_distributions=param_grid, scoring=scorer, n_iter=5, cv=3, verbose=1, random_state=42)


# Entrenar el modelo usando búsqueda aleatoria
random_search.fit(X_train, y_train)


# Obtener los mejores hiperparámetros
best_params = random_search.best_params_
print(f'Best hiperparameters: {best_params}')


# Best model
best_model = random_search.best_estimator_


# Predict with our model
y_pred = best_model.predict(X_test)
probabilities = best_model.predict_proba(X_test)[:,1]

Imagine that each feature represents information such as demographics, past buying behavior, and interactions with the brand.

Here, we have a model already. We could work with “probabilities”, but…

Evaluating the Model: Beyond Accuracy

When we talk about models, we often discuss accuracy, precision, recall, etc. But in the context of marketing, especially with imbalanced data, some metrics are more telling than others. Let’s compute and understand them.

from sklearn.metrics import roc_auc_score, f1_score, precision_score, recall_score, log_loss, brier_score_loss

threshold = np.mean(probabilities)

# Calculating metrics
auc_roc = roc_auc_score(y_test, probabilities)
f1 = f1_score(y_test, probabilities > threshold)
precision = precision_score(y_test, probabilities > threshold)
recall = recall_score(y_test, probabilities > threshold)
logloss = log_loss(y_test, probabilities)
calibration_loss = brier_score_loss(y_test, probabilities)

# Printing metrics
print(f'AUC-ROC: {auc_roc}')
print(f'F1 Score: {f1}')
print(f'Precision: {precision}')
print(f'Recall: {recall}')
print(f'Log Loss: {logloss}')
print(f'Calibration Loss: {calibration_loss}')

# We take threshold as the mean of probabilities which is an arbitrary one
AUC-ROC: 0.7829190606513905
F1 Score: 0.1130798969072165
Precision: 0.06108597285067873
Recall: 0.7597402597402597
Log Loss: 0.39972973157742686
Calibration Loss: 0.10952992554197918

AUC-ROC, F1 Score, Precision, and Recall offer insights into the model’s performance. But they don’t provide the monetary context necessary for a marketing campaign. Calibration Loss and Log Loss are slightly related to the financial aspects of a model.

The Calibration Predicament

Classification models like XGBoost and Random Forest tend to push probabilities toward extremes, a phenomenon known as model miscalibration. In simple terms, miscalibration means that the predicted probabilities of an event occurring are not well-aligned with the actual frequencies of that event. For instance, if a model predicts a customer’s likelihood of responding positively to a campaign with an 80% probability, we would expect about 80 out of 100 similar customers to respond positively. However, due to miscalibration, the actual number might be significantly higher or lower.

This miscalibration arises because models like XGBoost and Random Forest primarily focus on distinguishing between classes as clearly as possible. In the context of marketing, these algorithms try to separate potential responders from non-responders. In their effort to create this separation, they often produce probabilities that are close to 0 or 1, with very few in between. This is because boosting and bagging techniques involved in these algorithms focus more on accurate classification than on estimating well-calibrated probabilities.

In marketing, probabilities are often directly associated with financial aspects, such as costs and returns. If the probabilities are pushed towards extremes, the model might be too confident about certain predictions, leading to over-investing in campaigns targeting specific customers. On the other hand, the model could be too pessimistic regarding other prospects, resulting in missed opportunities.

Here’s why miscalibration can severely impact the expected profit calculations:

  1. Erroneous Budget Allocation: The budget may be heavily allocated to prospects where the model is too confident, potentially resulting in a lower return on investment.
  2. Missed Opportunities: Prospects that could have been profitable if targeted might be neglected due to the model underestimating their probabilities.
  3. Customer Experience and Brand Perception: Over-targeting certain customers based on overconfident predictions might annoy them, impacting customer experience and brand perception.
  4. Inaccurate Assessment of Campaign Effectiveness: Miscalibrated probabilities can lead to incorrect conclusions about the campaign’s effectiveness. Decision-makers might attribute the discrepancies to external factors without realizing that the model’s miscalibration is a contributing factor.

Miscalibrated probabilities can distort the financial aspects of marketing campaigns and cloud decision-making. This stresses the importance of calibrating the probabilities to more accurately reflect reality, ensuring a more effective allocation of resources and ultimately a better return on investment.

The cost-Benefit Matrix helps us make decisions

Marketing decisions often have clear financial consequences. It is vital to link the model’s predictions with the financial outcomes. The Cost-Benefit Matrix helps achieve this by attributing costs and benefits to the possible outcomes.

Cost-benefit matrix

Here, if the model correctly predicts a positive response (True Positive), there is a benefit, say $50. A False Positive would incur a smaller cost, say $5. On the other hand, missing out on a potential customer (False Negative) might incur a cost of $10, whereas correctly identifying a non-responder (True Negative) doesn’t affect the finances.

Calibrating Probabilities with Platt Scaling

Now, let’s calibrate the probabilities using Platt Scaling and observe the differences in a Calibration Plot.

As important as it sounds, calibrating a model is as simple as using the CalibratedClassifierCV function in python.

from sklearn.calibration import calibration_curve

# Calibrating the model
calibrated = CalibratedClassifierCV(best_model, method='sigmoid', cv='prefit')
calibrated.fit(X_train, y_train)

# Predict calibrated probabilities
calibrated_probabilities = calibrated.predict_proba(X_test)[:, 1]

# Plotting Calibration Curves with strategy='quantile'
prob_true, prob_pred = calibration_curve(y_test, probabilities, n_bins=50, strategy='quantile')
prob_true_calibrated, prob_pred_calibrated = calibration_curve(y_test, calibrated_probabilities, n_bins=50, strategy='quantile')

plt.figure(figsize=(8, 6))
plt.plot([0] + prob_pred.tolist() + [1], [0] + prob_true.tolist() + [1], marker='o', label='Before Calibration')
plt.plot([0] + prob_pred_calibrated.tolist() + [1], [0] + prob_true_calibrated.tolist() + [1], marker='x', label='After Calibration')
plt.plot([0, 1], [0, 1], linestyle='--', label='Perfectly Calibrated')
plt.xlabel('Predicted Probability')
plt.ylabel('True Probability')
plt.title('Calibration Plot')
plt.legend(loc='upper left')
plt.text(0.6, 0.07, f'Calibration Loss Before: {calibration_loss_before:.3f}', color='blue')
plt.text(0.6, 0.03, f'Calibration Loss After: {calibration_loss_after:.3f}', color='orange')
plt.show()
A model before and after being calibrated

Notice how the calibrated probabilities are closer to the line of perfect calibration. This is crucial for computing more realistic expected profits.

Histogram for probabilities and calibrated probabilities

In the histogram, we can see how the calibration takes us closer to the real probabilities (dotted line in blue).

Now, we have both components of the Expected Value equation we know from our statistics class.

Expected_Profit[i] = probability[i] * True_Positive_Benefit + (1 - probability[i]) * False_Positive_Cost

Plotting the Expected Profit Curve

Next, let’s compute the Expected Profit Curve, which represents how the expected profit changes as we vary the sample size.

# Calculating expected profit with the average probability
benefit = 50
cost = -5

sorted_probs = np.sort(calibrated_probabilities)[::-1]
cumulative_expected_profit = np.cumsum(sorted_probs * benefit + (1 - sorted_probs) * (cost))

# Without the model
average_prob = np.mean(sorted_probs)
constant_profit = np.cumsum(np.ones_like(sorted_probs) * average_prob * benefit+ (1 - average_prob) * (cost))

# Plotting the expected profit
plt.plot(range(len(sorted_probs)), cumulative_expected_profit, 'b--', label='Expected Profit with Model')
plt.plot(range(len(sorted_probs)), constant_profit, 'g-', label='Expected Profit without Model')
plt.axvline(x=np.argmax(cumulative_expected_profit), color='r', linestyle='--', label='Maximum Expected Profit')
plt.xlabel('Sample Size')
plt.ylabel('Expected Profit')
plt.title('Expected Profit Curve')
plt.legend()
plt.show()
Expected profit vs sample size

The blue dashed line represents the cumulative expected profit when targeting customers in descending order of predicted probabilities. The red dashed line indicates the maximum expected profit attainable.

Computing the Optimal Threshold

Using the Expected Profit Curve, we can compute the optimal threshold that maximizes the expected profit.

# Finding the optimal threshold
optimal_index = np.argmax(cumulative_expected_profit)
optimal_threshold = sorted_probs[optimal_index]

print(f'Optimal Threshold: {optimal_threshold}')
Optimal Threshold: 0.09104249626398087

This threshold can be used as a decision boundary for classifying the customers into ‘target’ or ‘do not target’.

Advantages and Disadvantages

Advantages:

  1. Incorporates financial aspects directly into the decision-making process.
  2. Helps in selecting an optimal threshold based on the monetary impact.

Disadvantages:

  1. Requires precise knowledge of the costs and benefits associated with decisions.
  2. Sensitive to changes in the cost-benefit matrix, and uncertainty in it.

Thought-provoking Questions

Now, let’s take this even further with some thought-provoking questions:

  1. What if the costs and benefits themselves were dynamic models? Imagine a scenario where the costs and benefits change according to market dynamics, competitors’ actions, customer behavior, or even customer variables. This adds another layer of complexity, as we would need to predict these parameters too.
  2. What if we also include the costs of False Negatives and the benefits of True Negatives? Usually, the cost of False Negatives is neglected as it’s considered a missed opportunity. But what if not reaching out to a customer right now means they are more likely to be receptive later or the opportunity for another company to contact our client? How would the model adapt to this?

Conclusions

Navigating the complex landscape of marketing using machine learning classification models is much like a captain steering through tumultuous waters. The models provide a powerful means to predict customer behavior. However, it is essential to calibrate the model’s probabilities and anchor them to real-world financial outcomes.

By employing Platt Scaling for calibration and incorporating the Cost-Benefit Matrix into our analysis, we turn the raw predictions into actionable insights. The Expected Profit Curve and optimal threshold determination guide us in maximizing our returns.

But let’s not forget that the seas are ever-changing. The costs, benefits, and customer behavior are dynamic. So should be our models and strategies. It is essential to continually adapt and learn, much like the machine learning models we employ.

As you set sail on your next marketing campaign, remember that the winds of data are at your back, but it’s the rudder of calibrated and financially informed decision-making that will steer you to bountiful shores.

--

--