Search Tech Journey

Find topics, journeys and posts

back to blog
ai mlbeginner 6m2025-07-19

Learning how to build an recommendation system from initial signals

From a few initial adopters of a product, how we can target new set of users who are more likely can use the product

Building a Targeting System from Early Adopter Signals

Goal: Given a small set of early adopters, build a scoring model to identify users most likely to adopt a product.

Git Repo: github.com/dinesh-coderepo/targetting-system


๐Ÿ”‘ Key Concepts at a Glance

SystemHow It WorksExample
RecommendationLearn from a user's own patterns โ†’ extend to similar items"You watched X, try Y"
TargetingLearn from early adopters' profiles โ†’ find similar non-adopters"Users like your best customers"
Cold StartVery few signals โ†’ traditional collaborative filtering failsThis blog's core challenge

๐Ÿ—๏ธ System Architecture


๐Ÿ”ง Background & Prerequisites

1. Types of Recommendation Systems

ApproachHow It WorksProsCons
User-based CFFind similar users โ†’ recommend their preferencesIntuitiveDoesn't scale; sparse
Item-based CFFind similar items โ†’ recommend to liking usersStableNeeds interaction data
Matrix FactorizationDecompose user-item matrix into latent factorsHandles sparsityCold start problem
Content-BasedMatch item features to user preferencesNo cold start for itemsLimited to feature quality
HybridCombine CF + content-basedBest of both worldsComplex to implement

๐Ÿ’ก Netflix, Spotify, and YouTube all use hybrid approaches combining multiple methods.


2. The Cold Start Problem

This is the core challenge for this blog โ€” very few adopters means extreme data sparsity.

Solutions for targeting with few adopters:

  • ๐Ÿ”น Feature similarity โ€” Match non-adopters against adopter feature profiles
  • ๐Ÿ”น Lookalike modeling โ€” Find users who "look like" early adopters (demographics + behavior)
  • ๐Ÿ”น Propensity scoring โ€” Binary classifier: adopter (1) vs non-adopter (0)

3. Propensity / Targeting Model

The heart of this project โ€” scoring every user by their likelihood to adopt.

Feature Categories:

CategoryExample Features
๐Ÿง‘ DemographicAge, location, job title, industry
๐Ÿ“Š BehavioralLogin frequency, feature usage, time spent, page views
๐Ÿค SocialConnections to existing adopters, team adoption rate
โฑ๏ธ TemporalRecency, frequency, monetary (RFM analysis)

Model Choices:

ModelWhen to Use
Logistic RegressionBaseline โ€” interpretable, fast. Understand odds ratios.
Random Forest / XGBoostBetter accuracy, non-linear relationships, feature importance
Neural NetworksLarge-scale datasets with many features

โš ๏ธ Class Imbalance: If only 1% are adopters, naive models just predict "no" 99% of the time. Use SMOTE (oversampling), class weights, focal loss, or undersampling.


4. Evaluation Metrics

MetricWhat It MeasuresWhy It Matters
AUC-ROCDiscrimination ability across thresholdsBest single metric for targeting
Precision@KOf top K predictions, how many are actual adoptersDirectly measures targeting quality
Recall@KOf all adopters, how many are in top KDid we find most adopters?
Lift ChartHow much better than random selection"Top 10% scored 5x more likely than random"
NDCGRanking quality with position weightingAre true adopters ranked highest?

โš ๏ธ Never use accuracy with imbalanced data โ€” it's misleading.

โš ๏ธ Never random split โ€” use time-based splits (train on past, test on future) to prevent data leakage.


5. Tools & Libraries

LibraryPurpose
scikit-learnLogisticRegression, RandomForest, metrics, pipelines
xgboost / lightgbmGradient boosting for targeting models
surpriseCollaborative filtering (SVD, KNN, NMF)
lightfmHybrid recommendations (collaborative + content)
implicitImplicit feedback models (ALS, BPR)
pandas + numpyData manipulation & feature engineering
matplotlib + seabornVisualization (lift charts, ROC curves)

โœ… TODO โ€” Remaining Work

#TaskPriority
1Implement basic collaborative filtering (user-item matrix, cosine similarity)๐Ÿ”ด High
2Implement matrix factorization (SVD) with Surprise๐Ÿ”ด High
3Build propensity model with logistic regression๐Ÿ”ด High
4Feature engineering pipeline (behavioral + demographic)๐Ÿ”ด High
5Handle class imbalance (SMOTE, class weights)๐ŸŸก Medium
6Evaluate with AUC-ROC, lift charts, decile analysis๐ŸŸก Medium
7Build cold-start fallback strategy๐ŸŸก Medium
8Compare model approaches in a results table๐ŸŸก Medium
9Add Mermaid architecture diagram of full targeting pipeline๐ŸŸข Low
10Connect to Monolith paper learnings๐ŸŸข Low

๐Ÿ”ง Reference Implementation โ€” Propensity Model with Lookalike Scoring

A minimal but complete pipeline: given a tiny set of adopters, score every non-adopter for likelihood to adopt.

# targeting.py
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.metrics import roc_auc_score, average_precision_score

def build_dataset(users: pd.DataFrame, adopters: set[str]) -> tuple[pd.DataFrame, pd.Series]:
    """users has columns: user_id, age, logins_30d, features_used, tenure_days, team_size, industry.
       adopters is a set of user_ids that already converted."""
    df = users.copy()
    df["label"] = df["user_id"].isin(adopters).astype(int)
    y = df.pop("label")
    X = pd.get_dummies(df.drop(columns=["user_id"]), columns=["industry"], drop_first=True)
    return X, y

def train(X: pd.DataFrame, y: pd.Series):
    X_tr, X_val, y_tr, y_val = train_test_split(
        X, y, test_size=0.25, stratify=y, random_state=42
    )
    scaler = StandardScaler().fit(X_tr)
    X_tr_s = scaler.transform(X_tr); X_val_s = scaler.transform(X_val)

    # Baseline โ€” logistic regression with class_weight for imbalance
    lr = LogisticRegression(class_weight="balanced", max_iter=500).fit(X_tr_s, y_tr)
    # Stronger โ€” gradient boosting handles non-linear interactions
    gb = GradientBoostingClassifier(n_estimators=200, max_depth=3).fit(X_tr, y_tr)

    for name, model, X_eval in [("logreg", lr, X_val_s), ("gbm", gb, X_val)]:
        p = model.predict_proba(X_eval)[:, 1]
        print(f"{name}: AUC={roc_auc_score(y_val, p):.3f}  "
              f"PR-AUC={average_precision_score(y_val, p):.3f}")
    return gb, scaler

def score_and_rank(model, users: pd.DataFrame, adopters: set[str], top_k: int = 1000):
    """Score all non-adopters and return the top-K targets with lift."""
    non_adopters = users[~users["user_id"].isin(adopters)].copy()
    X = pd.get_dummies(non_adopters.drop(columns=["user_id"]),
                       columns=["industry"], drop_first=True)
    non_adopters["score"] = model.predict_proba(X)[:, 1]
    ranked = non_adopters.sort_values("score", ascending=False)

    base_rate = len(adopters) / len(users)
    top = ranked.head(top_k)
    # Lift = model's positive rate in top-K / random base rate
    # (true labels not known for non-adopters โ€” use held-out to measure lift in practice)
    print(f"Base adoption rate: {base_rate:.3%} | Targeting top {top_k} users")
    return ranked[["user_id", "score"]]

Evaluating with a Proper Time-Based Split

Random splits leak future information. In targeting, the adopters at time T became adopters because of behaviour before T. Evaluate like this:

# Split by signup date, not randomly
cutoff = "2025-06-01"
train_users = users[users["signup_date"] < cutoff]
test_users  = users[users["signup_date"] >= cutoff]

# Adopters in each cohort
train_adopters = adopter_events.query("event_date < @cutoff")["user_id"].unique()
test_adopters  = adopter_events.query("event_date >= @cutoff")["user_id"].unique()

Lift Chart โ€” The Right Way to Present Results

def lift_chart(y_true, y_score, deciles=10):
    df = pd.DataFrame({"y": y_true, "p": y_score}).sort_values("p", ascending=False)
    df["decile"] = pd.qcut(df["p"].rank(method="first"), deciles, labels=False)
    base = df["y"].mean()
    table = df.groupby("decile")["y"].mean().rename("rate").to_frame()
    table["lift"] = table["rate"] / base
    return table.sort_index(ascending=False)

A healthy targeting model shows the top decile at 3โ€“10ร— lift over baseline. If the top decile is only 1.5ร—, your features aren't predictive โ€” go back to feature engineering before tuning the model.

When every TODO above is ticked and your lift chart shows โ‰ฅ 3ร— in the top decile on a time-based test set, flip this post to status: published.