电商用户与营销分析

E-commerce User & Marketing Analytics

RFM · K-means · Cohort analysis · KPI

简介

Introduction

本项目基于电商广告投放与用户行为数据(user.csv、ad.csv、click.csv),构建完整的数据处理与分析流程,对广告价格区间、点击率表现及用户结构进行系统分析,并通过用户聚类与 RFM 模型实现用户价值分层,为广告投放优化与精细化运营提供数据支持。

项目首先对多源数据进行清洗与整合,基于 userid 与 ad_id 构建统一分析数据集,对缺失值与异常值进行处理,并对商品价格进行分箱与编码,形成标准化价格区间特征。在此基础上,计算不同价格区间下的广告点击率,并对各价格区间中不同用户年龄层、性别及用户层级分布进行可视化分析,揭示价格与用户结构、广告效果之间的关系。

进一步地,项目筛选展示量与点击量排名前 100 的广告位,分析各广告位的平均商品价格及价格区间分布,并对点击率排名前 10 与后 10 的广告位进行对比分析,从用户结构与商品价格角度评估高效与低效广告位的差异。

在用户层面,项目基于用户的购物层级、点击行为及浏览商品的平均价格构建特征向量,采用 K-Means 聚类方法将用户划分为 5 类用户群体,并结合 RFM 模型对不同用户群体进行价值分层,最终形成"重要保持用户、重要发展用户、重要挽留用户、一般用户、低价值用户"五类用户画像,并对各类用户的年龄与性别结构进行可视化分析。

该项目完整覆盖数据预处理、指标构建、广告效果分析、用户画像建模与可视化展示全过程,能够支持广告定价策略优化、用户分层运营及精准营销决策。

This project builds a complete data processing and analysis pipeline based on e-commerce advertising and user behavior data (user.csv, ad.csv, click.csv). It systematically analyzes ad price ranges, click-through rate performance, and user demographics, and implements user value segmentation through clustering and RFM modeling to support ad placement optimization and refined operations.

The project first cleans and integrates multi-source data, constructing a unified analysis dataset based on userid and ad_id, handling missing values and outliers, and binning/encoding product prices to create standardized price range features. Based on this, it calculates ad click-through rates across different price ranges and visualizes user age, gender, and tier distributions within each price range, revealing relationships between price, user structure, and ad effectiveness.

Furthermore, the project filters the top 100 ad slots by impressions and clicks, analyzes average product prices and price range distributions for each slot, and compares the top 10 and bottom 10 ad slots by CTR, evaluating differences between high-performing and low-performing placements from user structure and product price perspectives.

At the user level, the project constructs feature vectors based on shopping tier, click behavior, and average browsed product price, using K-Means clustering to segment users into 5 groups. Combined with the RFM model for value stratification, it produces five user personas: "Important Retain", "Important Develop", "Important Win-back", "General", and "Low-value" users, with visualizations of age and gender distributions for each segment.

The project covers the full workflow from data preprocessing, metric construction, ad performance analysis, user profiling, to visualization, supporting ad pricing strategy optimization, user-tiered operations, and precision marketing decisions.