After graduating from Harvey Mudd College, I worked as an applied scientist intern at Amazon. At Amazon’s Customer Behavior Analytics (CBA) team, we collect and analyze customer behavior data, so we can receive and transform customer events in real time. Several hundred teams at Amazon make use of the data we collect to make critical business decisions. My work here is supervised by my manager Dinesh Mandalapu.
Deep Learning Model for Customer Segmentation
At Amazon’s CBA and other teams, key business decisions are informed by quantitative metrics resulted from Weblab (i.e., A/B Testing) experiments. From a business perspective, experiments usually do not target all customers at once but specific groups of customers. This insight suggests that a single model has a risk of compromising model’s generalizability. The single model may not generalize appropriately to large group of customers. With this motivation, my team and I worked on creating a customer segmentation model for discovering heterogeneity in customers. So, other groups at CBA can use these segmented customer groups in their experiments.
For this project, I developed an autoencoder-based deep neural network architecture to learn the representations the customer’s embeddings for clustering high-dimensional customers’ data into appropriate clusters driven by key business metrics. The modIel was developed with Python and MXNet Gluon. The proposed method improved clustering performance by 26% and is deployed to production for improving customer’s downstream impact estimation.
Distributed Deep Learning
To handle with very large amount of customer data, I developed a pipeline for neural network distributed training and inference to be used at Amazon’s CBA based on Spark and Amazon EMR cluster. Additionally, I also held tutorials on deep learning and distributed deep learning training/inference for my CBA team.
Last updated: Jan 17, 2022