Jampp specializes in boosting mobile sales for client's apps via RTB advertising. Mounted on a real-time Big Data and Machine Learning infrastructure that captures and analyze billions of data points per hour to maximize return on spend.
As a technical leader of a five people team, I was in charge of building heterogeneous ML systems which would improve relevant business metrics and problems. We addressed topics that ranged fraud, product recommendations, user segmentations and clusterings, via unsupervised and supervised Machine/Deep Learning algorithms.
Some of the projects deployed were:
Deep Clustering algorithms on mobile user sessions to create embeddings from user behaviour to user characterization as input for clustering algorithms based on w2v embeddings. Increase of +10% of CPI in all tests.
Generic Product recommendations tool for all mobile apps using the open-source Surprise lib and Airflow. Improved CPA by +35% and CTR by +5% in all product tests.
Researched the state of the art on Randomized Control Trials (also known as A/B tests) for the ad industry. Documented and evangelized proper RCT operations for sales and business teams for the current ad-RCT product. The solution provided statistical validity to +5% of uplift in advertising spend.
Researched, modeled and deployed an anti-fraud algorithm using complex statistical and clustering methods to tackle publishers engaged in click-spamming and click-injection. The solution helped account managers to save accounts that ran budgets in excess of one million dollars.
Researched, prototyped , developed and fully integrated a customer segmentation algorithm in Python. Tested with Random Forests, Logistic Regression and Naive Bayes models to score users in their propensity to make an in-app-purchase based on past behaviour. This project lead to a clear increase of the RTB revenue stream of the company.
Wrote, edited and reviewed the company’s technical blog posts on data science.
Implemented an internal data-science visualization web-app with Bokeh. This portal allows non-tech users to interact with Data's product with rich interactive visualizations.
Helped design and implement the ETL pipeline using Airflow for all products owned by the Data Science Team and using the infrastructure built over Hive, Presto, S3 and MySQL.
Promoted the in-company use of SQL querying, Python and Jupyter Notebooks to non-technical users by providing hands-on training and online slack discussions. This empowered users to build their own tailored analytics solutions to business-related problems, without solely relying on tech data-teams.