AI News, How we’re using machine learning to fight shell selling

How we’re using machine learning to fight shell selling

In this first in an occasional series, we’re taking a look at machine learning initiatives at WePay — the kinds of problems we use machine learning for, how we build technology to address them, and how the unique challenges of the payments industry shape our approach.

We thought the best introduction would to be to look at an actual fraud problem we face, shell selling, and how we built the algorithm we’re now using to solve it.

While fraud is a concern pretty much anywhere money is exchanged for a good or service, there are certain types of fraud that are unique to platforms — those services that act as an intermediary in a transaction for the purposes of making it easier.

And while it’s obvious when a merchant gets a bunch of payments from different cards at the same IP in a short amount of time, the fraudsters who perpetrate this crime tend to be more sophisticated than that.

Since shell selling a common problem, and one that’s difficult for humans to spot, we decided to build a machine learning algorithm to help us catch it.

For things like fraud modeling where you need to retrain constantly and deploy quickly, it offers a lot of advantages: Getting back specifically to shell selling, we tested several algorithms before we settled on the one that gave us the best performance: Random Forest.

Our machine learning pipeline follows a standard procedure, which includes data extraction, data cleaning, feature derivation, feature engineering and transformation, feature selection, model training, and model performance evaluation:

It’s still largely humans that have to think of the ways that fraudsters can attack a payment system and write rules to block them, and it’s still an experienced professional that has to make the judgment call whether to block a transaction when it falls in the gray area between “obvious fraud” and “obviously legitimate,” as it so often does.

Random Forest Algorithm - Random Forest Explained | Random Forest in Machine Learning | Simplilearn

This Random Forest Algorithm tutorial will explain how Random Forest algorithm works in Machine Learning. By the end of this video, you will be able to ...

Random Forests - The Math of Intelligence (Week 6)

This is one of the most used machine learning models ever. Random Forests can be used for both regression and classification, and our use case will be to ...

Unsupervised Anomaly Detection with Isolation Forest - Elena Sharova

PyData London 2018 This talk will focus on the importance of correctly defining an anomaly when conducting anomaly detection using unsupervised machine ...

Build A Complete Project In Machine Learning | Credit Card Fraud Detection | Eduonix

Look what we have for you! Another complete project in Machine Learning! In today's tutorial, we will be building a Credit Card Fraud Detection System from ...

Decision Tree Algorithm With Example | Decision Tree In Machine Learning | Data Science |Simplilearn

This Decision Tree algorithm in Machine Learning tutorial video will help you understand all the basics of Decision Tree along with what is Machine Learning, ...

Machine Learning Algorithms | Machine Learning Tutorial | Data Science Algorithms | Simplilearn

This Machine Learning Algorithms Tutorial video will help you learn you what is Machine Learning, various Machine Learning problems and the algorithms, key ...

Natalie Hockham: Machine learning with imbalanced data sets

Classification algorithms tend to perform poorly when data is skewed towards one class, as is often the case when tackling real-world problems such as fraud ...

Anomaly Detection: Algorithms, Explanations, Applications

Anomaly detection is important for data cleaning, cybersecurity, and robust AI systems. This talk will review recent work in our group on (a) benchmarking ...

Random Forest and Support Vector Machines Getting the Most from Your Classifiers

Brett Wujek talks about tuning random forest and support vector machine algorithms to train high quality models

Applying Random Forest Model to Customer Data

Our head of data science, Kirill Temlyakov describes how SimpleRelevance uses the Random Forest Model with customer data. Learn more at ...