tech-econ

Fraud, Credit and Claim Risk

Detect fraudulent transactions, assess credit risk, and identify insurance claim fraud

2008 4871 cited

Isolation Forest

Fei Tony Liu, Kai Ming Ting, Zhi-Hua Zhou

Tree-based anomaly isolation achieving O(n log n) complexity; the industry standard for fraud detection.

2000 5035 cited

LOF: Identifying Density-Based Local Outliers

Markus M. Breunig, Hans-Peter Kriegel, Raymond T. Ng, Jörg Sander

Introduced Local Outlier Factor assigning continuous 'degree of outlierness' for variable-density anomaly detection.

2009 10461 cited

Anomaly Detection: A Survey

Varun Chandola, Arindam Banerjee, Vipin Kumar

Comprehensive taxonomy covering classification, nearest-neighbor, clustering, and statistical approaches; 8,000+ citations.

2012 1843 cited

Isolation-Based Anomaly Detection

Fei Tony Liu, Kai Ming Ting, Zhi-Hua Zhou

Extended journal version with theoretical analysis; handles high-dimensional masking and swamping effects.

2001 5890 cited

Estimating the Support of a High-Dimensional Distribution (One-Class SVM)

Bernhard Schölkopf, John C. Platt, John Shawe-Taylor, Alex J. Smola, Robert C. Williamson

One-class SVM for novelty detection; foundational method for fraud detection when only normal data is available.

2021 1245 cited

Deep Learning for Anomaly Detection: A Review

Guansong Pang, Chunhua Shen, Longbing Cao, Anton van den Hengel

Comprehensive survey of deep learning anomaly detection; covers autoencoders, GANs, and self-supervised approaches.

2009 987 cited

A Survey of Credit Card Fraud, Credit and Claim Risk Techniques: Data and Technique Oriented Perspective

Linda Delamaire, Hussein Abdou, John Pointon

Survey comparing neural networks, genetic algorithms, and expert systems for credit card fraud detection.

2018 789 cited

Credit Card Fraud, Credit and Claim Risk: A Realistic Modeling and a Novel Learning Strategy

Andrea Dal Pozzolo, Giacomo Boracchi, Olivier Caelen, Cesare Alippi, Gianluca Bontempi

Addresses realistic fraud challenges: extreme class imbalance, concept drift, and delayed feedback loops.

2025 12 cited

Fraud, Credit and Claim Risk in Healthcare Claims Using Machine Learning: A Systematic Review

Multiple authors

Comprehensive review analyzing ML techniques for health insurance fraud over two decades.

2023 14 cited

Insurance Fraud, Credit and Claim Risk: A Statistically Validated Network Approach

Michele Tumminello, Andrea Consiglio, Pietro Vassallo, Riccardo Cesari, Fabio Farabullini

Network-based approach using statistically validated networks to detect coordinated fraud rings.

2023 47 cited

Detecting Insurance Fraud Using Supervised and Unsupervised Machine Learning

Jens Debener, Volker Heinke, Johannes Kriebel

Field experiment showing supervised and unsupervised methods are complements, not substitutes.

2010 456 cited

OddBall: Spotting Anomalies in Weighted Graphs

Leman Akoglu, Mary McGlohon, Christos Faloutsos

Detects anomalies in weighted graphs using egonet features; foundational for fraud ring detection.

2016 378 cited

FRAUDAR: Bounding Graph Fraud in the Face of Camouflage

Bryan Hooi, Hyun Ah Song, Alex Beutel, Neil Shah, Kijung Shin, Christos Faloutsos

Detects dense subgraphs even when fraudsters add random edges to camouflage; handles lockstep behavior.

2018 267 cited

Heterogeneous Graph Neural Networks for Malicious Account Detection

Ziqi Liu, Chaochao Chen, Xinxing Yang, Jun Zhou, Xiaolong Li, Le Song

GEM model using heterogeneous graphs (users, devices, transactions) for Alipay fraud detection.

2018 345 cited

NetWalk: A Flexible Deep Embedding Approach for Anomaly Detection in Dynamic Networks

Wenchao Yu, Wei Cheng, Charu C. Aggarwal, Kai Zhang, Haifeng Chen, Wei Wang

Dynamic network embeddings for streaming anomaly detection; updates in O(1) per edge.

2021 156 cited

BotRGCN: Twitter Bot Detection with Relational Graph Convolutional Networks

Shangbin Feng, Herun Wan, Ningnan Wang, Minnan Luo

Relational GCN exploiting follower/friend graphs achieves SOTA on bot detection benchmarks.

Title	Authors	Year	Citations
Isolation Forest Tree-based anomaly isolation achieving O(n log n) complexity; the industry standard for fraud detection.	Fei Tony Liu, Kai Ming Ting, Zhi-Hua Zhou	2008	4871
LOF: Identifying Density-Based Local Outliers Introduced Local Outlier Factor assigning continuous 'degree of outlierness' for variable-density anomaly detection.	Markus M. Breunig, Hans-Peter Kriegel, Raymond T. Ng, Jörg Sander	2000	5035
Anomaly Detection: A Survey Comprehensive taxonomy covering classification, nearest-neighbor, clustering, and statistical approaches; 8,000+ citations.	Varun Chandola, Arindam Banerjee, Vipin Kumar	2009	10461
Isolation-Based Anomaly Detection Extended journal version with theoretical analysis; handles high-dimensional masking and swamping effects.	Fei Tony Liu, Kai Ming Ting, Zhi-Hua Zhou	2012	1843
Estimating the Support of a High-Dimensional Distribution (One-Class SVM) One-class SVM for novelty detection; foundational method for fraud detection when only normal data is available.	Bernhard Schölkopf, John C. Platt, John Shawe-Taylor, Alex J. Smola, Robert C. Williamson	2001	5890
Deep Learning for Anomaly Detection: A Review Comprehensive survey of deep learning anomaly detection; covers autoencoders, GANs, and self-supervised approaches.	Guansong Pang, Chunhua Shen, Longbing Cao, Anton van den Hengel	2021	1245
A Survey of Credit Card Fraud, Credit and Claim Risk Techniques: Data and Technique Oriented Perspective Survey comparing neural networks, genetic algorithms, and expert systems for credit card fraud detection.	Linda Delamaire, Hussein Abdou, John Pointon	2009	987
Credit Card Fraud, Credit and Claim Risk: A Realistic Modeling and a Novel Learning Strategy Addresses realistic fraud challenges: extreme class imbalance, concept drift, and delayed feedback loops.	Andrea Dal Pozzolo, Giacomo Boracchi, Olivier Caelen, Cesare Alippi, Gianluca Bontempi	2018	789
Fraud, Credit and Claim Risk in Healthcare Claims Using Machine Learning: A Systematic Review Comprehensive review analyzing ML techniques for health insurance fraud over two decades.	Multiple authors	2025	12
Insurance Fraud, Credit and Claim Risk: A Statistically Validated Network Approach Network-based approach using statistically validated networks to detect coordinated fraud rings.	Michele Tumminello, Andrea Consiglio, Pietro Vassallo, Riccardo Cesari, Fabio Farabullini	2023	14
Detecting Insurance Fraud Using Supervised and Unsupervised Machine Learning Field experiment showing supervised and unsupervised methods are complements, not substitutes.	Jens Debener, Volker Heinke, Johannes Kriebel	2023	47
OddBall: Spotting Anomalies in Weighted Graphs Detects anomalies in weighted graphs using egonet features; foundational for fraud ring detection.	Leman Akoglu, Mary McGlohon, Christos Faloutsos	2010	456
FRAUDAR: Bounding Graph Fraud in the Face of Camouflage Detects dense subgraphs even when fraudsters add random edges to camouflage; handles lockstep behavior.	Bryan Hooi, Hyun Ah Song, Alex Beutel, Neil Shah, Kijung Shin, Christos Faloutsos	2016	378
Heterogeneous Graph Neural Networks for Malicious Account Detection GEM model using heterogeneous graphs (users, devices, transactions) for Alipay fraud detection.	Ziqi Liu, Chaochao Chen, Xinxing Yang, Jun Zhou, Xiaolong Li, Le Song	2018	267
NetWalk: A Flexible Deep Embedding Approach for Anomaly Detection in Dynamic Networks Dynamic network embeddings for streaming anomaly detection; updates in O(1) per edge.	Wenchao Yu, Wei Cheng, Charu C. Aggarwal, Kai Zhang, Haifeng Chen, Wei Wang	2018	345
BotRGCN: Twitter Bot Detection with Relational Graph Convolutional Networks Relational GCN exploiting follower/friend graphs achieves SOTA on bot detection benchmarks.	Shangbin Feng, Herun Wan, Ningnan Wang, Minnan Luo	2021	156

Spam & Abuse

Detect fake accounts and abusive behavior

2017 852 cited

Online Human-Bot Interactions: Detection, Estimation, and Characterization

Onur Varol, Emilio Ferrara, Clayton A. Davis, Filippo Menczer, Alessandro Flammini

Foundational Botometer paper; random forest on 1,000+ features estimating 9-15% of Twitter accounts are bots.

2016 1428 cited

The Rise of Social Bots

Emilio Ferrara, Onur Varol, Clayton Davis, Filippo Menczer, Alessandro Flammini

Seminal paper defining social bots, detection challenges, and policy implications for platform manipulation.

2022 133 cited

Botometer 101: Social Bot Practicum for Computational Social Scientists

Kai-Cheng Yang, Emilio Ferrara, Filippo Menczer

Practitioner guide for Botometer v4 with CAP scores, threshold selection, and case study methodology.

2020 316 cited

Scalable and Generalizable Social Bot Detection through Data Selection

Kai-Cheng Yang, Onur Varol, Pik-Mai Hui, Filippo Menczer

Addresses cross-dataset generalization using ensemble of specialized classifiers.

2016 1245 cited

Fake It Till You Make It: Reputation, Competition, and Yelp Review Fraud

Michael Luca, Georgios Zervas

First large-scale study of fake reviews; 16% of Yelp reviews flagged as fake, increasing with competition.

2014 876 cited

Promotional Reviews: An Empirical Investigation of Online Review Manipulation

Dina Mayzlin, Yaniv Dover, Judith Chevalier

Compares TripAdvisor vs Expedia reviews; finds review manipulation concentrated among independent hotels.

2023 89 cited

A Survey on Fake Review Detection Techniques

Dongxing Shen, Shiwei Sun, Xingjie Huang, Jianwei Zhang, Qinghua Zheng

Comprehensive survey covering linguistic, behavioral, and graph-based fake review detection methods.

Title	Authors	Year	Citations
Online Human-Bot Interactions: Detection, Estimation, and Characterization Foundational Botometer paper; random forest on 1,000+ features estimating 9-15% of Twitter accounts are bots.	Onur Varol, Emilio Ferrara, Clayton A. Davis, Filippo Menczer, Alessandro Flammini	2017	852
The Rise of Social Bots Seminal paper defining social bots, detection challenges, and policy implications for platform manipulation.	Emilio Ferrara, Onur Varol, Clayton Davis, Filippo Menczer, Alessandro Flammini	2016	1428
Botometer 101: Social Bot Practicum for Computational Social Scientists Practitioner guide for Botometer v4 with CAP scores, threshold selection, and case study methodology.	Kai-Cheng Yang, Emilio Ferrara, Filippo Menczer	2022	133
Scalable and Generalizable Social Bot Detection through Data Selection Addresses cross-dataset generalization using ensemble of specialized classifiers.	Kai-Cheng Yang, Onur Varol, Pik-Mai Hui, Filippo Menczer	2020	316
Fake It Till You Make It: Reputation, Competition, and Yelp Review Fraud First large-scale study of fake reviews; 16% of Yelp reviews flagged as fake, increasing with competition.	Michael Luca, Georgios Zervas	2016	1245
Promotional Reviews: An Empirical Investigation of Online Review Manipulation Compares TripAdvisor vs Expedia reviews; finds review manipulation concentrated among independent hotels.	Dina Mayzlin, Yaniv Dover, Judith Chevalier	2014	876
A Survey on Fake Review Detection Techniques Comprehensive survey covering linguistic, behavioral, and graph-based fake review detection methods.	Dongxing Shen, Shiwei Sun, Xingjie Huang, Jianwei Zhang, Qinghua Zheng	2023	89

Content Moderation & Toxicity

Identify and remove harmful content

2017 2308 cited

Automated Hate Speech Detection and the Problem of Offensive Language

Thomas Davidson, Dana Warmsley, Michael Macy, Ingmar Weber

Foundational 3-class dataset (hate/offensive/neither) with 24K labeled tweets; standard benchmark.

2022 122 cited

A New Generation of Perspective API: Efficient Multilingual Character-level Transformers

Alyssa Lees, Vinh Q. Tran, Yi Tay, Jeffrey Sorensen, Jai Gupta, Donald Metzler, Lucy Vasserman

Technical architecture behind Google Jigsaw's Perspective API; handles obfuscation, code-switching, multilingual toxicity.

2018 633 cited

Measuring and Mitigating Unintended Bias in Text Classification

Lucas Dixon, John Li, Jeffrey Sorensen, Nithum Thain, Lucy Vasserman

Develops methods for measuring unintended identity-term bias in toxicity classifiers; foundational for fair ML.

2021 66 cited

HateXplain: A Benchmark Dataset for Explainable Hate Speech Detection

Binny Mathew, Punyajoy Saha, Seid Muhie Yimam, Chris Biemann, Pawan Goyal, Animesh Mukherjee

First benchmark with rationale annotations for explainable hate speech detection across 20K posts.

2018 287 cited

Automatic Detection of Cyberbullying in Social Media Text

Cynthia Van Hee, Gilles Jacobs, Charlotte Brouckaert, Serena Cauberghs, Bart Desmet, Ann Loccufier, Véronique Hoste

Multi-label cyberbullying detection with fine-grained categories; benchmark on Dutch social media.

2019 234 cited

Internet Argument Corpus 2.0: An SQL Schema for Dialogic Social Media and the Corpora to Go with It

Eshaan Chandrasekharan, Mattia Samory, Shagun Jhaver, Hunter Charvat, Amy Bruckman, Cliff Lampe, Jacob Eisenstein, Eric Gilbert

Studies how subreddit norms shape behavior; users posting in banned communities post less toxic content after ban.

Title	Authors	Year	Citations
Automated Hate Speech Detection and the Problem of Offensive Language Foundational 3-class dataset (hate/offensive/neither) with 24K labeled tweets; standard benchmark.	Thomas Davidson, Dana Warmsley, Michael Macy, Ingmar Weber	2017	2308
A New Generation of Perspective API: Efficient Multilingual Character-level Transformers Technical architecture behind Google Jigsaw's Perspective API; handles obfuscation, code-switching, multilingual toxicity.	Alyssa Lees, Vinh Q. Tran, Yi Tay, Jeffrey Sorensen, Jai Gupta, Donald Metzler, Lucy Vasserman	2022	122
Measuring and Mitigating Unintended Bias in Text Classification Develops methods for measuring unintended identity-term bias in toxicity classifiers; foundational for fair ML.	Lucas Dixon, John Li, Jeffrey Sorensen, Nithum Thain, Lucy Vasserman	2018	633
HateXplain: A Benchmark Dataset for Explainable Hate Speech Detection First benchmark with rationale annotations for explainable hate speech detection across 20K posts.	Binny Mathew, Punyajoy Saha, Seid Muhie Yimam, Chris Biemann, Pawan Goyal, Animesh Mukherjee	2021	66
Automatic Detection of Cyberbullying in Social Media Text Multi-label cyberbullying detection with fine-grained categories; benchmark on Dutch social media.	Cynthia Van Hee, Gilles Jacobs, Charlotte Brouckaert, Serena Cauberghs, Bart Desmet, Ann Loccufier, Véronique Hoste	2018	287
Internet Argument Corpus 2.0: An SQL Schema for Dialogic Social Media and the Corpora to Go with It Studies how subreddit norms shape behavior; users posting in banned communities post less toxic content after ban.	Eshaan Chandrasekharan, Mattia Samory, Shagun Jhaver, Hunter Charvat, Amy Bruckman, Cliff Lampe, Jacob Eisenstein, Eric Gilbert	2019	234

Account Security & Identity

Verify identities and secure accounts

2017 169 cited

Data Breaches, Phishing, or Malware? Understanding the Risks of Stolen Credentials

Kurt Thomas, Frank Li, et al.

Google study on account hijacking vectors; SMS verification blocks 100% of automated bots and 96% of bulk phishing.

2021

Risk-Based Authentication: Practical Deployments and Research Challenges

Stephan Wiefling, Luigi Lo Iacono, Markus Dürmuth

Analyzes RBA deployments at Google, Microsoft, Amazon; develops measurement framework.

2019 9 cited

Selective Graph Attention Networks for Account Takeover Detection

IEEE Conference

Graph neural networks modeling account-device-transaction relationships for ATO detection.

2020 45 cited

DeepAuth: Deep Learning Based Authentication for Anomaly Detection

Saleh Alowais, Khaled Elleithy

Deep learning approach combining behavioral biometrics with session features for continuous authentication.

2015 234 cited

Framing the Underground Economy: An Ecosystem of Underground Market Sellers and Operators

Kurt Thomas, Danny Yuxing Huang, David Wang, et al.

Maps the underground economy of stolen accounts; traces supply chain from compromise to monetization.

Title	Authors	Year	Citations
Data Breaches, Phishing, or Malware? Understanding the Risks of Stolen Credentials Google study on account hijacking vectors; SMS verification blocks 100% of automated bots and 96% of bulk phishing.	Kurt Thomas, Frank Li, et al.	2017	169
Risk-Based Authentication: Practical Deployments and Research Challenges Analyzes RBA deployments at Google, Microsoft, Amazon; develops measurement framework.	Stephan Wiefling, Luigi Lo Iacono, Markus Dürmuth	2021	—
Selective Graph Attention Networks for Account Takeover Detection Graph neural networks modeling account-device-transaction relationships for ATO detection.	IEEE Conference	2019	9
DeepAuth: Deep Learning Based Authentication for Anomaly Detection Deep learning approach combining behavioral biometrics with session features for continuous authentication.	Saleh Alowais, Khaled Elleithy	2020	45
Framing the Underground Economy: An Ecosystem of Underground Market Sellers and Operators Maps the underground economy of stolen accounts; traces supply chain from compromise to monetization.	Kurt Thomas, Danny Yuxing Huang, David Wang, et al.	2015	234

Coordinated Manipulation & Information Operations

Detect state-sponsored trolls and coordinated inauthentic behavior

2019 289 cited

Who Let The Trolls Out? Towards Understanding State-Sponsored Trolls

Savvas Zannettou, Tristan Caulfield, Emiliano De Cristofaro, Michael Sirivianos, Gianluca Stringhini, Jeremy Blackburn

Characterizes Russian IRA troll activity across platforms; develops detection methodology.

2019 234 cited

Disinformation as Collaborative Work: Surfacing the Participatory Nature of Strategic Information Operations

Kate Starbird, Ahmer Arif, Tom Wilson

Framework for understanding information operations as collaborative work; case study of 2016 election.

2018 198 cited

Characterizing Twitter Users Who Engage with Russian Internet Research Agency

Adam Badawy, Emilio Ferrara, Kristina Lerman

Analysis of 14M tweets by IRA; identifies patterns distinguishing troll engagement from organic users.

2021 87 cited

Exploring Content and Design Techniques in Coordinated Manipulation: A Survey

Stefano Nizzoli, Serena Tardelli, Marco Avvenuti, Stefano Cresci, Maurizio Tesconi

Comprehensive taxonomy of manipulation techniques across platforms and actor types.

Title	Authors	Year	Citations
Who Let The Trolls Out? Towards Understanding State-Sponsored Trolls Characterizes Russian IRA troll activity across platforms; develops detection methodology.	Savvas Zannettou, Tristan Caulfield, Emiliano De Cristofaro, Michael Sirivianos, Gianluca Stringhini, Jeremy Blackburn	2019	289
Disinformation as Collaborative Work: Surfacing the Participatory Nature of Strategic Information Operations Framework for understanding information operations as collaborative work; case study of 2016 election.	Kate Starbird, Ahmer Arif, Tom Wilson	2019	234
Characterizing Twitter Users Who Engage with Russian Internet Research Agency Analysis of 14M tweets by IRA; identifies patterns distinguishing troll engagement from organic users.	Adam Badawy, Emilio Ferrara, Kristina Lerman	2018	198
Exploring Content and Design Techniques in Coordinated Manipulation: A Survey Comprehensive taxonomy of manipulation techniques across platforms and actor types.	Stefano Nizzoli, Serena Tardelli, Marco Avvenuti, Stefano Cresci, Maurizio Tesconi	2021	87

Abuse Detection in Human Interaction

Detect personal attacks, harassment, and abusive language

2016 678 cited

Abusive Language Detection in Online User Content

Chikashi Nobata, Joel Tetreault, Achint Thomas, Yashar Mehdad, Yi Chang

Yahoo system combining n-grams, syntactic, and semantic features; production-scale abuse detection.

2017 456 cited

Deep Learning for Detecting Harassment in Social Media

Dawei Yin, Zhenzhen Xue, Liangjie Hong, Brian D. Davison, April Kontostathis, Lynne Edwards

CNN and RNN architectures for harassment detection; analyzes temporal patterns in abuse.

2017 567 cited

Ex Machina: Personal Attacks Seen at Scale

Ellery Wulczyn, Nithum Thain, Lucas Dixon

Wikipedia personal attack corpus with 100K+ labeled comments; crowdsourcing methodology for abuse annotation.

2017 289 cited

Deep Learning for User Comment Moderation

John Pavlopoulos, Prodromos Malakasiotis, Ion Androutsopoulos

RNN with attention for comment moderation; deployed at Greek news organization.

Title	Authors	Year	Citations
Abusive Language Detection in Online User Content Yahoo system combining n-grams, syntactic, and semantic features; production-scale abuse detection.	Chikashi Nobata, Joel Tetreault, Achint Thomas, Yashar Mehdad, Yi Chang	2016	678
Deep Learning for Detecting Harassment in Social Media CNN and RNN architectures for harassment detection; analyzes temporal patterns in abuse.	Dawei Yin, Zhenzhen Xue, Liangjie Hong, Brian D. Davison, April Kontostathis, Lynne Edwards	2017	456
Ex Machina: Personal Attacks Seen at Scale Wikipedia personal attack corpus with 100K+ labeled comments; crowdsourcing methodology for abuse annotation.	Ellery Wulczyn, Nithum Thain, Lucas Dixon	2017	567
Deep Learning for User Comment Moderation RNN with attention for comment moderation; deployed at Greek news organization.	John Pavlopoulos, Prodromos Malakasiotis, Ion Androutsopoulos	2017	289

Privacy & Data Misuse

Understand economics of privacy and detect data misuse

2013 567 cited

What Is Privacy Worth?

Alessandro Acquisti, Leslie K. John, George Loewenstein

Experiments showing people value privacy but underestimate risks; foundational behavioral privacy study.

2016 1234 cited

The Economics of Privacy

Alessandro Acquisti, Curtis Taylor, Liad Wagman

JEL survey covering market structure, price discrimination, and welfare effects of privacy regulation.

2006 8900 cited

Differential Privacy

Cynthia Dwork

Foundational paper defining differential privacy; mathematical framework for privacy-preserving computation.

2017 89 cited

The Cost of Annoying Ads

Ginger Zhe Jin, Andrew Stivers

Studies tradeoff between ad intrusiveness and platform revenue; privacy implications of targeting.

Title	Authors	Year	Citations
What Is Privacy Worth? Experiments showing people value privacy but underestimate risks; foundational behavioral privacy study.	Alessandro Acquisti, Leslie K. John, George Loewenstein	2013	567
The Economics of Privacy JEL survey covering market structure, price discrimination, and welfare effects of privacy regulation.	Alessandro Acquisti, Curtis Taylor, Liad Wagman	2016	1234
Differential Privacy Foundational paper defining differential privacy; mathematical framework for privacy-preserving computation.	Cynthia Dwork	2006	8900
The Cost of Annoying Ads Studies tradeoff between ad intrusiveness and platform revenue; privacy implications of targeting.	Ginger Zhe Jin, Andrew Stivers	2017	89