SQL Risk Lab

Risk-analysis queries, review queues, and marketplace-integrity triage.

A public-safe PostgreSQL lab for synthetic fraud, abuse, diversion, chargeback, shared-identifier, and fake-watchlist scenarios, paired with a real-data Olist marketplace case study.

The project separates synthetic platform-abuse scenarios from the Olist marketplace case study, then turns both into review queues and explainable flags.

The synthetic lab models accounts, orders, assets, activations, transfers, abuse reports, chargebacks, shared identifiers, fake watchlists, and deterministic review scoring. The query library moves from coverage checks into investigation patterns and explainable priority bands.

The Olist companion applies the same review-queue pattern to public data and writes eight output files for order integrity, seller rollups, route risk, low-review patterns, payment review, category risk, and summary metrics.

Data: Synthetic platform-risk data plus public Olist marketplace data.
SQL: 15 reusable queries using CTEs, joins, windows, intervals, views, and scoring logic.
Artifacts: Schema ERD, case walkthroughs, representative snippets, Olist results guide.
Position: Scores prioritize review. They are not conclusions or enforcement decisions.

99,441 Olist orders

3,095 Sellers

8,349 Medium-plus queued orders

1 High-priority seller

39 Medium-priority sellers

15 Synthetic-lab SQL queries

Workflow

Model Entities

Define accounts, orders, assets, activations, transfers, reports, chargebacks, identifiers, and watchlist tables.

Generate And Load

Create controlled synthetic records, validate coverage, and load PostgreSQL tables and views.

Investigate With SQL

Use query patterns for clusters, mismatches, velocity, transfers, reports, disputes, and movement paths.

Score Review Queue

Materialize explainable priority scores with flags and supporting counts for analyst review.

Validate With Public Data

Run the workflow on Olist marketplace data and document where the signals support review, not conclusions.

Two Data Tracks

Synthetic Risk Lab

Explicitly models fraud, abuse, diversion-style movement, shared identifiers, chargebacks, support-language clustering, and fake watchlist workflows with invented records only.

Olist Marketplace Case Study

Applies the review-queue pattern to public marketplace data for fulfillment, seller quality, low-review patterns, route risk, payment review, and category triage without fraud labels.

Reviewer Path

outputs/summary.md

Start Here

Compact summary of the lab, query library, generated review queue, and public-data case outputs.

docs/query_snippets.md

Query Snippets

Short SQL examples for shared identifiers, activation mismatch, transfers, chargebacks, and explainability.

docs/case_walkthroughs.md

Case Walkthroughs

Public-safe examples showing how raw rows become reviewable signals with flags and context.

sql/30_review_queue.sql

Scoring Output

Materialized queue logic that combines account and asset flags into transparent priority bands.

case_studies/.../results/

Olist Results

Generated CSV outputs for order queue, seller rollup, route risk, low-review, payment, and category review.

docs/assumptions_limitations.md

Interpretation Guardrails

Explains what the scores can and cannot support, especially for public data without abuse labels.

Representative Queries

03_shared_identifier_clusters.sql

Shared Identifier Clusters

Find accounts sharing payment, device, phone, email, address, support-language, or reseller identifiers.

05_ship_activate_country_mismatch.sql

Ship-To / Activate-In Mismatch

Compare fulfillment geography against latest activation location to surface records needing review.

06_bulk_orders_short_window.sql

Bulk Ordering

Detect concentrated ordering behavior inside short windows by account, country, channel, or node.

07_rapid_asset_transfers.sql

Rapid Transfers

Identify assets transferred shortly after purchase, especially when combined with location or report flags.

09_chargebacks_by_age_country_channel_node.sql

Chargeback Patterns

Group payment disputes by account age, country, order channel, and shipping node.

11_diversion_style_movement_paths.sql

Diversion-Style Movement

Trace legal purchase paths followed by suspicious transfer, activation, resale, or movement patterns.

15_risk_score_explainability.sql

Score Explainability

Show which flags contributed to a priority score so queued records remain reviewable.

v_order_integrity_queue

Olist Order Queue

Rank public marketplace orders by fulfillment, review, value, payment, and status signals.

CASE
  WHEN priority_score >= 90 THEN 'critical'
  WHEN priority_score >= 60 THEN 'high'
  WHEN priority_score >= 35 THEN 'medium'
  ELSE 'low'
END AS priority_band

Outputs

Review Queue

Priority-ranked account, asset, order, and seller records with explainable flags and supporting fields.

Seller Rollup

Seller-level aggregation for order volume, delivery friction, review outcomes, and triage priority.

Route Risk

Late-delivery and fulfillment-risk patterns by seller, customer geography, and route-like combinations.

Low-Review Patterns

Review outcomes connected to delivery delay, fulfillment status, and order complexity.

Payment Review

Installment and payment-pattern slices used for marketplace operations review, not fraud labeling.

Category Risk

Category-level fulfillment and customer-impact patterns for operational prioritization.

Case Walkthroughs

Short examples showing how raw records become reviewable signals without becoming enforcement findings.

Schema And Snippets

Schema ERD and representative query snippets for quick review of the data model and SQL patterns.

Scope And Limitations

Synthetic Lab

Synthetic scenarios demonstrate explicit fraud, abuse, diversion, and fake-watchlist workflows. All entities, identifiers, countries, resellers, and watchlist records are invented for public-safe analysis.

Olist Case Study

Olist contains public marketplace records, not confirmed fraud labels. The case study supports integrity and operations review: fulfillment, seller quality, low reviews, payments, categories, and triage.

Scoring

Scores are deterministic prioritization aids. They surface records for human review and do not establish misconduct or prescribe enforcement.

Public Safety

The project avoids real customer data, real sanctions lists, denied-party data, geolocation intelligence, or evasion guidance.

GitHub