Documentation for GSD (Generate Synthetic Data) - Fraud
Table of Contents
Introduction
Getting Started
Installation
First-Time Login
Generating Your First DatasetFeatures
Risk Presets
Data Configuration
Streamlit UI OverviewUsage
Accessing Generated Data
Modifying DataBilling and Subscription
Subscription Plan
Generation-Based BillingSecurity and Compliance
GDPR ComplianceLegal Disclaimer
Affiliation with Galileo FTSupport
1. Introduction
GSD - Fraud is a synthetic data generation service designed for data scientists, engineers, and risk analysts to create realistic datasets for machine learning and analytics purposes. It generates datasets compatible with the Galileo FT base RDF spec, allowing users to test fraud detection models without using real customer data.
Note: GSD - Fraud is not affiliated with or endorsed by Galileo FT.
2. Getting Started Installation
GSD - Fraud can be installed directly from the Snowflake Marketplace. Simply search for the application and follow the installation instructions provided.
First-Time Login
After installation, access GSD - Fraud through your Snowflake account. No additional setup is required.
Generating Your First Dataset
Open the GSD - Fraud interface in Snowflake.
Use the side panel to configure risk presets by adjusting sliders for low, medium, and high MCC code percentages.
Click the "Generate Data" button.
Access the generated Snowflake tables directly in your account.
3. Features Risk Presets
Configure the risk level distribution of transactions by adjusting sliders for low, medium, and high MCC code percentages.
Individual MCC codes cannot be specified; only risk level distributions.
Data Configuration
Generate datasets of either:
200k records (included once per month with the subscription).
1M records (available for $1,000 per generation).
Data generation details:
200k generation includes:
40k customers inside customer_master
1-3 cards per customer in card_account
200k authorized transactions in authorized_transactions
200k posted transactions in posted_transactions
1M generation includes:
200k customers inside customer_master
1-3 cards per customer in card_account
1M authorized transactions in authorized_transactions
1M posted transactions in posted_transactions
Streamlit UI Overview
Intuitive Streamlit-based UI with a side panel for configuration and a "Generate Data" button.
4. Usage Accessing Generated Data
Generated data is available as Snowflake tables.
No API endpoints are provided.
Modifying Data
Users are free to modify or analyze the data once generated.
5. Billing and Subscription Subscription Plan
Monthly Subscription: $250/month includes one 200k record generation.
Additional Generations:
$250 for each 200k record dataset.
$1,000 for each 1M record dataset.
Billing Events
Each additional dataset triggers a billing event within Snowflake.
6. Security and Compliance
Complies with GDPR by default as no real input data is processed.
7. Legal Disclaimer
GSD - Fraud is not affiliated with or endorsed by Galileo FT.
8. Support
Support Method: Email-based only.
Response Time: No guaranteed response time.