top of page
Logo Finthetic (400 x 400 px) (6).png

Documentation for GSD (Generate Synthetic Data) - Fraud

Table of Contents

  1. Introduction

  2. Getting Started
    Installation
    First-Time Login
    Generating Your First Dataset

  3. Features
    Risk Presets
    Data Configuration
    Streamlit UI Overview

  4. Usage
    Accessing Generated Data
    Modifying Data

  5. Billing and Subscription
    Subscription Plan
    Generation-Based Billing

  6. Security and Compliance
    GDPR Compliance

  7. Legal Disclaimer
    Affiliation with Galileo FT

  8. Support


1. Introduction


GSD - Fraud is a synthetic data generation service designed for data scientists, engineers, and risk analysts to create realistic datasets for machine learning and analytics purposes. It generates datasets compatible with the Galileo FT base RDF spec, allowing users to test fraud detection models without using real customer data.

Note: GSD - Fraud is not affiliated with or endorsed by Galileo FT.


2. Getting Started Installation


GSD - Fraud can be installed directly from the Snowflake Marketplace. Simply search for the application and follow the installation instructions provided.


First-Time Login

After installation, access GSD - Fraud through your Snowflake account. No additional setup is required.


Generating Your First Dataset

  1. Open the GSD - Fraud interface in Snowflake.

  2. Use the side panel to configure risk presets by adjusting sliders for low, medium, and high MCC code percentages.

  3. Click the "Generate Data" button.

  4. Access the generated Snowflake tables directly in your account.


3. Features Risk Presets

  • Configure the risk level distribution of transactions by adjusting sliders for low, medium, and high MCC code percentages.

  • Individual MCC codes cannot be specified; only risk level distributions.


Data Configuration

  • Generate datasets of either:
    200k records (included once per month with the subscription).
    1M records (available for $1,000 per generation).


Data generation details:

  • 200k generation includes:

  • 40k customers inside customer_master

  • 1-3 cards per customer in card_account

  • 200k authorized transactions in authorized_transactions

  • 200k posted transactions in posted_transactions

  • 1M generation includes:

  • 200k customers inside customer_master

  • 1-3 cards per customer in card_account

  • 1M authorized transactions in authorized_transactions

  • 1M posted transactions in posted_transactions


Streamlit UI Overview

  • Intuitive Streamlit-based UI with a side panel for configuration and a "Generate Data" button.


4. Usage Accessing Generated Data

  • Generated data is available as Snowflake tables.

  • No API endpoints are provided.


Modifying Data

  • Users are free to modify or analyze the data once generated.


5. Billing and Subscription Subscription Plan

  • Monthly Subscription: $250/month includes one 200k record generation.

  • Additional Generations:
    $250 for each 200k record dataset.
    $1,000 for each 1M record dataset.


Billing Events

  • Each additional dataset triggers a billing event within Snowflake.


6. Security and Compliance

  • Complies with GDPR by default as no real input data is processed.


7. Legal Disclaimer

  • GSD - Fraud is not affiliated with or endorsed by Galileo FT.


8. Support

  • Support Method: Email-based only.

  • Response Time: No guaranteed response time.

bottom of page