Open Source · MIT License

OrgForge

Generate ground-truth enterprise workstreams to test AI agents.
From Git PRs and Confluence docs to Slack threads—all governed by a causal, deterministic engine.

Star on GitHubGet in Touch

Built by Aerie Security

The only widely-used corporate dataset for RAG evaluation is the Enron corpus. It's from 2001. One company. In crisis. There is nothing else.

How It Works

The engine controls the facts.
The LLM writes the prose.

⚙️
Event-Driven State Machine
A strict day-by-day simulation loop. P1 incidents follow a rigid lifecycle: detected → investigating → fix_in_progress → resolved. Every artifact references the same ground truth.
🧠
SimEvent Ground Truth Bus
Every action emits a structured SimEvent. LLMs retrieve facts via hybrid vector search — so an email on Day 6 correctly references a ticket resolved on Day 4.
🕸️
Living Social Graph
NetworkX tracks weighted relationships. Edges decay daily, boost on collaboration. Burnout propagates via betweenness centrality. Escalations route through Dijkstra on inverse-weight edges.
What It Generates

Every artifact cross-references
the same ground truth.

OrgForge/confluence/general/CONF-MKT-001Written by Dana · Day 3 · Marketing

Spring Marathon Campaign Brief (CONF-MKT-001)

Introduction

The Spring Marathon Campaign is designed to engage and retain our key athlete and coach base by offering them exclusive access to the latest features of Apex Athletics' improved sports-tracking system. This campaign aims to leverage the momentum from the successful Project Titan deployment to highlight the benefits of our new system and increase user adoption.

Campaign Objectives

  • Awareness: Educate athletes and coaches about the enhanced features and benefits of the new system.
  • Engagement: Increase user engagement by providing exclusive access to advanced analytics and real-time performance tracking.
  • Retention: Secure long-term commitment from our user base by demonstrating the value of our modernized system.

Target Audience

  • Athletes: Professional and amateur athletes who rely on our system for training and performance analysis.
  • Coaches: Trainers and coaches who need accurate and real-time data to optimize athlete training programs.

Campaign Timeline

  • Q1 2024: Campaign Planning and Preparation
  • Q2 2024: Beta Testing and User Feedback
  • Q3 2024: Full Campaign Launch and Promotion
  • Q4 2024: Evaluation and Campaign Optimization
Generated by OrgForge simulation engine · Facts sourced from SimEvent log · No hallucination
Full Output

configurable simulated days.
A complete synthetic corpus.

confluence/Technical specs, OKRs, postmortems, retros
jira/Sprint tickets, P1 incidents, linked PRs
slack/channels/Standups, incident alerts, engineering chatter
git/prs/Pull requests with graph-selected reviewers
emails/threads/Incident escalations, sprint check-ins, HR comms
servers/logs/AWS cost alerts, Snyk findings, CI output
simulation_snapshot.jsonFull state: incidents, morale curve, all artifact IDs
simulation.logChronological debug log for the entire run
For Developers

Up in three lines.

terminal
# Everything in Docker (recommended)
git clone https://github.com/aeriesec/orgforge
cd orgforge
docker compose up
local_cpu
PlannerQwen 2.5 7B
WorkerQwen 2.5 1.5B
Laptops · iteration
local_gpu
PlannerLlama 3.3 70B
WorkerLlama 3.1 8B
Offline · high quality
cloud
PlannerClaude 4.6 Sonnet
WorkerLlama 3.1 8B
Production datasets