Tutorials Data

Event Taxonomy from Scratch: AI-Generated Tracking Plan in 1 Hour

What is an event taxonomy?

An event taxonomy is a controlled vocabulary that defines which user actions to track in a product, how to name them consistently, and what properties to attach to each event. It prevents analytics chaos — where the same action is tracked under multiple inconsistent names — and makes data usable for analysis without manual cleanup.

TL;DR

  • -Without an event taxonomy, SaaS analytics accumulate 200+ events where 60% are duplicates or garbage within 6 months
  • -Naming formula: Object + Action in past tense — 'Subscription Created', not 'create_subscription' or 'btn-clicked'
  • -Three property tiers: required (timestamp, user_id, session_id), object-specific, and contextual
  • -Typical SaaS needs 8–12 objects; more than 12 means the taxonomy is too granular
  • -AI generates a 30-event tracking plan from a product description in under 1 hour using structured prompts

Most data in SaaS analytics systems is unfit for analysis. The problem isn’t the tools. The problem is the absence of an event taxonomy — a formalized structure for naming and classifying events. Without it, data becomes chaos: click_button, buttonClick, btn-clicked describe the same action three different ways. An analyst spends hours deciphering instead of analyzing.

This article covers how to design an event taxonomy from scratch, using AI to generate a tracking plan with 30 events in one hour. It covers taxonomy structure, naming conventions, prompts, a ready-to-use SaaS example, and a step-by-step implementation process.

What is Event Taxonomy

An event taxonomy is a controlled vocabulary of product events. It defines three things:

  1. What events to track. Not everything — only specific user actions that are meaningful for the business.
  2. How to name them. Consistent naming conventions that make events readable without documentation.
  3. What properties to send. A set of attributes (properties) for each event that provide context.

Without a taxonomy, the following happens:

  • Developers name events arbitrarily. One writes signup_completed, another user_signed_up, a third registration_done.
  • Properties aren’t standardized. One place uses plan_name, another planType, another subscription_tier.
  • Six months later, the system has 200+ events, 60% of which are duplicates or garbage.
  • An analyst can’t build a funnel because they don’t know which of the three registration events is current.

Event taxonomy solves this problem at the data architecture level. Designing it correctly once is cheaper than spending a month cleaning data later.

Three Levels of Structure

A working taxonomy is built on three levels: object, action, properties.

Level 1: Object

An object is a product entity the user interacts with. Examples: Account, Project, Invoice, Subscription, Report.

Objects form the top level of the hierarchy. A typical SaaS product needs 8–12 objects. More than 12 is a sign the taxonomy is too granular.

Level 2: Action

An action is what happens to an object. Standard action set for SaaS:

ActionMeaningEvent example
CreatedObject was createdProject Created
UpdatedObject was modifiedProject Updated
DeletedObject was removedProject Deleted
ViewedObject was viewedReport Viewed
CompletedProcess was completedOnboarding Completed
StartedProcess was initiatedTrial Started
SubmittedForm was submittedFeedback Submitted
ExportedData was exportedReport Exported

Event naming formula: Object + Action (past tense). Subscription Created, not Create Subscription or subscription.create.

Level 3: Properties

Properties give an event context. Without them, Subscription Created is meaningless: you don’t know which plan, billing period, or acquisition source.

Properties fall into three categories:

Required. Sent with every event. Minimum set: timestamp, user_id, session_id.

Object-specific. Depend on the object. For Subscription Created: plan_name, billing_cycle, price, currency. For Report Exported: report_type, format, row_count.

Contextual. Describe circumstances: source (where they came from), device_type, referrer, experiment_id.

Naming Conventions That Scale

Naming conventions determine how durable the taxonomy is. Without them, the system degrades with every new developer. Here’s a set of rules that works for products of any size.

Event Format: Object Action

✅ Subscription Created
✅ Report Exported
✅ Onboarding Step Completed

❌ created_subscription
❌ reportExported
❌ onboarding.step.completed

Title Case with spaces. Reads like natural language. Mixpanel, Amplitude, and PostHog support this format natively.

Past tense for actions. An event records something that already happened. Created, not Create. Viewed, not View.

Property Format: snake_case

✅ plan_name
✅ billing_cycle
✅ is_annual

❌ planName
❌ BillingCycle
❌ isAnnual

snake_case for properties. Compatible with all analytics platforms and consistent with SQL, which is what most data queries are written in.

Data Type Rules

RuleExampleWhy
Booleans start with is_ or has_is_annual, has_discountSelf-explanatory without docs
Counts end with _countmember_count, item_countDistinguishes from IDs
Amounts include _amount and currencytotal_amount, currency: "USD"No ambiguity
Timestamps end with _atcreated_at, expired_atDistinguishes from dates
Identifiers end with _idproject_id, workspace_idNot confused with values

Forbidden Patterns

  • Verb events without objects. Clicked, Viewed, Submitted. What exactly? Always add the object.
  • Nested properties. plan.name, user.email. Flat structure is better for SQL queries and compatibility.
  • Array properties at root level. tags: ["a", "b"]. Most analytics systems can’t filter by array elements. Better: tag_count: 2, primary_tag: "a".
  • PII in properties. Email, name, phone number are not sent in events. Only user_id — everything else through user profiles.

Prompts for Generating a Tracking Plan

AI speeds up taxonomy design by 5–10x. Instead of manually going through screens and features — a structured prompt that generates a complete tracking plan.

Base Prompt

You are a product analytics architect. Task: design an event taxonomy
for a SaaS product.

PRODUCT: [name and one sentence about its function]
KEY FEATURES: [list of 5-8 main features]
BUSINESS MODEL: [freemium / trial / enterprise]
ANALYTICS PLATFORM: [Mixpanel / Amplitude / PostHog]

NAMING CONVENTIONS:
- Events: Title Case, Object + Action (past tense)
- Properties: snake_case
- Booleans: is_ / has_ prefix
- Counts: _count suffix
- Identifiers: _id suffix

REQUIREMENTS:
1. 25-35 events covering the full user lifecycle:
   - Acquisition (registration, onboarding)
   - Activation (first meaningful action)
   - Engagement (core product actions)
   - Revenue (subscription, payment)
   - Retention (return, reactivation)
2. For each event: name, description (1 line), list of properties
   with data types
3. Separately: list of super properties (sent with every event)
4. Separately: user profile properties

OUTPUT FORMAT: Markdown table.

Validation Prompt

After generating the plan, run it through a second prompt:

Check this tracking plan for common mistakes:

1. DUPLICATES: are there events describing the same action?
2. GAPS: which user lifecycle stages are not covered?
3. NAMING: do all events follow the "Object Action" format (Title Case,
   past tense)?
4. PROPERTIES: are there properties without data types? Any PII?
5. GRANULARITY: are there events that could be merged?
   Are there overly broad events that should be split?

[paste tracking plan]

This two-step process (generate + validate) mirrors the LLM-as-Judge pattern. One AI generates, another checks. The result is consistently better than a single-pass generation.

30 Events for a SaaS Product

A concrete example for a typical B2B SaaS. A project management platform. Freemium model, key features: projects, tasks, collaboration, reports, integrations.

Super Properties (sent with every event)

PropertyTypeDescription
workspace_idstringWorkspace ID
plan_namestringCurrent plan (free/pro/enterprise)
user_rolestringRole in workspace (owner/admin/member)
is_trialbooleanUser is on trial
session_idstringCurrent session ID

Acquisition (5 events)

#EventDescriptionKey Properties
1Account CreatedUser completed registrationsignup_method (string), referral_source (string), utm_source (string), utm_medium (string), utm_campaign (string)
2Onboarding StartedUser started the onboarding flowonboarding_version (string)
3Onboarding Step CompletedOne onboarding step completedstep_name (string), step_number (int), is_skipped (boolean)
4Onboarding CompletedFull onboarding finishedtotal_steps_completed (int), total_steps_skipped (int), duration_seconds (int)
5Teammate InvitedWorkspace invitation sentinvite_method (string), invitee_role (string)

Activation (5 events)

#EventDescriptionKey Properties
6Project CreatedFirst/new project createdproject_type (string), is_template (boolean), template_name (string)
7Task CreatedTask createdproject_id (string), has_due_date (boolean), has_assignee (boolean), priority (string)
8Task CompletedTask marked as doneproject_id (string), time_to_complete_hours (float), priority (string)
9Integration ConnectedExternal integration connectedintegration_name (string), integration_category (string)
10First Value Moment ReachedUser reached the aha-momenttrigger_event (string), days_since_signup (int)

Engagement (10 events)

#EventDescriptionKey Properties
11Task UpdatedTask parameters changedproject_id (string), fields_changed (string), change_count (int)
12Task AssignedTask assigned to a userproject_id (string), assignee_role (string), is_reassignment (boolean)
13Comment CreatedComment added to a taskproject_id (string), task_id (string), has_mention (boolean), has_attachment (boolean)
14File UploadedFile uploaded to project/taskfile_type (string), file_size_kb (int), upload_context (string)
15Report ViewedReport/dashboard viewedreport_type (string), date_range (string)
16Report ExportedReport exportedreport_type (string), format (string), row_count (int)
17Search PerformedSearch executedquery_length (int), result_count (int), search_context (string)
18Filter AppliedFilter applied to a listfilter_type (string), filter_value_count (int), context (string)
19View SwitchedDisplay mode switchedfrom_view (string), to_view (string)
20Notification ClickedNotification clickednotification_type (string), channel (string), time_to_click_seconds (int)

Revenue (6 events)

#EventDescriptionKey Properties
21Trial StartedTrial period activatedtrial_duration_days (int), plan_name (string)
22Trial EndedTrial expiredconverted (boolean), days_active (int), feature_usage_count (int)
23Subscription CreatedPaid subscription startedplan_name (string), billing_cycle (string), price_amount (float), currency (string), coupon_code (string)
24Subscription UpgradedMoved to a higher planfrom_plan (string), to_plan (string), upgrade_reason (string)
25Subscription DowngradedMoved to a lower planfrom_plan (string), to_plan (string), downgrade_reason (string)
26Subscription CancelledSubscription cancelledplan_name (string), cancel_reason (string), lifetime_days (int), feedback_text (string)

Retention (4 events)

#EventDescriptionKey Properties
27Session StartedNew session starteddays_since_last_session (int), device_type (string), entry_page (string)
28Feature DiscoveredUser used a feature for the first timefeature_name (string), discovery_context (string), days_since_signup (int)
29Workspace Setting ChangedWorkspace setting changedsetting_name (string), old_value (string), new_value (string)
30Account DeletedUser deleted their accountdelete_reason (string), lifetime_days (int), plan_at_deletion (string)

User Profile Properties

Sent separately from events, updated on change:

PropertyTypeDescription
signup_datedateRegistration date
plan_namestringCurrent plan
company_namestringCompany name
company_sizestringSize (1-10, 11-50, 51-200, 200+)
industrystringIndustry
total_projects_createdintTotal projects created (increment)
total_tasks_completedintTotal tasks completed (increment)
last_active_atdatetimeLast activity
lifetime_valuefloatTotal payments

From Prompt to Implementation in 1 Hour

Minutes 0–10: Prepare Input Data

Before running AI, gather three things:

  1. A list of product screens. Walk through the main sections, record the features. For an MVP, screenshots or a list of 5–8 points is enough.
  2. Business questions. What questions should analytics answer? “Where do users drop off?”, “Which features correlate with conversion to paid?”, “Which onboarding step causes the most drop-off?”
  3. Current state. If tracking already exists, export the list of current events. AI will help map old events to the new taxonomy.

Minutes 10–25: Generate the Taxonomy

Plug the data into the base prompt and run the generation. AI produces a first version in 2–3 minutes. At this stage:

  • Verify coverage of all lifecycle stages (acquisition through retention)
  • Confirm naming conventions are followed
  • Remove events that don’t answer any business question

Minutes 25–40: Validate and Refine

Run the result through the validation prompt. Typical findings:

  • Missing cancellation/downgrade events (revenue-critical)
  • Overly granular engagement events (can be merged)
  • Missing segmentation properties (plan, role)
  • No activation marker event (first value moment)

Make corrections, run validation again.

Minutes 40–50: Documentation

The final tracking plan should contain:

  1. Naming conventions. Naming rules (so new developers don’t break the schema).
  2. Events table. Name, description, properties with types.
  3. Super properties. What gets sent with every event.
  4. User profile properties. What gets updated in the profile.
  5. Governance rules. Who can add new events, the review process.

Documentation can be generated with AI too: feed it the final table and ask it to format it as an internal standard.

Minutes 50–60: Implementation Plan

Last step: the rollout plan. Prompt for generation:

Based on this tracking plan, create an implementation plan:

1. Prioritization: which events to implement first
   (criteria: business value of the questions they answer)
2. For each event: where in the code to call track()
3. QA checklist: how to verify an event is sent correctly

[paste tracking plan]

Common Design Mistakes

Too many events. 100+ events for an MVP leads to a data swamp. 25–35 events cover 90% of analytics questions. New events should only be added when a specific business question arises that can’t be answered with existing data.

Tracking UI actions instead of business actions. Button Clicked, Dropdown Opened, Modal Closed — that’s UX analytics, not product analytics. Track business actions: Task Created, not Create Task Button Clicked.

No governance. Without a process for adding new events, the taxonomy degrades within 3–6 months. Minimum governance: PR review for tracking code, where the reviewer checks for naming convention compliance.

Ignoring negative events. Subscription Cancelled, Account Deleted, Trial Ended (converted: false) — uncomfortable but critically important for the business. AI often misses them in the first iteration because it focuses on the happy path.

Same properties with different names. plan, plan_name, plan_type, subscription_plan — four names for one property across different events. Super properties solve this: define plan_name once, send automatically.

Tools for Managing the Tracking Plan

A tracking plan doesn’t live in Google Docs. For products with 2+ developers, you need a tool with versioning and validation.

Avo. A specialized tracking plan management tool. Visual editor, code generation, CI/CD validation (verifies that sent events match the schema). Free plan for small teams.

Amplitude Data. Built-in tracking plan in Amplitude. Automatically detects schema deviations, surfaces unexplained events.

Mixpanel Lexicon. Taxonomy management inside Mixpanel. Events can be hidden, renamed, and described without code changes.

Google Sheets + Git. For teams of 1–3 developers: a table with events in the repository. Cheap, versioned, reviewed via PR.

Monitoring Data Quality After Launch

A tracking plan without monitoring will be outdated within a month. Three metrics to track:

Schema compliance rate. Percentage of events conforming to the schema. Target: 99%+. If lower, someone is adding events outside the process.

Property fill rate. Percentage of events where required properties are populated. plan_name: null in 30% of Subscription Created events points to a code problem, not a taxonomy problem.

Event volume anomalies. A sharp drop or spike in a specific event’s volume indicates a bug. Set up alerts for deviations greater than 50% from the 7-day average.

For monitoring LLM systems, including those that generate events, the observability tools described in the Langfuse guide apply. Same principle: log, version, track quality.

Getting Started

  1. List 5–8 key product features. Screens users see. Actions they pay for.
  2. Formulate 10 business questions. “Where do we lose users?”, “Which features do paying customers use?”, “Which acquisition channel gives the best retention?”
  3. Run the base prompt. Plug in the product and features, get a first tracking plan.
  4. Validate with the second prompt. Find gaps, duplicates, naming violations.
  5. Implement 10 events from acquisition and activation. Not all 30 at once. First iteration: registration, onboarding, first meaningful action, conversion to paid.
  6. Set up monitoring. Schema compliance, property fill rate, volume anomalies.
  7. Add events iteratively. A new event only when a business question arises that can’t be answered with current data.

An event taxonomy isn’t a one-time project. It’s a living document that grows with the product. AI reduces design time from days to an hour, but governance and monitoring remain the team’s responsibility.

FAQ

How should you handle events that span multiple objects — for example, when a user assigns a task to a teammate, which involves both Task and User objects?

Use the primary object as the event subject — the one whose state fundamentally changes. Task Assigned is correct because the task gains an assignee; the user’s state doesn’t change. Add the secondary object as a property: assignee_id, assignee_role. This keeps the Object + Action naming formula unambiguous and prevents combinatorial explosion (you’d otherwise need Task User Assignment Created, which describes a relationship rather than a business action). The rule of thumb: if the event appears on the Task’s timeline, the Task is the object.

What is the practical difference between schema compliance rate and property fill rate, and which matters more for a team of 3 developers?

Schema compliance rate measures whether events are arriving at all and are named correctly — it catches typos, rogue buttonClick events, and new events bypassing review. Property fill rate measures whether required fields inside valid events are populated — it catches bugs where plan_name is null because the frontend didn’t pass the subscription context. For small teams, property fill rate is more actionable: a compliance drop usually means a developer added an event outside the process (easy to spot in code review), while a fill rate drop means a runtime bug silently polluting weeks of data. Prioritize fill rate monitoring with daily alerts on any required property falling below 95%.

Can the same tracking plan work across Mixpanel, Amplitude, and PostHog, or do platform-specific constraints require separate taxonomies?

A single taxonomy works for all three with minor formatting notes. All three platforms support Title Case event names and snake_case properties natively. The main divergence is super properties: Mixpanel calls them “super properties” and applies them client-side, Amplitude uses “user properties” set via identify calls, and PostHog uses “person properties.” The events table stays identical; only the SDK implementation of super properties differs. Maintain one canonical tracking plan document and annotate per-platform implementation notes in a separate “SDK Notes” column rather than maintaining three separate taxonomies.