I want to delete some datasets. How do I do that?

Last updated: August 25, 2025

This article provides an in-depth overview of the tool for cleaning up unwanted datasets in bulk within Honeycomb environments. The Python script can be found on GitHub:

https://github.com/honeycombio/field-utils/blob/main/tools/hny_dataset_cleanup_tool/hny-dataset-cleanup.py

Dataset Cleanup Tool (hny-dataset-cleanup.py)

Prerequisites

  • Python 3.11 or later

  • Python requests library (pip install requests)

  • Honeycomb API key with “Create Dataset” permission

Usage

python hny-dataset-cleanup.py -k API_KEY [options]

Required Arguments

  • -k--api-key: Honeycomb API key.

Optional Arguments

  • -a--api-host: Honeycomb API hostname (default: api.honeycomb.io).

  • -m--mode: Type of datasets targeted:

    • spammy: Datasets with spammy strings (potentially created during pentesting)

    • date: Datasets created on a specific date.

    • lastwritten: Datasets not written since a specified date.

  • --date YYYY-MM-DD: Required with date or lastwritten modes.

  • --dry-run: Preview datasets targeted without deleting.

Examples

  • Preview spammy datasets removal:

    python hny-dataset-cleanup.py -k YOUR_API_KEY -m spammy --dry-run
    
  • Delete datasets created on a specific date:

    python hny-dataset-cleanup.py -k YOUR_API_KEY -m date --date 2023-02-20
    
  • Remove datasets not written to since a specific date:

    python hny-dataset-cleanup.py -k YOUR_API_KEY -m lastwritten --date 2023-01-01
    

Best Practices 

  • Always perform a dry-run first to verify the targeted datasets or columns.

  • Validate targets carefully, particularly when using regex patterns or date-based criteria.