Matt Godbolt 4cb1416c2a Add gh_tool CLI for GitHub repository automation (#8170)
This PR adds a new Python CLI tool for automating GitHub repository
management tasks.

## Overview

The initial implementation provides duplicate issue detection using text
similarity analysis. This is the first step toward automating repository
triage tasks.

## Features

- **Click-based CLI** with subcommands for future extensibility
- **find-duplicates command** for detecting duplicate issues using text
similarity
- Uses **gh CLI** for GitHub API access (no token management needed)
- Text similarity using `difflib.SequenceMatcher` (ratio-based
algorithm)
- Configurable similarity threshold (default: 0.6)
- Progress bar for long-running comparisons
- Age filtering support (`--min-age` parameter)
- Standard Python src-layout with **uv** for dependency management
- **Comprehensive test suite** with pytest (integrated into CI)

## Project Structure

```
etc/scripts/gh_tool/
├── src/gh_tool/          # Main package
│   ├── cli.py            # Click-based CLI interface
│   └── duplicate_finder.py  # Core duplicate detection logic
├── tests/                # Test suite
│   └── test_duplicate_finder.py
├── docs/                 # Documentation
│   ├── TRIAGE-CRITERIA.md    # Triage guidelines from manual review
│   └── PHASE1-FINDINGS.md    # Historical analysis of 855 issues
├── pyproject.toml        # Package configuration
└── README.md             # Usage documentation
```

## Usage

```bash
cd etc/scripts/gh_tool
uv sync
uv run gh_tool find-duplicates /tmp/report.md
```

**Options:**
- `--threshold FLOAT` - Similarity threshold 0-1 (default: 0.6)
- `--state {all,open,closed}` - Issue state to check (default: open)
- `--min-age DAYS` - Only check issues older than N days (default: 0)
- `--limit INTEGER` - Maximum number of issues to fetch (default: 1000)
- `--repo TEXT` - GitHub repository in owner/repo format (default:
compiler-explorer/compiler-explorer)

**Example:**
```bash
# Find high-confidence duplicates in open issues
uv run gh_tool find-duplicates /tmp/report.md --threshold 0.85

# Check all issues older than 30 days
uv run gh_tool find-duplicates /tmp/report.md --state all --min-age 30
```

## Testing

The tool includes comprehensive test coverage:
- Unit tests for similarity calculation
- Integration tests for duplicate detection
- Edge case handling (transitive grouping, age filtering, threshold
sensitivity)
- Report generation validation

**Run tests:**
```bash
cd etc/scripts/gh_tool
uv run pytest -v
```

Tests are integrated into CI and run on every push.

## Documentation

- **`README.md`**: Complete usage guide with examples
- **`docs/TRIAGE-CRITERIA.md`**: Comprehensive triage guidelines
developed during manual review of 22+ issues
- **`docs/PHASE1-FINDINGS.md`**: Historical analysis context from
initial 855 issue review

## CI Integration

The tool is integrated into the GitHub Actions workflow:
- `uv` is installed via `astral-sh/setup-uv@v6`
- Tests run automatically on every push
- Ensures tool remains functional as codebase evolves

## Next Steps

Future enhancements planned for follow-up PRs:
- GitHub Action for automatic duplicate detection on new issues
- Additional automation tools (upstream health checker, label validator,
etc.)
- Automated triage reports

## Changes in this PR

-  Core duplicate detection implementation
-  Comprehensive test suite (192 lines)
-  CI integration
-  Complete documentation
-  Example triage criteria and findings

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-10-07 14:50:22 -05:00
2025-07-29 14:22:29 -05:00
2025-06-18 09:04:23 -05:00
2025-04-24 12:10:37 -05:00
2025-09-22 11:34:48 -05:00
2025-08-04 09:56:45 -05:00
2025-07-28 10:34:46 -05:00
2025-10-04 16:10:48 +02:00
2022-05-09 23:13:50 -05:00
2025-09-22 20:00:16 -05:00
2025-10-02 14:05:56 -05:00
2022-05-09 23:13:50 -05:00
2024-03-08 22:25:09 -06:00
2024-02-04 13:33:19 -06:00

Build Status codecov

logo

Compiler Explorer

Compiler Explorer is an interactive compiler exploration website. Edit code in C, C++, C#, F#, Rust, Go, D, Haskell, Swift, Pascal, ispc, Python, Java, or any of the other 30+ supported languages, and see how that code looks after being compiled in real time.

Bug Report · Compiler Request · Feature Request · Language Request · Library Request · Report Vulnerability

Overview

Multiple compilers are supported for each language, many different tools and visualizations are available, and the UI layout is configurable (thanks to GoldenLayout).

Try out at godbolt.org, or run your own local instance. An overview of what the site lets you achieve, why it's useful, and how to use it is available here, or in this talk.

Compiler Explorer follows a Code of Conduct which aims to foster an open and welcoming environment.

Compiler Explorer was started in 2012 to show how C++ constructs are translated to assembly code. It started as a tmux session with vi running in one pane and watch gcc -S foo.cc -o - running in the other.

Since then, it has become a public website serving over 3,000,000 compilations per week.

You can financially support this project on Patreon, GitHub, Paypal, or by buying cool gear on the Compiler Explorer store.

Using Compiler Explorer

FAQ

There is now a FAQ section in the repository wiki. If your question is not present, please contact us as described below, so we can help you. If you find that the FAQ is lacking some important point, please feel free to contribute to it and/or ask us to clarify it.

Videos

Several videos showcase some features of Compiler Explorer:

A Road map is available which gives a little insight into the future plans for Compiler Explorer.

Developing

Compiler Explorer is written in TypeScript, on Node.js.

Assuming you have a compatible version of node installed, on Linux simply running make ought to get you up and running with an Explorer running on port 10240 on your local machine: http://localhost:10240/. If this doesn't work for you, please contact us, as we consider it important you can quickly and easily get running. Currently, Compiler Explorer requires node 20 or higher installed, either on the path or at NODE_DIR (an environment variable or make parameter).

Running with make EXTRA_ARGS='--language LANG' will allow you to load LANG exclusively, where LANG is one for the language ids/aliases defined in lib/languages.ts. For example, to only run Compiler Explorer with C++ support, you'd run make EXTRA_ARGS='--language c++'. You can supply multiple --language arguments to restrict to more than one language. The Makefile will automatically install all the third-party libraries needed to run; using npm to install server-side and client-side components.

For development, we suggest using make dev to enable some useful features, such as automatic reloading on file changes and shorter startup times.

You can also use npm run dev to run if make dev doesn't work on your machine.

When making UI changes, we recommend following the UI Testing Checklist to ensure all components work correctly.

Some languages need extra tools to demangle them, e.g. rust, d, or haskell. Such tools are kept separately in the tools repo.

Configuring compiler explorer is achieved via configuration files in the etc/config directory. Values are key=value. Options in a {type}.local.properties file (where {type} is c++ or similar) override anything in the {type}.defaults.properties file. There is a .gitignore file to ignore *.local.* files, so these won't be checked into git, and you won't find yourself fighting with updated versions when you git pull. For more information see Adding a Compiler.

Check CONTRIBUTING.md for detailed information about how you can contribute to Compiler Explorer, and the docs folder for specific details regarding various things you might want to do, such as how to add new compilers or languages to the site.

Running a local instance

If you want to point it at your own GCC or similar binaries, either edit the etc/config/LANG.defaults.properties or else make a new one with the name LANG.local.properties, substituting LANG as needed. *.local.properties files have the highest priority when loading properties.

For a quick and easy way to add local compilers, use the CE Properties Wizard which automatically detects and configures compilers for 30+ languages. See Adding a Compiler for more details.

If you want to support multiple compilers and languages like godbolt.org, you can use the bin/ce_install install compilers command in the infra project to install all or some of the compilers. Compilers installed in this way can be loaded through the configuration in etc/config/*.amazon.properties. If you need to deploy in a completely offline environment, you may need to remove some parts of the configuration that are pulled from www.godbolt.ms@443.

When running in a corporate setting the URL shortening service can be replaced by an internal one if the default storage driver isn't appropriate for your environment. To do this, add a new module in lib/shortener/myservice.js and set the urlShortenService variable in configuration. This module should export a single function, see the tinyurl module for an example.

RESTful API

There's a simple restful API that can be used to do compiles to asm and to list compilers.

You can find the API documentation here.

Contact us

We run a Compiler Explorer Discord, which is a place to discuss using or developing Compiler Explorer. We also have a presence on the cpplang Slack channel #compiler_explorer and we have a public mailing list.

There's a development channel on the discord, and also a development mailing list.

Feel free to raise an issue on github or email Matt directly for more help.

Official domains

Following are the official domains for Compiler Explorer:

The domains allow arbitrary subdomains, e.g., https://foo.godbolt.org/, which is convenient since each subdomain has an independent local state. Also, language subdomains such as https://rust.compiler-explorer.com/ will load with that language already selected.

Credits

Compiler Explorer is maintained by the awesome people listed in the AUTHORS file.

We would like to thank the contributors listed in the CONTRIBUTORS file, who have helped shape Compiler Explorer.

We would also like to especially thank these people for their contributions to Compiler Explorer:

Many amazing sponsors, both individuals and companies, have helped fund and promote Compiler Explorer.

Description
Run compilers interactively from your web browser and interact with the assembly
Readme BSD-2-Clause 118 MiB
Languages
TypeScript 88.3%
Python 6.8%
SCSS 2.1%
Pug 1.9%
JavaScript 0.2%
Other 0.4%