# Discovery Report

The `discover` command generates three default discovery deliverables from the same inventory snapshot:

* `.bb2gh/reports/<workspace>/inventory.json`
* `.bb2gh/reports/<workspace>/discovery-report.html`
* `.bb2gh/reports/<workspace>/discovery-report.xlsx`

The HTML report provides a visual overview of your Bitbucket workspace, and the Excel workbook provides the same discovery data in a format that is easier to sort, filter, and annotate during migration workshops.

## Generating the Deliverables

The deliverables are generated by default when you run `discover`:

```bash
bb2gh discover --workspace my-company
```

By default, all deliverables are grouped under the workspace bundle directory:

* `.bb2gh/reports/my-company/inventory.json`
* `.bb2gh/reports/my-company/discovery-report.html`
* `.bb2gh/reports/my-company/discovery-report.xlsx`

If you run discovery with `--project CORE`, the defaults become:

* `.bb2gh/reports/my-company/core/inventory.json`
* `.bb2gh/reports/my-company/core/discovery-report.html`
* `.bb2gh/reports/my-company/core/discovery-report.xlsx`

At the beginning of each run, `bb2gh` prints the resolved output directory plus the exact inventory, HTML, and Excel paths so the user can immediately open the generated files.

### Custom output paths

```bash
bb2gh discover --workspace my-company --report-path ./my-report.html
bb2gh discover --workspace my-company --excel-path ./my-report.xlsx
```

You can override any one of the three outputs independently. Unchanged deliverables keep using the default workspace bundle paths.

### Disable HTML or Excel generation

```bash
bb2gh discover --workspace my-company --no-report
bb2gh discover --workspace my-company --no-excel
```

## HTML Report

### Summary Cards

The top of the report shows key metrics at a glance:

* **Total Repos** — number of repositories discovered
* **Total Size** — combined size across all repositories
* **Workspace Users** — number of workspace users when the source can provide it
* **Private / Public** — repository visibility breakdown
* **Archived** — number of archived repositories
* **Forks** — number of forked repositories
* **LFS Repos** — repositories using Git LFS
* **Large Files** — repositories with files exceeding 100 MB
* **With Pipelines** — repositories that have Bitbucket Pipelines configured
* **Auto-Convertible Steps** — ratio of pipeline steps that can be automatically converted to GitHub Actions

### Charts

Four charts visualize your workspace:

| Chart                             | What it shows                                                                      |
| --------------------------------- | ---------------------------------------------------------------------------------- |
| **Complexity Distribution**       | Donut chart of simple / moderate / complex repositories                            |
| **Pipeline Migration Complexity** | Donut chart of trivial / low / medium / high pipeline complexity                   |
| **Top Projects by Repo Count**    | Horizontal bar chart of the largest Bitbucket projects                             |
| **Size Distribution**             | Bar chart of repositories by size bucket (< 10 MB, 10-100 MB, 100 MB-1 GB, > 1 GB) |

When discovery runs with `--no-analyze`, the report shows an analysis-completeness notice and leaves analysis-derived fields blank instead of presenting synthetic zero values as real measurements. If workspace user count cannot be retrieved for the selected source, the report shows `Unavailable` instead of inventing a value.

### Projects Section

The project summary section rolls repositories up by Bitbucket project. It includes:

* repository count
* private / public split
* archived and fork totals
* total size
* complexity rollups
* repositories with pipelines
* warning counts

### Repository Table

The main table lists every repository with sortable columns:

* Name, project, size, branches, PRs
* Complexity level and score
* Pipeline migration complexity
* LFS status and large file count
* Default branch and last updated date

Use the controls above the table to:

* **Search** by repository name or description
* **Filter by project** to focus on specific Bitbucket projects
* **Filter by complexity** to find repositories needing extra attention
* **Filter by pipeline status** to identify CI/CD migration effort

### Warnings Panel

The warnings section lists the explicit repository warnings captured during discovery analysis. These rows match the `Warnings` sheet in the Excel workbook, which makes it easier to review the same follow-up items across both deliverables.

### Provenance Links

The HTML report includes the exact `bb2gh` version that generated it plus two links:

* product homepage: `https://bb2gh.dev`
* GitHub repository: `https://github.com/n8group-oss/bb2gh`

This makes shared documents self-identifying even when they are forwarded internally.

## Excel Workbook

The Excel workbook contains four sheets:

* **Summary** — an ADO-style branded worksheet with metadata, discovery totals, provenance hyperlinks, and an embedded project rollup table
* **Projects** — one row per derived Bitbucket project summary with human-readable total sizes
* **Repositories** — one row per discovered repository with both exact `Size Bytes` and readable `Size` columns
* **Warnings** — one row per explicit repository warning

The workbook is generated from the same shared projection as the HTML report, so project rollups, warning rows, and provenance information stay aligned. The workbook presentation follows the same design system as the ADO2GH discovery export: branded summary header, colored section blocks, styled headers, wrapped cells, freeze panes, tab colors, and semantic highlighting for migration complexity and warning severity.

When discovery runs with `--no-analyze`, the workbook leaves analysis-derived cells blank rather than using implicit defaults that could be misread as actual discovery results. If workspace user count is unavailable, the `Summary` sheet marks it as `Unavailable`.

## Opening The Files

After discovery finishes:

* open `discovery-report.html` directly in a browser for the interactive dashboard
* open `discovery-report.xlsx` in Excel or another spreadsheet viewer for sorting and workshop review
* keep `inventory.json` as the canonical machine-readable snapshot for later `plan` or other automation

Because the files are grouped by workspace, customers can keep multiple discovery bundles side by side without renaming outputs manually.

## Verifying In Hatch

Use the Hatch environment for both code verification and a real discovery run.

### 1. Verify the implementation

Run the full repository presubmit:

```bash
hatch run all
```

For a faster discovery-focused pass, run only the relevant tests:

```bash
hatch run test tests/unit/test_discover.py tests/unit/test_discovery_deliverables.py tests/unit/test_reporting.py tests/unit/test_excel_export.py -v
```

### 2. Generate real deliverables

For Bitbucket Cloud:

```bash
export BB_USERNAME="<bitbucket-username>"
export BB_API_TOKEN="<bitbucket-api-token>"

hatch run bb2gh discover --workspace my-company --analyze
```

For Bitbucket Data Center:

```bash
export BB_DC_URL="https://bitbucket.example.com"
export BB_DC_TOKEN="<bitbucket-dc-token>"

hatch run bb2gh discover --workspace MYPROJECT --source bitbucket-dc --analyze
```

The command prints the resolved output directory, inventory path, HTML report path, and Excel workbook path before the discovery starts.

### 3. Verify the generated outputs

Check the JSON summary directly:

```bash
jq '.summary.workspace_user_count' .bb2gh/reports/my-company/inventory.json
```

For Bitbucket Cloud, `workspace_user_count` should be a number when member lookup succeeds. For Bitbucket Data Center, `workspace_user_count` should remain `null` because bb2gh does not currently have a correct workspace-scoped member source there.

Open the HTML report and verify:

* the `Workspace Users` summary card is present
* the `Projects` section appears above the repository table
* the warnings panel lists only explicit repository warning rows
* the footer/header include the exact `bb2gh` version plus product and repository links

Open the Excel workbook and verify:

* `Summary` contains the branded discovery header, provenance links, and `Workspace User Count`
* `Projects` contains one row per derived Bitbucket project and shows `Total Size` in human-readable units
* `Repositories` contains the same repository ordering and reporting data as the HTML projection, plus exact `Size Bytes` and readable `Size`
* `Warnings` contains only explicit repository warning rows

If you run a project-filtered discovery, repeat the same checks under `.bb2gh/reports/<workspace>/<project-slug>/`.

## Sharing the Deliverables

The HTML report is a single file with all styles, scripts, and data embedded inline. You can:

* Email it to stakeholders
* Upload it to a shared drive
* Open it on any machine without installing bb2gh
* Print it or save as PDF from your browser

> **Note**: The report contains repository names, sizes, and structural metadata. It does not contain any credentials, source code, or file contents.

## Next Steps

After reviewing the deliverables:

1. Identify repositories that need special handling (large files, complex pipelines)
2. Decide which repositories to include or exclude from migration
3. Create a [migration plan](/bb2gh/commands/plan.md) based on your findings


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://n8-group.gitbook.io/bb2gh/guides/discovery-report.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
