Generating Real Release Notes from Minified Electron Apps

View markdown Download markdown

I maintain claude-desktop-debian, a build tool that repackages the Windows Electron app for Linux. Until recently, every release shipped with the same auto-generated note:

Claude Desktop Update

This release updates the packaged Claude Desktop version to 1.1.3363.

What's Changed

  • Updated Claude Desktop to version 1.1.3363

This release was automatically generated when a new Claude Desktop version was detected.

Users got a version number and nothing else. No explanation of what changed in the app, no packaging context, no heads-up about anything that might affect them specifically on Linux.

The upstream app ships as minified JavaScript inside an app.asar archive. There's no changelog. Diffing two versions by hand isn't practical. The code is bundled, minified, and full of single-letter identifiers that change between builds for reasons that have nothing to do with behavior.

I needed a tool that could do the mechanical work.

The Pipeline

I built that tool in a separate repo: aaddrick/claude-desktop-versions. The core is a Python script called compare-releases.py.

graph TD
    subgraph Extract
        direction LR
        A[Resolve tags] --> B[Download AppImages] --> C[Extract & unpack asar]
    end
    subgraph Diff
        direction LR
        D[Prettier beautify] --> E[Diff files] --> F[Filter noise]
    end
    subgraph Analyze
        direction LR
        G[Sonnet: deobfuscate] --> H[Sonnet: synthesize] --> I[summary.md]
    end
    Extract --> Diff --> Analyze
python scripts/compare-releases.py

I set it to auto-detect the two most recent releases with distinct upstream versions. To compare specific tags, pass them explicitly:

python scripts/compare-releases.py --old v1.3.10+claude1.1.3500 --new v1.3.11+claude1.1.3541

It downloads the AppImages for both versions from aaddrick/claude-desktop-debian using gh, extracts each one, unpacks the app.asar contents, and runs Prettier over the minified JavaScript.

Prettier doesn't make the code readable. It takes code compressed onto a handful of lines and spreads it across tens of thousands of lines instead. That's what you need for diffing: individual lines you can compare, grep through, and feed into analysis tools.

Vite content-hash renames are a real problem here. A file called main-DW5TxSpY.js in one release becomes main-Xk9mP2Qa.js in the next, even if the code barely changed. The script matches these as the same file rather than treating one as deleted and one as new.

After matching files, I run difftastic on the changed ones. Difftastic is an AST-aware diff tool. It parses both files and compares the trees directly.

Minified code is full of short arbitrary symbol names that change between builds: a bundler renaming h to b across a file produces thousands of changed lines that reflect nothing functional. Prettier doesn't guarantee identical line layout every run either. Two semantically identical files can come out with slightly different arrangements after beautification. An AST diff compares structure, not text. A symbol in the same structural position doesn't register as a change just because its name changed. A reordered import doesn't either. When difftastic hits size limits on very large files, the script falls back to difflib. The result lands in compare-work/ as report.md and report.json.

Two Kinds of Releases

Not every release goes through the full pipeline. The version tag encodes which type it is.

The tag format is v{packaging}+claude{upstream}. CI looks at the +claude portion only to determine which type of release it is.

# previous release
v1.3.10+claude1.1.3500

# upstream release: +claude changed -> full pipeline
v1.3.11+claude1.1.3541

# wrapper release: +claude unchanged -> commit note only
v1.3.11+claude1.1.3500

When +claude changes, the upstream app updated. The full pipeline runs: AppImage extraction, diffing, noise filtering, Claude analysis, structured summary.

When only the v{} prefix changes, it's packaging only. The app code is identical. Running the diff pipeline would produce nothing useful, so CI generates a commit-based note instead:

# Wrapper Update: v1.3.11+claude1.1.3500

This release updates the wrapper/packaging only — the upstream Claude Desktop version is unchanged.

## Changes since v1.3.10+claude1.1.3500

- fix: correct arm64 AppImage extraction path (a3f82c1)
- chore: bump reprepro to handle re-uploads (b91044e)

CI figures out which type it is by scanning the recent release list for the previous tag with a different Claude version. If one exists, it's an upstream release. If every recent tag shares the same Claude version, it's wrapper-only.

Noise Filtering

Even with AST-aware diffing, minified JavaScript produces diffs full of non-behavioral changes. I run a two-layer filter before handing anything to Claude.

Consider what a Zod minifier pass produces: thousands of diff lines where identifiers got renamed and type-checking wrappers got reshuffled, none of it reflecting a behavioral change. Both filter layers exist to catch exactly that.

The first layer catches known build artifacts. I drop any hunk where every changed line matches one of these patterns:

  • UUIDs in the 8-4-4-4-12 hex format
  • Hex hashes between 32 and 64 characters
  • Sentry DSN fields and debug IDs
  • Version strings like "1.2.3"
  • Short commit hashes

The second layer handles minified identifier renames. When a bundler shuffles variable names (h becomes b, d becomes c), each affected line changes but the code's behavior doesn't. I normalize short identifiers on both sides of a hunk and compare them. Matching normalized versions means the hunk is a rename pass. It gets filtered out.

With both layers, Claude reads a diff of real behavioral changes.

What the Output Looks Like

I send each filtered hunk to Sonnet. Sonnet's job is to deobfuscate: it guesses meaningful names for the short identifiers based on usage context, string literals, and API call patterns. The output is structured JSON with three fields: a key_identifiers map from minified names to guessed meaningful names, a description of what the code does, and a change summary focused on the functional delta. Returning just the identifier map instead of fully rewritten code cut the output token count significantly. I checkpoint results to analysis-progress.json after each hunk, so a long run can resume if interrupted.

Large minified JS files are a practical problem here. A 4MB index.js exceeds Claude's context window and would fail silently if sent as one chunk. The script handles this with a three-tier approach. Hunks under 40KB go directly to Sonnet (Tier 1, the normal path). Oversized hunks get split on semicolons, normalized, and re-chunked at change-region boundaries (Tier 2). If that still doesn't fit, a string-extraction fallback pulls out string literals and API changes and sends those instead (Tier 3).

Once all hunks are processed, a second Sonnet pass synthesizes the results. It receives the full set of per-hunk analyses: identifier maps, descriptions, and change summaries. The minified identifiers are gone. For large diffs that exceed the single-prompt limit, there's an intermediate pass where each file gets its own summarization call first, then a final call synthesizes across all files.

Everything lands in compare-work/summary.md.

The summary follows a consistent structure:

Section Contents
Executive Summary 2-3 sentences describing the release at a high level
Changes by Category Grouped by functional area: UI, IPC, permissions, and so on
Wrapper/Packaging Changes What changed in the Debian packaging specifically
Cost and Duration Per-model token usage and total analysis time

Here's an excerpt from a real release:

Local Sessions route A new LocalSessions navigation destination (local_sessions) has been added to the app's deep-link routing system. The Settings page handler now also covers this new route, suggesting a new UI section for managing local CLI sessions is being surfaced to users.

DXT manifest v0.4: "uv" runtime support The Desktop Extension manifest schema (version 0.4) now accepts "uv" as a valid server runtime type, alongside the existing "python", "node", and "binary".

That release ran 24 Sonnet calls and cost $3.37 CAD. The first version of this pipeline used Opus for synthesis and returned fully rewritten code from each hunk, which ran about $30/run. Simplifying the output schema to key_identifiers and switching the summary pass to Sonnet via --model sonnet is what brought it down. Practical enough to run on every release.

The CI Side

The packaging changes section comes from a plain git log of the wrapper repo. For a typical release, that might be:

881744a Update Claude Desktop download URLs to version 1.1.3647
e0bf73c docs: add IliyaBrook to contributor acknowledgments

Upstream release tags come from a separate scheduled workflow that runs once a day. It uses Playwright to scrape the official Claude Desktop download URLs and compares them against what's in build.sh. When the upstream version changes, the workflow commits the updated URLs, creates the tag, and creates an initial release with the boilerplate note. The tag push triggers ci.yml.

ci.yml builds deb, rpm, and AppImage packages for amd64 and arm64, creates the GitHub release, updates the APT and DNF repos, publishes the AUR package, and runs the release notes job. That job checks out claude-desktop-versions, runs compare-releases.py, appends the wrapper git log to summary.md, and updates the release body via gh release edit.

The release notes job runs after the GitHub release is created, not before. Users can grab packages the moment the release goes live. The notes catch up. A release with significant upstream changes still takes time because there can be a lot of hunks. The per-hunk Claude calls run in parallel with four workers, which brought runtime down from about two hours to roughly ten minutes. A large diff still means a lot of batches, though. Since the notes get applied after the release is already out, the wait doesn't block users. The job runs with continue-on-error: true, so if it times out or fails, the release stands without updated notes.

Before this, users got a version number. Now they get the diff, interpreted.