Deduping reviewer findings without losing signal
How we sort by severity, dedupe by (file, line, lowercased title), and only post inline comments on lines that map to a unified-diff position. Plus: what we threw out and why.
How we sort by severity, dedupe by (file, line, lowercased title), and only post inline comments on lines that map to a unified-diff position. Plus: what we threw out and why.
Three reviewers in parallel produce three lists of findings. Naively merging them gets you duplicates, inconsistent severities, and inline comments on lines GitHub will refuse to attach. The aggregator does the boring middle work that turns three model outputs into one usable review.
We dedupe by (file, line, lowercased title). We tried fancier things — embedding similarity, suggestion overlap, semantic fingerprints — and they all lost more than they gained. Two reviewers flagging the same line with similar phrasing is the single strongest cross-reviewer signal we have. We do not want to merge it away by accident.
function dedupKey(f: Finding) {
return [f.file, f.line, f.title.toLowerCase()].join('::');
}When two findings collide on the key, we keep the one with the higher severity, then the higher confidence on tiebreak. We attribute the kept finding to whichever reviewer sent it; the dropped reviewer's id is logged but not posted. Maintainers do not need to know that two AIs agreed; they need to read one comment.
Sorting by the string severity is a footgun. We map to integers (critical=4, high=3, medium=2, low=1), sort descending, and use confidence as the tiebreak. The first 10 findings after sort and dedup are the inline comments; the rest survive in the database as run history.
GitHub will only attach an inline comment to a line that exists as a position in the unified diff. That is not "any line in the file" — it is specifically the line numbers GitHub assigns to + and context lines inside @@ hunks. A reviewer can confidently flag line 412 of a 600-line file, and if line 412 was not in the diff, the comment will fail to post.
We solved this by parsing the patch in mapPatchLineToPosition and walking the @@ hunks ourselves. Every finding gets the position it would map to; the ones that do not map are logged in review_findings but stripped from the inline post. The summary at the top of the review still mentions the count so nothing is silently dropped.
The aggregator is now under 200 lines and shipping for months without changes. That is mostly because the dedup key is dumb and the severity rank is an integer. The smartest version of this pipeline lives in the reviewers, not in the post-processing.