Agent Skill

Duplicate Record Review

Identify likely duplicate profiles or companies using available record evidence.

Version 1.0 Updated June 2026 SKILL.md MIT 6 min read

Overview

What this skill does

Duplicates creep into every list from multiple sources and re-scrapes, inflating counts and causing double outreach.

This skill groups likely duplicates using the evidence available, reports match confidence and which fields agree or conflict, and recommends a merge or manual review — describing matches as likely rather than certain unless they are exact.

When to use it

Best used for

  • De-duplicating a merged or re-scraped list
  • Catching the same company under different names
  • Preventing double outreach to one contact
  • Producing a reviewable merge plan

Know the limits

When not to use this skill

  • The records share no comparable identifiers
  • You need an irreversible automatic merge (always review)
  • The list is already known to be unique

Inputs

Provide these when prompted. The skill asks for anything missing before it runs.

Required

  • A list of company or profile records

Optional

  • Match keys to prioritize (domain, email)
  • A confidence threshold
  • Fields that must match exactly

Outputs

One record per duplicate group with a consistent, inspectable schema.

  • duplicate_group

    Records believed to refer to the same entity.

  • match_confidence

    Likelihood the group is a true duplicate.

  • matching_fields

    Fields that agree across the group.

  • conflicting_fields

    Fields that disagree.

  • merge_recommendation

    Suggested surviving record and merges.

  • manual_review_status

    Whether a human must decide.

Example

Example

A likely duplicate pair.

Input

A: Acme | acme.com | Maria Chen | maria@acme.com
B: ACME Inc | acme.com | M. Chen | maria@acme.com

Output

duplicate_group: A + B
match_confidence: high (exact email + domain)
matching_fields: email; domain
conflicting_fields: name format (Maria Chen vs M. Chen)
merge_recommendation: Keep A; merge name variant from B
manual_review_status: optional (high confidence)

An exact email and domain make this a high-confidence duplicate; the only conflict is a name format, so the merge is safe with light review.

Setup

How to use the skill

General steps first, then notes for specific clients where verified.

  1. 1Download the file using the button below, or copy the Markdown.
  2. 2Place it in a directory named after the skill (e.g. skill-name/).
  3. 3Make sure the filename stays exactly SKILL.md.
  4. 4Add any references or assets included with the package.
  5. 5Load the skill into a compatible agent and provide the required inputs.
Claude Code
  1. 1Create a folder for the skill and save SKILL.md inside it.
  2. 2Place the folder where your project's skills are discovered.
  3. 3Reference the skill when you want it applied to your data.
Other compatible clients
  1. 1Confirm the client supports the open Agent Skills format.
  2. 2Load the SKILL.md file as instructed by that client.
  3. 3If skills are not auto-loaded, paste the Markdown as instructions.

Source

Full SKILL.md source

Read the rendered skill or copy the complete Markdown. The download is generated from this exact source.

Version 1.0 SKILL.md ~2 KB MIT
View on GitHub

Duplicate Record Review

Purpose

Identify likely duplicate profiles or companies using available record evidence.

When to use this skill

  • De-duplicating a merged or re-scraped list
  • Catching the same company under different names
  • Preventing double outreach to one contact
  • Producing a reviewable merge plan

When not to use this skill

  • The records share no comparable identifiers
  • You need an irreversible automatic merge (always review)
  • The list is already known to be unique

Required inputs

  • A list of company or profile records

Optional inputs

  • Match keys to prioritize (domain, email)
  • A confidence threshold
  • Fields that must match exactly

Rules

  1. Describe matches as likely unless an exact deterministic key matches.
  2. Report match confidence and the evidence behind it.
  3. Never auto-merge; recommend and flag for review.
  4. List conflicting fields explicitly.
  5. Prefer the most complete record as the survivor.

Process

  1. Compare records on available keys.
  2. Group likely duplicates.
  3. Assess confidence and list matching/conflicting fields.
  4. Recommend a survivor and merges.
  5. Set manual-review status.

Output format

Return one record per duplicate group with the following fields:

  • duplicate_group
  • match_confidence
  • matching_fields
  • conflicting_fields
  • merge_recommendation
  • manualreviewstatus

Validation

  • Confirm only exact key matches are called certain.
  • Confirm conflicts are listed for each group.
  • Confirm no merge is performed automatically.

Limitations

  • Without unique keys, matches are probabilistic.
  • Distinct entities can share names; review before merging.

Before you rely on it

Safety and limitations

  • Without unique keys, matches are probabilistic.
  • Distinct entities can share names; review before merging.
  • Review the output before acting on it.
  • Do not upload confidential datasets to an external model without authorization.
  • Outputs depend on the model and the source data and are not guaranteed to be accurate.

History

Changelog

  1. v1.0June 2026
    • Initial release.

Questions

Agent Skill FAQ

How does it decide two records are duplicates?
It compares the available evidence — names, domains, emails, and other fields — and groups likely matches with a reason and merge guidance, leaving the final merge to you.
Will it merge records for me?
No. It recommends merges and flags review status; the actual merge stays a human decision to avoid data loss.
Do I need ProfileSpider to use this skill?
No. The skill works on any compatible data. ProfileSpider is one convenient way to produce that structured input.
Does running this skill send data to ProfileSpider?
No. Downloading or copying the file does not send any data to ProfileSpider. What happens afterward depends on the AI service you load it into.
Are Agent Skills the same as prompts?
No. A skill is a structured, reusable package — task, inputs, rules, process, and output format — so the workflow runs consistently and can be shared, versioned, and edited.

Ready to Extract Structured Leads?

Start free and see how quickly you can build a clean lead list.

Get started for free