Agent Skill

Company Data Normalization

Normalize company names, domains, industries, and locations into consistent values.

Version 1.0 Updated June 2026 SKILL.md MIT 6 min read

Overview

What this skill does

The same company appears as "ACME Inc.", "Acme", and "acme.com" across sources, which fragments reporting, joins, and de-duplication.

This skill produces canonical company fields — name, domain, industry, location — with a confidence level on each and an explicit list of fields it could not resolve, so downstream joins and analysis are reliable.

When to use it

Best used for

  • Standardizing company fields before a join or merge
  • Cleaning a multi-source company list
  • Preparing data for segmentation or analysis
  • Resolving inconsistent industry labels

Know the limits

When not to use this skill

  • You need to detect duplicates (use Duplicate Record Review)
  • The records contain no company identifiers
  • You need verified, authoritative registry data

Inputs

Provide these when prompted. The skill asks for anything missing before it runs.

Required

  • Company records

Optional

  • A canonical industry taxonomy
  • A location format standard
  • Known aliases to map

Outputs

One record per company with a consistent, inspectable schema.

  • normalized_company_name

    Canonical company name.

  • normalized_domain

    Clean root domain, no protocol.

  • standardized_industry

    Industry mapped to your taxonomy.

  • standardized_location

    Location in a consistent format.

  • normalization_confidence

    Per-record confidence level.

  • unresolved_fields

    Fields that could not be normalized.

Example

Example

One record normalized.

Input

company: ACME Inc.
website: https://www.acme.com/about
industry: software
location: SF

Output

normalized_company_name: Acme
normalized_domain: acme.com
standardized_industry: Software
standardized_location: San Francisco, CA, USA
normalization_confidence: high
unresolved_fields: none

Name, domain, industry, and location are now consistent and join-ready, with confidence reported so low-confidence rows can be reviewed.

Setup

How to use the skill

General steps first, then notes for specific clients where verified.

  1. 1Download the file using the button below, or copy the Markdown.
  2. 2Place it in a directory named after the skill (e.g. skill-name/).
  3. 3Make sure the filename stays exactly SKILL.md.
  4. 4Add any references or assets included with the package.
  5. 5Load the skill into a compatible agent and provide the required inputs.
Claude Code
  1. 1Create a folder for the skill and save SKILL.md inside it.
  2. 2Place the folder where your project's skills are discovered.
  3. 3Reference the skill when you want it applied to your data.
Other compatible clients
  1. 1Confirm the client supports the open Agent Skills format.
  2. 2Load the SKILL.md file as instructed by that client.
  3. 3If skills are not auto-loaded, paste the Markdown as instructions.

Source

Full SKILL.md source

Read the rendered skill or copy the complete Markdown. The download is generated from this exact source.

Version 1.0 SKILL.md ~2 KB MIT
View on GitHub

Company Data Normalization

Purpose

Normalize company names, domains, industries, and locations into consistent values.

When to use this skill

  • Standardizing company fields before a join or merge
  • Cleaning a multi-source company list
  • Preparing data for segmentation or analysis
  • Resolving inconsistent industry labels

When not to use this skill

  • You need to detect duplicates (use Duplicate Record Review)
  • The records contain no company identifiers
  • You need verified, authoritative registry data

Required inputs

  • Company records

Optional inputs

  • A canonical industry taxonomy
  • A location format standard
  • Known aliases to map

Rules

  1. Produce canonical values; do not invent unknown fields.
  2. Map industries only to the supplied taxonomy when given.
  3. Report a confidence level per record.
  4. List unresolved fields explicitly.
  5. Preserve originals alongside normalized values.

Process

  1. Parse each company record.
  2. Normalize name and domain.
  3. Map industry and standardize location.
  4. Assign confidence.
  5. List unresolved fields.

Output format

Return one record per company with the following fields:

  • normalizedcompanyname
  • normalized_domain
  • standardized_industry
  • standardized_location
  • normalization_confidence
  • unresolved_fields

Validation

  • Confirm domains are root domains without protocol.
  • Confirm industries match the taxonomy when provided.
  • Confirm low-confidence rows are flagged.

Limitations

  • Normalization is heuristic without an authoritative registry.
  • Ambiguous names may need manual disambiguation.

Before you rely on it

Safety and limitations

  • Normalization is heuristic without an authoritative registry.
  • Ambiguous names may need manual disambiguation.
  • Review the output before acting on it.
  • Do not upload confidential datasets to an external model without authorization.
  • Outputs depend on the model and the source data and are not guaranteed to be accurate.

History

Changelog

  1. v1.0June 2026
    • Initial release.

Questions

Agent Skill FAQ

What does the confidence score mean?
How certain the normalization is for each field. Low-confidence values are flagged so you can review ambiguous names or domains instead of trusting a silent guess.
Does it merge duplicate companies?
No. It standardizes fields so duplicates become detectable; use Duplicate Record Review to merge them.
Do I need ProfileSpider to use this skill?
No. The skill works on any compatible data. ProfileSpider is one convenient way to produce that structured input.
Does running this skill send data to ProfileSpider?
No. Downloading or copying the file does not send any data to ProfileSpider. What happens afterward depends on the AI service you load it into.
Are Agent Skills the same as prompts?
No. A skill is a structured, reusable package — task, inputs, rules, process, and output format — so the workflow runs consistently and can be shared, versioned, and edited.

Ready to Extract Structured Leads?

Start free and see how quickly you can build a clean lead list.

Get started for free