Workflow

How to Scrape a Directory to CSV

Use ProfileSpider to turn a public online directory into a structured lead list. Open the directory in Chrome, extract visible listings, save them to a list, and export the results as CSV, Excel, or JSON.

6 steps ~5 minutes 1 credit per page scrape

Goal

What This Workflow Is For

Turn a public directory page into a clean spreadsheet you can filter, enrich, and export.

Use this workflow when you have a public directory page with repeated people, companies, members, vendors, agencies, speakers, partners, or local businesses and you want the data in a structured format.

Instead of manually copying names, companies, profile links, websites, and contact details into a spreadsheet, ProfileSpider reads the visible page in Chrome and turns the listings into rows you can save to a list.

This page is focused on the ProfileSpider workflow: open the source page, extract the listings, save the results, optionally enrich missing details, and export the list as CSV, Excel, or JSON.

Prerequisites

Before You Start

Confirm the page and tooling match this workflow.

Before you start, make sure you have:

  • ProfileSpider installed in Chrome
  • A public directory page open in a normal Chrome tab
  • Listings, cards, rows, profiles, or companies visible on the page
  • A rough idea of the columns you want in your export, such as name, company, title, website, LinkedIn URL, email, and source URL

This workflow works best when the directory has repeated listings visible in the page HTML. If the data is hidden behind a login, image, PDF, or blocked script, extraction may be limited.

Fit

Best For / Not Ideal For

Set expectations before you install or run an extract.

Best for

  • Industry and association member directories
  • Conference speaker, sponsor, or exhibitor lists
  • Local business directories and chamber of commerce pages
  • Agency, consultant, SaaS, startup, or vendor directories
  • Marketplace pages with repeated company or profile cards
  • Partner directories and technology ecosystem pages

Not ideal for

  • Private directories you are not allowed to access
  • PDFs, screenshots, or scanned tables with no live HTML
  • Single-profile pages with no repeated rows or cards
  • Pages where the information only appears after complex in-page interactions
  • Directories where the visible page contains almost no useful fields

Steps

Step-by-Step Workflow

  1. 1

    Open the directory page in Chrome

    Go to the public directory page you want to extract. Wait until the listings are fully loaded and visible in the browser.

    ProfileSpider works from the page you can see in Chrome, so make sure the relevant rows, cards, or profiles are actually rendered before extracting.

  2. 2

    Open ProfileSpider

    Click the ProfileSpider extension icon. The extension will analyze the current page and prepare the extraction workflow.

  3. 3

    Review the fields you want to capture

    Check the columns you want in the output. Common fields for directory scraping include name, job title, company, website, LinkedIn URL, email, location, description, and source URL.

    The exact fields depend on what the directory exposes. ProfileSpider can structure visible data, but it cannot extract fields that are not present or discoverable on the page.

  4. 4

    Run the extraction

    Start the extraction. ProfileSpider turns the repeated listings on the page into structured rows. A normal page scrape uses one credit.

  5. 5

    Save the extracted rows to a list

    Save the results to a new or existing list. Use list names, tags, and notes to keep different directories, clients, campaigns, or niches organized.

    For example, you could save rows to a list named “Barcelona SaaS agencies”, “HR tech vendors”, or “Conference speakers 2026”.

  6. 6

    Export the directory data

    Export the saved list as CSV, Excel, or JSON. Use CSV or Excel for spreadsheet workflows, and JSON if you want to move the data into another tool or system.

Schema

What ProfileSpider Extracts

Default fields for this workflow. Add or remove columns before you extract.

  • NameThe person, company, vendor, member, speaker, or listing name shown in the directory.
  • Job TitleThe person’s role or position, if the directory includes it.
  • CompanyThe company, organization, agency, or employer associated with the listing.
  • WebsiteThe company or profile website linked from the directory, when available.
  • LinkedIn URLA LinkedIn profile or company page URL if the directory links to one.
  • EmailAn email address if it is visible on the directory page. Missing emails can be handled later with email finding where available.
  • LocationCity, country, region, or service area if present in the listing.
  • DescriptionShort company, profile, or listing description if the page exposes one.
  • Source URLThe URL of the directory page or listing source, useful for verification and deduplication.

Output

Example Output

What a downloaded file looks like. Real exports are saved as .csv, .xlsx, or .json.

directory-leads-export.csv CSV / XLSX / JSON
NameTitleCompanyWebsiteLinkedInEmailLocationSource
Maria ChenVP EngineeringLumen Roboticslumenrobotics.comlinkedin.com/in/mariachenmaria@lumenrobotics.comBerlin, Germanyexample-directory.com/robotics
James PatelHead of SalesNorthwind Cloudnorthwind.iolinkedin.com/in/jamespateljames@northwind.ioLondon, UKexample-directory.com/cloud
Aisha CarterFounderCartergrove Labscartergrove.comlinkedin.com/in/aishacarterAmsterdam, Netherlandsexample-directory.com/startups

Troubleshooting

Common Problems

The directory has multiple pages

Extract the first page, move to the next page manually, and save the next extraction to the same list. This keeps paginated directories organized in one place.

Some rows have missing fields

This usually means the field was not visible in the source listing. You can keep the column empty, enrich the row later, or filter incomplete rows before export.

Rows look duplicated

Some pages repeat the same listing in featured sections, sidebars, or hidden page elements. Review the list, use deduplication if available, and keep the source URL column for verification.

The page uses infinite scroll

Scroll until the listings you need are loaded, then run the extraction. For very long pages, extract in batches and save each batch to the same list.

The directory blocks scraping or hides content

Only extract data you can access normally in your browser. If the page does not expose useful content in the visible page, ProfileSpider may not be able to structure it cleanly.

Questions

Common Questions

Can I scrape any directory to CSV with ProfileSpider?
ProfileSpider is designed for public web pages with visible listings, rows, cards, profiles, or companies. It works best when the directory has repeated items on the page. It is not intended for private data, inaccessible pages, scanned PDFs, or pages where the data is not visible in the browser.
Does ProfileSpider export to Excel as well as CSV?
Yes. ProfileSpider supports CSV, Excel, and JSON exports from saved lists.
Does this work on paginated directories?
Yes, but the safest workflow is usually to extract one page at a time and save each page into the same list. That gives you more control and makes it easier to review the output.
Can ProfileSpider find emails from a directory?
If email addresses are visible on the directory page, ProfileSpider can include them in the extracted rows. If emails are missing, you can use the email-finding workflow where available.
What should I do after exporting the directory?
Use the export for research, sales prospecting, recruiting, account lists, enrichment, or outreach preparation. For outreach, review and clean the data first so you do not send messages to irrelevant or incorrect contacts.

Ready to Extract Structured Leads?

Start free and see how quickly you can build a clean lead list.

Get started for free