How to Scrape a Directory to CSV
Use ProfileSpider to turn a public online directory into a structured lead list. Open the directory in Chrome, extract visible listings, save them to a list, and export the results as CSV, Excel, or JSON.
Goal
What This Workflow Is For
Turn a public directory page into a clean spreadsheet you can filter, enrich, and export.
Use this workflow when you have a public directory page with repeated people, companies, members, vendors, agencies, speakers, partners, or local businesses and you want the data in a structured format.
Instead of manually copying names, companies, profile links, websites, and contact details into a spreadsheet, ProfileSpider reads the visible page in Chrome and turns the listings into rows you can save to a list.
This page is focused on the ProfileSpider workflow: open the source page, extract the listings, save the results, optionally enrich missing details, and export the list as CSV, Excel, or JSON.
Prerequisites
Before You Start
Confirm the page and tooling match this workflow.
Before you start, make sure you have:
- ProfileSpider installed in Chrome
- A public directory page open in a normal Chrome tab
- Listings, cards, rows, profiles, or companies visible on the page
- A rough idea of the columns you want in your export, such as name, company, title, website, LinkedIn URL, email, and source URL
This workflow works best when the directory has repeated listings visible in the page HTML. If the data is hidden behind a login, image, PDF, or blocked script, extraction may be limited.
Fit
Best For / Not Ideal For
Set expectations before you install or run an extract.
Best for
- Industry and association member directories
- Conference speaker, sponsor, or exhibitor lists
- Local business directories and chamber of commerce pages
- Agency, consultant, SaaS, startup, or vendor directories
- Marketplace pages with repeated company or profile cards
- Partner directories and technology ecosystem pages
Not ideal for
- Private directories you are not allowed to access
- PDFs, screenshots, or scanned tables with no live HTML
- Single-profile pages with no repeated rows or cards
- Pages where the information only appears after complex in-page interactions
- Directories where the visible page contains almost no useful fields
Steps
Step-by-Step Workflow
- 1
Open the directory page in Chrome
Go to the public directory page you want to extract. Wait until the listings are fully loaded and visible in the browser.
ProfileSpider works from the page you can see in Chrome, so make sure the relevant rows, cards, or profiles are actually rendered before extracting.
- 2
Open ProfileSpider
Click the ProfileSpider extension icon. The extension will analyze the current page and prepare the extraction workflow.
- 3
Review the fields you want to capture
Check the columns you want in the output. Common fields for directory scraping include name, job title, company, website, LinkedIn URL, email, location, description, and source URL.
The exact fields depend on what the directory exposes. ProfileSpider can structure visible data, but it cannot extract fields that are not present or discoverable on the page.
- 4
Run the extraction
Start the extraction. ProfileSpider turns the repeated listings on the page into structured rows. A normal page scrape uses one credit.
- 5
Save the extracted rows to a list
Save the results to a new or existing list. Use list names, tags, and notes to keep different directories, clients, campaigns, or niches organized.
For example, you could save rows to a list named “Barcelona SaaS agencies”, “HR tech vendors”, or “Conference speakers 2026”.
- 6
Export the directory data
Export the saved list as CSV, Excel, or JSON. Use CSV or Excel for spreadsheet workflows, and JSON if you want to move the data into another tool or system.
Schema
What ProfileSpider Extracts
Default fields for this workflow. Add or remove columns before you extract.
- NameThe person, company, vendor, member, speaker, or listing name shown in the directory.
- Job TitleThe person’s role or position, if the directory includes it.
- CompanyThe company, organization, agency, or employer associated with the listing.
- WebsiteThe company or profile website linked from the directory, when available.
- LinkedIn URLA LinkedIn profile or company page URL if the directory links to one.
- EmailAn email address if it is visible on the directory page. Missing emails can be handled later with email finding where available.
- LocationCity, country, region, or service area if present in the listing.
- DescriptionShort company, profile, or listing description if the page exposes one.
- Source URLThe URL of the directory page or listing source, useful for verification and deduplication.
Output
Example Output
What a downloaded file looks like. Real exports are saved as .csv, .xlsx, or .json.
| Name | Title | Company | Website | Location | Source | ||
|---|---|---|---|---|---|---|---|
| Maria Chen | VP Engineering | Lumen Robotics | lumenrobotics.com | linkedin.com/in/mariachen | maria@lumenrobotics.com | Berlin, Germany | example-directory.com/robotics |
| James Patel | Head of Sales | Northwind Cloud | northwind.io | linkedin.com/in/jamespatel | james@northwind.io | London, UK | example-directory.com/cloud |
| Aisha Carter | Founder | Cartergrove Labs | cartergrove.com | linkedin.com/in/aishacarter | Amsterdam, Netherlands | example-directory.com/startups |
Troubleshooting
Common Problems
The directory has multiple pages
Extract the first page, move to the next page manually, and save the next extraction to the same list. This keeps paginated directories organized in one place.
Some rows have missing fields
This usually means the field was not visible in the source listing. You can keep the column empty, enrich the row later, or filter incomplete rows before export.
Rows look duplicated
Some pages repeat the same listing in featured sections, sidebars, or hidden page elements. Review the list, use deduplication if available, and keep the source URL column for verification.
The page uses infinite scroll
Scroll until the listings you need are loaded, then run the extraction. For very long pages, extract in batches and save each batch to the same list.
The directory blocks scraping or hides content
Only extract data you can access normally in your browser. If the page does not expose useful content in the visible page, ProfileSpider may not be able to structure it cleanly.
Questions