URL Extractor

Quickly extract URLs and domains from emails, web page content, HTML source, or uploaded TXT and CSV files. Supports automatic deduplication, removal of common tracking parameters (such as UTM), filtering by keyword or domain, and viewing stats. Export results as TXT or CSV. Handles files up to 2 MB and 500,000 characters of text—all processed locally in your browser; data is never uploaded to a server.

Extraction settings

Output
Sort results

Already have one URL per line? Use Text Deduplicator to clean and dedupe your list.

Input

0 / 500,000

Extracted URLs appear here

Paste text or upload a .txt / .csv file in the input area on the left

How to use

Overview

URL Extractor pulls links out of unstructured text—emails, newsletters, copied web pages, HTML source, or log snippets. You do not need one URL per line.

Processing includes automatic deduplication, trailing punctuation cleanup, and optional removal of common tracking parameters. Filter by keywords or hostname, view domain counts, then copy or export. Everything runs locally in your browser with no server upload.

Good for

  • Collecting links from email threads or marketing messages
  • Extracting URLs from pasted HTML or page source
  • Cleaning shared links before analysis or bookmarking
  • Getting a hostname list for SEO or competitor review
  • Exporting to a spreadsheet or another tool

For a list that already has one URL per line, use Text Deduplicator.

Steps

  1. Paste into Input or Upload a .txt / .csv file (try Sample for a demo)
  2. Review Extracted URLs and the stats below the panels
  3. Use Domain breakdown (top 50 by count) and click to filter
  4. Adjust output mode, sort, cleaning options, and filters
  5. Copy, or Download as TXT or CSV; Clear text removes input only; Reset also restores defaults

Options

OptionWhat it does
Output: Full URL / Domain onlyOne full link per line, or hostname only
Sort: first seen / A–Z / Z–AOrder of the result list
Remove tracking paramsStrip utm_*, fbclid, gclid, and similar query keys (default on)
Remove protocolDrop https:// from displayed/exported URLs
Remove trailing slashRemoves `/` at the end of paths (e.g. `/blog/` → `/blog`). Pure domain or root URLs such as `https://example.com/` usually look the same with this option on.
Include / Exclude keywordsKeep or drop URLs containing any listed substring (OR)
Domain filterKeep matching hostnames only (OR; includes subdomains)

Export

MethodContents
Copy / TXTOne result per line (URL or domain, per output mode)
CSVurl, hostname, protocol columns with header (UTF-8)

Limits and tips

  • Up to 500,000 characters per run; uploads ≤ 2 MB and must decode within the character limit
  • Pasted HTML is scanned as text—not rendered in a browser
  • PDF and Word files are not supported; copy the text and paste instead
  • Path-only links such as `/api/user` are not extracted

Format-level extraction only—not a guarantee that every link is live or safe to open.

FAQ

Q: What does this tool do?

A: Finds URLs in mixed text or uploaded TXT/CSV, deduplicates them, and removes common tracking query parameters by default. Switch to domain-only output, filter by keywords or hostname, view stats, then copy or download TXT/CSV—all in your browser.

Q: How do I use it?

A: 1. Paste into Input on the left, or Upload a .txt / .csv file (UTF-8)

2. Extracted links appear on the right; stats show matched, kept, deduped, domains, and HTTPS/HTTP counts

3. Adjust output mode, sort, and cleaning options in Extraction settings

4. Use Include / Exclude keywords or Domain filter to narrow results; click a domain chip to fill the filter

5. Copy results, or Download as TXT or CSV; Clear text removes input only; Reset also restores defaults

Q: Can I upload a file?

A: Yes—.txt and .csv (UTF-8). Files up to 2 MB; decoded text up to 500,000 characters (same as paste). Over-limit inputs show an error and are not truncated.

Q: Why both a 2 MB and a 500,000-character limit?

A: 2 MB caps upload file size so the browser does not read huge files at once.

500,000 characters caps the text used for extraction (paste and decoded uploads share this limit).

A file can be under 2 MB yet exceed 500,000 characters after decoding—split the file.

Q: Input and output formats?

A: Input: any text, or full content of an uploaded TXT/CSV file.

Output: one item per line (full URL or hostname, depending on output mode).

CSV has url, hostname, and protocol columns with a header (UTF-8).

Q: What are tracking parameters?

A: Extra query strings added for analytics or ads, such as utm_source or fbclid. With Remove tracking params enabled (default), common ones are stripped from results. Turn it off to keep the original query string.

Q: Full URL vs Domain only?

A: Full URL keeps the complete link including path and query (after cleaning). Domain only outputs the hostname (e.g. blog.example.com)—useful when you care about where links point, not every unique page.

Q: How do Include, Exclude, and Domain filters work?

A: Enter one or more terms separated by commas, semicolons, or new lines (OR match, case-insensitive). Include keeps URLs containing any term; Exclude removes matches. Domain filter keeps URLs whose hostname matches (including subdomains). Click a chip in Domain breakdown to add it quickly.

Q: Can it extract links from HTML?

A: Yes if you paste HTML source as plain text—the tool scans for URL-like strings (including href and src values). It does not render the page or parse the DOM, so unusual markup may be missed.

Q: Why are fewer URLs shown than expected?

A: Common reasons: active filters, links without a recognizable hostname, path-only strings like /api/user (not supported), or text that does not contain a full URL. Try relaxing filters or checking the source.

Q: Is my data uploaded?

A: No. Processing runs locally in your browser. Options may be saved in localStorage; input stays in sessionStorage for this tab. See the Privacy Notice below.

Q: How is this different from Text Deduplicator?

A: URL Extractor finds links inside mixed text. Text Deduplicator cleans lists that already have one URL per line. Use Text Deduplicator if you already have a plain list.

Similar Tools

Tools in the same category or with related features

View All