Fetch-MCP: Playwright-Based MCP Server with Batch URL Fetching Support
64 points
1 month ago
| 5 comments
| github.com
| HN
Sulfide6416
1 month ago
[-]
Fetch-MCP is a MCP server built on Playwright, designed for efficient web page content fetching. It excels at retrieving content from both static and dynamic websites, leveraging Playwright's powerful headless browser capabilities. Key features include `fetch_url` for single page retrieval and `fetch_urls` for high-performance batch fetching of multiple URLs in parallel. Fetch-MCP intelligently extracts main content, supports Markdown conversion, and is easily configurable, making it an ideal tool for developers needing robust and scalable web scraping capabilities.
reply
andrethegiant
1 month ago
[-]
Check out https://pure.md for a REST API version of this
reply
wejick
1 month ago
[-]
Is there any example how an agent can interact with MCP? I imagine it will replace / complement Tools interface.
reply
tuananh
1 month ago
[-]
it can be either stdio or SSE.
reply
tomjen3
1 month ago
[-]
Cool, but playwright doesn’t use your cookies.

Increasingly I want to stop spending time on twitter, but it’s also where the AI news drops first - and I can’t just scrape the data without being logged in.

If there was a way to have the ai go ahead and gather the data for me, that would be great.

reply
omneity
1 month ago
[-]
This is something I am building. Herd[0] gives you a puppeteer-like API over your own browser, in effect allowing you to use your session seamlessly for automation and data extraction (and avoid bot detection as a nice side effect)

0: https://herd.garden

reply
aschobel
1 month ago
[-]
Playwright can actually use your existing browser cookies if you connect it through Chrome's debugging protocol. Launch Chrome with the flag:

--remote-debugging-port=9222

Then connect via CDP in Playwright like this:

const browser = await chromium.connectOverCDP('http://localhost:9222');

reply
yonl
1 month ago
[-]
I would agree to this point as well.

Speaking of implementation, i don’t mind if a browser extension forward cookies from my browser to the automation (privacy and security is an issue of course, and i’d ideally want the cookies to not leave my device, but personally i’m okay with some trade off).

reply
dd36
1 month ago
[-]
Can’t you just have it login?
reply
chazeon
1 month ago
[-]
What's MCP?
reply
DogRunner
1 month ago
[-]
A simple explanation can be seen here: https://www.youtube.com/watch?v=7j_NE6Pjv-E
reply
pizza
1 month ago
[-]
Model Context Protocol
reply
dSebastien
1 month ago
[-]
I shared some notes about it here. Well worth exploring right now: https://notes.dsebastien.net/30+Areas/33+Permanent+notes/33....
reply
hi_hi
1 month ago
[-]
Thanks for this. I'm not familiar with MCP, but having (briefly) read your link it appears to enable a use case I've been expecting where a chat window could replace the entire website experience (probably better suited to larger enterprise style websites) to provide tailored information for a company/product.

Would you know if it's possible to use this approach to constrain an LLM to only a specific context of information (For example, on the Microsoft site, any question related to CRMs would answer with information about Dynamics but never Salesforce)?

reply