FilterHN

Show HN: I taught GPT-OSS-120B to see using Google Lens and OpenCV

12 points

1 hour ago

| 2 comments

I built an MCP server that gives any local LLM real Google search and now vision capabilities - no API keys needed.

  The latest feature: google_lens_detect uses OpenCV to find objects in an image, crops each one, and sends them to Google Lens for identification. GPT-OSS-120B, a text-only model with
   zero vision support, correctly identified an NVIDIA DGX Spark and a SanDisk USB drive from a desk photo.

  Also includes Google Search, News, Shopping, Scholar, Maps, Finance, Weather, Flights, Hotels, Translate, Images, Trends, and more. 17 tools total.

  Two commands: pip install noapi-google-search-mcp && playwright install chromium

  GitHub: https://github.com/VincentKaufmann/noapi-google-search-mcp
  PyPI: https://pypi.org/project/noapi-google-search-mcp/

Booyah!

▲

N_Lens

5 minutes ago

[-]

Looks like a TOS violation to me to scrape google directly like that. While the concept of giving a text only model 'pseudo vision' is clever, I think the solution in its current form is a bit fragile. The SerpAPI, Google Custom Search API, etc. exist for a reason; For anything beyond personal tinkering, this is a liability.

▲

speedgoose

4 minutes ago

[-]

Isn’t SerpAPI about scraping Google through residential proxies as a service ?

▲

TZubiri

10 minutes ago

[-]

have you tried Llama? In my experience it has been strictly better than GPT OSS, but it might depend on specifically how it is used.