Marob AI
Web

AI Scrape

Extract structured data from any web page without writing selectors.

POST/v1/web/ai-scrape

Point the scraper at a URL (or raw HTML), describe what you want with element_prompts, and get back typed JSON. Handles rotating proxies, scroll, and JS-rendered pages.

Parameters

NameTypeRequiredDescription
urlstringconditionalURL to scrape. One of url or html is required.
htmlstringconditionalRaw HTML to scrape.
element_promptsstring[] | objectnoNatural-language names of fields to extract. Max 5.
selectorsstring[]noExplicit CSS selectors, e.g. ["h2.title"].
root_element_selectorstringnoCSS scope limiter. Defaults to "main".
scrollbooleannoAuto-scroll through the page before scraping.
page_positionnumbernoPagination index (1-based).
featuresstring[]noAny of "meta", "link". Defaults to both.
size_presetstringno"QVGA", "VGA", "HD", "FHD", "4K UHD", etc.
is_mobilebooleannoUse a mobile viewport.
width / heightnumbernoViewport dimensions in pixels.
force_rotate_proxybooleannoForce a fresh proxy on every request.
goto_optionsobjectno{ timeout, wait_until }.
wait_forobjectno{ mode, value }.
http_headersobjectnoCustom HTTP headers.
cookiesobject[]noCookies to send with the request.
byo_proxyobjectnoUse your own proxy config.
advance_configobjectnoCapture console, network, cookies.

Request

curl https://api.marob.ai/v1/web/ai-scrape \
  -H "Authorization: Bearer $MAROB_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://news.ycombinator.com/show",
    "element_prompts": ["post_title", "post_points", "post_author"]
  }'

Response

{
  "success": true,
  "_usage": {
    "input_tokens": 512,
    "output_tokens": 1840,
    "inference_time_tokens": 2600,
    "total_tokens": 4952
  },
  "log_id": "log_01JABC...",
  "data": [
    {
      "key": "post_title",
      "selector": "span.titleline > a",
      "results": [
        { "html": "...", "text": "Show HN: Marob AI", "attributes": [] }
      ]
    }
  ],
  "page_position": 1,
  "page_position_length": 1,
  "context": { "post_title": ["Show HN: Marob AI"] },
  "selectors": { "post_title": ["span.titleline > a"] },
  "link": [
    { "href": "https://marob.ai", "text": "Marob AI", "type": "a" }
  ],
  "meta": {
    "title": "Show | Hacker News",
    "description": "",
    "keywords": "",
    "og_image": ""
  }
}

On this page