Web
AI Scrape
Extract structured data from any web page without writing selectors.
POST/v1/web/ai-scrape
Point the scraper at a URL (or raw HTML), describe what you want with
element_prompts, and get back typed JSON. Handles rotating proxies, scroll,
and JS-rendered pages.
Parameters
| Name | Type | Required | Description |
|---|---|---|---|
url | string | conditional | URL to scrape. One of url or html is required. |
html | string | conditional | Raw HTML to scrape. |
element_prompts | string[] | object | no | Natural-language names of fields to extract. Max 5. |
selectors | string[] | no | Explicit CSS selectors, e.g. ["h2.title"]. |
root_element_selector | string | no | CSS scope limiter. Defaults to "main". |
scroll | boolean | no | Auto-scroll through the page before scraping. |
page_position | number | no | Pagination index (1-based). |
features | string[] | no | Any of "meta", "link". Defaults to both. |
size_preset | string | no | "QVGA", "VGA", "HD", "FHD", "4K UHD", etc. |
is_mobile | boolean | no | Use a mobile viewport. |
width / height | number | no | Viewport dimensions in pixels. |
force_rotate_proxy | boolean | no | Force a fresh proxy on every request. |
goto_options | object | no | { timeout, wait_until }. |
wait_for | object | no | { mode, value }. |
http_headers | object | no | Custom HTTP headers. |
cookies | object[] | no | Cookies to send with the request. |
byo_proxy | object | no | Use your own proxy config. |
advance_config | object | no | Capture console, network, cookies. |
Request
curl https://api.marob.ai/v1/web/ai-scrape \
-H "Authorization: Bearer $MAROB_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"url": "https://news.ycombinator.com/show",
"element_prompts": ["post_title", "post_points", "post_author"]
}'Response
{
"success": true,
"_usage": {
"input_tokens": 512,
"output_tokens": 1840,
"inference_time_tokens": 2600,
"total_tokens": 4952
},
"log_id": "log_01JABC...",
"data": [
{
"key": "post_title",
"selector": "span.titleline > a",
"results": [
{ "html": "...", "text": "Show HN: Marob AI", "attributes": [] }
]
}
],
"page_position": 1,
"page_position_length": 1,
"context": { "post_title": ["Show HN: Marob AI"] },
"selectors": { "post_title": ["span.titleline > a"] },
"link": [
{ "href": "https://marob.ai", "text": "Marob AI", "type": "a" }
],
"meta": {
"title": "Show | Hacker News",
"description": "",
"keywords": "",
"og_image": ""
}
}