Wait...

API to Search, Extract, Structure Web Data

Get clean data for your AI from any website and automate
your web workflows

Diagonal Sections

Using the rotation transform is how you might think to do it but I think skew is the way to go!

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

One unified API for your AI

< Your AI Agent >
  • Start "research YC" workflow
  • Automate brand protection
  • Research donors in NYC
    Find local businesses
  • Analyze brand visibility
[-- Data Layer --]
· Research agents
· Parsers - structured data
· Data router
· Automation engine
· Click, fill forms
· Distributed infra
· Map/Crawl
· VM sandboxes
· Batches API
• Output

{
"id": "request_56is5c9gyw",
"created": 1317322740,
"result": {
"markdown_content": "# Ex", "json_content": {}
"html_content": "<DOC>"
}
}

Trusted by the best startups in the world

...and many more

Diagonal Sections

Using the rotation transform is how you might think to do it but I think skew is the way to go!

Developer-centric
1import requests
2
3API_URL = 'https://api.olostep.com/v1/answers'
4API_KEY = '<your_token>'
5
6headers = {
7    'Authorization': f'Bearer {API_KEY}',
8    'Content-Type': 'application/json'
9}
10
11data = {
12    "task": "What is the latest book by J.K. Rowling?",
13    "json": {
14        "book_title": "",
15        "author": "",
16        "release_date": ""
17    }
18}
19
20response = requests.post(API_URL, headers=headers, json=data)
21result = response.json()
22
23print(json.dumps(result, indent=4))
1// Using native fetch API (Node.js v18+)
2const API_URL = 'https://api.olostep.com/v1/answers';
3const API_KEY = '<your_token>';
4
5fetch(API_URL, {
6  method: 'POST',
7  headers: {
8    'Authorization': `Bearer ${API_KEY}`,
9    'Content-Type': 'application/json'
10  },
11  body: JSON.stringify({
12    "task": "What is the latest book by J.K. Rowling?",
13    "json": {
14        "book_title": "",
15        "author": "",
16        "release_date": ""
17    }
18  })
19})
20  .then(response => response.json())
21  .then(result => {
22    console.log(JSON.stringify(result, null, 4));
23  })
24  .catch(error => console.error('Error:', error));
1import requests
2
3API_URL = 'https://api.olostep.com/v1/crawls'
4API_KEY = '<token>'
5
6headers = {'Authorization': f'Bearer {API_KEY}'}
7data = {
8    "start_url": "https://docs.stripe.com/api",
9    "include_urls": ["/**"],
10    "max_pages": 10
11}
12
13response = requests.post(API_URL, headers=headers, json=data)
14result = response.json()
15
16print(f"Crawl ID: {result['id']}")
17print(f"URL: {result['start_url']}")
1// Using native fetch API (Node.js v18+)
2const API_URL = 'https://api.olostep.com/v1/crawls';
3const API_KEY = '<token>';
4
5fetch(API_URL, {
6  method: 'POST',
7  headers: {
8    'Authorization': `Bearer ${API_KEY}`,
9    'Content-Type': 'application/json'
10  },
11  body: JSON.stringify({
12    "start_url": "https://docs.stripe.com/api",
13    "include_urls": ["/**"],
14    "max_pages": 10
15  })
16})
17.then(response => response.json())
18.then(result => {
19  console.log(`Crawl ID: ${result.id}`);
20  console.log(`URL: ${result.start_url}`);
21})
22.catch(error => console.error('Error:', error));
1import requests
2
3API_URL = 'https://api.olostep.com/v1/scrapes'
4API_KEY = '<your_token>'
5
6headers = {'Authorization': f'Bearer {API_KEY}'}
7data = {"url_to_scrape": "https://github.com"}
8
9response = requests.post(API_URL, headers=headers, json=data)
10result = response.json()
11
12print(f"Scrape ID: {result['id']}")
13print(f"URL: {result['url_to_scrape']}")
1// Using native fetch API (Node.js v18+)
2const API_URL = 'https://api.olostep.com/v1/scrapes';
3const API_KEY = '<your_token>';
4
5fetch(API_URL, {
6  method: 'POST',
7  headers: {
8    'Authorization': `Bearer ${API_KEY}`,
9    'Content-Type': 'application/json'
10  },
11  body: JSON.stringify({
12    "url_to_scrape": "https://github.com"
13  })
14})
15.then(response => response.json())
16.then(result => {
17  console.log(`Scrape ID: ${result.id}`);
18  console.log(`URL: ${result.url_to_scrape}`);
19})
20.catch(error => console.error('Error:', error));
1import requests
2
3API_URL = 'https://api.olostep.com/v1/agents' # endpoint available to select customers
4API_KEY = '<token>'
5
6headers = {'Authorization': f'Bearer {API_KEY}', 'Content-Type': 'application/json'}
7data = {
8    "prompt": '''
9      Search every portfolio company from every fund from 
10      (https://www.vcsheet.com/funds) and return the results into a google sheet 
11      with the following columns (Fund Name, Fund Website 
12      URL, Fund LinkedIn URL, Portfolio Company Name, Portfolio
13      Company URL, Portfolio Company LinkedIn URL). Run every week 
14      on Monday at 9:00 AM. Send an email to steve@example.com when 
15      new portfolio companies are added to any of these funds.  
16    ''',
17    "model": "gpt-4.1"
18}
19
20response = requests.post(API_URL, headers=headers, json=data)
21result = response.json()
22
23print(f"Agent ID: {result['id']}")
24print(f"Status: {result['status']}")
25# You can then schedule this agent
1// Using native fetch API (Node.js v18+)
2const API_URL = 'https://api.olostep.com/v1/agents'; // endpoint available to select customers
3const API_KEY = '<token>';
4
5fetch(API_URL, {
6  method: 'POST',
7  headers: {
8    'Authorization': `Bearer ${API_KEY}`,
9    'Content-Type': 'application/json'
10  },
11  body: JSON.stringify({
12    "prompt": `
13      Search every portfolio company from every fund from 
14      (https://www.vcsheet.com/funds) and return the results into a google sheet 
15      with the following columns (Fund Name, Fund Website 
16      URL, Fund LinkedIn URL, Portfolio Company Name, Portfolio
17      Company URL, Portfolio Company LinkedIn URL). Run every week 
18      on Monday at 9:00 AM. Send an email to steve@example.com when 
19      new portfolio companies are added to any of these funds.
20    `,
21    "model": "gpt-4.1"
22  })
23})
24  .then(response => response.json())
25  .then(result => {
26    console.log(`Agent ID: ${result.id}`);
27    console.log(`Status: ${result.status}`);
28    // You can then schedule this agent
29  })
30  .catch(error => console.error('Error:', error));

Get the data in the format you want

Get Markdown, HTML, PDF or Structured JSON

Pass the URL to the API and retrieve the HTML, Markdown, PDF, or plain text of the website. You can also specify the schema to only get the structured, clean JSON data you want

JS execution + residential ipS

Web-pages rendered in a browser

Full JS support is the norm for every request, as well as premium residential IP addresses and proxies rotation to avoid all bot detection

Crawl

Get all the data from a single URL

Multi-depth crawling enables you to get clean markdown from all the subpages of a website. Works also without a sitemap (e.g. useful for doc websites).

Get clean data

We handle the heavy lifting

Browser infra, rate limits and js-rendered content

Crawling

Get the data from all subpages of a website. No sitemap required. This is useful if you are building an AI agent that need to get a specific context from a documentation website

Batches

You can submit from 100 to 100k URLs in a batch and have the content (markdown, html, raw pdfs or structured JSON) back in 5-7 mins. Useful for deep research agents, monitoring social media, and for aggregating data at scale

Reliable

Get the content you want when you want it. All requests are done with a premium proxy

PDF parsing

Olostep can parse and output content from web hosted pdfs, docx, and more.

Actions

Click, type, fill forms, scroll, wait and more dynamically on websites

Most cost-effective API on the market

Pricing that Makes Sense

We want you to be able to build a business on top of Olostep.
Start for free. Scale with no worries.

Free
COST/500 $0
$0
No credit card required
500 successful requests
All requests are JS rendered + utilizing residential IP addresses
Low rate limits
Starter
COST/1K $1.800
$9
per month
5000 successful requests/month
Everything in Free Plan
150 concurrent requests
Standard
COST/1K $0.495
$99 USD
per month
200K successful requests/month
Everything in Starter Plan
500 concurrent requests
Scale
COST/1K $0.399
$399 USD
per month
1 Million successful requests/month
Everything in Standard Plan
AI-powered Browser Automations
Free
$0
per month
3000 successful scrapes
All requests are JS rendered + utilizing residential IP addresses
Starter
$29
per month
20K successful scrapes
All requests are JS rendered + utilizing residential IP addresses
Standard
$99 USD
per month
200K successful scrapes
All requests are JS rendered + utilizing residential IP addresses
Scale
$399 USD
per month
1 Million successful scrapes
All requests are JS rendered + utilizing residential IP addresses

Top-ups

Need flexibility or have spiky usage? You can buy credits pack. They are valid for 6 months.

Credit pack

$20 for 10k credits
Purchase Credit Pack

Credit pack

$200 for 250k credits
Purchase Credit Pack

Credit pack

$1000 for 2M credits
Purchase Credit Pack

Enterprise

Hundreds of millions of credits with enterprise-grade reliability. We offer custom discounts
Contact Sales

Data tailored to your industry

Access clean, structured data that matters most to you, when it matters the most. Power search, deep resarch, AI Agents and your applications.

Deep Search

Access custom, hyper-specialized B2B indexes for your industry to search and extract comprehensive data beyond what general web indexes cover

Recruiting

Identify, research, and validate candidates faster with intelligence and data aggregated from top-quality profiles and specialist web sources.

Power AI applications

Get clean, structured data from any website as markdown, html, screenshot, etc. to power your AI application and workflows

Monitor the Web

Monitor any webpage for DOM changes, stock availability, price changes, job openings or fresh content. Run automatically on a schedule and get alerted

Automate data pipelines

Automate complex data pipelines with the /agents endpoint through natural language prompts. You can also pass your own internal knowledge as context

Automate data pipelines

Automate complex data pipelines with the /agents endpoint through natural language prompts. You can also pass your own internal knowledge as context

Deep research agents

Enable your agent to conduct deep research on large Web datasets.

Spreadsheet enrichment

Get real-time web data to enrich your spreadsheets and analyze data.

Lead generation

Research, enrich, validate and analyze leads. Enhance your sales data

Vertical AI search

Build industry specific search engines to turn data into an actionable resource.

AI Brand visibility

Monitor brands to help improve their AI visibility (Answer Engine Optimization).

Agentic Web automations

Enable AI Agents to automate tasks on the Web: fill forms, click on buttons, etc.

Customers

Trusted by world-class teams

Discover why the best teams in the world choose Olostep.
Read more customer stories

Michelle Julia
Co-founder & CEO Aurium

Olostep is the best!!! We automated entire data pipelines with just a prompt

Richard He
Co-founder & CEO Openmart

Olostep has become the default Web Layer infrastructure for our company

Max Brodeur-Urbas
Co-founder & CEO Gumloop

Olostep works like a charm! And your customer service is exceptional

Rob Hayes
Co-founder Merchkit

Olostep lets us turn any website into an API. Great product, great people

Brandon Cohen
Co-founder & CTO CivilGrid

I highly recommend Olostep, great product!

Co-founder & CEO Gedd.it

We verify coupon codes at scale. Love Olostep. It works on any e-commerce

Trevor West
Co-founder & CEO Podqi

Olostep is the best API to search, extract, and structure data from the Web. Happy to be customers

Rida Naveed
Co-founder Zecento

We use /batches combined with parsers and it's magical how we can get structured data deterministically at large scale

Kieran V.
Growth PlotsEvents

Olostep allowed us to search and structure events data across the Web

Paul Mit
Founder Foundbase

Reliable and cost-effective API for working with data. Congrats on the cool product

Questions?

Frequently asked questions

Have other questions? Get in touch via info@olostep.com

What is Olostep?

Olostep is a Web Data API that helps AI teams search, crawl, scrape and structure web data through a single, developer-friendly platform. Built for modern AI workflows, it makes it easier to turn public web content into clean, structured outputs for research, enrichment and automation.

From one-off extractions to high-volume data pipelines, Olostep gives teams a reliable and scalable way to collect web data without building and maintaining complex scraping infrastructure themselves.

Olostep also includes an Agent that lets users automate research workflows and generate structured outputs using natural language prompts, making it easier to move from manual research to scalable automation.

What is a Web Data API?

A Web Data API allows developers to extract, crawl and structure data from websites at scale. It handles rendering, anti-bot protection and parsing, returning clean outputs such as JSON or HTML for use in applications, analytics or AI workflows.

How does a Web Data API work?

A Web Data API processes requests by rendering web pages, handling anti-bot protections, extracting structured data and returning it in formats such as JSON or Markdown. This removes the need to manage scraping infrastructure manually.

What is the difference between crawling and scraping?

Crawling refers to discovering and navigating multiple pages across a website, while scraping focuses on extracting data from a specific page. A Web Data API typically supports both processes in a unified workflow.

What data formats can be returned?

Most Web Data APIs return data in structured formats such as JSON, as well as HTML, Markdown or raw content depending on the use case. Structured outputs are commonly used for automation and AI workflows.

Is web scraping legal?

Web scraping is legal in many cases, but depends on how the data is accessed and used. It is important to follow website terms of service, data privacy regulations and applicable laws when extracting web data.

What is counted as a request?

One request equals one webpage or one PDF processed. We do not charge separately for bandwidth, proxies, or data usage. All infrastructure costs are included in the price per request.

Does Olostep charge for failed requests?

No, Olostep does not charge for failed requests. You are only billed for successful requests, ensuring predictable and fair usage-based pricing.

For endpoints that involve LLM processing (such as the Answers API), any underlying model costs may still apply. However, Olostep itself only charges for requests that are successfully completed.

Which websites can Olostep access/interact?

Olostep can access and interact with most publicly available websites, including those that require JavaScript rendering.

If your use case involves authentication, cookies or logged-in sessions, you can get in touch at info@olostep.com to explore supported options.

Can Olostep support my high-volume requests?

Yes, Olostep is built to handle high-volume data extraction at scale, supporting up to billions of requests per month. With features like batch processing, distributed infrastructure and scalable workflows, it is designed for both growing teams and enterprise-level use cases.

How can I pay?

You can pay using the Stripe Payment Links.

Why should I use Olostep?

Olostep is reliable (99.5% uptime), cost-effective (up to 70% cheaper), scalable, and flexible to work with your existing workflows and backend. It is one of the few platforms where you can create custom parsers to return deterministic results at scale in a cost-effective way. You can request features you need, and our team will work on adding them. You can also test Olostep for free to see if it fits your use case. Get your free API keys here: https://www.olostep.com/auth/

Can I switch plans after signing up?

Yes, you can switch plans at any time. Plans are pro-rated, meaning any unused value from your current plan is carried over to your new plan.

This ensures you don't pay twice for usage you've already covered, giving you flexibility as your needs grow.

Does Olostep offer a free trial?

Yes, Olostep includes a free plan with 500 requests to help you test the API before upgrading. Paid plans start from $9/month and include 5,000 credits per month.

This gives teams a low-risk way to evaluate Olostep's reliability, scalability and cost-effectiveness before moving to higher-volume usage.

Can I ask for a refund if I don't use it?

Yes. If you're not satisfied with the Olostep API or it doesn't end up being useful for your use case, you can email info@olostep.com to request a refund.

If you cancel after a period of non-use, Olostep can also refund the unused portion of your plan where applicable.

How does it return the results?

Olostep returns a request ID for future retrieval, along with the page content in Markdown and HTML. Depending on the endpoint and configuration, it can also return structured JSON through parsers or LLM-based extraction.

For the /answers endpoint, Olostep returns the generated answer, a JSON object based on your defined schema, and the sources used to produce the result.

Can Olostep automate my data pipelines?

Yes, Olostep is designed to support automated data pipelines and research workflows on the web. With capabilities for searching, crawling, scraping, structuring data and running repeatable workflows, it can support a wide range of business and AI use cases.

If you have a specific workflow in mind, contact the team at info@olostep.com or via the Contact Sales page to discuss the best setup for your use case.

Who should use Olostep?

Olostep is built for AI startups, developers, AI engineers, data scientists and research teams that rely on web data to power products, enrich datasets and automate workflows.

It is especially useful for teams that need to search, crawl, scrape and structure web data for use cases such as market research, website monitoring, data enrichment, historical web analysis, LLM fine-tuning and grounding AI systems with real-world data.

By returning clean, structured outputs through a single API, Olostep makes it easier to plug web data into existing backends, pipelines and AI applications.

Can I extract data with a prompt?

Yes, Olostep lets you extract data using natural language prompts. If you already know the exact page you want to process, you can use the /scrapes endpoint with LLM extraction to describe the data you want returned.

For high-volume or deterministic extraction, Olostep's parsers are the better option, as they return structured JSON more consistently at scale.

For more advanced workflows, such as searching for data, navigating across pages, handling pagination or validating results, the /agents endpoint can automatically carry out multi-step extraction based on your prompt.