> ## Documentation Index
> Fetch the complete documentation index at: https://info.invictai.io/llms.txt
> Use this file to discover all available pages before exploring further.

# Web Browsing & Scraping Tools

Web Browsing and Scraping tools enable AI Agents to gather information from various online sources, including search engines, social media platforms, websites, and multimedia content. These tools provide capabilities for both broad web searches and targeted data extraction.

## Available Tools

| Tool Name                  | Platform/Category | Description                                          |
| -------------------------- | ----------------- | ---------------------------------------------------- |
| Apollo Company Enrichment  | Business Data     | Search for company details in Apollo database        |
| DuckDuckGo Search          | Search Engine     | Basic web search using DuckDuckGo                    |
| Tavily Search              | Search Engine     | Advanced semantic search with AI-optimized results   |
| LinkedIn Profile Scraper   | Social Media      | Extracts detailed information from LinkedIn profiles |
| Twitter Profile Scraper    | Social Media      | Collects data from Twitter/X user profiles           |
| Twitter Search Scraper     | Social Media      | Gathers tweets based on search terms                 |
| Website Scraper            | Web               | Extracts text content from web pages                 |
| Video Transcript Extractor | Multimedia        | Retrieves transcripts from video/audio content       |
| Wikipedia Search           | Knowledge Base    | Searches and retrieves Wikipedia article content     |

## Tool Details

### Apollo Company Enrichment

**Description:**
Enriches company data by retrieving detailed information from Apollo's comprehensive business database.

**System Tool ID:** `apollo_organization_enrichment`

**Arguments:**

| Name   | Required | Type   | Description                  |
| ------ | -------- | ------ | ---------------------------- |
| domain | Required | string | Company domain to search for |

### DuckDuckGo Search

**Description:**
A basic web search tool that queries DuckDuckGo search engine to find relevant web pages and information.

**System Tool ID:** `duckduckgo-search`

**Arguments:**

| Name       | Required              | Type   | Description                                |
| ---------- | --------------------- | ------ | ------------------------------------------ |
| maxResults | Optional (default: 5) | number | Maximum number of search results to return |

### Tavily Search

**Description:**
An advanced search tool that uses AI to provide more relevant and contextual search results.

**System Tool ID:** `tavily-search`

**Arguments:**

| Name       | Required              | Type   | Description                                |
| ---------- | --------------------- | ------ | ------------------------------------------ |
| maxResults | Optional (default: 5) | number | Maximum number of search results to return |

### LinkedIn Profile Scraper

**Description:**
Extracts comprehensive information from LinkedIn profiles including work experience, education, skills, and other public profile data.

**System Tool ID:** `linkedin_scrape_profiles_by_urls`

**Arguments:**

| Name        | Required | Type      | Description                              |
| ----------- | -------- | --------- | ---------------------------------------- |
| profileUrls | Required | string\[] | Array of LinkedIn profile URLs to scrape |

### Twitter Profile Scraper

**Description:**
Collects data from Twitter/X user profiles including tweets, profile information, and public metrics.

**System Tool ID:** `twitter_scrape_by_handles`

**Arguments:**

| Name           | Required                  | Type                                  | Description                        |
| -------------- | ------------------------- | ------------------------------------- | ---------------------------------- |
| twitterHandles | Required                  | string\[]                             | Array of Twitter handles to scrape |
| start          | Optional (default: '')    | string                                | Start date in YYYY-MM-DD format    |
| end            | Optional (default: '')    | string                                | End date in YYYY-MM-DD format      |
| sort           | Optional (default: 'Top') | enum: \['Latest', 'Top']              | Sort order of tweets               |
| maxItems       | Optional (default: '10')  | enum: \['5', '10', '25', '50', '100'] | Maximum number of items to return  |

### Twitter Search Scraper

**Description:**
Searches and extracts tweets based on specific search terms or keywords.

**System Tool ID:** `twitter_scrape_by_search_terms`

**Arguments:**

| Name        | Required                  | Type                                  | Description                       |
| ----------- | ------------------------- | ------------------------------------- | --------------------------------- |
| searchTerms | Required                  | string\[]                             | Array of search terms             |
| start       | Optional (default: '')    | string                                | Start date in YYYY-MM-DD format   |
| end         | Optional (default: '')    | string                                | End date in YYYY-MM-DD format     |
| sort        | Optional (default: 'Top') | enum: \['Latest', 'Top']              | Sort order of tweets              |
| maxItems    | Optional (default: '10')  | enum: \['5', '10', '25', '50', '100'] | Maximum number of items to return |

### Website Scraper

**Description:**
Extracts visible text content from any webpage URL, making it ideal for gathering information from articles, blog posts, and other web content.

**System Tool ID:** `scrape_web_text`

**Arguments:**

| Name | Required | Type   | Description                  |
| ---- | -------- | ------ | ---------------------------- |
| url  | Required | string | URL of the webpage to scrape |

### Video Transcript Extractor

**Description:**
Retrieves transcripts from various online video and audio content across different platforms.

**System Tool ID:** `video_transcript`

**Arguments:**

| Name       | Required | Type   | Description                                         |
| ---------- | -------- | ------ | --------------------------------------------------- |
| video\_url | Required | string | URL of the video/audio content                      |
| language   | Optional | string | Language code for the transcript (e.g., 'en', 'ru') |

### Wikipedia Search

**Description:**
Searches Wikipedia articles and retrieves relevant content and information.

**System Tool ID:** `wikipedia-query-run`

**Arguments:**

| Name                | Required                 | Type   | Description                         |
| ------------------- | ------------------------ | ------ | ----------------------------------- |
| topKResults         | Optional (default: 3)    | number | Number of top results to return     |
| maxDocContentLength | Optional (default: 4000) | number | Maximum content length per document |

<Tip>
  When using search and scraping tools, be mindful of rate limits and usage
  quotas that may apply to specific services.
</Tip>
