You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/self-hosting/advanced/online-search.mdx
+123-5Lines changed: 123 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,16 +14,134 @@ tags:
14
14
15
15
# Configuring Online Search Functionality
16
16
17
-
LobeChat supports configuring online search functionality for AI, allowing it to access the latest web information and provide more accurate and timely responses. The online search feature is based on the [SearXNG](https://github.com/searxng/searxng) search engine, which is a privacy-respecting metasearch engine that aggregates results from multiple search engines.
17
+
LobeChat supports configuring **web search functionality** for AI, enabling it to retrieve real-time information from the internet to provide more accurate and up-to-date responses. Web search supports multiple search engine providers, including [SearXNG](https://github.com/searxng/searxng), [Search1API](https://www.search1api.com), [Google](https://programmablesearchengine.google.com), and [Brave](https://brave.com/search/api), among others.
18
18
19
-
<Callouttype={'info'}>
20
-
SearXNG is an open-source metasearch engine that can be self-hosted or accessed via public
21
-
instances. By configuring SearXNG, LobeChat enables AI to retrieve the latest internet
22
-
information, allowing it to answer time-sensitive questions and provide up-to-date news.
19
+
<Callouttype="info">
20
+
Web search allows AI to access time-sensitive content, such as the latest news, technology trends, or product information. You can deploy the open-source SearXNG yourself, or choose to integrate mainstream search services like Search1API, Google, Brave, etc., combining them freely based on your use case.
23
21
</Callout>
24
22
23
+
By setting the search service environment variable `SEARCH_PROVIDERS` and the corresponding API Keys, LobeChat will query multiple sources and return the results. You can also configure crawler service environment variables such as `CRAWLER_IMPLS` (e.g., `browserless`, `firecrawl`, `tavily`, etc.) to extract webpage content, enhancing the capability of search + reading.
24
+
25
25
# Core Environment Variables
26
26
27
+
## `CRAWLER_IMPLS`
28
+
29
+
Configure available web crawlers for structured extraction of webpage content.
|`browserless`| Headless browser crawler based on [Browserless](https://www.browserless.io/), suitable for rendering complex pages. |`BROWSERLESS_TOKEN`|
40
+
|`exa`| Crawler capabilities provided by [Exa](https://exa.ai/), API required. |`EXA_API_KEY`|
41
+
|`firecrawl`|[Firecrawl](https://firecrawl.dev/) headless browser API, ideal for modern websites. |`FIRECRAWL_API_KEY`|
42
+
|`jina`| Crawler service from [Jina AI](https://jina.ai/), supports fast content summarization. |`JINA_READER_API_KEY`|
43
+
|`native`| Built-in general-purpose crawler for standard web structures. ||
44
+
|`search1api`| Page crawling capabilities from [Search1API](https://www.search1api.com), great for structured content extraction. |`SEARCH1API_CRAWL_API_KEY`|
45
+
|`tavily`| Web scraping and summarization API from [Tavily](https://www.tavily.com/). |`TAVILY_API_KEY`|
46
+
47
+
> 💡 Setting multiple crawlers increases success rate; the system will try different ones based on priority.
48
+
49
+
---
50
+
51
+
## `SEARCH_PROVIDERS`
52
+
53
+
Configure which search engine providers to use for web search.
|`jina`| Semantic search provided by [Jina AI](https://jina.ai/). |`JINA_READER_API_KEY`|
70
+
|`kagi`| Premium search API by [Kagi](https://kagi.com/), requires a subscription key. |`KAGI_API_KEY`|
71
+
|`search1api`| Aggregated search capabilities from [Search1API](https://www.search1api.com). |`SEARCH1API_CRAWL_API_KEY`|
72
+
|`searxng`| Use a self-hosted or public [SearXNG](https://searx.space/) instance. |`SEARXNG_URL`|
73
+
|`tavily`|[Tavily](https://www.tavily.com/), offers fast web summaries and answers. |`TAVILY_API_KEY`|
74
+
75
+
> ⚠️ Some search providers require you to apply for an API Key and configure it in your `.env` file.
76
+
77
+
---
78
+
79
+
## `BROWSERLESS_URL`
80
+
81
+
Specifies the API endpoint for [Browserless](https://www.browserless.io/), used for web crawling tasks. Browserless is a browser automation platform based on Headless Chrome, ideal for rendering dynamic pages.
82
+
83
+
```env
84
+
BROWSERLESS_URL=https://chrome.browserless.io
85
+
```
86
+
87
+
> 📌 Usually used together with `CRAWLER_IMPLS=browserless`.
88
+
89
+
---
90
+
91
+
## `GOOGLE_PSE_ENGINE_ID`
92
+
93
+
Configure the Search Engine ID for Google Programmable Search Engine (Google PSE), used to restrict the search scope. Must be used alongside `GOOGLE_PSE_API_KEY`.
94
+
95
+
```env
96
+
GOOGLE_PSE_ENGINE_ID=your-google-cx-id
97
+
```
98
+
99
+
> 🔑 How to get it: Visit [programmablesearchengine.google.com](https://programmablesearchengine.google.com/), create a search engine, and obtain the `cx` parameter.
100
+
101
+
---
102
+
103
+
## `FIRECRAWL_URL`
104
+
105
+
Sets the access URL for the [Firecrawl](https://firecrawl.dev/) API, used for web content scraping. Default value:
106
+
107
+
```env
108
+
FIRECRAWL_URL=https://api.firecrawl.dev/v1
109
+
```
110
+
111
+
> ⚙️ Usually does not need to be changed unless you’re using a self-hosted version or a proxy service.
112
+
113
+
---
114
+
115
+
## `TAVILY_SEARCH_DEPTH`
116
+
117
+
Configure the result depth for [Tavily](https://www.tavily.com/) searches.
118
+
119
+
```env
120
+
TAVILY_SEARCH_DEPTH=basic
121
+
```
122
+
123
+
Supported values:
124
+
125
+
*`basic`: Fast search, returns brief results;
126
+
*`advanced`: Deep search, returns more context and web page details.
127
+
128
+
---
129
+
130
+
## `TAVILY_EXTRACT_DEPTH`
131
+
132
+
Configure how deeply Tavily extracts content from web pages.
133
+
134
+
```env
135
+
TAVILY_EXTRACT_DEPTH=basic
136
+
```
137
+
138
+
Supported values:
139
+
140
+
*`basic`: Extracts basic info like title and content summary;
141
+
*`advanced`: Extracts structured data, lists, charts, and more from web pages.
142
+
143
+
---
144
+
27
145
## `SEARXNG_URL`
28
146
29
147
The URL of the SearXNG instance, which is a necessary configuration to enable the online search functionality. For example:
Copy file name to clipboardExpand all lines: docs/self-hosting/advanced/online-search.zh-CN.mdx
+123-4Lines changed: 123 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,15 +10,134 @@ tags:
10
10
11
11
# 配置联网搜索功能
12
12
13
-
LobeChat 支持为 AI 配置联网搜索功能,这使得 AI 能够获取最新的网络信息,从而提供更准确、更及时的回答。联网搜索功能基于[SearXNG](https://github.com/searxng/searxng) 搜索引擎,它是一个尊重隐私的元搜索引擎,可以聚合多个搜索引擎的结果。
13
+
LobeChat 支持为 AI 配置**联网搜索功能**,使其能够实时获取互联网信息,从而提供更准确、最新的回答。联网搜索支持多个搜索引擎提供商,包括[SearXNG](https://github.com/searxng/searxng)、[Search1API](https://www.search1api.com)、[Google](https://programmablesearchengine.google.com)、[Brave](https://brave.com/search/api) 等。
0 commit comments