Apify
Apify is a cloud platform for web scraping and data extraction, which provides an ecosystem of more than a thousand ready-made apps called Actors for various scraping, crawling, and extraction use cases.
This integration enables you run Actors on the Apify
platform and load their results into LangChain to feed your vector
indexes with documents and data from the web, e.g. to generate answers from websites with documentation,
blogs, or knowledge bases.
Installation and Setup
- Install the Apify API client for Python with
pip install apify-client
- Get your Apify API token and either set it as
an environment variable (
APIFY_API_TOKEN
) or pass it to theApifyWrapper
asapify_api_token
in the constructor.
Utility
You can use the ApifyWrapper
to run Actors on the Apify platform.
from langchain_community.utilities import ApifyWrapper
API Reference:ApifyWrapper
For more information on this wrapper, see the API reference.
Document loader
You can also use our ApifyDatasetLoader
to get data from Apify dataset.
from langchain_community.document_loaders import ApifyDatasetLoader
API Reference:ApifyDatasetLoader
For a more detailed walkthrough of this loader, see this notebook.