Project
A "project" is a collection of parameters for extracting data from a URL (or list of URLs), along with the dataset produced by those instructions.
##Parameters
-
Select DocType: Choose the type of document your target web pages represent. (Select "Custom" to define your own prompt.) -- Custom Prompt: You can provide specifc instructions on what content you wish to extract. What's more, Datavist can be use agentically so that it try to navigate the site to find the location of the data requested in the prompt.
-
Use Details Page?: This toggle determines whether a separate page contains detailed information. If enabled, the next field, "Max Detail Pages," sets a limit on the number of detail pages to process.
-
Automatically Detect Keys?: This toggle enables automatic detection of key data points (properties) related to the Document Type on the target pages. If disabled, you can manually define properties to be extracted.
-
Add Properties: Manually add additional data points (properties) to extract, separated by commas. Example:
author, isbn, publisher. -
Use Pagination?: This toggle turns on and off pagination functionality, while also hiding and revealing the field for setting the number of pages to paginate. -- Max Pagination Pages: If pagination is enabled, this field sets a limit on the number of paginated pages to process.
-
Frequency: How frequently the URL(s) in a project should be monitored/crawled. (Once, Daily, Weekly, Monthly)
-
Notifications: Configure email and webhook notifications for project updates (completion, errors).