User Guide

Agent Builder: Getting Started

Create multi-step AI pipelines, run them against your corpus, and deliver findings by email or webhook. This guide walks you through every step.

Open Agent Builder Technical Reference

Getting Started

Agent Builder lets you chain CaveauAI's AI tools — corpus search, LLM chat, data transforms, email, and webhooks — into automated multi-step pipelines. Define a workflow once, then run it manually or on a cron schedule. Every run produces a full audit trail with per-step output you can review.

Prerequisites

  • A CaveauAI account on any plan (Starter, Professional, Enterprise, or Beta)
  • At least one corpus with uploaded documents — the agent searches your data, not the internet
  • A modern browser (Chrome, Firefox, Safari, or Edge)

Accessing Agent Builder

Log in to your CaveauAI dashboard at ai.bluenotelogic.com. In the left sidebar, click the Agents link (look for the sparkle icon).

Tip: The Agents page shows a green health indicator when the pipeline engine is online and ready to accept runs. If you see a red indicator, check back in a few minutes.

Creating Your First Agent

The fastest way to get started is to pick a template. Templates are pre-configured pipelines you can customize.

Step by step

  1. Click the "New Agent" button in the top-right corner of the Agents page
  2. Enter a name for your agent (e.g., "GDPR Weekly Brief")
  3. Select a template from the dropdown:
    • Research Assistant — search + analyze + email (best for getting started)
    • Compliance Monitor — multi-topic search + analysis + email alerts
    • Document Processor — search + analyze + email or webhook
    • Custom — blank canvas, add steps manually
  4. Configure the template-specific fields (see table below)
  5. Click "Create Agent"
Tip: Start with Research Assistant on "quick" mode for the fastest first result. Quick mode runs a single search pass and takes 2–5 seconds.

Configuration fields

Field Required Description
Name Yes A descriptive label for your agent
Template Yes Determines the step pipeline (Research, Compliance, Document, Custom)
Research topic Yes The search query sent to your corpus (e.g., "GDPR data processor obligations")
Depth No Research Assistant only: quick (1 pass), standard (2 passes), deep (3 passes)
Email recipients No Comma-separated email addresses for the digest delivery step
Webhook URL No HTTPS endpoint for webhook delivery step (must be a public URL)

Template depth comparison

Depth Steps Search Passes Typical Duration
Quick 2 (search + chat) 1 2–5 seconds
Standard 4 (search + chat + transform + email) 2 5–15 seconds
Deep 5 (search ×3 + chat + transform + email) 3 15–45 seconds

Running an Agent

Once your agent is created, trigger a run manually to see it in action.

Manual run

  1. On the Agents page, find your agent in the list
  2. Click the "Run Now" button
  3. Watch the status badge change: pendingrunningcompleted
  4. Click the run row to expand the per-step details

Real example: GDPR Research Bot

This is an actual test run against the Norwegian Legal Intelligence corpus (280,000+ document chunks covering employment law, GDPR, and AI regulation):

Test run details: Research Assistant template, quick mode, topic: "GDPR data processor obligations under Norwegian law"

Step 1 — Search: The corpus search returns 10 ranked passages from the NLI corpus. Results include excerpts from EU AI Act provisions, Norwegian personopplysningsloven (Personal Data Act) sections, and employment law guidance. Each result carries a relevance score and source citation.

Step 2 — Chat: The AI model receives the search results and the analysis prompt. It produces a structured response identifying key obligations:

Step 2 output (excerpt): "Data processor obligations under Norwegian law derive primarily from the GDPR (Regulation 2016/679) as incorporated through the EEA Agreement, supplemented by the Norwegian Personal Data Act (personopplysningsloven). Key obligations include: 1. Processing only on documented instructions from the controller 2. Ensuring confidentiality of personal data 3. Implementing appropriate technical and organisational measures 4. Assisting the controller with data subject requests..." Sources: 8 corpus documents cited Duration: 2.1 seconds (full pipeline)

The entire 2-step quick pipeline completed in 2.1 seconds. Deep mode (3 search passes + analysis + transform + email) typically takes 15–45 seconds depending on corpus size.

Note: Every finding includes source citations you can verify against the original documents in your corpus. The AI does not fabricate references — all sources come from the search step's actual results.

Understanding Results

Each run produces a detailed record you can inspect down to the individual step level.

Run states

Status Meaning
pending Run is queued and waiting for an execution slot (max 3 concurrent runs)
running Pipeline is actively executing steps in sequence
completed All steps finished successfully; results are available
failed A step encountered an error; check step details for the specific failure

Per-step output

Click into a completed run to see the breakdown for each step:

  • input_data — what was sent to this step (query text, prompt, template variables)
  • output_data — what the step produced (search results, AI response, formatted content)
  • duration_ms — execution time for this step in milliseconds
  • api_calls — number of external API calls made (search, chat, email, webhook)

For search steps, the output includes results_count and relevance scores for each passage. For chat steps, you get the full AI response text, the model used, and source citations derived from the search context.

Tip: Use the run details to debug underperforming agents. If the search step returns few results, try broadening your query or uploading more documents. If the chat step is vague, refine the analysis prompt.

Scheduling

Instead of running agents manually, set a cron schedule so they execute automatically at fixed intervals.

Setting up a schedule

  1. Open your agent's detail view (click the agent name)
  2. In the Schedule section, enter a cron expression
  3. Select a timezone (e.g., Europe/Oslo)
  4. Click "Update Agent" to save

Cron expression examples

0 8 * * 1 Every Monday at 08:00 0 9 * * * Every day at 09:00 0 */6 * * * Every 6 hours 0 8 1 * * First of the month at 08:00 0 8 * * 1-5 Weekdays at 08:00 0 20 * * 0 Every Sunday at 20:00

Cron format: minute hour day-of-month month day-of-week

Tip: Weekly Monday morning (0 8 * * 1) is the most common schedule for compliance monitoring and research digests. The team gets findings at the start of the week.

Minimum intervals by plan

Plan Minimum Schedule Interval
StarterEvery 6 hours
ProfessionalEvery 1 hour
EnterpriseEvery 15 minutes
BetaEvery 15 minutes

Advanced: Custom Pipelines

When templates don't fit your use case, select "Custom" as the template type and build your pipeline from scratch. You can chain any combination of the five step types in any order.

Template variable syntax

Steps can reference the output of previous steps using double-brace template variables:

{{steps.search_step.output}} ← Output from a step named "search_step" {{steps.chat_step.output}} ← Output from a step named "chat_step" {{inputs.question}} ← Agent input parameter {{run.date}} ← Current run timestamp {{agent.name}} ← Agent display name

Variables are resolved at execution time. If a referenced step hasn't run yet or doesn't exist, the variable resolves to an empty string.

Example: Weekly Competitive Intelligence Brief

A 4-step custom pipeline that searches your market intelligence corpus, identifies threats, formats an executive summary, and posts it to Slack:

# Step Name Type Configuration
1 market_search search Query: "competitive product launches and market shifts"
2 threat_analysis chat Prompt: "Analyze these findings and identify the top 3 competitive threats: {{steps.market_search.output}}"
3 exec_summary transform Template: format as executive briefing with bullet points
4 slack_post webhook URL: https://hooks.slack.com/services/T.../B.../xxx, payload: {{steps.exec_summary.output}}
Note: Maximum 3 concurrent runs across all your agents. If 3 runs are already in progress, the next run queues as "pending" until a slot opens.

Troubleshooting

Common issues and how to resolve them:

Symptom Likely Cause Fix
Run stuck on pending 3 concurrent runs already active Wait for a running agent to finish, or cancel an existing run
Search step returns no results Corpus is empty or query doesn't match any documents Check your corpus has uploaded documents; try a broader search query
Email not received Spam filter or incorrect address Check spam/junk folder; verify the email address in agent config
Webhook step fails Target URL unreachable or returned error Verify the endpoint is public and accepts POST; check step output for HTTP status
Chat step produces empty output Model timeout or no context provided Ensure the search step ran first and produced results; retry the run
Schedule not firing Invalid cron expression or engine offline Validate cron syntax; check the health indicator on the Agents page
Important: When a run fails, check the run details and expand the specific step that errored. The step output contains the error message and HTTP status code, which usually points directly at the issue.

Need help?

If you're stuck or have questions about configuring an agent for your specific use case, reach out. We can help design the right pipeline for your workflow.

Contact Support Technical Reference

AI Chat — Beta Testing, Online Soon