Skip to content
Docs
flero.ai

AI & LLM

Nodes for working with large language models, running prompts, evaluating outputs, guarding against unsafe content, and routing semantically.


AI Agent

What: Sends prompts to an LLM provider and returns the response.

Providers: OpenAI (GPT-4 / 3.5), Anthropic (Claude Opus / Sonnet / Haiku), Google (Gemini Pro), Ollama (local).

Inputs: A prompt string, or a structured messages array for multi-turn.

Outputs:

  • response, the LLM's text output.
  • usage, token counts (input, output, total).
  • cost, estimated cost based on the provider's per-token pricing.
  • finishReason, stop, length, tool_calls, etc.

Configuration:

Field Notes
Provider One of OpenAI / Anthropic / Google / Ollama
Credential API key for the provider
Model Specific model name (e.g. claude-opus-4-7)
System instructions Role definition that prepends the conversation
Prompt / Messages Either a single prompt or an array of {role, content} objects
Max tokens Output cap
Temperature Sampling randomness, 0–1
Top-p, Top-k, Presence/Frequency penalty Other sampling controls
Stop sequences Strings that, if generated, halt generation
Stream If true, returns chunks incrementally (useful for long outputs)
Enable memory Persist conversation state across runs
Memory key Conversation identifier
Max memory messages Context window cap
Tools Function definitions the LLM can call
Tool choice auto, none, or force a specific tool
Response format text (default) or json_object with optional JSON Schema

Example (simple summarisation):

Provider:           anthropic
Model:              claude-opus-4-7
System:             You are a concise news summarizer.
Prompt:             Summarise this article in 2 sentences: {{ $node["Fetch article"].json.body.content }}
Temperature:        0.3
Max tokens:         300

Example (multi-turn with memory):

Provider:           openai
Model:              gpt-4
System:             You are a helpful CRM assistant.
Messages:           [
                      { role: "user", content: {{ $trigger.body.message }} }
                    ]
Enable memory:      true
Memory key:         {{ $trigger.body.conversationId }}

Vision input (images):

Prompt:             Describe what's in this product photo.
Images:             [ {{ $node["Download image"].json.base64 }} ]

Prompt Template

What: Builds a prompt from a reusable template with placeholders. Use this when the same prompt structure repeats across nodes or workflows.

Configuration: A template string with {{ variable }} placeholders and a variables object that fills them.

Built-in templates: summarize, code_review, translate, extract_json, qa, creative_writing, data_analysis, email_draft.


AI Eval

What: Uses an LLM to score another LLM's output against criteria. Output is a structured score / verdict.

Use case: Quality checks (was the AI's answer correct?), compliance evaluation (did the response stay on-topic?), A/B testing prompts.

Outputs: score (number), passed (bool), rationale (text).


AI Guardrails

What: Filters AI outputs against safety / compliance rules before they leave the workflow.

Checks:

  • PII redaction (emails, phone numbers, credit cards, SSNs).
  • Toxicity / safety classifier.
  • Topic guard (must / must not mention specific topics).
  • Length and format constraints.

Outputs: output (the filtered text) and blocked (true if a guard tripped, with details on which rule).


Semantic Router

What: Routes data by semantic similarity. Compute an embedding of the input, compare to known intents, route to the closest match's output port.

Use case: Intent classification ("is the user asking for help, sales info, or refund?"), smart fan-out in support chatbots.


Learning

What: Persists workflow signals to a feedback store that can later fine-tune prompts or routing decisions. Advanced.


Workflow Assistant

What: Powers the in-editor AI builder. Available as a node for advanced cases (e.g. building a chatbot that itself can build workflows). Most users interact with this via the AI builder, not directly.


Token Cost Guard

What: A hard ceiling on LLM token spend per run. Tracks usage across AI Agent nodes and aborts the workflow if the budget is exceeded.

Configuration:

Field Notes
Budget (tokens) Max input + output tokens for this run
Budget (USD) Alternative: ceiling in cost
Action abort (fail), route to error port, warn-only

Use case: Production workflows where a runaway LLM could rack up significant cost.


Tips & gotchas

  • Model names change frequently. Use the provider's docs for the current model ID. Flero updates its picklist regularly but isn't authoritative.
  • JSON output mode is not guaranteed parseable even when the model says "json_object", wrap with a schema in responseFormat and follow up with a Data Parser if you need a hard guarantee.
  • Temperature 0 is not deterministic across providers, there's still some variability. Use seeded sampling where the provider supports it (some OpenAI models).
  • Memory has a cost. Every prior message is re-sent on every turn. Cap maxMemoryMessages to a reasonable number (5–10 for support bots).
  • Vision input is large. Images count for many tokens; budget accordingly. Resize before embedding when possible.
  • Tool calling is provider-specific. Tool definitions written for OpenAI may need tweaking for Anthropic or Google. Test before going live.
  • Cost guard runs after each AI node, so a single oversized prompt can still exceed the budget by one node's worth. Set the guard slightly under your real cap.


Found something out of date? This page lives in the Flero docs content set.