Prompt Engineering for LangGraph Agents > Lessons from a Finance Chat Bot

A practical breakdown of how to structure system prompts for a routed multi-node LangGraph agent, using the insight-chat graph of a household finance assistant as a real example.

Background

The insight-chat graph is a LangGraphJS agent that handles natural language queries about household finances. It routes each user turn into one of three branches — general, read, or write — each with its own system prompt and tool set.

The key architectural insight: instead of one monolithic 70-line prompt sent on every turn, each branch gets a focused prompt containing only what it needs.

The 6 Prompt Components

A well-designed system prompt is composed of distinct layers, each serving a specific purpose.

1. Identity / Role

"You are Finance Family Chat, a careful household finance assistant."

LLMs have no persistent state — every call is a fresh session. Declaring a role at the top anchors the model's behavior: which domain knowledge to activate, what tone to use, and how cautious to be.

Without it, the model defaults to a generic helpful assistant that may not apply domain-appropriate judgment.

2. Context / Grounding

"Today's date in Asia/Ho_Chi_Minh is 2026-06-15."

LLMs are frozen at their training cutoff. They don't know today's date, the user's timezone, or any runtime state. Grounding injects dynamic facts the model needs to reason correctly.

Common grounding data:

Current date / time (especially when timezone matters)
User profile or account state
Business-specific constants

In this codebase, currentDateInVietnam() computes the date at call time and interpolates it into the prompt — so every turn has an accurate temporal anchor.

3. Constraints

"- Answer in the user's language."
"- Use native tool calls only. Never write XML/function tags."
"- This is a read-only branch: you cannot create, update, or delete data."

LLMs tend to be "helpfully wrong" — fabricating data, writing <function=...> tags instead of actual tool calls, or answering in the wrong language. Constraints are behavioral guardrails.

Three subtypes:

Subtype	Example
Output format	`"Format VND as 109.500.000đ"`
Behavior	`"Do not invent balances or totals"`
Safety boundary	`"Do not retry if cancelled=true"`

The most important constraints are negative ("never do X") — they prevent the failure modes you've already discovered.

4. Instructions / Procedure

"- For budgets, call list_active_categories first when you need a categoryId."
"- Prefer report tools over list_transactions for structured questions."
"- When calling no-argument tools, always send an empty object {}."

The model knows it has tools but doesn't know the right order or strategy for using them. Instructions encode explicit business logic as a procedure the model should follow.

Key distinction:

Constraints = "don't do X"
Instructions = "do X this way"

The {} rule for no-argument tools is a good example of a non-obvious instruction: some LLMs silently drop the arguments field when there are no parameters, which breaks tool call parsing. A one-line rule prevents this class of bug entirely.

5. Output Schema / Format

"- Use markdown tables for tabular data with 2–4 concise insights."
"- Format dates as 05/06/2026 08:00, not raw timezone names."
"- Use bullet lists for short totals. Format caveats as blockquotes starting with Lưu ý:."

LLMs produce free-form text by default. If your UI needs to render structured markdown or your users expect a consistent presentation, you must specify the format explicitly.

Concrete examples beat abstract descriptions. "05/06/2026 08:00" is more reliable than "format as day/month/year hour:minute" — the model anchors to the literal sample.

6. Fallback / Edge Case Handling

"When unsure between read and write, prefer read."
"If unsure about a derived metric, omit it instead of guessing."
"If asset kind is unclear, ask the user."

LLMs must make a decision even with ambiguous input. Without explicit fallback rules, the model fills the gap with unpredictable behavior. Define the default for every uncertain case you can anticipate.

The "prefer read over write" rule is a safety-first default: write tools have side effects, so routing ambiguous intent to the read branch is always the safer failure mode.

Prompt Splitting: One Prompt per Route

The Problem with a Monolithic Prompt

The original graph had a single ~70-line prompt combining read formatting rules, write safety rules, and every available tool — sent on every turn regardless of intent.

Every turn → [read rules] + [write rules] + [all tools] → LLM

Problems:

LLM receives write-safety instructions even when only reading data → noise
All tools visible on every turn → LLM makes incorrect tool selections
Higher token cost per turn

The Split Approach

router node     → routerSystemPrompt()   (classify only, no baseHeader)
general_chat    → generalSystemPrompt()  (no tools, no data rules)
read_chat       → readSystemPrompt()     (read tools + formatting rules)
write_chat      → writeSystemPrompt()    (write tools + safety rules + API constraints)

Each branch ships only the instructions and tools it actually needs.

baseHeader()  (Identity + Context + 2 core Constraints)
     │
     ├─► generalSystemPrompt()  = baseHeader + no-data constraints + fallback
     ├─► readSystemPrompt()     = baseHeader + tool instructions + output schema + read boundary
     └─► writeSystemPrompt()    = baseHeader + write safety + edge cases + API constraints

The router prompt intentionally skips baseHeader() — it only needs to classify intent. Every extra line is noise that increases latency and reduces classification accuracy.

Mapping Components to Each Node

Component	general	read	write	router
Identity / Role	✓	✓	✓	(minimal)
Context / Grounding	✓	✓	✓	—
Constraints	✓	✓	✓	—
Instructions / Procedure	—	✓	✓	✓
Output Schema	—	✓	✓	—
Fallback / Edge cases	✓	✓	✓	✓

Key Takeaways

Split prompts by intent branch — send each node only what it needs. Noise in the prompt degrades accuracy.
Concrete examples outperform abstract rules — "109.500.000đ" is more reliable than "Vietnamese thousands separator".
Negative constraints are your most important rules — they encode failure modes you've already discovered.
Grounding is dynamic — inject runtime facts (date, user state) rather than baking in stale values.
Always define a fallback — LLMs must decide even with ambiguous input. Explicit defaults ("prefer read", "omit if unsure") keep behavior predictable.
Keep classifier prompts minimal — router/classifier nodes should have the shortest possible prompts. Their job is one-shot classification, not reasoning.