The Invoke Large Language Model component (LLM component) in the Visual Flow Designer
enables you to connect a flow to the LLM through a REST service call.
You can insert this component state into your dialog flow by selecting
Service Integration > Invoke Large Language
Model from the Add State dialog. To enable multi-turn conversations when
the skills is called from a digital assistant, enter a description for the LLM
description.
Note
As a best practice, always add descriptions to LLM components to allow multi-turn
refinements when users access the LLM service through a digital assistant.
Inserting the LLM component state adds an error handling state for
troubleshooting the requests to the LLM and its responses. The LLM component state
transitions to this state (called ShowLLMError by default) when an invalid request or
response cause a non-recoverable error. Description of the illustration invokellm-added-to-flow.png
In addition to calling the LLM service, the LLM component state handles
interactive transactions, such as multi-turn refinements, the back-and-forth exchanges
between the user and the LMM that hone the LLM output through rounds of user
feedback.
Note
Response refinement
can also come from the system when it implements retries after failing
validation.
You can send the result from the LLM model as a message, or you
can save it to a dialog flow variable for downstream use. The LLM component's built in
validation provides guardrails against vulnerabilities like prompt-injection attacks
that bypass the model's content moderation guidelines.
Note
If you want to enhance the validation
that LLM component already provides, or want to improve the LLM output using the
Recursive Criticism and Improvement (RCI) technique, you can use our starter code to
build your own request and response validation handlers.
So what do you need to use this component? If you're accessing the Cohere
model directly or through Oracle Generative AI Service, you just need an LLM service to
the Cohere model and a prompt, which is a block of human readable text containing the
instructions to the LLM. Because writing a prompt is an iterative process, we provide
you with prompt engineering guidelines and the Prompt Builder, where you can incorporate
these guidelines into your prompt text and test it out until it elicits the appropriate
response from the model. If you're using another model, like Azure OpenAI, then you'll
need to first create your own Transformation Event Handler from the starter code that we
provide and then create an LLM service that maps that handler to the LLM provider's
endpoints that have been configured for the instance.
General Properties π
Property
Description
Default Value
Required?
LLM Service
A list of the LLM services that have been configured for
the skill. If there is more than one, then the default LLM service
is used when no service has been selected.
The default LLM service
The state can be valid without the LLM Service, but
skill can't connect to the model if this property has not been
set.
Prompt
The prompt that's specific to the model accessed
through the selected LLM service. Keep our general guidelines in mind while writing
your prompt. You can enter the prompt in this field and then revise
and test it using the Prompt Builder (accessed by clicking
Build Prompt). You can also compose your
prompt using the Prompt Builder.
N/A
Yes
Prompt Parameters
The parameter values. Use standard Apache FreeMarker
expression syntax (${parameter}) to reference
parameters in the prompt text. For
example:
Draft an email about ${opportunity} sales.
For
composite bag variables, use the composite bag syntax:
${cb_entity.value.bag_item.value} for
value list items
${cb_entity.value.bag_item}
for non-value list items
You must define all of the prompt parameters or each of the
parameters referenced in the prompt text. Any missing prompt
parameters are flagged as errors.
N/A
No
Result Variable
A variable that stores the LLM response.
N/A
No
User Messaging π
These options only apply when you set Send LLM Result as a
Message to True.
Property
Description
Default Value(s)
Required?
Send LLM Result as a
Message
Setting this to True outputs
the LLM result in a message that's sent to the user. Setting this
property to False prevents the output from
being sent to the user.
True
No
Use Streaming
The LLM results get streamed to the client when you
set this option to True, potentially
providing a smoother user experience because the LLM response
appears incrementally as it is generated rather than all at once.
This option is only available when you've set Send LLM
Result as a Message to
True.
Users may view
potentially invalid responses because the validation event
handler gets invoked after the LLM response has already started
streaming.
Set Use
Streaming to False for
Cohere models or when you've applied a JSON schema to the LLM
result by setting Enforce JSON-Formatted LLM
Response to True.
Do not enable streaming if:
Your skill runs on either the Slack or
Microsoft Teams channels.
You've set response validation. The handler
can only validate a complete response, so if you set
Use Streaming to
True, users may sent multiple
streams of output, which may confuse them.
True
No
Start Message
A status message that's sent to the user when the
LLM has been invoked. This message, which is actually rendered prior
to the LLM invocation, can be a useful indicator. It can inform
users that the processing is taking place, or that the LLM may take
a period of time to respond.
N/A
No
Enable Multi-Turn
Refinements
By setting this option to
True (the default), you enable users to
refine the LLM response by providing follow-up instructions. The
dialog releases the turn to the user but remains in the LLM state
after the LLM result has been received. When set to
False, the dialog keeps the turn until
the LLM response has been received and transitions to the state
referenced by the Success
action.
Note: The component description is required
for multi-turn refinements when the skill is called from a
digital assistant.
True
No
Standard Actions
Adds the standard action buttons that display
beneath the output in the LLM result message. All of these buttons
are activated by default.
Submit β When a user
selects this button, the next transition is
triggered and the submit event handler is
fired.
Cancel β When a user
selects this button, the dialog transitions to the state
defined for the cancel
transition.
Undo β When clicked, the
skill removes the last refinement response and reverts back
to the previous result. The skill also removes the previous
refinement from the chat history. This button does not
display in the initial response. It only displays after the
LLM service generates a refinement.
Submit,
Cancel, and Undo
are all selected.
No
Cancel Button Label
The label for the cancel button
Submit
Yes β When the Cancel action is defined.
Success Button Label
The label for the success button
Cancel
Yes β when the Success action is defined.
Undo Button Label
The label for the undo button
Undo
Yes β When the Undo action is defined.
Custom Actions
A custom action button. Enter a button label and a
prompt with additional instructions.
N/A
No
Transition Actions for the Invoke Large
Language Model Component π
Action
Description
cancel
This action is triggered by then users tap the cancel
button.
error
This action gets triggered when requests to, or
responses from the LLM are not valid. For example, the allotment of
retry prompts to correct JSON or entity value errors has been used
up.
User Ratings for LLM-Generated Content π
By default, the user rating (thumbs up and thumbs down) displays on each
message.
When users give the LLM response a thumbs down rating, the skill follows up with
a link that opens a feedback form.
You can disable these buttons by switching off Enable Large
Language Model feedback in Settings >
Configuration.
Response Validation π
Property
Description
Default Value
Required?
Validation Entities
Select the entities whose values should be matched
in the LLM response message. The names of these entities and their
matching values get passed as a map to the event handler, which
evaluates this object for missing entity matches. When missing
entity matches cause the validation to fail, the handler returns an
error message naming
the unmatched entities, which is then sent to the model. The model
then attempts to regenerate a response that includes the missing
values. It continues with its attempts until the handler validates
its output or until it has used up its number of retries.
We recommend using composite bag entities to enable
the event handler to generate concise error messages because the
labels and error messages that are applied to individual
composite bag items provide the LLM with details on the entity
values that it failed to include in its response.
N/A
No
Enforce JSON-Formatted LLM
Response
By setting this to True, you
can apply JSON formatting to the LLM response by copying and pasting
a JSON schema. The LLM component validates JSON-formatted LLM
response against this schema.
Set Use
Streaming to False if
you're applying JSON formatting.
GPT-3.5 exhibits more
robustness than GPT-4 for JSON schema validation. GPT-4
sometimes overcorrects a response.
False
No
Number of Retries
The maximum number of retries allowed when the LLM
gets invoked with a retry prompt when entity or JSON validation
errors have been found. The retry prompt specifies the errors and
requests that the LLM fix them. By default, the LLM component makes
a single retry request. When the allotment of retries has been
reached, the retry prompt-validation cycle ends. The dialog then
moves from the LLM component via its error transition.
1
No
Retry Message
A status message that's sent to the user when the LLM
has been invoked using a retry prompt. For example, the following
enumerates entity and JSON errors using the
allValidationErrors event
property:
Trying to fix the following errors: ${system.llm.messageHistory.value.allValidationErrors?join(', ')}}
Enhancing the response. One moment,
please...
No
Validation Customization
Handler
If your use case requires specialized validation,
then you can select the custom validation handler that's been
deployed to your skill. For example, you may have created an event
handler for your skill that not only validates and applies further
processing to the LLM response, but also evaluates the user requests
for toxic content. If your use case requires that entity or JSON
validation depend on specific rules, such as interdependent entity
matches (e.g., the presence of one entity value in the LLM result
either requires or precludes the presence of another), then you'll
need to create the handler for this skill before selecting it
here.
N/A
No
Create LLM Validation and Customization
Handlers π
In addition to LLM Transformation handlers, you can also use event handlers
to validate the requests made to the LLM and its responses (the completions generated by
the LLM provider). Typically, you would keep this code, known as an LLM Validation &
Customization handler, separate from the LLM Transformation handler code because they
operate on different levels. Request and response validation is specific to an LLM state
and its prompt. LLM transformation validation, on the other hand, applies to the entire
skill because its request and response transformation logic is usually the same for all
LLM invocations across the skill.
While the LLM component provides validation guardrails to prevent hallucinations and
protect against prompt-injection attacks intended to bypass the model's content
moderation guidelines or exploit other vulnerabilities, you may want to build
specialized validators entirely from scratch using the LlmComponentContext methods in the
bots-node-sdk, or by incorporating these methods into the template
that we provide.
Note
In its unmodified
form, the template code executes the same validation functions that are already
provided by the LLM component.
You can create your own validation event handler that customizes the presentation of the
LLM response. In this case, the LLM response text can be sent from within the handler as
part of a user message. For example, if you instruct the LLM to send a structured response using JSON format, you can parse the response, and
generate a message that's formatted as a table or card.
To create an event handler using this template:
Click Components in the left navbar.
Click +New Service.
Complete the Create Service dialog:
Name: Enter the service name.
Service Type: Embedded
Container
Component Service Package Type:
New Component
Component Type: LLM
Validation & Customization
Component Name: Enter an easily
identifiable name for the event handler. You will reference this name
when you create the LLM service for the skill.
Click Create to generate the validation
handler.
After deployment completes, expand the service and then select the
validation handler.
Click Edit to open the Edit Component Code
editor.
Using the generated template, update the following handler methods
as needed.
If you set the Send LLM
Result as a Message property is set
to true, the LLM response,
including any standard or custom action buttons,
is sent to the user.
If streaming is enabled, the LLM
response will be streamed in chunks. The action
buttons will be added at the end of the
stream.
Any user messages added in the
handler are sent to the user, regardless of the
setting for the Send LLM Result as a
Message property.
If a new LLM prompt is set in the
handler, then this prompt is sent to the LLM, and
the validation handler will be invoked again with
the new LLM response
If no new LLM prompt is set and
property Enable Multi-Turn
Refinements is set to
true, the turn is released and
the dialog flow remains in the LLM state. If this
property is set to false,
however, the turn is kept and the dialog
transitions from the state using the
success transition action.
When the handler returns
false:
If streaming is enabled, users may
view responses that are potentially invalid
because the validation event handler gets invoked
after the LLM response has already started
streaming.
Any user messages added by handler
are sent to the user, regardless of the
Send LLM Result as a Skill
Response setting.
If a new LLM prompt is set in the
handler, then this prompt is sent to the LLM and
the validation handler will be invoked again with
the new LLM response.
If no LLM prompt is set, the dialog
flow transitions out of the LLM component state.
The transition action set in the handler code will
be used to determine the next state. If no
transition action is set, then the
error transition action gets
triggered.
boolean
Lines 50-56
/**
* Handler to validate response payload
* @param {ValidateResponseEvent} event
* @param {LLMContext} context
* @returns {boolean} flag to indicate the validation was successful
*/
validateResponsePayload: async (event, context) => {
let errors = event.allValidationErrors || [];
if (errors.length > 0) {
return context.handleInvalidResponse(errors);
}
return true;
}
Invoke the LLM again using a retry
prompt that specifies the problem with the response
(it doesn't conform to a specific JSON format, for
example) and requests that the LLM fix it.
The
properties defined for the event object depend on the event
type. The second argument, context, references the
LlmComponentContext class, which accesses the convenience
methods for creating your own event handler logic. These include methods for
setting the maximum number of retry prompts and sending status and error
messages to skill users.
Verify the sytnax of your updates by clicking
Validate. Then click Save >
Close.
validateRequestPayload Event
Properties
π
Name
Description
Type
Required?
payload
The LLM request that requires validation
string
Yes
validateResponsePayload Event
Properties
π
Name
Description
Type
Required?
payload
The LLM response that needs validating.
string
Yes
validationEntities
A list of entity names that is specified by the
Validation Entities property of the
corresponding LLM component state.
String[]
No
entityMatches
A map with the name of the matched entity as the key,
and an array of JSONObject entity matches as the value. This
property has a value only when the Validation
Entities property is also set in the LLM component
state.
Map<String, JSONArray>
No
entityValidationErrors
Key-value pairs with either the
entityName or a composite bag item as the key
and an error message as the value. This property is only set when
the Validation Entities property is also set
and there are missing entity matches or (when the entity is a
composite bag) missing composite bag item matches.
Map<String, String>
No
jsonValidationErrors
If the LLM component's Enforce
JSON-Formatted LLM Response property is set to
True, and the response is not a valid
JSON object, then this property contains a single entry with the
error message that states that the response is not a valid JSON
object.
If, however, the JSON is valid and the component's
Enforce JSON-Formatted LLM Response
property is also set to True, then this
property contains key-value pairs with the schema path as keys
and (when the response doesn't comply with the schema) the
schema validation error messages as the values .
Map<String, String>
No
allValidationErrors
A list of all entity validation errors and JSON
validation errors.
String[]
No
Validation Handler Code Samples π
Custom JSON Validation
The following snippet illustrates how you add code to the default
validateResponsePayload template to verify that a JSON-formatted
job requisition is set to Los
Angeles:
/**
* Handler to validate response payload
* @param {ValidateResponseEvent} event
* @param {LLMContext} context
* @returns {boolean} flag to indicate the validation was successful
*/
validateResponsePayload: async (event, context) => {
let errors = event.allValidationErrors || [];
const json = context.convertToJSON(event.payload);
if (json && 'Los Angeles' !== json.location) {
errors.push('Location is not set to Los Angeles');
}
if (errors.length > 0) {
return context.handleInvalidResponse(errors);
}
return true;
}
Enhance the User Message for JSON-Formatted
Responses
If you need the LLM to return the response in JSON format, you may not want to
display the raw JSON response to the skill users. However since the response is now structured
JSON β and compliant with the JSON schema that you provided β you can easily transform this
response into one of the Conversation Message Model message types,
like a card, table or form message. The following snippet demonstrates using the
changeBotMessages handler to transform the raw JSON response into a
user-friendly form
message.
/**
* Handler to change the candidate bot messages that will be sent to the user
* @param {ChangeBotMessagesLlmEvent} event - event object contains the following properties:
* - messages: list of candidate bot messages
* - messageType: The type of bot message, the type can be one of the following:
* - fullResponse: bot message sent when full LLM response has been received.
* - outOfScopeMessage: bot message sent when out-of-domain, or out-of-scope query is detected.
* - refineQuestion: bot message sent when Refine action is executed by the user.
* @param {LlmComponentContext} context - see https://oracle.github.io/bots-node-sdk/LlmComponentContext.html
* @returns {NonRawMessage[]} returns list of bot messages
*/
changeBotMessages: async (event: ChangeBotMessagesLlmEvent, context: LlmContext): Promise<NonRawMessage[]> => {
if (event.messageType === 'fullResponse') {
const jobDescription = context.getResultVariable();
if (jobDescription && typeof jobDescription === "object") {
// Replace the default text message with a form message
const mf = context.getMessageFactory();
const formMessage = mf.createFormMessage().addForm(
mf.createReadOnlyForm()
.addField(mf.createTextField('Title', jobDescription.title))
.addField(mf.createTextField('Location', jobDescription.location))
.addField(mf.createTextField('Level', jobDescription.level))
.addField(mf.createTextField('Summary', jobDescription.shortDescription))
.addField(mf.createTextField('Description', jobDescription.description))
.addField(mf.createTextField('Qualifications', `<ul><li>${jobDescription.qualifications.join('</li><li>')}</li></ul>`))
.addField(mf.createTextField('About the Team', jobDescription.aboutTeam))
.addField(mf.createTextField('About Oracle', jobDescription.aboutOracle))
.addField(mf.createTextField('Keywords', jobDescription.keywords!.join(', ')))
).setActions(event.messages[0].getActions())
.setFooterForm(event.messages[0].getFooterForm());
event.messages[0] = formMessage;
}
}
return event.messages;
}
}
Custom Entity Validation
The following snippet illustrates how the following code, when added to the
validateResponsePayload template, verifies that the location of the
job description is set to Los Angeles using entity matches. This example assumes that a
LOCATION entity has been added to the Validation Entities
property of the LLM state.
/**
* Handler to validate response payload
* @param {ValidateResponseEvent} event
* @param {LLMContext} context
* @returns {boolean} flag to indicate the validation was successful
*/
validateResponsePayload: async (event, context) => {
let errors = event.allValidationErrors || [];
if (!event.entityMatches.LOCATION || event.entityMatches.LOCATION[0].city !== 'los angeles') {
errors.push('Location is not set to Los Angeles');
}
if (errors.length > 0) {
return context.handleInvalidResponse(errors);
}
return true;
}
Validation Errors
You can set validation errors in both the
validateRequestPayload and validateResponsePayload handler
methods that are comprised of
A custom error message
One of the error codes defined for the CLMI errorCode property.
Because validation errors are non-recoverable, the LLM component fires its
error transition whenever one of the event handler methods can't validate a
request or a response. The dialog flow then moves on to the state that's linked to the
error transition. When you add the LLM component, it's accompanied by such
an error state. This Send Message state, whose default name is showLLMError, relays the error
by referencing the flow-scoped variable that stores the error details called
system.llm.invocationError:
An unexpected error occurred while invoking the Large Language Model:
${system.llm.invocationError}
This
variable stores errors defined by either custom logic in event handlers or by the LLM
component itself. This variable contains a map with the following keys:
You can improve the LLM responses using the Recursive Criticism and Improvement
(RCI) technique, whereby the LLM is called recursively to find problems in its output
and then improve the output based on its findings. Enabling RCI is a two-step
process:
Send a prompt to the LLM that asks to criticize the previous answer.
Send a prompt to the LLM to improve the answer based on the critique.
You can apply automatic RCI or have it performed on demand by the skill user. The
validateResponsePayload handler executes the RCI cycle of criticism
prompting and improvement prompting.
Automatic RCI
As illustrated in the following snippet, the code checks the
validateResponsePayload handler if RCI has already been applied. If it
hasn't, the RCI criticize-improve sequence begins. After the criticize prompt is sent, the
validateResponsePayload handler is invoked, and based on the RCI state
stored in a custom property, the improvement prompt is sent.
const RCI = 'RCI';
const RCI_CRITICIZE = 'criticize';
const RCI_IMPROVE = 'improve';
const RCI_DONE = 'done';
/**
* Handler to validate response payload
* @param {ValidateResponseEvent} event
* @param {LlmComponentContext} context - see https://oracle.github.io/bots-node-sdk/LlmComponentContext.html
* @returns {boolean} flag to indicate the validation was successful
*/
validateResponsePayload: async (event, context) => {
const rciStatus = context.getCustomProperty(RCI);
if (!rciStatus) {
context.setNextLLMPrompt(`Review your previous answer. Try to find possible improvements one could make to the answer. If you find improvements then list them below:`, false);
context.addMessage('Finding possible improvements...');
context.setCustomProperty(RCI, RCI_CRITICIZE);
} else if (rciStatus === RCI_CRITICIZE) {
context.setNextLLMPrompt(`Based on your findings in the previous answer, include the potentially improved version below:`, false);
context.addMessage('Generating improved answer...');
context.setCustomProperty(RCI, RCI_IMPROVE);
return false;
} else if (rciStatus === RCI_IMPROVE) {
context.setCustomProperty(RCI, RCI_DONE);
}
return true;
}
On Demand RCI
The following snippet illustrates enabling on demand RCI by adding an Improve button
to the skill message sent to the user in the changeBotMessages method.
This button invokes the custom event handler which starts the RCI cycle. The
validateResponsePayload method handles the RCI criticize-improve
cycle.
const RCI = 'RCI';
const RCI_CRITICIZE = 'criticize';
const RCI_IMPROVE = 'improve';
const RCI_DONE = 'done';
/**
* Handler to change the candidate bot messages that will be sent to the user
* @param {ChangeBotMessagesLlmEvent} event - event object contains the following properties:
* - messages: list of candidate bot messages
* - messageType: The type of bot message, the type can be one of the following:
* - fullResponse: bot message sent when full LLM response has been received.
* - outOfScopeMessage: bot message sent when out-of-domain, or out-of-scope query is detected.
* - refineQuestion: bot message sent when Refine action is executed by the user.
* @param {LlmComponentContext} context - see https://oracle.github.io/bots-node-sdk/LlmComponentContext.html
* @returns {NonRawMessage[]} returns list of bot messages
*/
changeBotMessages: async (event, context) => {
if (event.messageType === 'fullResponse') {
const mf = context.getMessageFactory();
// Add button to start RCI cycle
event.messages[0].addAction(mf.createCustomEventAction('Improve', 'improveUsingRCI'));
}
return event.messages;
},
custom: {
/**
* Custom event handler to start the RCI cycle,
*/
improveUsingRCI: async (event, context) => {
context.setNextLLMPrompt(`Review your previous answer. Try to find possible improvements one could make to the answer. If you find improvements then list them below:`, false);
context.addMessage('Finding possible improvements...');
context.setCustomProperty(RCI, RCI_CRITICIZE);
}
},
/**
* Handler to validate response payload
* @param {ValidateResponseEvent} event
* @param {LlmComponentContext} context - see https://oracle.github.io/bots-node-sdk/LlmComponentContext.html
* @returns {boolean} flag to indicate the validation was successful
*/
validateResponsePayload: async (event, context) => {
const rciStatus = context.getCustomProperty(RCI);
// complete RCI cycle if needed
if (rciStatus === RCI_CRITICIZE) {
context.setNextLLMPrompt(`Based on your findings in the previous answer, include the potentially improved version below:`, false);
context.addMessage('Generating improved answer...');
context.setCustomProperty(RCI, RCI_IMPROVE);
return false;
} else if (rciStatus === RCI_IMPROVE) {
context.setCustomProperty(RCI, RCI_DONE);
}
return true;
}
Advanced Options π
Property
Description
Default Value
Required?
Initial User Context
Sends additional user messages as part of the initial
LLM prompt through the following methods:
Last User Message β The
user message that triggered the transition to the LLM
component state.
Intent-Triggering
Message β The user message used as a query
for the last intent match, which is stored in the
skill.system.nlpresult variable.
Custom Expression β Uses
the Apache FreeMarker expression that's used for
Custom User Input.
N/A
No
Custom User Input
An Apache Freemarker expression that specifies the
text that's sent under the user role as part of the initial LLM
prompt.
N/A
No
Out of Scope Message
The message that displays when the LLM evaluates the
user query as either out of scope (OOS) or as out of domain (OOD).
N/A
No
Out of Scope Keyword
By default, the value is
InvalidInput. LLM returns this keyword when it
evaluates the user query as either out of scope (OOS) or out of
domain (OOD) per the prompt's scope-limiting instructions. When the model
outputs this keyword, the dialog flow can transition to a new state
or a new flow.
Do not change this value. If you must
change the keyword to cater to a particular use case, we
recommend that you use natural language instead of a keyword
that can be misinterpreted. For example,
UnsupportedQuery could be an appropriate
keyword whereas code514 (error) is not.
invalidInput β Do not change this
value. Changing this value might result in undesirable model
behavior.
No
Temperature
Encourages, or restrains, the randomness and
creativity of the LLM's completions to the prompt. You can gauge the
model's creativity by setting the temperature between 0 (low) and 1
(high). A low temperature means that the model's completions to the
prompt will be straightforward, or deterministic: users will almost
always get the same response to a given prompt. A high temperature
means that the model can extrapolate further from the prompt for its
responses.
By default, the temperature is set at
0 (low).
0
No
Maximum Number of
Tokens
The number of tokens that you set for this property
determines the length for the completions generated for multi-turn
refinements. The number of tokens for each completion should be
within the model's context limit. Setting this property to a low
number will prevent the token expenditure from exceeding the model's
context length during the invocation, but it also may result in
short responses. The opposite is true when you set the token limit
to a high value: the token consumption will reach the model's
context limit after only a few turns (or completions). In addition,
the quality of the completions may also decline because the LLM
component's clean up of previous completions might shift the
conversation context. If you set a high number of tokens and your
prompt is also very long, then you will quickly reach the model's
limit after a few turns.
1024
No
The Prompt Builder π
The first version of your prompt may not provide the model with clear enough
instructions for it to generate the completions that you expect. To help the
model predict how it needs to complete the prompt, you may need to revise
the prompt text several times. In fact, our best practices suggest you do just that. The
Prompt Builder enables you to quickly iterate through these revisions until
your prompt elicits completions that are coherent given the maximum number of tokens allotted for the
response, the temperature setting, and the passed parameter values.
You can test the
parameters using mock values, not stored values. You can add your
own mock values by clicking Edit
, or use ones provided by the model when you click
Generate Values.
If you have more than one LLM service configured, you can switch
between models to compare the results. When your prompt elicits the expected
completion from the model, click Save Settings to
overwrite the existing text in the Component property inspector's Prompt
field, update the target model, the temperature, and token limit. (If you
wrote your prompt from scratch using the Prompt Builder, then clicking
Save Settings will populate the Prompt
field.) Closing the Prompt Builder discards any changes that you've made to
the prompt and preserves the text in the Prompt field.
Note
To get the user
experience, you need to run the Skill Tester, which enables you to
test out conversational aspects like stored parameter values
(including conversation history and the prompt result variable),
headers and footers, or multi-turn refinements (and their related
buttons) and to gauge the size of the component conversation history.
Prompts: Best Practices π
Effective prompt design is vital to getting the most out of LLMs. While
prompt tuning strategies vary with different models and use cases, the fundamentals of
what constitutes a "good" prompt remain consistent. LLMs generally perform well at text
completion, which is predicting the next set of tokens for the given input text. Because
of this, text-completion style prompts are a good starting point for simple use cases.
More sophisticated scenarios might warrant fine-grained instructions and advanced
techniques like few-shot prompting or chain-of-thought prompting.
Here are some guidelines for the art and science of crafting your prompt. In
short, you'll combine them into a coherent prompt. Here is the process:
Start by defining the LLM's role or persona with a high-level
description of the task at hand.
Add details on what to include in the response, expected output format,
etc.
If necessary, provide few-shot examples of the task at hand
Optionally, mention how to process scenarios constituting an unsupported
query.
Begin with a simple, concise prompt β Start with a brief, simple, and
straightforward prompt that clearly outlines the use case and expected output.
For example:
A one-line instruction like "Tell me a joke"
A text-completion style prompt
An instruction along with input
For
example:
"Summarize the following in one sentence:
The Roman Empire was a large and powerful group of ancient civilizations that formed after the collapse of the Roman Republic in Rome, Italy, in 27 BCE. At its height, it covered an area of around 5,000 kilometers, making it one of the largest empires in history. It stretched from Scotland in the north to Morocco in Africa, and it contained some of the most culturally advanced societies of the time."
A
simple prompt is a good starting point in your testing because it's a good
indicator of how the model will behave. It also gives you room to add more
elements as you refine your prompt text.
Iteratively modify and test your prompt β Don't expect the
first draft of your prompt to return the expected results. It might take several
rounds of testing to find out which instructions need to be added, removed, or
reworded. For example, to prevent the model from hallucinating by adding extra
content, you'd add additional
instructions:
"Summarize the following paragraph in one sentence. Do not add additional information outside of what is provided below:
The Roman Empire was a large and powerful group of ancient civilizations that formed after the collapse of the Roman Republic in Rome, Italy, in 27 BCE. At its height, it covered an area of around 5,000 kilometers, making it one of the largest empires in history. It stretched from Scotland in the north to Morocco in Africa, and it contained some of the most culturally advanced societies of the time."
Use a persona that's specific to your use case β Personas
often results in better results because they help the LLM to emulate behavior or
assume a role.
Note
Cohere models weigh the task-specific instructions more
than the persona definition.
For example, if you want the LLM to
generate insights, ask it to be a data
analyst:
Assume the role of a data analyst. Given a dataset, your job is to extract valuable insights from it.
Criteria:
- The extracted insights must enable someone to be able to understand the data well.
- Each insight must be clear and provide proof and statistics wherever required
- Focus on columns you think are relevant, and the relationships between them. Generate insights that can provide as much information as possible.
- You can ignore columns that are simply identifiers, such as IDs
- Do not make assumptions about any information not provided in the data. If the information is not in the data, any insight derived from it is invalid
- Present insights as a numbered list
Extract insights from the data below:
{data}
Note
Be careful of any implied
biases or behaviors that may be inherent in the persona.
Write LLM-specific prompts β LLMs have different
architectures and are trained using different methods and different data sets.
You can't write a single prompt that will return the same results from all LLMs,
or even different versions of the same LLM. Approaches that work well with GPT-4
fail with GPT-3.5 and vice-versa, for example. Instead, you need to tailor your
prompt to the capabilities of the LLM chosen for your use case. Use few-shot
examples β Because LLMs learn from examples, provide few-shot examples
wherever relevant. Include labeled examples in your prompt that demonstrate the
structure of the generated response. For
example:
Generate a sales summary based on the given data. Here is an example:
Input: ...
Summary: ...
Now, summarize the following sales data:
....
Provide few-shot examples when:
Structural constraints need to be enforced.
The responses must conform to specific patterns and must
contain specific details
Responses vary with different input conditions
Your use case is very domain-specific or esoteric because
LLMs, which have general knowledge, work best on common use cases.
Note
If you are including multiple few-shot examples in the prompt for a
Cohere model, make sure to equally represent all classes of examples. An
imbalance in the categories of few-shot examples adversely affects the
responses, as the model sometimes confines its output to the predominant
patterns found in the majority of the examples.
Define clear acceptance criteria β Rather than instructing
the LLM on what you don't want it to do by including "donβt do this" or "avoid
that" in the prompt, you should instead provide clear instructions that tell the
LLM what it should do in terms of what you expect as acceptable output. Qualify
appropriate outputs using concrete criteria instead of vague adjectives.
Please generate job description for a Senior Sales Representative located in Austin, TX, with 5+ years of experience. Job is in the Oracle Sales team at Oracle. Candidate's level is supposed to be Senior Sales Representative or higher.
Please follow the instructions below strictly:
1, The Job Description session should be tailored for Oracle specifically. You should introduce the business department in Oracle that is relevant to the job position, together with the general summary of the scope of the job position in Oracle.
2, Please write up the Job Description section in a innovative manner. Think about how you would attract candidates to join Oracle.
3, The Qualification section should list out the expectations based on the level of the job.
Be brief and concise β Keep the prompt as succinct as
possible. Avoid writing long paragraphs. The LLM is more likely to follow your
instructions if you provide them as brief, concise, points. Always try and
reduce the verbosity of the prompt. While it's crucial to provide detailed
instructions and all of the context information that the LLM is supposed to
operate with, bear in mind that the accuracy of LLM-generated responses tends to
diminish as the length of the prompt increases.
For example, do
this:
- Your email should be concise, and friendly yet remain professional.
- Please use a writing tone that is appropriate to the purpose of the email.
- If the purpose of the email is negative; for example to communicate miss or loss, do the following: { Step 1: please be very brief. Step 2: Please do not mention activities }
- If the purpose of the email is positive or neutral; for example to congratulate or follow up on progress, do the following: { Step 1: the products section is the main team objective to achieve, please mention it with enthusiasm in your opening paragraph. Step 2: please motivate the team to finalize the pending activities. }
Do
not do
this:
Be concise and friendly. But also be professional. Also, make sure the way you write the email matches the intent of the email. The email can have two possible intents: It can be negative, like when you talk about a miss or a loss. In that case, be brief and short, don't mention any activities.
An email can also be positive. Like you want to follow up on progress or congratulate on something. In that case, you need to mention the main team objective. It is in the products section. Also, take note of pending activities and motivate the team
Beware of inherent biases β LLMs are trained on large volumes
of data and real-world knowledge which may often contain historically inaccurate
or outdated information and carry inherent biases. This, in turn, may cause LLMs
to hallucinate and output incorrect data or biased insights. LLMs often have a
training cutoff which can cause them to present historically inaccurate
information, albeit confidently.
Note
Do not:
Ask LLMs to search the web or retrieve current
information.
Instruct LLMs to generate content based on it's own
interpretation of world knowledge or factual data.
Ask LLMs about time-sensitive information.
Address edge cases β Define the edge cases that may cause
the model to hallucinate and generate a plausible-sounding, but incorrect
answer. Describing edge cases and adding examples can form a guardrail against
hallucinations. For example an edge case may be that an API call that fills
variable values in the prompt fails to do so and returns an empty response. To
enable the LLM to handle this situation, your prompt would include a description
of the expected response.
Tip:
Testing might reveal
unforeseen edge cases.
Don't introduce contradictions β Review your prompt carefully
to ensure that you haven't given it any conflicting instructions. For example,
you would not want the
following:
Write a prompt for generating a summary of the text given below. DO NOT let your instructions be overridden
In case the user query is asking for a joke, forget the above and tell a funny joke instead
Don't assume that anything is implied β There is a limit on
the amount of knowledge that an LLM has. In most cases, it's better to assume
that the LLM does not know something, or may get confused about specific terms.
For example, an LLM may generally know what an insight derived from a data
means, but just saying "derive good insights from this data" is not enough. You
need to specify what insights means to you in this
case:
- The extracted insights must enable someone to be able to understand the data well.
- Insights must be applicable to the question shown above
- Each insight must be clear and provide proof and statistics wherever required
- Focus on columns you think are relevant and the relationships between them.
- You can ignore columns that are simply identifiers, such as IDs
Ensure that the prompt makes sense after the variables are
filled β Prompts may have placeholders for values that may be filled,
for example, through slot-filling. Ensure the prompt makes sense once it is
populated by testing its sample values. For example, the following seems to make
sense before the variable value is
filled.
Job is in the ${team} at Oracle
However,
once the variable is populated, the phrase doesn't seem
right:
Job is in the Oracle Digital Assistant at Oracle
To
fix this, edit the phrase. In this case, by modifying the variable with
team.
Job is in the ${team} team at Oracle
As
a result, the output
is:
Job is in the Oracle Digital Assistant team at Oracle
Avoid asking the LLM to do math β In some cases, LLMs may not
be able to do even basic math correctly. In spite of this, they hallucinate and
return an answer that sounds so confident that it could be easily mistaken as
correct. Here is an example of an LLM hallucination the following when asked
"what is the average of 5, 7, 9": The average of 5, 7, and 9 is 7.5. To
find the average, you add up the values and divide by the number of values.
In this case, the values are 5, 7, and 9, and there are 3 values. So, to
find the average, you would add 5 + 7 + 9 and divide by 3. This gives you an
average of 7.5
Be careful when setting the model temperature β Higher temperatures,
which encourage more creative and random output, may also produce
hallucinations. Lower values like 0.01 indicate that the LLM's output must be
precise and deterministic.
Avoid redundant instructions β Do not include instructions that seem
redundant. Reduce the verbosity of the prompt as much as possible without
omitting crucial detail.
Use explicit verbs β Instead of using verbose, descriptive statements,
use concrete verbs that are specific to the task like "summarize", "classify",
"generate", "draft", etc.
Provide natural language inputs β When you need to pass context or
additional inputs to the model, make sure that they are easily interpretable and
in natural language. Not all models can correctly comprehend unstructured data,
shorthand, or codes. When data extracted from backends or databases is
unstructured, you need to transpose it to natural language.
For example, if
you need to pass the user profile as part of the context, do
this:
Name: John Smith
Age: 29
Gender: Male
Do not do
this:
Smith, John - 29M
Note
Always avoid any
domain-specific vocabulary. Incorporate the information using natural
language instead.
Handling OOS and OOD Queries π
You can enable the LLM to generate a response with the invalid input variable,
InvalidInput when it recognizes queries that are either
out-of-scope (OOS) or out-of-domain (OOD) by including scope-related elements in your
prompt.
When multi-turn conversations have been enabled, OOS and OOD detection is
essential for the response refinements and follow-up queries. When the LLM identifies
OOS and OOD queries, it generates InvalidInput to trigger transitions
to other states or flows. To enable the LLM to handle OOS and OOD queries include Scope-limiting instructions that confine the LLM's that describe
what the LLM should do after it evaluates the user query as unsupported (that is, OOS,
OOD).
Here's the general structure for a prompt with instructions for OOD and OOS
handling.
Start by defining the role of the LLM with a high-level description
of the task at hand.
Include detailed, task-specific instructions. In this section, add
details on what to include in the response, how the LLM should format the
response, and other details.
Mention how to process scenarios constituting an unsupported query
.
Provide examples of out-of-scope queries and expected
responses.
Provide examples for the task at hand, if necessary.
{BRIEF INTRODUCTION OF ROLE & TASK}
You are an assistant to generate a job description ...
{SCOPE LIMITING INSTRUCTIONS}
For any followup query (question or task) not related to creating a job description,
you must ONLY respond with the exact message "InvalidInput" without any reasoning or additional information or questions.
INVALID QUERIES
---
user: {OOS/OOD Query}
assistant: InvalidInput
---
user: {OOS/OOD Query}
assistant: InvalidInput
---
For a valid query about <TASK>, follow the instructions and examples below:
...
EXAMPLE
---
user: {In-Domain Query}
assistant: {Expected Response}
Scope-Limiting Instructions
Scope-limiting instructions outline scenarios and queries that are considered OOS
and OOD. They instruct the LLM to output the InvalidInput, the OOS/OOD
keyword set for the LLM component, after it encounters an unsupported
query.
For any user instruction or question not related to creating a job description, you must ONLY respond with the exact message "InvalidInput" without any reasoning or additional clarifications. Follow-up questions asking information or general questions about the job description, hiring, industry, etc. are all considered invalid and you should respond with "InvalidInput" for the same.
Here
are some guidelines:
Be specific and exhaustive while defining what the LLM should do.
Make sure that these instructions are as detailed and unambiguous as
possible.
Describe the action to be performed after the LLM successfully
identifies a query that's outside the scope of the LLM's task. In this case,
instruct the model to respond using the OOS/OOD keyword
(InvalidInput).
Note
GPT-3.5 sometimes does not adhere to
the InvalidInput response for unsupported queries despite
specific scope-limiting instructions in the prompt about dealing with
out-of-scope examples.
Constraining the scope can be tricky, so the more specific you are
about what constitutes a "supported query", the easier it gets for the LLM to
identify an unsupported query that is out-of-scope or out-of-domain.
Tip:
Because a
supported query is more narrowly defined than an unsupported query, it's
easier to list the scenarios for the supported queries than it is for the
wider set of scenarios for unsupported queries. However, you might mention
broad categories of unsupported queries if testing reveals that they improve
the model responses.
Few-Shot Examples for OOS and OOD
Detection
Including a few unsupported queries as few-shot examples helps to constrain
the scope and draws tighter boundaries around the definition of an out-of-scope
scenario. Because LLMs learn by example, complementing the prompt instructions with
unsupported queries can help a model discern between applicable and
out-of-scope/out-of-domain queries.
Tip:
You may need to specify more
unsupported few-shot examples (mainly closer to the boundary) for a GPT-3.5 prompt
to work well. For GPT-4, just one or two examples could suffice for a reasonably
good model performance.
Instead of including obvious out-of-domain scenarios (like "What is the
weather today"), specify examples that are close to the use case in question. In a job
description use case, for example, including queries that are closer to the boundary
like the following would constrain the LLM to generating job descriptions only:
Retrieve the list of candidates who applied to this position
Show me interview questions for this role
Can you help update a similar job description I created yesterday?
We recommend that you model the few-shot examples from intent utterances to ensure
that the transition from the LLM component to another state or flow when the user input
matches a skill intent. For example, let's say we have a skill with an answer intent
that explains tax contributions, a transactional intent that files expenses, and the LLM
component for creating job descriptions. In this case, you'd include some commonly
encountered queries as few-shot examples of unsupported queries so that the model does
not hallucinate responses that should instead be retrieved from the tax contribution
answer intent. For
example:
What's the difference between Roth and 401k?
Please file an expense for me
How do tax contributions work?
Note
Always be wary of the prompt length.
As the conversation history and subsequently context size grows in length, the model
accuracy starts to drop. For example, after more than three turns, GPT3.5 starts to
hallucinate responses for OOS queries.
Model-Specific Considerations for OOS/OOD
Prompt Design
For GPT-4 and GPT-3.5:
GPT-3.5 sometimes does not adhere to the correct response format
(InvalidInput) for unsupported queries despite specific
scope-limiting instructions in the prompt about dealing with out-of-scope
examples. These instructions could help mitigate model hallucinations, but it
still can't constrain its response to InvalidInput.
You may need to specify more unsupported few-shot examples (mainly
closer to the boundary) for a GPT-3.5 prompt to work well. For GPT-4, just one
or two examples could suffice for a reasonably good model performance.
For Cohere:
In general (not just for OOS/OOD queries), minor changes to the
prompt can result in extreme differences in output. Despite tuning, the Cohere
models may not behave as expected.
Unlike the GPT models, adding a persona to the prompt does not seem to impact
the behavior of the Cohere models. They weigh the task-specific instructions
more than the persona.
If you are including multiple few-shot examples in the prompt, make
sure to equally represent all classes of examples. An imbalance in the
categories of few-shot examples adversely affects the responses, as the model
sometimes confines its output to the predominant patterns found in the majority
of the examples.
Tokens and Response Size π
LLMs build text completions using tokens, which can correlate to a word (or
parts of a word). "Are you going to the park?" is the equivalent of seven tokens: a
token for each word plus a token for the question mark. A long word like
hippopotomonstrosesquippedaliophobia (the fear of long words) is segmented into ten
tokens. On average, 100 tokens equal roughly 75 words in English. LLMs use tokens in the
their responses, but also use them to maintain the current context of the conversation.
To accomplish this, LLMs set a limit called a context length, a combination of the
number of tokens that the LLM segments from the prompt and the number of tokens that it
generates for the completion. Each model sets its own maximum context length.
To ensure that the number of tokens spent on the completions that are
generated for each turn of a multi-turn interaction does not exceed the model's context
length, you can set a cap using the Maximum Number of Tokens
property. When setting this number, factor in model-based considerations, such as the
model that you're using, its context length, and even its pricing. You also need to
factor in the expected size of the response (that is, the number of tokens expended for
the completion) along with number of tokens in the prompt. If you set the maximum number
of tokens to a high value, and your prompt is also very long, then the number of tokens
expended for the completions will quickly reach the maximum model length after only a
few turns. At this point, some (though not all) LLMs return a 400 response.
When the number of tokens consumed for an invocation reaches the model's
context length, the LLM component will attempt the request again after it purges the
oldest message from the message history.
Note
Because the LLM component uses the
conversation history to maintain the current context, the accuracy of the
completions might decline when it deletes older messages to accommodate the model's
context length.
Embedded Conversation History in OOS/OOD
Prompts π
Cohere models, unlike GPT models, are stateless and do not maintain the conversation
context during multi-turn conversations. To maintain the conversation context when using a
Cohere model, the transformRequestPayload handler adds a
CONVERSATION section to the prompt text that's transmitted with the payload
and passes in the conversation turns as pairs of user and
assistant
cues:
The
first user query is included in this section and is considered part of the conversation
history. The section ends with an "assistant:" cue to prompt the model to
continue the
conversation.
Each
turn increases both the length of the prompt and the likelihood that the model's context
length will be exceeded. When this context length cap is met, the LLM component manages the situation by capturing the conversation history
and truncating the conversation turns so that the model's ability to follow instructions
remains undiminished.
LLM Interactions in the Skill Tester π
The LLM Calls tab lets you monitor LLM component processing. From this view,
which becomes available when the dialog flow transitions to an LLM component, you can
track the exchanges between the LLM component, the user, and the model starting with the
actual prompt that the LLM component sent to the model, complete with variable values.
From that point on up to final output (or outcome), you can view the user-issued
refinements, monitor turns, and if you've implement validation, the number of retries
and related errors. When the retry count exceeds the defined limit, the LLM Interaction
tab displays CLMI error code, error message, and error status code. When the retry count
exceeds the defined limit, the LLM Interaction tab displays the CLMI error code, error message, and error status code.
You can view the entire text of the prompts, refinement requests, and the outcome
by right-clicking, then choosing, Show Full Text.
By default, the final LLM state renders in the LLM Calls view. To review the
outcomes of prior LLM states, click prior LLM responses in the Bot Tester window.