Skip to content

AI Agent doesn't store the Tool usages in memory #14361

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
fjrdomingues opened this issue Apr 2, 2025 · 19 comments
Open

AI Agent doesn't store the Tool usages in memory #14361

fjrdomingues opened this issue Apr 2, 2025 · 19 comments
Labels
in linear Issue or PR has been created in Linear for internal review

Comments

@fjrdomingues
Copy link

Bug Description

The current implementation of the AI Agent and the Memory nodes stores only the input and output messages, not the Tool messages.

Why is this important?
Have you noticed the agent claiming that it called a tool but didn't? Despite the flaws of the models, this is greatly aggravated by this problem. The context window gets filled with messages where the user asks for an action, the AI replies with success and the user replies with positive feedback. Without the tool messages, the LLM will learn the pattern and repeat it, so the next time it won't call a tool and instead will just reply directly to the user.

I have a fork with a working suggestion on how to fix it: fjrdomingues@1af9450

To Reproduce

Use the Simple Memory node, or the Postgres one (the only ones I tested) and check the messages that were saved. The tool calls are always an empty array, on save and load.

Expected behavior

The tools_call array should be populated when saving memories

Operating System

NA

n8n Version

1.83.2

Node.js Version

20.18.3

Database

PostgreSQL

Execution mode

main (default)

@Joffcom
Copy link
Member

Joffcom commented Apr 2, 2025

Hey @fjrdomingues,

We have created an internal ticket to look into this which we will be tracking as "GHC-1434"

@Joffcom Joffcom added the in linear Issue or PR has been created in Linear for internal review label Apr 2, 2025
@Joffcom
Copy link
Member

Joffcom commented Apr 2, 2025

Hey @fjrdomingues,

Is this a bug or an enhancement request?

@Joffcom Joffcom added the Needs Feedback Waiting for further input or clarification. label Apr 2, 2025
@fjrdomingues
Copy link
Author

Hey @Joffcom If there's such a category then it may fit it better - enhancement request.

@Joffcom Joffcom removed the Needs Feedback Waiting for further input or clarification. label Apr 2, 2025
@davidsula
Copy link

I totally agree. I want my AI to remember the output of a tool call, which currently stays empty. For now agents only seem to remember the output of tool calls within the response they are called with.
AND YOU ARE SO RIGHT. My AI agent will call the tool the first two times it is meant to, and then eventually just stop when it needs to still be calling it, because it learns the pattern that there were no tool calls before even though there were.

@GuillaumeRoy
Copy link

I'd suggest this should rank much higher than "enhancement request". Consider this scenario:

  • You're interacting with an agent. It creates a record in a data store at your request via a tool. Let's say, saving a contact. This returns a contact id.
  • N messages later in the memory window, you ask the AI to add an email address to that contact.
  • With the current state of things, it has no knowledge of the id returned by the tool and will require a search first to reacquire that id.

When working directly with langchain, that tool output including the id would be in the memory context and no round-trip would be necessary.

This is significantly holding back agent tool usage via n8n IMHO. Working with langchain directly this was never an issue but it does introduce concerns around managing your tool verbosity vs filling the context with a lot of tokens.

@Merlin-Richter
Copy link

Yes, this is a bug request for sure.

Many applications of AI agents don't work at the moment because of this issue.
There are many many problems that come downstream from this.

Multi-turn Agents with tools are currently not working.

@GuillaumeRoy
Copy link

Right now I'm working around this by having the tools manually insert in the memory using the memory manager, but it's an ugly patch with many pitfalls.

Not sure what the proper etiquette is here to get an update and/or more eyes on this?

@GuillaumeRoy
Copy link

Hi @Joffcom,

Just want to bring this to your attention. In addition to the example I've given above, I am now seeing multiple models hallucinating tool calls due to this shortcoming.

  • User request leads to a tool call.
  • Agent executes tool call (nothing persisted to memory, neither input nor output).
  • Agent replies to user to indicate tool was called.
  • User requests another tool call.
  • Agent does not see prior tool call input and output, only the conversation about it. Based on that pattern, it learns that replying about a tool call (not making it) is sufficient to satisfy user request.
  • Agent replies to the user to indicate tool was called, without having actually called the tool, as this fits with the precedent established in its memory.

IMHO this is totally kneecapping n8n's agents when compared to straight Langchain/Langgraph implementations.

@civilcoder55
Copy link

Totally agree with you @GuillaumeRoy

@davidsula
Copy link

Hi @Joffcom,

Just want to bring this to your attention. In addition to the example I've given above, I am now seeing multiple models hallucinating tool calls due to this shortcoming.

  • User request leads to a tool call.
  • Agent executes tool call (nothing persisted to memory, neither input nor output).
  • Agent replies to user to indicate tool was called.
  • User requests another tool call.
  • Agent does not see prior tool call input and output, only the conversation about it. Based on that pattern, it learns that replying about a tool call (not making it) is sufficient to satisfy user request.
  • Agent replies to the user to indicate tool was called, without having actually called the tool, as this fits with the precedent established in its memory.

IMHO this is totally kneecapping n8n's agents when compared to straight Langchain/Langgraph implementations.

A perfect explanation of the issue. Alongside this, it would be better for the AI to remember the output of a tool call, as there is some info from the tool output that the model might not respond with in the initial tool call that may be relevant later on.

@caiosa1337
Copy link

I encountered a similar issue while building an appointment scheduling system via API using an AI agent in n8n. At one point, the agent would call a tool that returned crucial data like available staff IDs and time slots. Initially, everything worked fine — the AI had access to those values in the current turn and could make correct suggestions.

The problem started when the customer confirmed an option in a later turn. At that point, the AI no longer had access to the previous tool response, and it started "making up" IDs and times because that data was no longer available in context.

The root cause was realizing that tool outputs are not automatically persisted or injected into the prompt context across turns. And since n8n lets us configure the memory window, even if the tool response is saved to memory, it may be forgotten quickly as the conversation grows.

My solution was to manually inject all essential data (like IDs, times, names) into the prompt so it would remain available across turns. This worked and made the system more reliable — but it increased prompt complexity and maintenance.

It would be extremely useful if n8n provided an option to persist tool outputs directly into the prompt context, as a kind of “prompt extension,” without relying solely on memory. That would reduce complexity and help avoid fragile behavior in multi-turn flows.

@Merlin-Richter
Copy link

This needs to be fixed asap. I am currently doing a workaround with normal HTTP requests and firestore. It's so ugly, lol.
But the AI Agent is currently unusable with tools.

@gradox2020
Copy link

Without this fix, Agent AI with tools is not usable in real workflows. The agent does not retain tool output and therefore cannot reliably act on previous results. Please consider increasing the priority of this issue. Thank you!

@kshwetabh
Copy link

Right now I'm working around this by having the tools manually insert in the memory using the memory manager, but it's an ugly patch with many pitfalls.

Not sure what the proper etiquette is here to get an update and/or more eyes on this?

Can you please provide more details on how you achieved this using memory manager? I tried but failed pathetically.

@GuillaumeRoy
Copy link

Using Redis memory and workflow tool nodes, at the end of the subworkflow I was injecting a message directly into the redis context using the memory manager insert functionality. It kinda sucks because 1) limited applicability 2) manual 3) still can lead to hallucinations 4) message ordering and type is not exactly what it should be.

I ended up implementing my own memory layer as a stopgap and I'm moving serious agent use cases away from n8n 💔

@Nervo24
Copy link

Nervo24 commented May 15, 2025

Using Redis memory and workflow tool nodes, at the end of the subworkflow I was injecting a message directly into the redis context using the memory manager insert functionality. It kinda sucks because 1) limited applicability 2) manual 3) still can lead to hallucinations 4) message ordering and type is not exactly what it should be.

I ended up implementing my own memory layer as a stopgap and I'm moving serious agent use cases away from n8n 💔

I tried the same using PostgreSQL but by inserting the data into memory at the sub-workflow end the tool response is inserted before the user prompt.

@ptr-bloch
Copy link

Are there any movements?
The problem is a bit deeper than what is mentioned here about what the model learns from previous messages: some models call the same tool multiple times, with each new user request, perhaps because they see that the previous discussion doesn't mention the tool usage that is required by the system prompt. So, instead of one call, I get extra calls with each agent execution.

@GuillaumeRoy
Copy link

@ptr-bloch I went on a limb and reached out to a member of the n8n team directly over LinkedIn on May 2nd to bring this issue+thread to their attention and learned that "AI Squad (...) already looking into it and discussing".

@aouicher
Copy link

aouicher commented May 31, 2025

I've tested different LLM provider. And it seems have an issue when provider respond with:

[
    {
      "response": {
        "generations": [
          [
            {
              "text": "bla bla bla bla bla bla bla bla bla bla",
              "generationInfo": {
                "prompt": 0,
                "completion": 0,
                "finish_reason": "tool_calls",
                "system_fingerprint": "fp",
                "model_name": "custom_model"
              }
            }
          ]
        ]
      },
      "tokenUsage": {
        "completionTokens": 328,
        "promptTokens": 2852,
        "totalTokens": 3180
      }
    }
  ]

It seems work when:

[
    {
      "response": {
        "generations": [
          [
            {
              "text": "bla bla bla bla bla bla bla bla bla bla",
              "generationInfo": {
                "finish_reason": "tool_calls"
              }
            }
          ]
        ]
      },
      "tokenUsage": {
        "completionTokens": 328,
        "promptTokens": 2852,
        "totalTokens": 3180
      }
    }
  ]

Other thing: for each call to chat model, the action on memory after is ever loadMemoryVariables, never saveContext. When it's works, saveContext is called

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
in linear Issue or PR has been created in Linear for internal review
Projects
None yet
Development

No branches or pull requests