Skip to main content

Command Palette

Search for a command to run...

Building Persistent Agent Memory: Cross-Project Knowledge Graphs

Published
5 min read

title: "Why your AI agent keeps forgetting everything (and how to fix it with a cross-project memory graph)"

tags: [ai, programming, tutorial, opensource]

Last week, we watched an agent fix the same bug twice in two different repos.

First repo: it learned that X-Internal-Token should never be logged.
Second repo: same framework, same middleware pattern, same mistake... and the agent happily suggested logging it again.

That’s when the real problem clicked: most “agent memory” today is just a nicer scratchpad. It remembers a chat thread, maybe a repo, maybe a session. But it doesn’t build durable knowledge that survives across projects.

If you’re using agents for coding, support, ops, or security work, that gets expensive fast. Repeated mistakes. Repeated discovery. Repeated prompts.

The fix isn’t “more context window.” It’s persistent memory with structure.

The idea: store facts, not transcripts

A raw conversation log is hard to reuse. A knowledge graph is much easier:

  • Entities: repos, services, APIs, owners, incidents, secrets, libraries
  • Relationships: depends_on, owned_by, caused_incident, uses_pattern, replaced_by
  • Evidence: commit, file path, PR, ticket, runbook, test failure

Instead of asking:

“What happened in that other project again?”

your agent can ask:

“Show me all services using this auth middleware”
“What incidents were caused by this package?”
“Has any repo already solved this migration?”

That’s a much better shape for long-term memory.

What cross-project memory actually looks like

Here’s a simple version:

[Repo: billing-api] ----uses----> [Lib: express-rate-limit]
       |                               |
       | caused_incident               | replaced_by
       v                               v
[Incident: 2025-02-12]          [Lib: rate-limiter-flexible]

[Repo: auth-service] ----uses----> [Lib: express-rate-limit]
[Repo: admin-api] ----uses----> [Pattern: redact auth headers]

Now your agent can infer useful things:

  • “Billing and auth-service share the same risky dependency”
  • “Admin-api already has the safer logging pattern”
  • “Migration advice exists somewhere else in the org”

That’s the difference between memory and search.

A practical schema that works

Don’t overcomplicate the first version. You usually need just 3 node types:

  1. Resource
    Repo, service, package, document, endpoint

  2. Fact
    “This repo uses package X”, “This endpoint requires approval”, “This pattern caused incident Y”

  3. Evidence
    Commit SHA, file path, issue URL, ticket, test output

A useful rule: if the agent can’t point to evidence, don’t let it promote the claim to durable memory.

That one rule cuts down a lot of hallucinated “organizational knowledge.”

Keep writes narrow and reviews optional

A common failure mode is letting agents write broad summaries like:

“Team A prefers OAuth everywhere.”

That’s vague and hard to verify.

Better:

  • auth-service -> uses -> oauth2-proxy
  • incident-184 -> caused_by -> missing token audience check
  • payments-api -> owned_by -> platform-team

Small facts are easier to merge, dedupe, and revoke.

If the memory is sensitive or operationally important, route writes through a policy check or human approval. Honestly, if you already use OPA for policy decisions, it’s a solid fit here too.

One runnable example

This is a tiny local graph using graphlib in Node. It’s not production memory infra, but it shows the shape.

npm install graphlib
const { Graph } = require("graphlib");

const g = new Graph({ directed: true });

g.setNode("billing-api", { type: "repo" });
g.setNode("express-rate-limit", { type: "lib" });
g.setNode("incident-184", { type: "incident" });

g.setEdge("billing-api", "express-rate-limit", "uses");
g.setEdge("incident-184", "express-rate-limit", "caused_by");

console.log("billing-api ->", g.successors("billing-api"));
console.log("incident-184 ->", g.successors("incident-184"));

Output:

billing-api -> [ 'express-rate-limit' ]
incident-184 -> [ 'express-rate-limit' ]

The next step is obvious: persist this somewhere real, attach evidence, and expose retrieval to your agent through a tool/API.

Retrieval matters more than storage

A giant graph is useless if your agent can’t query it well.

Good retrieval prompts look like this:

  • “Before changing auth code, fetch incidents and prior fixes related to this middleware”
  • “When suggesting a dependency, check whether it was deprecated elsewhere”
  • “Before writing logs, retrieve redaction rules from similar services”

The trick is to make memory retrieval event-driven, not optional. Don’t rely on the model to “remember to remember.”

In practice, trigger graph lookups on:

  • dependency changes
  • auth or secret handling
  • incident-linked files
  • repeated task types across repos

Security note: memory becomes infrastructure fast

The second you make memory cross-project, you also make it sensitive.

Now your graph may reveal:

  • internal architecture
  • incident history
  • ownership maps
  • secrets-related patterns
  • privileged workflows

So treat memory like production infrastructure:

  • scope read/write access
  • attach provenance to every fact
  • expire or revoke stale claims
  • log who wrote what
  • separate public vs internal memory

This is also why agent identity and authorization start to matter. If multiple agents can read and write memory, you need to know which agent learned what, and who allowed it.

Try it yourself

If you’re building agents with memory, a few free tools may help:

  • Want to check your MCP server? Try https://tools.authora.dev
  • Run npx @authora/agent-audit to scan your codebase for agent security issues
  • Add a verified badge to your agent: https://passport.authora.dev
  • Check out https://github.com/authora-dev/awesome-agent-security for more resources

The main lesson

The one thing nobody tells you about agent memory is this:

If memory isn’t structured, verified, and reusable across projects, it’s just a longer chat log.

The teams getting real leverage from agents aren’t only giving them more tokens. They’re giving them durable facts, evidence, and the ability to connect lessons learned in one codebase to another.

How are you handling persistent agent memory today: vector DB, graph, plain docs, or something else? Drop your approach below.

-- Authora team

This post was created with AI assistance.

More from this blog

A

Authora Dev

38 posts