Skip to content

Persistence

ParaLLeM saves every LLM response to a local datastore keyed by a hash of the request content. On subsequent runs, a matching hash returns the cached response instantly — no API call is made.

How caching works

Every call to ask_llm computes a SHA-256 hash of:

  • The system prompt (instructions)
  • All input documents (strings, images, function call outputs, …)
  • Any additional salt terms (see below)

If parallem has already seen your hash, then the previous value is returned immediately. Otherwise, a request is sent to the provider and stored.

with pllm.resume_directory(".pllm/myproject", provider="openai", strategy="sync") as orch:
    with orch.agent() as agt:
        resp = agt.ask_llm("Name a prime number.")
        agt.print(resp.final_answer)  # live on first run, instant on subsequent runs

Hashing

What is and isn't hashed

By default, only the message content is hashed — including instructions (system prompt) and all input documents. Config that is not included in the hash:

  • Model name / LLM identity
  • Tool definitions
  • Provider type

This means that if you change the model but keep the same prompt, the cached response from the old model is returned. Use hash_by or salt to avoid this.

hash_by

hash_by is a list of named terms to fold into the hash. Currently the only supported value is "llm", which appends the model identity string before hashing.

agt.ask_llm("Name a prime.", hash_by=["llm"])

Now switching from gpt-4o to gpt-4o-mini produces a different hash and a separate cache entry.

salt

salt is a free-form string that can distinguish otherwise identical content. For example, if you are not happy with the cached result, pass a salt parameter to bypass that previous result.

agt.ask_llm("Name a prime.")
agt.ask_llm("Name a prime.", salt=1)  # Will not collide

Use salt when you want to force a fresh response without clearing the whole cache — for example when tool definitions change (which are not hashed):

agt.ask_llm(prompt, tools=my_tools, salt="tools-v2")

ignore_cache and rewrite_cache

Parameter Effect
ignore_cache=True Always call the provider, ignoring any stored response.
rewrite_cache=True Always call the provider and overwrite the stored response with the new one.

Both are set on resume_directory:

pllm.resume_directory(".pllm/myproject", provider="openai", ignore_cache=True)