Hacker Newsnew | past | comments | ask | show | jobs | submit | staranjeet's commentslogin

Hey, yes. It remembers whatever is important from the previous chats.


I'm looking to use, or build, a system that memorizes conversations and stores them in the RAG system. Example conversation:

==== Bot: wassup?

Me: I have some more thoughts on Project X. They will be rambly so please also create an edited version as well as the usual synopsis. I will say 'I'm finished' when I've finished.

Bot: ok hit me

Me: bla bla bla bla etc etc. I'm finished.

Bot: this looks like part of the introduction text of Project X, is that correct?

Me: yes. What meta tags do you suggest? Etc ====

I'm assuming that a custom GPT or equivalent is necessary to set out the 'terms of engagement' and agent objectives. Can you offer any advice about building such a system, and how mem0 could help?


Thanks for your question! Currently, we process data in the US and are not yet fully GDPR-compliant, but we're actively working on it. We also plan to offer a Europe-based data processing option soon. Your feedback on this is welcome!


thanks


Hey, that would be great. Lets chat


Thanks for your question!

Claude Prompt Caching and Mem0's memory system have several key differences:

1. Purpose and duration: Claude's cache is designed for short-term memory, clearing every 5 minutes. In contrast, Mem0 is built for long-term information storage, retaining data indefinitely unless instructed otherwise. 2. Flexibility and control: Mem0 offers more flexibility, allowing developers to update, delete, or modify stored information as needed. Claude's cache is more static - new information creates additional entries rather than updating existing ones. 3. Content management: Claude has minimum length requirements for caching (1024 characters for Sonnet, 2048 for Haiku). Mem0 can handle information of any length, from short facts to longer contexts. 4. Customization: Developers have greater control over Mem0's memory management, including options for prioritizing or deprioritizing information based on relevance or time. Claude's caching system offers less direct control. 5. Information retrieval: Mem0 is designed for more precise and targeted information retrieval, while Claude's cache works with broader contextual blocks.

These differences reflect the distinct purposes of each system. Claude's cache aims to maintain recent context in ongoing conversations, while Mem0 is built to serve as a flexible, long-term knowledge base for AI applications.


Mem0 currently handles outdated or irrelevant memories by:

1. Automatically deprioritizing older memories when new, contradictory information is added. 2. Adjusting memory relevance based on changing contexts.

We're working on improving this system to give developers more control. Future plans include:

1. Time-based decay of unused memories 2. Customizable relevance scoring 3. Manual removal options for obsolete information

These improvements aim to create a more flexible "forgetting" mechanism, allowing AI applications to maintain up-to-date and relevant knowledge bases over time.

We're open to user feedback on how to best implement these features in practical applications.


Thanks for your question!

Vector databases are typically used for storing embeddings and are great for tasks like similarity search. However, they are generally read-only and don't natively support the concept of time or state transitions. Let's take an example of tracking state of a tasks from your todo list in a vector database:

You might store the task's states like:

Task 1 in backlog Task 1 in progress Task 1 in canceled

But there's no concept of "latest state" or memory of how the task evolved over time. You'd have to store multiple versions and manually track changes.

With a memory-enabled system like Mem0, you could track: Task 1 (current state: in progress) with a memory of previous states (backlog, canceled, etc). This gives your AI app more stateful understanding of the world, allowing it to update and reflect the current context automatically.

Traditional databases, on the other hand, are designed for structured, relational data with fixed schemas, like customer information in a table. These are great for handling transactional data but aren't optimal for cases where the data is unstructured.

As mentioned in the post, we use a hybrid datastore approach that handles these cases effectively and that's where the graph aspect comes into picture.


> However, they are generally read-only

What??


This post is about learnings by running a RAG application in production.

Here are the learnings:

• Always customise your prompt. • Set soft & hard limit on your LLM cost before launching any project. • Choose the LLM model wisely. • Context length matters a lot. • Cache your queries. • Have a router to choose LLM model wisely. • Have a UI to see all queries, answers, context & metrics like response time. • Memory management in chat is painful.


You can use embedchain[1] to connect various data sources and then get a RAG application running on your local and production very easily. Embedchain is an open source RAG framework and It follows a conventional but configurable approach.

The conventional approach is suitable for software engineer where they may not be less familiar with AI. The configurable approach is suitable for ML engineer where they have sophisticated uses and would want to configure chunking, indexing and retrieval strategies.

[1]: https://github.com/embedchain/embedchain


Yes, this is most intuitive and easy to use framework for building RAG applications. Do give it a try!


OpenAI and all its own properties get traffic of 1.5B monthly users (source: Similarweb).

Despite the significantly high traffic, there are a couple of SEO-based opportunities that OpenAI can implement to increase its traffic further. This post covers those strategies and their implementation in detail.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: