What are your best practices when using Embeddings, RAG, and Retrieval?
Hi there,
Recently started building LLM applications, While talking to developers in the field, I got overwhelmed by all the tools and services available.
- Different embedding algorithms: For some ada-002 is SOTA, for others not
- Embedding pipeline providers
- Chunking and cleaning
- Injecting up-to-date Knowledge Bases (RAG)
- Indexing
- Retrieval
(I will not even start with 10s of different vector DB providers)
I'd love to collect best practices for common pain points. Can we create a high-quality thread with the following?
- What is your tool stack for LLM applications?
- What problems did you experience?
- How did you solve it? (if it's solved, otherwise "looking for a solution")