How to Improve AI Embedding Retrieval

How to Improve AI Embedding Retrieval

How to Improve AI Embedding Retrieval


Subscribe to get the weekly email newsletter loved by 1000+ executives. It's FREE!

Deeplearning.ai Tutorial

Advanced Retrieval


  • Understand how to better improve Retrieval Augmented Generation Applications

Key takeaways

Pitfalls of Retrieval

  • Distractors are embedding chunks returned that isn't necessarily related
  • These can occur because simple vector retrieval does not have any context other the mapping the query to a vector space and finding its nearest neighbours
  • Distractors heavily impact the LLM's reason—perhaps
  • These suboptiomal responses are also difficult to debug

Improving queries

  • Query expansions are the simplest way to improve query results
  • Expansion with generated answers is a appraoch in which we use the LLM to directly imagine answers based on the query and then use that result as the vector query for the real answers
  • You basically ask the model to hallucinate to get a helpful input prompt for the vector query
  • Expansion with multiple queries in which we use the LLM to generate similar queries to be fed into the vector search, the queries generated are handled as individual vector queries each. So there is a required extra step once you recieve all the real results from the database to de-duplicate the repsonses.
  • The system prompts for these are very important, the arxiv paper examples are the best starting point but you must experiment

Cross-encoder Re-ranking


  • Re-ranking is a way to order results and score them to a particular query
  • One use of re-ranking is to get more information out of the long tail of results
  • You do need to ask for more results from the vector query for this to be useful
  • This allows for returned results to be more than just what is the closest in the vector space
  • You can combine re-ranking with multiple query expansions to get the most relevant responses from the long tailed responses

Embedding Adapters

  • The adapters are a way to alter the embedding directly to produce better results
  • This requires a lightweight model
  • This uses feedback from users
  • The user feedback is the dataset that will train the lightweight model
  • You can use LLMs to supplement and create the initial training data for this model
  • This uses cosine similarity, which basically reduces vectors to direction vectors of 1 or -1
  • Cosine similarity uses MSE loss combined with the embeddings
  • It's the size of a single linear layer of a nueral network meaning this can be trained increadibly faster
  • So fast it could be possible for individuals / companies / etc to have their own adapter specific to their feedback
  • This embedding adapter is essential stretching or squeezing the space of the embeddings to the dimensions that are most relevant to their query
  • Instead of a simple Cosine similarity model you could train a neural network to do this to allow for more option space

Adapted Queries

Random Notes

  • chunk overlap is a powerful hyperparameter to test
  • Recursively character splitting for chunking isn't sufficient, some chunk will be large tthan token context windows, meaning some text will be ignored, in this case use a sentence tokeniser to chunk further
  • Vectors can be projected to 2D space using umap
  • 2D representations of vectors don't share the exact multi dimensionality of the vector but offer a better visual
  • You can fine tune your own embedding model using similar data from the embedding adapters
  • You can fine tune the LLM to expect retrieved results and reason with them (RAFT, RA-DIT)
  • There is a lot of experimentation currently around using deep models and transformers to improve chunking
Thanks for reading! If you found this helpful, please share this article with 1 friend!

More Articles

The Time a PhD Mathematician Won the Olympics


The Time a PhD Mathematician Won the Olympics

The story of Anna Kiesenhofer's incredible victory in Tokyo

Struggling to Empower Your Team? Read This Book


Struggling to Empower Your Team? Read This Book

Learn how the best companies build products...

Using a can of beans to figure out consulting pricing


Using a can of beans to figure out consulting pricing

Get better at pricing your consulting jobs...

The Economics of Airbnb Icons


The Economics of Airbnb Icons

Why exactly did they build the UP house?

The 5 Word Meeting Technique


The 5 Word Meeting Technique

Google, Apple and Amazon were told to run their companies this way...

The Trillion Dollar Coach: Steve Job's Coach


The Trillion Dollar Coach: Steve Job's Coach

This is a simple book that is a must read for any leader...

AI-Powered Networking: Building 50+ Connections in a New City


AI-Powered Networking: Building 50+ Connections in a New City

Discover how I leveraged AI to transform networking in London, creating a scalable system for building meaningful professional relationships.

Jeff Bezos is famous for reading slowly - here's why you need to do it too


Jeff Bezos is famous for reading slowly - here's why you need to do it too

Honestly this is not an easy thing to do. I tried and it felt like the mental equivalent of deadlifting

Do you lead people? Your mood is like electricity - it spreads


Do you lead people? Your mood is like electricity - it spreads

Discover how a leader's emotional state can spread through an organisation like wildfire, influencing performance at every level...

AI paying Humans?


AI paying Humans?

A new company Payman is betting that the future of work involves AIs paying us to do their boring tasks ...

Olympic Glory in the Digital Age: How Paris 2024 Reshaped Athletes' Social Media Landscapes


Olympic Glory in the Digital Age: How Paris 2024 Reshaped Athletes' Social Media Landscapes

We look at Instagram follower counts and see if there is a correlation between winning a medal or not...

How Youtube can help you to find your North Star Metric


How Youtube can help you to find your North Star Metric

Learn why the biggest companies choose one thing to focus on...

Running a large organisation? You need to think this way


Running a large organisation? You need to think this way

Day one thinking and why you need to make sure that your company thinks like this...

25 Strategic Moves That Established Companies Need to Steal from Startups


25 Strategic Moves That Established Companies Need to Steal from Startups

Sick of slow progress in your organisation? This will help you focus on what matters...

What Every CEO Can Learn from GitHub's 100-Day Leadership Challenge


What Every CEO Can Learn from GitHub's 100-Day Leadership Challenge

How Nat Friedman's Bold 100-Day Strategy Transformed GitHub and Redefined Leadership....

51 Books Every Executive Should Read in 2024


51 Books Every Executive Should Read in 2024

Hand picked, each of these has shaped us in some way...

No Rules Rules: Netflix and the Culture of Reinvention


No Rules Rules: Netflix and the Culture of Reinvention

An inside look into the culture of Netflix...

The Hidden Psychology of Decision-Making: What Executives Can Learn from Hostage Negotiators


The Hidden Psychology of Decision-Making: What Executives Can Learn from Hostage Negotiators

Explore how understanding emotional under currents can enhance decision-making in business...


Site Information

Fun Stuff

© 2024 Cub Digital. All Rights Reserved.