Home » Precision Examined: Evaluating OpenAI’s Latest Embedding Mannequin | by Priya Dwivedi | Jan, 2024

Precision Examined: Evaluating OpenAI’s Latest Embedding Mannequin | by Priya Dwivedi | Jan, 2024

by Narnia
0 comment

Doing cool issues with information!

Image generated by DALLE3

OpenAI just lately unveiled two new state-of-the-art textual content embedding fashions — text-embedding-3-small and text-embedding-3-large – poised to dethrone the venerable text-embedding-ada-002. In this publish, we’ll dig into the capabilities of those rising fashions and put one to the check in a hands-on RAG utility.

Specifically, we’ll:

  • Dive into the main points and potential of the brand new OpenAI textual content embedders
  • Showcase text-embedding-3-small by constructing a Retrieval-Augmented Generator demo
  • Horse race it towards the favored BGE-small open supply model on a RAG dataset
  • Leverage Trulens to transparently consider the RAG fashions underneath the hood

I’m eagerly anticipating spectacular features from text-embedding-3-small because of its extra intensive coaching. However, BGE-small has confirmed formidable — so this matchup ought to reveal actual tradeoffs to think about. The Trulens evaluation will even uncover insights no benchmark alone can present.

Let’s get hands-on with the following evolution of Foundation Models! The face-off begins now…

At Opal AI, we search to make modern language and generative applied sciences accessible and impactful for organizations of all sizes. Contact us to study extra about us.

OpenAI has unveiled two new textual content embedding fashions as upgrades over December 2022’s text-embedding-ada-002 – introducing text-embedding-3-small and text-embedding-3-large. These new basis fashions boast stronger benchmark efficiency and pricing efficiencies.

Specifically, text-embedding-3-small enhances multilingual retrieval accuracy on MIRACL benchmarks by 12.6% over ada-002 whereas barely enhancing English specialised duties on MTEB. At the identical time it’s 5x cheaper. The bigger text-embedding-3-large surfaces much more dramatic features – 23.5% on MIRACL and three.6% on MTEB. The desk beneath compares their efficiency

You may also like

Leave a Comment