Curated Resource ( ? )

Using ChatGPT for Question Answering on Your Own Data

my notes ( ? )

"leverage ChatGPT to answer natural language questions on a variety of text repositories... [via a ]combination of embeddings, vector search, and prompt engineering...

  • Embeddings are mathematical representations of words, phrases, or even entire documents as vectors ... sequences [with] similar meaning are “close together” in a high-dimensional space... encode the semantic meaning ...
  • Vector databases provide an efficient way to store and search for embeddings... designed to perform similarity searches... quickly identify the most relevant matches for a given query...
  • Prompt engineering ... guide the behavior of [LLMs] by carefully crafting input prompts."

He then sets out how to use "open-source Python package Langchain to ... streamline the following process":

  • preprocess your data
  • create embeddings of it "using a pre-trained language model like ChatGPT"
  • put them in a vector database
  • translate user queries into an embedding using the same model
  • "Perform a similarity search in the vector database to identify the most relevant matches...
  • Craft a prompt that combines the user’s query" with the text returned from the vector database, and feed it to the LLM

Read the Full Post

The above notes were curated from the full post

Related reading

More Stuff I Like

More Stuff tagged ai , guide , chatgpt , ai prompt , llm , langchain , vector database , embedding

See also: Digital Transformation , Innovation Strategy , Science&Technology , Large language models

Cookies disclaimer saves very few cookies onto your device: we need some to monitor site traffic using Google Analytics, while another protects you from a cross-site request forgeries. Nevertheless, you can disable the usage of cookies by changing the settings of your browser. By browsing our website without changing the browser settings, you grant us permission to store that information on your device. More details in our Privacy Policy.