This patent details a system for a large language model (LLM) to respond to user queries by leveraging a custom corpus of documents. The system receives a user query, selects one or more external applications and retrieves relevant documents from the custom corpus based on the query and potentially a context vector or precomputed embeddings. The LLM then generates a response to the user query, conditioned on the retrieved documents, which is subsequently displayed to the user on their client device. Flowcharts and diagrams illustrate the process, including interactions between the client device, the natural language response system, and external applications accessing document embeddings and the custom corpus.