#7 MLOps & AI/ML

Episode 7: The Conductor – Guiding and Refining LLM-Generated Content

How do you ensure the LLM's output is what you intended? We look at the anatomy of an ideal "completion," including the preamble, recognizable start and end markers, and postscript. Explore logprobs, how to assess the quality of generated content, using LLMs for classification, critical points in the prompt, and model selection.

29:00 Prompt Engineering for LLMs (book) 79,7 MB 1 uppspelningar
Avslutad

Audio Player

0:00 / 29:00
Hastighet:
Sovtimer:

Transkript

(Narrator): Welcome to Chapter 7: Taming the Model. This chapter marks the end of Part II of the book, focusing on core prompt-engineering techniques. Building on assembling the prompt, Chapter 7 is about making sure the LLM provides the response you want.
(Narrator): We'll start by looking at the completion formats and how to ensure your completions end appropriately, along with using something called logprob tricks. Then, we'll step back to consider how to choose the right model to invoke.
(Narrator): Just like we looked at the anatomy of a prompt in Chapter 6, Chapter 7 examines the Anatomy of the Ideal Completion.
(Narrator): The completion often starts with a preamble. This initial text can sometimes set the stage helpfully, aiding reasoning or chain-of-thought processes. However, LLMs, especially RLHF-trained models, can sometimes include unnecessary "fluff" in the preamble. You can try to manage this fluff using techniques like providing instructions with few-shot examples or reformatting the prompt. Sometimes, you might even need to banish fluff into a separate section that's easy to parse out.
(Narrator): It's crucial to have a recognizable start and end for your main answer within the completion. This helps you easily extract the relevant part and filter out any preamble or postscript. The document structure you choose (like Markdown or structured formats) can help with this.
(Narrator): You also want to control the length of the completion. Every token generated costs time and compute. Identifying a recognizable end point allows you to stop generation precisely when the useful part of the response is complete, conserving resources.
(Narrator): Moving beyond the text itself, Chapter 7 introduces Logprobs. Log probabilities indicate the model's confidence in choosing each specific token. By examining logprobs, you can gain insight into the model's certainty about the text it is generating. Logprobs can even help detect anomalies like typos, which often have very low (more negative) logprob values. Negative single-digit logprobs are somewhat common, but negative double-digit ones usually indicate something unusual is happening.
(Narrator): The chapter also addresses the important decision of Choosing the Model. You might use a commercial service, an open-source model, or even fine-tune your own model.
(Narrator): Fine-tuning involves continuing the model's training process on a specific dataset to adjust its parameters.

Full fine-tuning or continued pre-training adjusts all parameters and is suitable for training the model on a potentially whole new domain, but requires tens of thousands of documents and takes weeks or months.

Parameter efficient fine-tuning (like LoRA) is better for teaching the model prior expectations within an existing domain, how to interpret information, and how to follow a fixed format. This takes days and requires hundreds or thousands of documents.

Soft prompting refers to the information contained directly in the prompt itself.
(Narrator): A significant benefit of fine-tuning is that it can bake instructions and examples into the model's parameters. This means you can often remove static prompt context, general explanations, instructions, and few-shot prompting from your prompt, as the fine-tuned model has already learned this behavior. In this way, fine-tuning is seen as a continuation of prompt engineering. When using a fine-tuned model, you should shape your prompt to look like the beginning of the documents used for fine-tuning, not the original training documents.
(Narrator): Taming the model can be challenging. But by clearly defining the completion you want and using techniques like managing preambles and postscripts, ensuring recognizable starts and ends, using logprobs for insight, and potentially choosing or fine-tuning the model, you gain control over the LLM's output.
(Narrator): This concludes the core prompt-engineering techniques covered in the book.

Källor

Prompt Engineering For Llms By John Berryman And Albert Ziegler