Large language models (LLMs) are trained on massive, publicly available text datasets comprising trillions of tokens, enabling them to excel at general language tasks like next-token prediction. However, LLMs often struggle with domain-specific prompts, exhibiting reduced accuracy or generating inaccurate information (hallucinations). This is because they lack sufficient subject matter expertise. Two primary approaches exist to address this limitation for augmenting LLMs knowledge: Retrieval-Augmented Generation (RAG) and fine-tuning. This presentation focuses on fine-tuning smaller LLMs with domain-specific instruct datasets using the LoRA (Low-Rank Adaptation) technique on Gaudi hardware. We will leverage publicly available LLMs and datasets from the Hugging Face Hub for this demonstration. Though it is possible to fine tune LLMs with plain text data – sourced from documents, articles, and other materials.



