Fine Tuning Large Language Models (LLMs) with Domain Specific Datasets

April 14 @ 11:00 am - 12:00 pm

Large language models (LLMs) are trained on massive, publicly available text datasets comprising trillions of tokens, enabling them to excel at general language tasks like next-token prediction. However, LLMs often struggle with domain-specific prompts, exhibiting reduced accuracy or generating inaccurate information (hallucinations). This is because they lack sufficient subject matter expertise. Two primary approaches exist to address this limitation for augmenting LLMs knowledge: Retrieval-Augmented Generation (RAG) and fine-tuning. This presentation focuses on fine-tuning smaller LLMs with domain-specific instruct datasets using the LoRA (Low-Rank Adaptation) technique on Gaudi hardware. We will leverage publicly available LLMs and datasets from the Hugging Face Hub for this demonstration. Though it is possible to fine tune LLMs with plain text data – sourced from documents, articles, and other materials.

Fine Tuning Large Language Models (LLMs) with Domain Specific Datasets

April 14 @ 11:00 am - 12:00 pm

Details

Venue

Details

Venue

Event Navigation

COMMUNITY PARTNERS:

Site Links

About SD Tech Scene

Fine Tuning Large Language Models (LLMs) with Domain Specific Datasets

April 14 @ 11:00 am - 12:00 pm

Details

Venue

Related Events

San Diego Tech Coffee Online

[Online] Women Coders Inclusive Org Lunch Chat!

Open Web Application Security Project San Diego (OWASP-SD) – Monthly Meeting

Details

Venue

Event Navigation

COMMUNITY PARTNERS:

Site Links

About SD Tech Scene