Skip main navigation

Chunking Homework Exercise

Jupyter notebook with document chunking exercise using semantic chunking.
Homework Instructions:

In this exercise, you will implement an advanced RAG system to find legal cases from large documents. You will practice loading a large PDF with more than 1,000 pages, creating an encoder tuned for legal context, chunking the document using semantic chunking, uploading the documents into a vector database, and searching the index with a filter on the jurisdiction. The exercise includes a notebook with some of the code missing. Complete the code based on the instructions in the notebook, and answer the questions below with the output cells’ data after the code completion.

The notebook (“/exercise/02_advanced_chunking_exercise.ipynb”) is under the folder in the GitHub repository and here.

This article is from the free online

Advanced Retrieval-Augmented Generation (RAG) for Large Language Models

Created by
FutureLearn - Learning For Life

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now