CMU 15-445 | Project 1 - Buffer Pool

Official link: https://15445.courses.cs.cmu.edu/fall2023/project1/
Time spent: 11 days
Leaderboard: 40 (2024-10-16)

This project consists of three subtasks: implementing the LRU algorithm, the Disk Scheduler, and a Buffer Pool. The first two tasks took five days, and the last one took another five days. Overall, the main challenge was understanding the algorithms and class relationships—the function comments are quite helpful, and if you just follow the comments, you’ll definitely pass SampleTest.

First is the LRU-K Replacer, which requires implementing an LRU algorithm, but only compares the time of the K-th most recent access; the element with the earliest K-th access is evicted. If all elements have fewer than K accesses, the one with the earliest last access is evicted. This algorithm uses a doubly linked list to store each node’s access history and a hash table to map frame_id to nodes. Once you understand this, implementation is straightforward.

The second task is DiskScheduler. This part is relatively easy—DiskScheduler uses a worker thread to process read/write requests serially, with a queue to store requests. This section taught me how to use std::promise.

The last part, of course, is BufferPoolManager, which took me several days 🙈. The reason was that I didn’t understand what Page and Frame actually were, or what Page* pages_; was storing. I managed to score 74 on Gradescore without understanding what frame_id was—pretty ridiculous 😂. To clarify: a Frame is simply the location in memory where a Page is stored; in code, the index of the Page* pages_ array is the frame_id, and that’s it. The Page itself can be anywhere—on disk or in memory. If you want to load a Page into memory, you store it in one of the slots in the pages_ array—in other words, you select a frame in memory and place the page there. You can see this in the BufferPoolManager constructor when initializing free_list_: the maximum frame_id is pool_size_ - 1, matching the size of pages_ in BufferPoolManager. Once you understand this, implementation is much easier if you follow the comments.

Leaderboard

I made some simple optimizations recently, which bumped my leaderboard rank from 497 to 40. Here’s a record of what I did:

2024-07-21 #Rank 497 - No optimization
2024-08-14 #Rank 267 - optim evict: The original LRUKLNode could record history, but had no fixed size. I added an AddHistory method; when adding a history record, if the existing records exceed k_, I pop_back the last one. This way, when searching for the K-th history record, you can just use node.history_.back().
2024-08-31 #Rank 35 - add multiple io queue: Previously, DiskScheduler only had a single request_queue_, which meant the consumer thread blocked the queue after each request. I split request_queue_ into bucket queues, each with its own lock. When enqueuing a request, the queue and lock index is page_id % bucket. By reducing lock granularity, my rank improved significantly.

CMU 15-445 | Project 1 - Buffer Pool

Leaderboard

Copyright Notice

Comments

预览:

Table of Contents

Leaderboard

Copyright Notice

Comments

预览:

Start searching

No results found