Projects - search

Filter by:

Limit to posts tagged with one or more of:

In this project, students will evaluate a recently proposed CDC algorithm and compare it against state-of-the-art techniques. Working in teams, students will use open-source implementations and real-world datasets to conduct benchmarking experiments and measure metrics such as chunking throughput, chunk-size distributions, and deduplication efficiency. Team members will focus on complementary tasks, including experiment design, dataset preparation, benchmarking, data analysis, and visualization. The primary goal during the program is to evaluate the algorithm and understand its strengths and limitations. Students who continue beyond the program may also explore integrating the algorithm into an open-source benchmarking framework and investigating further improvements to chunking techniques.


Tags: Systems, Algorithms, C/C++, Go, Python, All Years