Candidate: Hadi Omidi
Date: February 12, 2024
Time: 1:00 PM
Location: online via Teams
Supervisors: Seyed Majid Zahedi & Nachiket Kapre
All are welcome!
Abstract:
The performance of machine learning applications heavily relies on the choice of underlying hardware architecture, encompassing factors such as computational power, scalability, memory, and storage capabilities. These hardware choices significantly impact the efficiency and effectiveness of machine learning systems. Resource-intensive programs can lead to competition for system resources, causing delays, while inefficient resource usage can saturate resources and harm user experience. To address resource variation among applications, resource sharing is implemented, allowing applications to dynamically allocate resources as needed, promoting efficient resource utilization. However, resource allocation strategies often prioritize performance, potentially overlooking fairness among users or applications, especially in shared environments. Balancing performance optimization and fair resource allocation is a complex challenge, requiring mechanisms that encourage resource sharing, prevent envy, and ensure a fair distribution of resources. Incorporating these characteristics promotes collaboration, minimizes negative emotions, and prioritizes the well-being of all participants in the system. This research introduces MARS, an innovative resource allocation mechanism that addresses shortcomings in traditional methods. MARS prioritizes both fairness and efficiency in resource distribution, utilizing a token-based mechanism to ensure fairness and implementing individual preferences based on learned thresholds through an Actor-Critic method to improve efficiency. A computer simulation involving 40 accelerators and 20 agents in different environments demonstrates a performance improvement 1.2× compared to standard approaches. This study contributes by shedding light on the complex challenges of resource allocation in heterogeneous systems and providing a practical solution with MARS.