TY - GEN
T1 - Web-based multi-user concurrent job scheduling system on the shared computing resource objects
AU - Habuza, Tetiana
AU - Khalil, Khaled
AU - Zaki, Nazar
AU - Alnajjar, Fady
AU - Gochoo, Munkhjargal
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2020/11/17
Y1 - 2020/11/17
N2 - In this paper we propose a user-friendly jobs and resources allocation manager for the ML server. We introduce some unique features of the designed system such as a protection of user's sensitive data, automatic cleaning of unused information, secure of the host OS via environment virtualization (container), and direct access to the container via SSH. Proposed web-based tool allows users to request and allocate resources on a server and monitor the progress of their tasks. It is created to simplify access to servers particularly ML servers, to allocate computational resources while satisfying data security concerns. The proposed tool also relieves system administrators form manually allocating resources to users and monitor the progress. The tool is user friendly and transparent so that the system administrator and the user can simply view all jobs in progress to find the best allocation for their tasks. The implementation code, deployment instructions, and supplementary files are available at https://github.com/UAEUIRI/DGX1-scheduler.
AB - In this paper we propose a user-friendly jobs and resources allocation manager for the ML server. We introduce some unique features of the designed system such as a protection of user's sensitive data, automatic cleaning of unused information, secure of the host OS via environment virtualization (container), and direct access to the container via SSH. Proposed web-based tool allows users to request and allocate resources on a server and monitor the progress of their tasks. It is created to simplify access to servers particularly ML servers, to allocate computational resources while satisfying data security concerns. The proposed tool also relieves system administrators form manually allocating resources to users and monitor the progress. The tool is user friendly and transparent so that the system administrator and the user can simply view all jobs in progress to find the best allocation for their tasks. The implementation code, deployment instructions, and supplementary files are available at https://github.com/UAEUIRI/DGX1-scheduler.
KW - AI
KW - DGX server
KW - Deep learning server
KW - job scheduling system
UR - http://www.scopus.com/inward/record.url?scp=85099473198&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85099473198&partnerID=8YFLogxK
U2 - 10.1109/IIT50501.2020.9299110
DO - 10.1109/IIT50501.2020.9299110
M3 - Conference contribution
AN - SCOPUS:85099473198
T3 - Proceedings of the 2020 14th International Conference on Innovations in Information Technology, IIT 2020
SP - 221
EP - 226
BT - Proceedings of the 2020 14th International Conference on Innovations in Information Technology, IIT 2020
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 14th International Conference on Innovations in Information Technology, IIT 2020
Y2 - 17 November 2020 through 18 November 2020
ER -