Web-based multi-user concurrent job scheduling system on the shared computing resource objects

Tetiana Habuza, Khaled Khalil, Nazar Zaki, Fady Alnajjar, Munkhjargal Gochoo

Research output: Chapter in Book/Report/Conference proceedingConference contribution

8 Citations (Scopus)

Abstract

In this paper we propose a user-friendly jobs and resources allocation manager for the ML server. We introduce some unique features of the designed system such as a protection of user's sensitive data, automatic cleaning of unused information, secure of the host OS via environment virtualization (container), and direct access to the container via SSH. Proposed web-based tool allows users to request and allocate resources on a server and monitor the progress of their tasks. It is created to simplify access to servers particularly ML servers, to allocate computational resources while satisfying data security concerns. The proposed tool also relieves system administrators form manually allocating resources to users and monitor the progress. The tool is user friendly and transparent so that the system administrator and the user can simply view all jobs in progress to find the best allocation for their tasks. The implementation code, deployment instructions, and supplementary files are available at https://github.com/UAEUIRI/DGX1-scheduler.

Original languageEnglish
Title of host publicationProceedings of the 2020 14th International Conference on Innovations in Information Technology, IIT 2020
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages221-226
Number of pages6
ISBN (Electronic)9781728181844
DOIs
Publication statusPublished - Nov 17 2020
Event14th International Conference on Innovations in Information Technology, IIT 2020 - Al Ain, United Arab Emirates
Duration: Nov 17 2020Nov 18 2020

Publication series

NameProceedings of the 2020 14th International Conference on Innovations in Information Technology, IIT 2020

Conference

Conference14th International Conference on Innovations in Information Technology, IIT 2020
Country/TerritoryUnited Arab Emirates
CityAl Ain
Period11/17/2011/18/20

Keywords

  • AI
  • DGX server
  • Deep learning server
  • job scheduling system

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Science Applications
  • Information Systems
  • Information Systems and Management
  • Safety, Risk, Reliability and Quality

Fingerprint

Dive into the research topics of 'Web-based multi-user concurrent job scheduling system on the shared computing resource objects'. Together they form a unique fingerprint.

Cite this