Senior Research Fellow
Durham University, School of Engineering and Computing Sciences
Born in Pakistan, Ibad Kureshi completed his O' and A' levels from the Karachi Grammar school and then started a BSc in Computer Engineering at the Lahore University of Management Sciences. He started a career in the media industry in 2005 and was at Dawnnews TV (part of the largest media group in Pakistan) as an Associate Producer from 2007 to 2008. Moving to the UK in 2008, Ibad completed his B. Eng (Hons) in Electronic Engineering and Computer Systems, an MSc (Research) and PhD in the High Performance Computing Research Group, all from the University of Huddersfield in 2009, 2011, and 2015 respectively.
While with the UoH, Ibad served as the Senior System Administrator for the HPC-Resource Centre from 2009 to 2013 and undertook his PhD between 2011-2014. Currently he is a Senior Research Fellow with the Institute of Advanced Research Computing and the School of Engineering and Computing Sciences at the University of Durham.
Other than computing, Ibad’s passions include cycling, dogs, drumming and photography.
Durham University, School of Engineering and Computing Sciences
Durham University, Institute of Advanced Research Computing
University of Huddersfield, School of Computing and Engineering
University of Huddersfield, HPC Resource Centre
University Centre Blackburn College, VET Program
Ph.D. High Performance Computing
University of Huddersfield
HPC Research Group
Master by Research in High Performance Computing
University of Huddersfield
HPC Research Group
Bachelor of Engineering in Electronic Engineering and Computer Science
University of Huddersfield
My research interests lie in the theory, implementation and application of advanced and innovative computer architectures and data manipulation. On the theoretical front my particular interests are in scheduling strategies, system performance profiling and algorithm scalability. On the applied front, middleware design, user workflow and workload management systems; parallelising algorithms; visualisation of 'Big Data' problems and large-scale system management tools make up my primary focus.
My research masters evaluated the techniques of delivering Research Computing Infrastructure (RCI) and generating a best practices guide. My PhD research looked at novel methods to make job schedulers more intelligent by allowing them to make resource allocation decisions using application performance data and heuristic data as costs metrics.
Recently I have taken an interest in Smart Cities and am looking at applying HPC, Visualisation and Big Data strategies to improve urban administration and development. In particular I am interested in the mapping and management of public transport systems in regions where public transport is currently provided in an ad-hoc manner.
Working within Tectre Enterprise Solutions to devise a software to model datacentre components for HPC systems. Information such as power and cooling requirements etc. will generate designs and reports to better inform purchase and provisioning decisions.
The University of Huddersfield is an early adopter of the Moonshot project to link HPC systems with existing RADIUS servers that serve EDUROAM.
Developing an in-house map-reduce system to run on existing Cycle Stealing Condor Pool.
Project aims to avoid the need for dedicated Hadoop Resources and reduce dependence on commercial providers (e.g. Amazon).
Large Condor pools with 500-1000GB local hard drives are prime sources for storage solutions. While duplication has its flaws the use of error correcting code (like those used in network communication) are being implemented to harness this dormant resource.
Developing mechanisms for real time processing and visualization from X-Ray Tomography and Particle Velocimetry devices (to name a few) through a HPC/GPGPU System to a purpose build 3D Visualization suite.
Created a mechanism for different HPC managers on different platforms to communicate with each other and share compute end points. A working system was deployed and used for two years. System included a Windows HPC 2008 and a Linux/Torque HPC head node sharing 64 compute nodes
The system was later enhanced to use virtualised environments on the compute nodes.
Investigated methods of deploying cycle stealing software to harness idle campus resources. Integrated the system with power management features to ensure IT carbon reduction targets were still met.
Added the use of virtualisation to provide different run time environments
Designed portals and plugins to work with popular mechanical engineering software packages (e.g. Fluent). This made adoption of HPC technologies easier for undergraduates and hid the complexities of UNIX clusters from the average Mechanical Engineering users.
Designed a portal system to allow Art students to render their 3D sequences on a cycle stealing render farm and the dedicated on campus HPC systems.
Using grid technologies typically used for National and International connectivity a grid system was deployed at the University of Huddersfield. This linked the Universities disparate resources and allowed users to scale beyond on-campus resources.
Designed a software solution with mechanical engineers to assess vehicular telemetry data, identifying those portions that matched to New European Driving Cycle (NEDC) standards. The final software took advantage of multicore processors, traditional HPC and HTC systems.
Investigated, deployed and maintained an Infrastructure as a Service private cloud at the University of Huddersfield. Worked with Software Design teams and faculty members from various disciplines to build Platform and Software services.
Working with the Systems Research Institute at the Polish Academy of Science to integrate agents in the form of the AiG system with existing grid middleware like gLite and Unicore.
Based on prior experience of linking Job Schedulers, this project aims to link TORQUE, Condor and LSF to allow users a single point of submission with a single job description language. This trusted grid also allows for surging within the established private cloud.
Working with the University of Glasgow on converting an existing SPARQL/RDF tool for detecting errors in National Health Service Surgical Records to HADOOP. The aim is to reduce the computation time required and make the software scalable. This project was started due to involvement in the National Grid Service and the Software Sustainability Institutes Collaboration Workshop held in Oxford in 2012.
Below is a list of publications (journal papers, conference papers and thesis) and presentations (posters/abstract). There are two distinct themes within the publications. The first are those related to my PhD research work and the second is the side projects that I have under taken. A majority of the side project papers were written in collaboration with colleagues from the High Performance Research Group and the HPC Resource Centre and the University of Huddersfield. Keywords in the abstract depict the theme.
The campus grid architectures currently available are considered to be overly complex. We have focused on High Throughput Condor HTCondor as one of the most popular middlewares among UK universities, and are proposing a new system for unifying campus grid resources. This new system PBStoCondor is capable of interfacing with Linux based system within the campus grids, and automatically determining the best resource for a given job. The system does not require additional efforts from users and administrators of the campus grid resources. We have compared the real usage data and PBStoCondor system simulation data. The results show a close match. The proposed system will enable better utilization of campus grid resources, and will not require modification in users’ workflows.
High-Performance Computing (HPC) and the ability to process large amounts of data are of paramount importance for UK business and economy as outlined by Rt Hon David Willetts MP at the HPC and Big Data conference in February 2014. However there is a shortage of skills and available training in HPC to prepare and expand the workforce for the HPC and Big Data research and development. Currently, HPC skills are acquired mainly by students and staff taking part in HPC-related research projects, MSc courses, and at the dedicated training centres such as Edinburgh University’s EPCC. There are few UK universities teaching the HPC, Clusters and Grid Computing courses at the undergraduate level. To address the issue of skills shortages in the HPC it is essential to provide teaching and training as part of both postgraduate and undergraduate courses. The design and development of such courses is challenging since the technologies and software in the fields of large scale distributed systems such as Cluster, Cloud and Grid computing are undergoing continuous change. The students completing the HPC courses should be proficient in these evolving technologies and equipped with practical and theoretical skills for future jobs in this fast developing area. In this paper we present our experience in developing the HPC, Cluster and Grid modules including a review of existing HPC courses offered at the UK universities. The topics covered in the modules are described, as well as the coursework project based on practical laboratory work. We conclude with an evaluation based on our experience over the last ten years in developing and delivering the HPC modules on the undergraduate courses, with suggestions for future work.
In this paper a system for storing and querying medical RDF data using Hadoop is developed. This approach enables us to create an inherently parallel framework that will scale the workload across a cluster. Unlike existing solutions, our framework uses highly optimised joining strategies to enable the completion of eight separate SPAQL queries, comprised of over eighty distinct joins, in only two Map/Reduce iterations. Results are presented comparing an optimised version of our solution against Jena TDB, demonstrating the superior performance of our system and its viability for assessing the quality of medical data.
Computational science and complex system administration relies on being able to model user interactions. When it comes to managing HPC, HTC and grid systems user workloads - their job submission behaviour, is an important metric when designing systems or scheduling algorithms. Most simulators are either inflexible or tied in to proprietary scheduling systems. For system administrators being able to model how a scheduling algorithm behaves or how modifying system configurations can affect the job completion rates is critical. Within computer science research many algorithms are presented with no real description or verification of behaviour. In this paper we are presenting the Cluster Discrete Event Simulator (CDES) as an strong candidate for HPC workload simulation. Built around an open framework, CDES can take system definitions, multi-platform real usage logs and can be interfaced with any scheduling algorithm through the use of an API. CDES has been tested against 3 years of usage logs from a production level HPC system and verified to a greater than 95% accuracy.
In this paper we present a framework for developing an intelligent job management and scheduling system that utilizes application specific benchmarks to mould jobs onto available resources. In an attempt to achieve the seemingly irreconcilable goals of maximum usage and minimum turnaround time this research aims to adapt an open-framework benchmarking scheme to supply information to a mouldable job scheduler. In a green IT obsessed world, hardware efficiency and usage of computer systems becomes essential. With an average computer rack consuming between 7 and 25 kW it is essential that resources be utilized in the most optimum way possible. Currently the batch schedulers employed to manage these multi-user multi-application environments are nothing more than match making and service level agreement (SLA) enforcing tools. These management systems rely on user prescribed parameters that can lead to over or under booking of compute resources. System administrators strive to get maximum “usage efficiency” from the systems by manual fine-tuning and restricting queues. Existing mouldable scheduling strategies utilize scalability characteristics, which are inherently 2dimensional and cannot provide predictable scheduling information. In this paper we have considered existing benchmarking schemes and tools, schedulers and scheduling strategies, and elastic computational environments. We are proposing a novel job management system that will extract performance characteristics of an application, with an associated dataset and workload, to devise optimal resource allocations and scheduling decisions. As we move towards an era where on-demand computing becomes the fifth utility, the end product from this research will cope with elastic computational environments.
Cloud computing, which evolved from grid computing, virtualisation and automation, has a potential to deliver a variety of services to the end user via the Internet. Using the Web to deliver Infrastructure, Software and Platform as a Service (SaaS/PaaS) has benefits of reducing the cost of investment in internal resources of an organisation. It also provides greater flexibility and scalability in the utilisation of the resources. There are different cloud deployment models - public, private, community and hybrid clouds. This paper presents the results of research and development work in deploying a private cloud using OpenStack at the University of Huddersfield, UK, integrated into the University campus Grid QGG. The aim of our research is to use a private cloud to improve the High Performance Computing (HPC) research infrastructure. This will lead to a flexible and scalable resource for research, teaching and assessment. As a result of our work we have deployed private QGG-cloud and devised a decision matrix and mechanisms required to expand HPC clusters into the cloud maximising the resource utilisation efficiency of the cloud. As part of teaching and assessment of computing courses an Automated Formative Assessment (AFA) system was implemented in the QGG-Cloud. The system utilises the cloud’s flexibility and scalability to assign and reconfigure required resources for different tasks in the AFA. Furthermore, the throughput characteristics of assessment workflows were investigated and analysed so that the requirements for cloud-based provisioning can be adequately made.
In the last decade Grid Computing Technology, an innovative extension of distributed computing, is becoming an enabler for computing resource sharing among the participants in "Virtual Organisations" (VO) [1]. Although there exist enormous research efforts on grid-based collaboration technologies, most of them are concentrated on large research and business institutions. In this paper we are considering the adoption of Grid Computing Technology in a VO of small to medium Further Education (FE) and Higher-Education (HE) institutions. We will concentrate on the resource sharing among the campuses of The University of Huddersfield in Yorkshire and colleges in Lancashire, UK, enabled by the Grid. Within this context, it is important to focus on standards that support resource and information sharing, toolkits and middleware solutions that would promote Grid adoption among the FE/HE institutions in the Virtual HE organisation.
The cloud computing paradigm promises to deliver hardware to the end user at a low cost with an easy to use interface via the internet. This paper outlines an effort at the University of Huddersfield to deploy a private Infrastructure as a Service cloud to enhance the student learning experiance. The paper covers the deployment methods and configurations for OpenStack along with the security provisions that were taken to deliver computer hardware. The rationale behind the provisions of virtual hardware and OS configurations have been defined in great detail supported by examples. This paper also covers how the resource has been used within the taught courses as a Virtual Laboratory, and in the research projects. A second use case of the cloud for Automated Formative Assesment (AFA) by using JClouds and Chef for Continuous Integration is presented. The AFA deployment is an example of a Software as a Service offering that has been added on to the IaaS cloud. This development has led to an increase in freedom for the student.
In an effort to deliver HPC services to the research community at the University of Huddersfield, many grid middle wares have been deployed in parallel to asses their effectiveness and efficiency along with their user friendliness. With a disparate community of researchers spanning but not limited to, 3D Art designers, Architects, Biologists, Chemists, Computer scientists, Criminologists, Engineers (Electrical and Mechanical) and Physicists, no single solution works well. As HPC is delivered as a centralised service, an ideal solution would be one that meets a majority of the needs, most of the time. The scenario is further complicated by the fact that the HPC service delivered at the University of Huddersfield comprises of several small high performance clusters, a high throughput computing service, several storage resources and a shared HPC services hosted off-site.
Grid computing has, in recent history, become an invaluable tool for scientific research. As grid middleware has matured, considerations have extended beyond the core functionality, towards greater usability. The aim of this paper is to consider how resources that are available to the users across the Queensgate Grid (QGG) at the University of Huddersfield (UoH), could be accessed with the help of an ontology-driven interface.
The interface is a part of the Agent in Grid (AiG) project underdevelopment at the Systems Research Institute Polish Academy of Sciences (SRIPAS). It is to be customized and integrated with the UoH computing environment. The overarching goal is to help users of the grid infrastructure. The secondary goals are: (i) to improve the performance of the system, and (ii) to equalize the distribution of work among resources. Results presented in this paper include the new ontology that is being developed for the grid at the UoH, and the description of issues encountered during the development of a scenario when user searches for an appropriate resource within the Unicore grid middleware and submits job to be executed on such resource
In this paper we present a cluster middleware, designed to implement a Linux-Windows Hybrid HPC Cluster, which not only holds the characteristics of both operating system, but also accepts and schedules jobs in both environments. Beowulf Clusters have become an economical and practical choice for small-and-medium-sized institutions to provide High Performance Computing (HPC)resources. The HPC resources are required for running simulations, image rendering and other calculations, and to support the software requiring a specific operating system. To support the software, smallscale computer clusters would have to be divided in two or more clusters if they are to run on a single operating system. The x86 virtualisation technology would help running multiple operating systems on one computer, but only with the latest hardware which many legacy Beowulf clusters do not have. To aid the institutions, who rely on legacy nonvirtualisation- supported facilities rather than high-end HPC resources, we have developed and deployed a bi-stable hybrid system built around Linux CentOS 5.5 with the improved OSCAR middleware; and Windows Server 2008 and Windows HPC 2008 R2. This hybrid cluster is utilised as part of the University of Huddersfield campus grid.
High Throughput Computing (HTC) systems are designed to utilise available resources on a network of idle machines in an institution or organization by cycle stealing. It provides an additional ‘free’ resource from the existing computing and networking infrastructure for modelling and simulation requiring a large number of small jobs, such as applications from biology, chemistry, physics, and digital signal processing. At the University of Huddersfield, there are thousands of idle laboratory machines that could be used to run serial/parallel jobs by cycle stealing. Our HTC system, implemented in Condor [1], is part of the Queensgate Campus Grid (QGG) [2] that consists of a number of dedicated departmental and university computer clusters.
Condor is an excellent HTC tool that excels in cycle stealing and job scheduling on idle machines. However, only idle powered machines can be used from a networked pool. Many organizations deploy power saving mechanisms to try to reduce energy consumption in their systems, and power down idle resources, using rigid and inflexible power management policies. The University of Huddersfield Computing Services use the Energy Star EZ GPO power saving tool that runs as a Windows service and detects how long the computer has been idle. Then it allows the computer to first turn off the screen and then go into hibernation.
Our research and development work is focused on implementing a HTC system using Condor to work within a “green IT” policy of a higher education institutions that conform to green IT challenges for a multi-platform, multi-discipline user/ resource base. This system will allow Condor to turn on machines that may have gone to sleep due to lack of usage when there is a large queue of pending jobs. The decision to utilise dormant resources will be made on a variety of factors such as job priority, job requirements, user priority, time of day, flocking options, queue conditions etc. Good practice scheduling policies would need to be devised that would work within this “green IT” pool.
The advent of open source software leading to Beowulf clusters has enabled small to medium sized Higher and Further education institutions to remove the “computational power” factor from research ventures. In an effort to catch up with leading Universities in the realm of research, many Universities are investing in small departmental HPC clusters to help with simulations, renders and calculations. These small HE/FE institutions have in the past benefited from cheaper software and operating system licenses. This raises the question as to which platform Linux of Windows should be implemented on the cluster. As the smaller/medium Universities move into research, many Linux based applications and code better suit their research needs, but the teaching base still keeps the department tied to Windows based applications. In such institutions, where it is usually recycled machines that are linked to form the clusters, it is not often feasible to setup more than one cluster.
This paper will propose a method to implement a Linux-Windows Hybrid HPC Cluster that seamlessly and automatically accepts and schedules jobs in both domains. Using Linux CentOS 5.4 with OSCAR 5.2 beta 2 middleware with Windows Server 2008 and Windows HPC 2008 R2 (beta) a bi-stable hybrid system has been deployed at the University of Huddersfield. This hybrid cluster is known as the Queensgate Cluster. We will also examine innovative solutions and practices that are currently being followed in the academic world as well as those that have been recommended by Microsoft® Corp.
.
.
.
.
.
.
.
I teach at both under and postgraduate levels. I have predominately taught courses related to Computer Systems and Computer System Architectures but I also teach Signal Processing and its applications in Engineering and Music Technology. I am always interested in supervising short projects (final year or masters) within the parallel and distributed computing field. I also consider projects related to embedded engineering and instrumentation as it relates to management of large scale compute systems.
In this module students critically evaluate the development of software solutions across existing and emerging technology areas. Divided into four parts I teach the cloud computing technology area. Through the use of relevant case studies. students understand and apply fundamental principles of applied system solutions to a range of real world problems. Specifically in Cloud Computing, the course covers applications, challenges and demands that drive developments, architectures along with the service models and current technologies.
This beginer session provides an introduction to front end web development covering the basics of HTML/CSS and jQuery. You will learn how to design the layout and format of a webpage as well as how to code collaboratively using Github.
In this module students introduced to Computer Cluster, Cloud and Grid technologies and applications. Semester one focuses on the fundamental components of Cluster environments, such as Commodity Components for Clusters, Network Services/Communication software, Cluster Middleware, Resource management, and Programming Environments. In semester two, students study the fundamental components of Grid environments, such as Authentication, Authorization, Resource access, and Resource discovery. The hands-on laboratory exercises provides the necessary practical experience with Cluster and Grid middleware software required to construct Cluster and Grid applications.
The module combines the theory of signal processing and analysis of discrete time systems, with practical aspects of Digital Signal Processing (DSP) applied to the design of digital filters. Semester one focuses on Signal processing operations and analysis in time and frequency domain; and digital FIR and IIR filter design and simulation using MATLAB. In semester two students implement their digital filter design using DSP software and hardware development system. A range of DSP design case studies: eg audio filters, two dimensional filters and adaptive filters will be used to illustrate typical DSP applications through practical laboratory work.
Virtual instruments represent a change from traditional hardware-centred instrumentation systems to software-centred systems that use the computing power, display, and connectivity capabilities of desktop computers and workstations. With virtual instruments, engineers and scientists build measurement and automation systems that suit their needs exactly (user-defined) instead of being limited by traditional fixed-function instruments (manufacturer-defined). In this module students learn fundamentals of programming in LabVIEW and acquire skills to design effective solutions to variety of instrumentation problems. The laboratory exercises provide the necessary practical experience required to design and develop computer-based systems to emulate a range of instruments.
Many existing and future computer-based applications impose exceptional demands on performance that traditional predominantly single-processor systems cannot offer. Large-scale computational simulations for scientific and engineering applications now routinely require highly parallel computers. In this module you will learn about Parallel Computer Architectures, Legacy and Current Parallel Computers, trends in Supercomputers and Software Issues in Parallel Computing; you will be introduced to Computer Cluster, Cloud and Grid technologies and applications. Students study the fundamental components of Cluster environments, such as Commodity Components for Clusters, Network Services/Communication software, Cluster Middleware, Resource management, and Programming Environments. The hands-on laboratory exercises provide the necessary practical experience with Cluster middleware software required to construct Cluster applications.
The module combines the theory of signal processing and analysis of audio systems, with practical aspects of Digital Signal Processing (DSP). Students learn about digital filter design, Digital Signal Processors (DSPs) and their applications in audio systems. Semester one focuses on Signal processing operations and analysis in time and frequency domain, digital FIR and IIR filter design and simulation using MATLAB and LabVIEW. In semester students will apply their digital filter design to create artificial digital audio effects, using DSP software and hardware development system. Case studies will be used to illustrate typical audio DSP applications.
In this module students are introduced to MATLAB and SIMULINK software to enable modelling of the dynamic response of instruments, devices and systems to different types of input - for example thermometers, dc motors, electronic filters and suspension systems. Understanding how Laplace Transforms are used to simulate processes and how they are used in the design of controllers for controlling the output from complex systems - such as positions control systems. Students design simple controllers for various processes using Proportional and Integral control and learn how to determine whether such systems are likely to become unstable. Further analysis techniques like Discrete Fourier Transforms are also taught.
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
I would be happy to talk to you if you need my assistance in your research, studies or whether you need some system administration support for your company (coffee is on you ;) ). I usually operate an open door policy so feel free to queue up if you see me in the office. My formal lecture timetable is posted outside my office.
You can find me at my office located in Christopherson building, School of ECS at Durham University. The office number is E291 Institute of Advanced Research Computing. I am at my office every day from 10:00 am until 4:00 pm, but you may consider a call to fix an appointment.