Systems Optimizations for Learning and Processing on Large Scale Graphs
Author | : Morteza Ramezani |
Publisher | : |
Total Pages | : 0 |
Release | : 2022 |
ISBN-10 | : OCLC:1367865150 |
ISBN-13 | : |
Rating | : 4/5 ( Downloads) |
Download or read book Systems Optimizations for Learning and Processing on Large Scale Graphs written by Morteza Ramezani and published by . This book was released on 2022 with total page 0 pages. Available in PDF, EPUB and Kindle. Book excerpt: Graphs are powerful data representations favored in many computation domains and graph-based data structures have grown in popularity in recent years. However processing large graphs and learning useful information from them are both data- and compute-intensive. Unlike traditional data structures, where the data independency allows for more parallelism and optimizations, the dependency between nodes in the graph leads to more memory footprint, limited parallelism, privacy concerns and excessive random memory accesses. These wide range of problems make the design of optimized systems for performing graph-related tasks quite challenging. Among these tasks, graph processing systems and graph learning are gaining increasing attention in recent years. In graph processing systems the objective is to design and implement an efficient system for running graph analytic tasks, such as graph traversal or node ranking. On the other hand, as deep neural networks have been proven successful in a diverse range of applications involving images and sequences, several efforts have emerged trying to generalize deep neural networks to the graphs. However in either cases, the aforementioned special characteristics of the graphs lead to extensive memory requirement, inefficient memory accesses latency and communication burden which hinder the performance of the systems. This also limits the usage of ever popular distributed settings for processing such a data type, as new challenges arise while adapting distributed systems for graphs. In this dissertation, we address the systems limitations for processing and learning on graphs, by introducing novel techniques inspired by both systems and theoretical aspects. More specifically, we exploit the approximate capability of graph processing applications to design and develop an approximate graph processing system, motivated by both approximate computing methods and approximate graph algorithms. We introduce a novel framework which unlike existing systems does not require significant pre-processing time, while maintaining the accuracy for the output. In the second part of this dissertation, we focus on learning on graphs and more specifically graph neural networks. We investigate the main bottlenecks for training the graph neural networks in both centralized and distributed settings. While the underlying root of inefficiency is the same in both cases, in centralized systems such as CPU-GPU the limited memory capacity of GPU leads to significant decrease in training performance. We proposed a new sampling technique that takes advantage of this architecture and the unique structure of graphs to compensate for the memory limitation. Furthermore, in the distributed learning system, we first examine the effect of utilizing existing platforms for learning on the graph and show that using current techniques can lead to significant reduction in model performance. Later we propose a novel approach to improve the accuracy of learning on graphs in distributed systems, by leveraging a global correction technique.