Qizhen Zhang
I am an Assistant Professor of Computer Science at the University of Toronto, where I lead the Far Data Lab (FDL). I am broadly interested in data management and computer systems. My research has been bridging cloud data processing systems and data center networks to address emerging challenges in hyperscale data processing.
Email, CV, Google Scholar, LinkedIn
I am actively looking for motivated students to work on cloud data systems, data center networking, and interactions between ML and systems. Come join us at FDL. If you are interested, please apply to UofT CS and mention my name. Feel free to drop me an email.
Oct. 2024 | DPDPU is accepted at CIDR 2025 |
Sep. 2024 | I finally moved to Toronto |
Jul. 2024 | Vision paper on data processing with DPUs is available on arXiv |
Jun. 2024 | DDS is accepted at VLDB 2024 |
Feb. 2024 | I am visiting the National University of Singapore |
Feb. 2024 | Vision paper on cloud-native LMs is available on arXiv |
May 2023 | Cowbird is accepted at SIGCOMM 2023 |
Mar. 2023 | I will teach memory-disaggregated DBMSs at SIGMOD 2023 |
Oct. 2022 | TeShu is accepted at CIDR 2023 |
Aug. 2022 | I am now at Microsoft Research |
Aug. 2022 | FlexChain is accepted at VLDB 2023 |
Older News |
Aug. 2022 | I defended my dissertation and received my Ph.D. |
Jun. 2022 | I will join the University of Toronto as an Assistant Professor in Fall 2023 |
Mar. 2022 | My work is recognized with Penn's best CS dissertation award (Rubinoff Award) |
Dec. 2021 | TELEPORT is accepted at SIGMOD 2022 (happening in Philly next year!) |
Oct. 2021 | Redy is accepted at VLDB 2022 |
Oct. 2021 | CompuCache is accepted at CIDR 2022 |
Apr. 2021 | MimicNet is accepted at SIGCOMM 2021 |
Apr. 2020 | Understanding DBMSs in DDCs is accepted at VLDB 2020 |
Oct. 2019 | Rethinking data processing systems in DDCs is accepted at CIDR 2020 |
Mar. 2019 | I am selected as the 2019-2020 Jonathan M. Smith Fellow |
Nov. 2018 | GraphRex is accepted at SIGMOD 2019 |
Aug. 2017 | Predicting startup crowdfunding success is accepted at CIKM 2017 |
Jul. 2017 | Analyzing the performance and cost of graph systems is accepted at SoCC 2017 |
Aug. 2016 | I started Ph.D. studies at the University of Pennsylvania |
Today's largest data processing workloads are hosted in cloud data centers. Due to unprecedented data growth, these workloads have ballooned to hyperscale level, encompassing billions to trillions of data items and hundreds to thousands of machines per query. Enabling and expanding with these workloads are highly scalable data center networks that connect up to hundreds of thousands of networked servers. At hyperscale, the classic layered designs are no longer sustainable: without knowing how their data is transferred in the network, applications can make egregious decisions in executions; the cloud infrastructure also performs poorly without rethinking the interfaces and services exposed to its applications. Rather than optimize these massive layers in silos, I build systems across them with principled network-centric designs for efficient hyperscale data processing.
I have also worked on other aspects of data processing, including data science application with machine learning [CIKM '17], and data processing cost efficiency [SoCC '17]. Recently, I proposed CompuCache [CIDR '22], a new cloud service that exploits spot VMs for remote caching and memory-intensive compute offloading.
* students I advise
DPDPU: Data Processing with DPUs [arXiv] Jiasheng Hu*, Philip Bernstein, Jialin Li, Qizhen Zhang Conference on Innovative Data Systems Research, CIDR 2025, to appear
DDS: DPU-optimized Disaggregated Storage Qizhen Zhang, Philip Bernstein, Badrish Chandramouli, Jiasheng Hu*, Yiming Zheng* International Conference on Very Large Data Bases, VLDB 2024
Computing in the Era of Large Generative Models: From Cloud-Native to AI-Native Yao Lu, Song Bian, Lequn Chen, Yongjun He, Yulong Hui, Matthew Lentz, Beibin Li, Fei Liu, Jialin Li, Qi Liu, Rui Liu, Xiaoxuan Liu, Lin Ma, Kexin Rong, Jianguo Wang, Yingjun Wu, Yongji Wu, Huanchen Zhang, Minjia Zhang, Qizhen Zhang, Tianyi Zhou, Danyang Zhuo CoRR, arXiv:2401.12230, 2024
Cowbird: Freeing CPUs to Compute by Offloading the Disaggregation of Memory Xinyi Chen, Liangcheng Yu, Vincent Liu, Qizhen Zhang Annual Conference of the ACM Special Interest Group on Data Communication, SIGCOMM 2023
Disaggregated Database Systems Jianguo Wang, Qizhen Zhang ACM International Conference on Management of Data, SIGMOD 2023 Tutorial
FlexChain: An Elastic Disaggregated Blockchain Chenyuan Wu, Mohammad Javad Amiri, Jared Asch, Heena Nagda, Qizhen Zhang, Boon Thau Loo International Conference on Very Large Data Bases, VLDB 2023
Templating Shuffles Qizhen Zhang, Jiacheng Wu, Ang Chen, Vincent Liu, Boon Thau Loo Conference on Innovative Data Systems Research, CIDR 2023
Optimizing Data-intensive Systems in Disaggregated Data Centers with TELEPORT Qizhen Zhang, Xinyi Chen, Sidharth Sankhe, Zhilei Zheng, Ke Zhong, Sebastian Angel, Ang Chen, Vincent Liu, Boon Thau Loo ACM International Conference on Management of Data, SIGMOD 2022
Redy: Remote Dynamic Memory Cache Qizhen Zhang, Philip Bernstein, Daniel Berger, Badrish Chandramouli International Conference on Very Large Data Bases, VLDB 2022
CompuCache: Remote Computable Caching using Spot VMs Qizhen Zhang, Philip Bernstein, Daniel Berger, Badrish Chandramouli, Vincent Liu, Boon Thau Loo Conference on Innovative Data Systems Research, CIDR 2022
MimicNet: Fast Performance Estimates for Data Center Networks with Machine Learning Qizhen Zhang, Kelvin K.W. Ng, Charles W. Kazer, Shen Yan, João Sedoc, Vincent Liu Annual Conference of the ACM Special Interest Group on Data Communication, SIGCOMM 2021
Understanding the Effect of Data Center Resource Disaggregation on Production DBMSs Qizhen Zhang, Yifan Cai, Xinyi Chen, Sebastian Angel, Ang Chen, Vincent Liu, Boon Thau Loo International Conference on Very Large Data Bases, VLDB 2020
Rethinking Data Management Systems for Disaggregated Data Centers Qizhen Zhang, Yifan Cai, Sebastian Angel, Ang Chen, Vincent Liu, Boon Thau Loo Conference on Innovative Data Systems Research, CIDR 2020
Optimizing Declarative Graph Queries at Large Scale Qizhen Zhang, Akash Acharya, Hongzhi Chen, Simran Arora, Ang Chen, Vincent Liu, Boon Thau Loo ACM International Conference on Management of Data, SIGMOD 2019
Predicting Startup Crowdfunding Success through Longitudinal Social Engagement Analysis Qizhen Zhang, Tengyuan Ye, Meryem Essaidi, Shivani Agarwal, Vincent Liu, Boon Thau Loo ACM International Conference on Information and Knowledge Management, CIKM 2017
Architectural Implications on the Performance and Cost of Graph Analytics Systems Qizhen Zhang, Hongzhi Chen, Da Yan, James Cheng, Boon Thau Loo, Purushotham Bangalore ACM Symposium on Cloud Computing, SoCC 2017
Quegel: A General-Purpose System for Querying Big Graphs Qizhen Zhang, Da Yan, James Cheng Proceedings of the 2016 International Conference on Management of Data, SIGMOD 2016 Demo
A General-Purpose Query-Centric Framework for Querying Big Graphs Da Yan, James Cheng, M. Tamer Özsu, Fan Yang, Yi Lu, John C. S. Lui, Qizhen Zhang, Wilfred Ng International Conference on Very Large Data Bases, VLDB 2016
A Shapley Value Approach for Cost Allocation in the Cloud Qizhen Zhang, Haoran Wang, Yang Chen, Tao Qin, Ying Yan, Thomas Moscibroda ACM Symposium on Cloud Computing, SoCC 2015 Poster
CSCC43: Introduction to DatabasesUniversity of Toronto, Winter 2025 / Summer 2024
CSC2531: Cloud-native Data Management SystemsUniversity of Toronto, Winter 2025
CSC2229: Topics in Computer Networks: Cloud ComputingUniversity of Toronto, Winter 2024
Program Committee: SIGMOD 2025, VLDB 2025, ICDE 2025, SIGMOD 2024, EuroSys 2024, EDBT 2024, SoCC 2023, CIKM 2023, ICDE 2023, SoCC 2022, CIKM 2022
Session Chair: VLDB 2023, Northwest Database Society (NWDS) Annual Meeting 2023, SoCC 2022, SIGMOD 2022
Journal Reviewer: IEEE/ACM Transactions on Networking 2023 IEEE Transactions on Knowledge and Data Engineering 2023 IEEE/ACM Transactions on Networking 2022 IEEE Transactions on Parallel and Distributed Systems 2022 IEEE Transactions on Knowledge and Data Engineering 2022
Organizer: FDL Reading Group, University of Toronto, Fall 2024 DSL Seminar, University of Pennsylvania, Spring 2017 and Fall 2017
Offloading the Tax of Disaggregation Invited talk, National University of Singapore, Singapore, March 2024
Memory-disaggregated Database Systems SIGMOD 2023, Seattle, Washington, United States, June 2023
Templating Shuffles Northwest Database Society (NWDS) Annual Meeting 2023, Redmond, Washington, United States, May 2023
Redy: Remote Dynamic Memory Cache VLDB 2022, Virtual, September 2022
Optimizing Data-intensive Systems in Disaggregated Data Centers with TELEPORT SIGMOD 2022, Philadelphia, Pennsylvania, United States, June 2022
Hyperscale Data Processing with Network-centric Designs Job talk, February - April 2022 CMU, HKUST, NUS, Ohio State U., Simon Fraser U., UC Irvine, UIUC, U. Minnesota Twin Cities, U. Toronto, U. Virginia, U. Waterloo
CompuCache: Remote Computable Caching using Spot VMs CIDR 2022, Virtual, January 2022
MimicNet: Fast Performance Estimates for Data Center Networks with Machine Learning SIGCOMM 2021, Virtual, August 2021
Understanding the Effect of Data Center Resource Disaggregation on Production DBMSs VLDB 2020, Virtual, September 2020 Microsoft Research, Virtual, August 2020
Rethinking Data Management Systems for Disaggregated Data Centers CIDR 2020, Amsterdam, Netherlands, January 2020
Optimizing Declarative Graph Queries at Large Scale Microsoft Research, Redmond, Washington, United States, August 2019 SIGMOD 2019, Amsterdam, Netherlands, July 2019
Predicting Startup Crowdfunding Success through Longitudinal Social Engagement Analysis CIKM 2017, Singapore, November 2017
Architectural Implications on the Performance and Cost of Graph Analytics Systems SoCC 2017, Santa Clara, California, United States, September 2017
Microsoft Research, Redmond, August 2022 - August 2023 Cloud data management
Microsoft Research, Redmond, Summer 2021 CompuCache: fast and cheap compute offloading for remote memory caching with spot VMs
Microsoft Research, Redmond, Summer 2020 Redy: high-performance RDMA-accessible caching with remote dynamic memory
Microsoft Research, Redmond, Summer 2019 Large-scale SQL query optimization with a focus on search completeness and efficiency
NEC Labs, America, Summer 2018 Software behavior analysis based on provenance graphs for anomaly detection
Microsoft Research Asia, September 2014 - June 2015 Resource cost allocation for multi-tenant clouds with game theory
Last edit in October 2024