osdi 2021 accepted papersosdi 2021 accepted papers

Submitted papers must be no longer than 12 single-spaced 8.5 x 11 pages, including figures and tables, plus as many pages as needed for references, using 10-point type on 12-point (single-spaced) leading, two-column format, Times Roman or a similar font, within a text block 7 wide x 9 deep. Table of Contents | The copyback-aware block allocation considers different copy costs at different copy paths within the SSD. Finding the inductive invariant of the distributed protocol is a critical step in verifying the correctness of distributed systems, but takes a long time to do even for simple protocols. Most existing schedulers expect users to specify the number of resources for each job, often leading to inefficient resource use. USENIX discourages program co-chairs from submitting papers to the conferences they organize, although they are allowed to do so. First, Fluffy mutates and executes multi-transaction test cases to find consensus bugs which cannot be found using existing fuzzers for Ethereum. All papers will be available online to registered attendees before the conference. Zeph executes privacy-adhering data transformations in real-time and scales to thousands of data sources, allowing it to support large-scale low-latency data stream analytics. We built an FPGA prototype of the nanoPU fast path by modifying an open-source RISC-V CPU, and evaluated its performance using cycle-accurate simulations on AWS FPGAs. For conference information, see: . Grand Rapids, Michigan, United States . The key insight in blk-switch is that Linux's multi-queue storage design, along with multi-queue network and storage hardware, makes the storage stack conceptually similar to a network switch. However, a plethora of recent data breaches show that even widely trusted service providers can be compromised. To adapt to different workloads, prior works mix or switch between a few known algorithms using manual insights or simple heuristics. Her specialties include network routing protocols and network security. HotCRP.com signin Sign in using your HotCRP.com account. This yielded 6% fewer TLB miss stalls, and 26% reduction in memory wasted due to fragmentation. See the USENIX Conference Submissions Policy for details. This post is for recording some notes from a few OSDI'21 papers that I got fun. USENIX, like other scientific and technical conferences and journals, prohibits these practices and may, on the recommendation of a program chair, take action against authors who have committed them. In contrast, CLP achieves significantly higher compression ratio than all commonly used compressors, yet delivers fast search performance that is comparable or even better than Elasticsearch and Splunk Enterprise. For instance, the following are not sufficient grounds to specify a conflict with a PC member: they have reviewed the work before, they are employed by your competitor, they are your personal friend, they were your post-doc advisor or advisee, or they had the same advisor as you. We conclude with a discussion of additional techniques for improving the allocator development process and potential optimization strategies for future memory allocators. J.P. Morgan AI Research partners with applied data analytics teams across the firm as well as with leading academic institutions globally. Proceedings Cover | USENIX new Date().getFullYear()>document.write(new Date().getFullYear()); Grants for Black Computer Science Students Application, Propose an interesting, compelling solution, Demonstrate the practicality and benefits of the solution, Clearly describe the paper's contributions, Clearly articulate the advances beyond previous work. Abstract registrations that do not provide sufficient information to understand the topic and contribution (e.g., empty abstracts, placeholder abstracts, or trivial abstracts) will be rejected, thereby precluding paper submission. Web pages today commonly include large amounts of JavaScript code in order to offer users a dynamic experience. Mothy received a PhD in 1995 from the Computer Laboratory of the University of Cambridge, where he was a principal designer and builder of the Nemesis OS. In addition, CLP outperforms Elasticsearch and Splunk Enterprise's log ingestion performance by over 13x, and we show CLP scales to petabytes of logs. VLDB 2021: Venue Tivoli Hotel & Congress Center Arni Magnussons Gade 2 1577 Copenhagen, Denmark +45 3268 4300 In-person attendees can purchase tickets for the park / gardens with a 15% discount, which is a special offer by Tivoli Hotel & Congress Center to VLDB 2021 attendees. We present selective profiling, a technique that locates data locality problems with low-enough overhead that is suitable for production use. The 15th USENIX Symposium on Operating Systems Design and Implementation (OSDI '21) will take place as a virtual event on July 1416, 2021. Dorylus is up to 3.8 faster and 10.7 cheaper compared to existing sampling-based systems. Our evaluation shows that, compared to existing participant selection mechanisms, Oort improves time-to-accuracy performance by 1.2X-14.1X and final model accuracy by 1.3%-9.8%, while efficiently enforcing developer-specified model testing criteria at the scale of millions of clients. The 20th ACM Workshop on Hot Topics in Networks (HotNets 2021) will bring together researchers in computer networks and systems to engage in a lively debate on the theory and practice of computer networking. This motivates the need for a new approach to data privacy that can provide strong assurance and control to users. The main contribution of this paper is GoJournal, a verified, concurrent journaling system that provides atomicity for storage applications, together with Perennial 2.0, a framework for formally specifying and verifying concurrent crash-safe systems. The experimental results show that Penglai can support 1,000s enclave instances running concurrently and scale up to 512GB secure memory with both encryption and integrity protection. One important reason for the high cost is, as we observe in this paper, that many sanitizer checks are redundant the same safety property is repeatedly checked leading to unnecessarily wasted computing resources. SanRazor adopts a novel hybrid approach it captures both dynamic code coverage and static data dependencies of checks, and uses the extracted information to perform a redundant check analysis. In this paper, we present P3, a system that focuses on scaling GNN model training to large real-world graphs in a distributed setting. Existing frameworks optimize tensor programs by applying fully equivalent transformations, which maintain equivalence on every element of output tensors. Concurrency control algorithms are key determinants of the performance of in-memory databases. Prior or concurrent workshop publication does not preclude publishing a related paper in OSDI. Unfortunately, because devices lack the semantic information about which I/O requests are latency-sensitive, these heuristics can sometimes lead to disastrous results. PLDI is a premier forum for programming language research, broadly construed, including design, implementation, theory, applications, and performance. OSDI'21 accepted 31 papers and 26 papers participated in the AE, a significant increase in the participate ratio: 84%, compared to OSDI'20 (70%) and SOSP'19 (61%). KEVIN combines a fast, lightweight, and POSIX compliant file system with a key-value storage device that performs in-storage indexing. We observe that scalability challenges in training GNNs are fundamentally different from that in training classical deep neural networks and distributed graph processing; and that commonly used techniques, such as intelligent partitioning of the graph do not yield desired results. OSDI is "a premier forum for discussing the design, implementation, and implications of systems software." A total of six research papers from the department were accepted to the . Attaching supplementary material is optional; if your paper says that you have source code or formal proofs, you need not attach them to convince the PC of their existence. We present the nanoPU, a new NIC-CPU co-design to accelerate an increasingly pervasive class of datacenter applications: those that utilize many small Remote Procedure Calls (RPCs) with very short (s-scale) processing times. For more details on the submission process, and for templates to use with LaTeX, Word, etc., authors should consult the detailed submission requirements. Marius is open-sourced at www.marius-project.org. Jason Mohoney and Roger Waleffe, University of WisconsinMadison; Henry Xu, University of Maryland, College Park; Theodoros Rekatsinas and Shivaram Venkataraman, University of WisconsinMadison. If you have any questions about conflicts, please contact the program co-chairs. We develop MAGE, an execution engine for SC that efficiently runs SC computations that do not fit in memory. Just using Lambdas on top of CPU servers offers up to 2.75 more performance-per-dollar than training only with CPU servers. 23 artifacts received the Artifacts Functional badge (88%). Our evaluation on the SPEC benchmarks shows that SanRazor can reduce the overhead of sanitizers significantly, from 73.8% to 28.062.0% for AddressSanitizer, and from 160.1% to 36.6124.4% for UndefinedBehaviorSanitizer (depending on the applied reduction scheme). Compared to existing baselines, DPF allows training more models under the same global privacy guarantee. The biennial ACM Symposium on Operating Systems Principles is the world's premier forum for researchers, developers, programmers, and teachers of computer systems technology. Second, Fluffy uses multiple existing Ethereum clients that independently implement the specification as cross-referencing oracles. Registering abstracts a week before paper submission is an essential part of the paper-reviewing process, as PC members use this time to identify which papers they are qualified to review. Timothy Roscoe is a Full Professor in the Systems Group of the Computer Science Department at ETH Zurich, where he works on operating systems, networks, and distributed systems, and is currently head of department. For conference information, . Paper abstracts and proceedings front matter are available to everyone now. Tao Luo, Mingen Pan, Pierre Tholoniat, Asaf Cidon, and Roxana Geambasu, Columbia University; Mathias Lcuyer, Microsoft Research. The device then "calibrates" its interrupts to completions of latency-sensitive requests. DeSearch uses trusted hardware to build a network of workers that execute a pipeline of small search engine tasks (crawl, index, aggregate, rank, query). Please identify yourself as a presenter and include your mailing address in your email. While verifying GoJournal, we found one serious concurrency bug, even though GoJournal has many unit tests. Typically, monolithic kernels share state across cores and rely on one-off synchronization patterns that are specialized for each kernel structure or subsystem. An evaluation of Addra on a cluster of 80 machines on AWS demonstrates that it can serve 32K users with a 99-th percentile message latency of 726 msa 7 improvement over a prior system for text messaging in the same threat model. In this paper, we present Vegito, a distributed in-memory HTAP system that embraces freshness and performance with the following three techniques: (1) a lightweight gossip-style scheme to apply logs on backups consistently; (2) a block-based design for multi-version columnar backups; (3) a two-phase concurrent updating mechanism for the tree-based index of backups. Compared to a state-of-the-art fuzzer, Fluffy improves the fuzzing throughput by 510 and the code coverage by 2.7 with various optimizations: in-process fuzzing, fuzzing harnesses for Ethereum clients, and semantic-aware mutation that reduces erroneous test cases. Amy Tai, VMware Research; Igor Smolyar, Technion Israel Institute of Technology; Michael Wei, VMware Research; Dan Tsafrir, Technion Israel Institute of Technology and VMware Research. Accepted papers will be allowed 14 pages in the proceedings, plus references. DMon speeds up PostgreSQL, one of the most popular database systems, by 6.64% on average (up to 17.48%). Devices employ adaptive interrupt coalescing heuristics that try to balance between these opposing goals. We propose a learning-based framework that instead explicitly optimizes concurrency control via offline training to maximize performance. Session Chairs: Deniz Altinbken, Google, and Rashmi Vinayak, Carnegie Mellon University, Tanvir Ahmed Khan and Ian Neal, University of Michigan; Gilles Pokam, Intel Corporation; Barzan Mozafari and Baris Kasikci, University of Michigan. GoJournals goal is to bring the advantages of journaling for code to specs and proofs. As the emerging trend of graph-based deep learning, Graph Neural Networks (GNNs) excel for their capability to generate high-quality node feature vectors (embeddings). We demonstrate the above using design, implementation and evaluation of blk-switch, a new Linux kernel storage stack architecture. Notification of conditional accept/reject for revisions: 3 March 2022. If you are uncertain about how to anonymize your submission, please contact the program co-chairs, osdi21chairs@usenix.org, well in advance of the submission deadline. Ethereum is the second-largest blockchain platform next to Bitcoin. The blockchain community considers this hard fork the greatest challenge since the infamous 2016 DAO hack. Prior or concurrent publication in non-peer-reviewed contexts, like arXiv.org, technical reports, talks, and social media posts, is permitted. By submitting a paper, you agree that at least one of the authors will attend the conference to present it. We implement DeSearch for two existing decentralized services that handle over 80 million records and 240 GBs of data, and show that DeSearch can scale horizontally with the number of workers and can process 128 million search queries per day. She is the recipient of several best paper awards, the Einstein Chair of the Chinese Academy of Science, the ACM/SIGART Autonomous Agents Research Award, an NSF Career Award, and the Allen Newell Medal for Excellence in Research. In 2023 I started another two-year term on the . We present case studies and end-to-end applications that show how Storm lets developers specify diverse policies while centralizing the trusted code to under 1% of the application, and statically enforces security with modest type annotation overhead, and no run-time cost. In experiments with real DL jobs and with trace-driven simulations, Pollux reduces average job completion times by 37-50% relative to state-of-the-art DL schedulers, even when they are provided with ideal resource and training configurations for every job. If your paper is accepted and you need an invitation letter to apply for a visa to attend the conference, please contact conference@usenix.org as soon as possible. We also propose two file system techniques for ZNS+-aware LFS. Session Chairs: Nadav Amit, VMware Research Group, and Ada Gavrilovska, Georgia Institute of Technology, Stephen Ibanez, Alex Mallery, Serhat Arslan, and Theo Jepsen, Stanford University; Muhammad Shahbaz, Purdue University; Changhoon Kim and Nick McKeown, Stanford University. Pollux promotes fairness among DL jobs competing for resources based on a more meaningful measure of useful job progress, and reveals a new opportunity for reducing DL cost in cloud environments. Qing Wang, Youyou Lu, Junru Li, and Jiwu Shu, Tsinghua University. Research Impact Score 9.24. . Jiachen Wang, Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University; Shanghai AI Laboratory; Engineering Research Center for Domain-specific Operating Systems, Ministry of Education, China; Ding Ding, Department of Computer Science, New York University; Huan Wang, Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University; Shanghai AI Laboratory; Engineering Research Center for Domain-specific Operating Systems, Ministry of Education, China; Conrad Christensen, Department of Computer Science, New York University; Zhaoguo Wang and Haibo Chen, Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University; Shanghai AI Laboratory; Engineering Research Center for Domain-specific Operating Systems, Ministry of Education, China; Jinyang Li, Department of Computer Science, New York University. We present Storm, a web framework that allows developers to build MVC applications with compile-time enforcement of centrally specified data-dependent security policies. The NAL eliminates remote PM accesses to hot items without inducing extra local PM accesses. Yet, existing efforts randomly select FL participants, which leads to poor model and system efficiency. We focus on NVMe storage devices and show that it is natural to express these semantics in the kernel and the application and only requires a modest two-bit change to the device interface. OSDI will provide an opportunity for authors to respond to reviews prior to final consideration of the papers at the program committee meeting. USENIX NSDI, 2021 Acceptance Rate: 15.99% Fluid: Resource-Aware Hyperparameter Tuning Engine P. Yu*, J. Liu*, M. Chowdhury (*Equal contribution) MLSys, 2021 Acceptance Rate: 23.53% NetLock: Fast, Centralized Lock Management Using Programmable Switches Z. Yu, Y. Zhang, V. Braverman, M. Chowdhury, X. Jin ACM SIGCOMM, 2020 Acceptance Rate: 21.6% All the times listed below are in Pacific Daylight Time (PDT). With the help of thousands of Lambda threads, Dorylus scales GNN training to billion-edge graphs. This paper describes the design, implementation, and evaluation of Addra, the first system for voice communication that hides metadata over fully untrusted infrastructure and scales to tens of thousands of users. This year, there were only 2 accepted papers from UK institutes. Alas, existing profiling techniques incur high overhead when used to identify data locality problems and cannot be deployed in production, where programs may exhibit previously-unseen performance problems. All submissions will be treated as confidential prior to publication on the USENIX OSDI 21 website; rejected submissions will be permanently treated as confidential. We have made Fluffy publicly available at https://github.com/snuspl/fluffy to contribute to the security of Ethereum. Such centralized engines are in a perfect position to censor content and violate users privacy, undermining some of the key tenets behind decentralization. Our evaluation shows that NrOS scales to 96 cores with performance that nearly always dominates Linux at scale, in some cases by orders of magnitude, while retaining much of the simplicity of a sequential kernel. To resolve the problem, we propose a new LFS-aware ZNS interface, called ZNS+, and its implementation, where the host can offload data copy operations to the SSD to accelerate segment compaction. We evaluate PrivateKube and DPF on microbenchmarks and an ML workload on Amazon Reviews data. Editor in charge: Daniel Petrolia . Despite their extensive use for debugging and vulnerability discovery, sanitizer checks often induce a high runtime cost. This approach misses possible optimization opportunities as transformations that only preserve equivalence on subsets of the output tensors are excluded. She developed the technology for making network routing self-stabilizing, largely self-managing, and scalable. The papers will be available online to everyone beginning on the first day of the conference, July 14, 2021. OSDI brings together professionals from academic and industrial backgrounds in what has become a premier forum for discussing the design, implementation, and implications of systems software. NrOS is primarily constructed as a simple, sequential kernel with no concurrency, making it easier to develop and reason about its correctness. Paper Submission Information All submissions must be received by 11:59 PM AoE (UTC-12) on the day of the corresponding deadline. To achieve low overhead, selective profiling gathers runtime execution information selectively and incrementally. A graph neural network (GNN) enables deep learning on structured graph data. This paper presents Zeph, a system that enables users to set privacy preferences on how their data can be shared and processed. Furthermore, to enable automatic runtime optimization, GNNAdvisor incorporates a lightweight analytical model for an effective design parameter search. Tej Chajed, MIT CSAIL; Joseph Tassarotti, Boston College; Mark Theng, MIT CSAIL; Ralf Jung, MPI-SWS; M. Frans Kaashoek and Nickolai Zeldovich, MIT CSAIL. OSDI '21 Technical Sessions All the times listed below are in Pacific Daylight Time (PDT). Of the 26 submitted artifacts: 26 artifacts received the Artifacts Available badge (100%). Submissions may include as many additional pages as needed for references but not for appendices. The biennial ACM Symposium on Operating Systems Principles is the world's premier forum for researchers, developers, programmers, vendors and teachers of operating system technology. Consensus bugs are extremely rare but can be exploited for network split and theft, which cause reliability and security-critical issues in the Ethereum ecosystem. Shaghayegh Mardani, UCLA; Ayush Goel, University of Michigan; Ronny Ko, Harvard University; Harsha V. Madhyastha, University of Michigan; Ravi Netravali, Princeton University. Horcrux-compliant web servers perform offline analysis of all the JavaScript code on any frame they serve to conservatively identify, for every JavaScript function, the union of the page state that the function could access across all loads of that page. This kernel is scaled across NUMA nodes using node replication, a scheme inspired by state machine replication in distributed systems. Kernel code requires manual memory management and type-unsafe code and must efficiently handle complex, asynchronous events. This is especially true for DPF over Rnyi DP, a highly composable form of DP. Jiang Zhang, University of Southern California; Shuai Wang, HKUST; Manuel Rigger, Pinjia He, and Zhendong Su, ETH Zurich. The abstractions we design for the privacy resource mirror those defined by Kubernetes for traditional resources, but there are also major differences. In addition, increasing CPU core counts further complicate kernel development. The chairs will review paper conflicts to ensure the integrity of the reviewing process, adding or removing conflicts if necessary. Responses should be limited to clarifying the submitted work. We implement a variant of a log-structured merge tree in the storage device that not only indexes file objects, but also supports transactions and manages physical storage space. Fan Lai, Xiangfeng Zhu, Harsha V. Madhyastha, and Mosharaf Chowdhury, University of Michigan. We present TEMERAIRE, a hugepage-aware enhancement of TCMALLOC to reduce CPU overheads in the applications code. Pollux improves scheduling performance in deep learning (DL) clusters by adaptively co-optimizing inter-dependent factors both at the per-job level and at the cluster-wide level. The wire-to-wire RPC response time through the nanoPU is just 69ns, an order of magnitude quicker than the best-of-breed, low latency, commercial NICs. One classical approach is to increase the efficiency of an allocator to minimize the cycles spent in the allocator code. The program co-chairs will use this information at their discretion to preserve the anonymity of the review process without jeopardizing the outcome of the current OSDI submission. Submitted November 12, 2021 Accepted January 20, 2022. Camera-ready submission (all accepted papers): 2 April 2021; Main conference program: 27-28 April 2021; All deadline times are . Upon these two primitives, our system can scale to thousands of concurrent enclaves with high resource utilization and eliminate the high-cost initialization of secure memory using fork-style enclave creation without weakening the security guarantees. We demonstrate that KEVIN reduces the amount of I/O traffic between the host and the device, and remains particularly robust as the system ages and the data become fragmented. Reviews will be available for response on Wednesday, March 3, 2021. . Authors may use this for content that may be of interest to some readers but is peripheral to the main technical contributions of the paper. Authors may submit a response to those reviews until Friday, March 5, 2021. Even the little publishable OS work that is not based on Linux still assumes the same simplistic hardware model (essentially a multiprocessor VAX) that bears little resemblance to modern reality. Currently, for large graphs, CPU servers offer the best performance-per-dollar over GPU servers. Instead of choosing among a small number of known algorithms, our approach searches in a "policy space" of fine-grained actions, resulting in novel algorithms that can outperform existing algorithms by specializing to a given workload. Using selective profiling, we build DMon, a system that can automatically locate data locality problems in production, identify access patterns that hurt locality, and repair such patterns using targeted optimizations. Youngseok Yang, Seoul National University; Taesoo Kim, Georgia Institute of Technology; Byung-Gon Chun, Seoul National University and FriendliAI. (Visa applications can take at least 30 working days to process.)

Why Is My Backup Camera Upside Down Dodge Journey, Guy And Ralna Divorce Cause, Air Force Brigadier General Promotion List 2022, What You Talkin Bout Willis Gif With Sound, Articles O

osdi 2021 accepted papers