June 24, 2024
We have released a benchmark for algorithm synthesis, ASAC, which consists of problems from Chinese national competitive programming contests. Compared with existing benchmarks, our benchmark is not only more difficult, but also consists of logic formalizations of the problems so that the logic-based synthesis and verification is possible. The paper will appear at the demo track of FSE'24 in the next month.
June 14, 2024Proving Functional Program Equivalence via Directed Lemma Synthesis was accepted at FM'24. This study originates from the fact that the program synthesized by our algorithm synthesziers are too complex, and cannot be automatically verified by existing verifiers. This paper identifies that the key to the proof is find useful lemmas, and these lemmas can be then synthesized by program synthesizers. Therefore, our algorithm synthesizer can be used to verify its synthesized programs. The resulted system is 21 times faster than CVC4-Ind, the SOTA verifier.
April 1, 2024Superfusion: Eliminating Intermediate Data Structures via Inductive Synthesis was accepted at PLDI'24. In this paper, we generalize the previous AutoLifter approach into Superfusion that optimizes programs by eliminating intermediate data structures, automating the fusion procedure in functional programming. The implemented system, SuFu, not only achieves comparable performance to AutoLifter on D&C-like algorithm synthesis problems, but also significantly outperforms existing approaches on fusion optimization and structural recursive program synthesis problems.
February 13, 2024
Our ET tool won the first place in the most participated Java Functional Erros track in the first international competition for Automated Program Repair. ET consists of our two existing approaches: the patch validator ExpressAPR and the patch generator Tare. ET successfully repaired 6 bugs with no incorrect patch, outperforming the baseline of calling ChatGPT (LLMR) and all other participated tools.
January 26, 2024Accelerating Patch Validation for Program Repair with Interception-Based Execution Scheduling was accepted at IEEE Transactions on Software Engineering. This paper explains the techniques we used behind our tool ExpressAPR for fast patch validation, which is 137.1x faster than plain validation and 8.8X faster than UniAPR, the previous SOTA tool. We hope this tool could facilitate future APR research and tool development.
January 25, 2024
Cooperating with DeepSeek (a subsidiary of High-Flyer AI), my student Qihao Zhu has leaded the development of the currently best open source base LLM for code, DeepSeek-Coder. Some of the ideas in our existing research, such as presenting the related information together to help model learn language rules, have helped the development of this model. We have published a technical report about the training details. Qihao is on job market (expected to graudate at this June) and we look forward to your offer.
January 1, 2024Decomposition-Based Synthesis for Applying D&C-Like Algorithmic Paradigms was accepted at TOPLAS. This is the second published paper of our currently focused project on algorithm synthesis, following our previous paper on synthesizing dynamic programming, yet actually finished earlier. This paper identifies that multiple algorithm classes, such as D&C, incremental computation, segment trees, etc, have a similar structure, and proposes a new algorithm to synthesize algorithms in this class. The number of problems solved by the synthesis algorithm in our experiment is twice that of existing general synthesis algorithm, and outperforms existing semi-automatic approaches for D&C.
July 31, 2023A Probabilistic Delta Debugging Approach for Abstract Syntax Trees was accepted at ISSRE'23. This paper is a follow-up of our ESEC/FSE'21 paper on probabilistic debugging, and builds a probabilitc model for abstract syntactic trees, enabling more efficient delta debugging for context-free languages such as programming languages.
July 1, 2023Synthesizing Efficient Memoization Algorithms was accepted at OOPSLA'23. This is the first published paper of our currently focused project on algorithm synthesis started three years ago. Given a formal specification, algorithm synthesis aims to apply algorithmic paradigms such as D&C and dynamic programming to synthesize efficient programs. This paper focus on synthesizing dynamic programming algorithms. Another paper on synthesizing D&C and similar algorithms has also been drafted and is available on arxiv.
March 30, 2023
I received the First-Class Award on Science from the Chinese Institute of Electronics as the 1st co-winner for the research achievement "data-driven software testing and repair". This is the only first-class award in science from CIE in the domain of software in recent three years.
February 24, 2023Improving Oracle-Guided Inductive Synthesis by Efficient Question Selection was accepted at OOPSLA'23. In this paper, we propose a new approximation model based on random semantics to efficiently select the optimal question to be asked to the oracle in OGIS, improving the performance of any OGIS program synthesis approach in both interactive and non-interactive scenarios. This is a follow-up work of our PLDI'20 work on interactive synthesis paper.
December 9, 2022
Two papers were accepted at ICSE23.
Tare: Type-Aware Neural Program Repair describes a new approach which guides neural networks to learn typing rules, and based on which we implemented a program repair approach significantly outperforming existing program repair approaches. Reliability Assurance for Deep Neural Network Architectures Against Numerical Defects is a follow-up work of our FSE20 paper that not only better detects potential bugs in neural achitectures, but also confirms these potential bugs with tests and suggests fixes.
August 30, 2022Toward Actionable Testing of Deep Learning Models was accepted at Science China Information Sciences. This is a position paper calls for actionable testing approaches for deep learning systems.
April 21, 2022
Two papers accepted at IJCAI 2022. Grape: Grammar Preserving Rule Embedding presents an neural embedding technique for grammar rules, similar to Word2Vec for words, and could be used in many downstream applications such as program generation. Different from Word2Vec, our embedding technique takes the definitions of the grammar rules into consideration, embedding both the structure and the content of the rules. Lyra: A Benchmark for Turducken-Style Code Generation presents a new program generation benchmark that requires to generate a program in two programming languages at one time.
January 7, 2022
The videos of my talk on algorithm synthesis and my tutorial on inductive program synthesis (both in Chinese) are available in the CCF digital library. Please check the invited talks page for the links.
October 11, 2021L2S: a Framework for Synthesizing the Most Probable Program under a Specification was accepted at TOSEM. This paper presents L2S framework, which guides program synthesis with probabilities and captures the core underline ideas of many our other papers. This work started as a generalization and extension of our ACS approach at the end of 2016. In 2018 we wrapped up the initial ideas and experiment results as a workshop paper. Finally, after almost five years' work, after numerous invited talks and keynotes at workshops and conferences about L2S, and after the publications of multiple other papers based on the framework, we finally managed to publish the full L2S framework. The final paper has 44 pages excluding the appendix.
September 30, 2021
I received the CCF-IEEE CS Young Computer Scientists Award. This award is given to at most 5 young computer scientists in China under the age of 40 each year. Thank nominators and the award committee for the nomination and recognition.
August 31, 2021Generalizable Synthesis Through Unification was accepted at OOPSLA'21. In this paper, we borrow the concept of "Occam Learning" from the machine learning community and build the first Occam solver based on the "synthesis through unification" framework whose generalizability is theoretically guaranteed by Occam learning.
August 6, 2021
I received the second ESEC/FSE Distinguished Reviewer Award from ESEC/FSE 2021. Thanks for the recognition.
August 2, 2021Faster Mutation Analysis with Fewer Processes and Smaller Overheads was accepted at ASE'21. This is a follow-up work of our ISSTA'17 paper and further improves the efficiency of mutation analysis, the analysis problem exists in many applications such as mutation testing and patch validation.
July 1, 2021Probabilistic Delta Debugging was awarded ACM SIGSOFT Distinguished Paper Award. Thanks for the recognition.
June 15, 2021Interactive Patch Filtering as Debugging Aid was accepted at ICSME'21. Program repair tools are usually assume to have a high precision to be useful. In this paper we show that, with proper interaction tool support, the program repair tool with a low precision can also be useful. This opens many possibilities for future program repair research, such as increasing the recall without worrying at the risk of lowering precision. This is a follow-up work of our PLDI'20 paper on interactive repair/synthesis.
May 21, 2021Probabilistic Delta Debugging and A Syntax-Guided Edit Decoder for Neural Program Repair were accepted at ESEC/FSE'21. The former proposes the first delta debugging based on probabilistic modeling and significantly outperforms the classic ddmin algorithm. The latter proposes the first neural program repair approach that outperforms traditional non-neural program repair approaches.
December 1, 2020
I received the NASAC Software Innovation Award for Young Researchers and ESEC/FSE 2020 Distinguished Reviewer Award. Thanks for the recognition from the award committees and the nominators!
June 2, 2020
I was recognized as a distinguished reviewer by ICSE 2020. Thank all subreviewers who contributed to the reviews. Thank the ICSE organization committee for the recognition.
March 31, 2020Question Selection for Interactive Program Synthesis was accepted at PLDI'20. Unlike most existing research studies how computers answer questions from pepople, this paper studies how computers ask questions to people for program synthesis. This is the second paper in our new project on interactive program synthesis/repair, following the ISSRE'19 paper.
August 16, 2019
I have received the early career award from NSFC. This is a Chinese version of the NSF career award in US, but is only given to researchers under the age of 38 and is very competitive. Among all researchers who mainly work in software engineering in China, only Dan Hao and He Jiang have received this award before.
March 15, 2018Learning to Synthesize was accepted at GI'18. This paper proposes a framework for synthesizing a program that has a high probability in a given context.
May 10, 2017Faster Mutation Analysis via Equivalence Modulo States was accepted at ISSTA'17. Many program analysis tasks require to execute many similar versions of a program, including but not limited to mutation testing, generate-and-validate program repair, mutation-based fault localization, and product line testing. Our approach accelerates these analyses by sharing the redundant executions.
December 13, 2016Precise Condition Synthesis for Program Repair was accepted at ICSE'17. Following our previous attempt in our ASE'15 paper to increase the precision of defect repair (less incorrect patches), in this paper we try to generate precise patches for a more general category of defects: incorrect conditions.
April 2, 2016 We have discovered a mistake in the data presentation of our ICST paper "empirical evaluation of test coverage for functional programs". Unfortuately, it is already too late to update the camera-ready version, so please download the corrected version here. Compared with the published version, Figure 3 and Table II are updated, while all discussions and conclusions are the same.
December 21, 2015 Recently we started a new project on compiler testing. The ICSE'16 paper presents an empirical comparison of the mainstream compiler testing techniques, where we used a new method to measure test effectiveness to make the comparison possible. The ICST'16 paper presents a low-cost test prioritization approach that has the potential of accelerating compiler testing, even for the non-regressional cases.
December 1, 2015 As a member of a big team on software-defined cloud management, I received the the First-Class Award on Scientific and Technological Progress, Ministry of Education, one of the most pretigeous award from Ministry fo Education, China.
July 19, 2015Fixing Recurring Crash Bugs via Analyzing Q&A Sites was accepted at ASE'15. Within our knowledge, this is the first paper that leverages Internet resources to fix bugs. Our approach achieved high accuracy in producing correct patches, successfully avoiding the over-fitting problem in automatic bug repair.
July 2, 2015Inner Oracles: Input-Specific Assertions on Internal States was accepted at the new idea track of ESEC/FSE'15. In this paper, we propose the novel concept of inner oracle, which asserts the inner state of one test case. We believe inner oracles are useful in many testing scenarios.
July 2, 2014 Recently I started to work on problems in testing and debugging of general programs, and here are two new papers. The ICSME'14 paper is about the automatic association of bug reports to source files, reporting two new heuristics to improve the accuracy of existing techniques. The ASE'14 paper reports the first approach to automatically inferring metamorphic relations from programs.
August 1, 2013 Our project proposal on improving the quality of safety-critical software system has been approved. This five year project is supported by the young scientist fund under the national basic research program, and I am the principal investigator. The young scientist fund under the national basic research program is one of the most competitive fund for young scientists in China. Every year only 1~3 projects are granted in the area of computer science. Our project is the first one in the area of software development.
May 30, 2013 Our paper about priority-based fixing was accepted at SPLC 2013. This work enhances our ICSE'12 work about range fix, reducing the number of options presented to the user by inferring implicit priorities from user interactions.
April 12, 2013 Our empirical study on web API evolution was accepted at ICWS 2013. This is the first step of our new project on web API evolution, which, hopefully, will eventually make it easier to migrate clients for API update.
April 18, 2012 I have joined Peking University faculty as an assistant professor under the "Young talents plan". This is a newly-designed position to match the tenure-track system used in North America.
January 27, 2012Generating Range Fixes for Software Configuration was accepted at ICSE'12. This paper proposes a new type of fix, range fix, to handle the inconsistency fixing problem arised from our survey. An automated algorithm for generating range fix is also given. In the sense of general inconsistency handling, range fixes introduce an interative fixing process to resolving inconsistencies, which is a complement of our existing work on fully automatic fixing.
November 30, 2011 I have finished my postdoc and left University of Waterloo. Now I temporarily work as an indepent researcher while waiting for a new job offer.
September 29, 2011 We have conducted an online survey about what challenges are faced by the users of modern configuration tools. The result was published here as a technical report.
July 25, 2011 I will serve as a PC member of BX 2012. Please consider submitting a papper.
January 5, 2011Synchronizing concurrent model updates based on bidirectional transformation was accepted at Software and Systems Modeling and is available online. This is an extended version of our ICMT'09 paper. This version adds discussion on history ignorance law and connects it with constant complement. I recommend to read even if you have read the conference version.
September 28, 2009 I have graduated from the University of Tokyo. My thesis is available here.
September 17, 2009 We have released a library for helping integrate the Haskell-based synchronizer into Java projects. Please refer to here for more information.
July 1, 2009 A new paper describing the upcoming Beanbag 1.0 language was accepted by ESEC/FSE'09.
April 30, 2009 Atenea group has released reSynch, a UML synchronization tool developed using Beanbag. Visit here for more information.
March 16, 2009 Beanbag 0.2.1 has released. Added an output to the sync command and the resync command to show only effective update on the data. Also fixed a bug with incremental synchronization.
December 12, 2008 We have published a technical report about the Beanbag language.
November 11, 2008 Our ATL-based synchronization tool has been named as SyncATL.
September 30, 2008 We have released a new version of Beanbag. The new version includes a new and easy-to-write Beanbag language, an interactive console and two example programs.
September 5, 2008 Our on-site synchronization project has been named as Beanbag. This name comes from a Japanese traditional game of the same name where the players tries to keep several beanbag consistent within a short time constraint.
October 10, 2007 Our paper is accepted by ASE 2007!