Keynote

Comprehending and Empowering Developers by Mining Interaction Data

David Lo, Singapore Management University, Singapore

Abstract

Developer interaction data is a rich trove of information that can be analysed and mined to better comprehend developer activities & which in turn shed light into their pain points and needs — and empower developers to perform their tasks more effectively and/or efficiently. Interaction data can be obtained by tracking developer activities in IDE and also related tools that developers regularly use to accomplish their day-to-day tasks (e.g., web browser). Mining and analysis of developer interaction data is a relatively young topic in the area of mining software repositories (MSR). And as such, it is now an exciting time where there are much challenges to address and opportunities to tap on.

This talk highlights several of our recent work which is part of the exciting community-wide effort to tackle challenges and take advantage of the potentials of mining developer interaction data. In particular, this talk, which is divided into 3 parts, highlights how developer interaction data can be mined to (1) comprehend how developers understand programs, (2) comprehend how developers search the web, and (3) empower developers to create interactive video tutorials. In the first part, I'll describe our field study of program comprehension in practice by analysing developer interaction data within and outside IDE across a total of seven real projects, on 78 professional developers, and amounting to 3,148 working hours. In the second part, I describe our mixed-method study based on collecting search queries from 60 developers' interaction data and surveying 235 software engineers from more than 21 countries across five continues to understand what developers frequently search for and of the search tasks that they often find challenging. In the final part, I describe our work that design and build a programming video tutorial authoring system that that leverages operating system level instrumentation to log workflow history while tutorial authors are creating programming videos, and the corresponding tutorial watching system that enhances the learning experience of video tutorials by providing programming-specific workflow history and timeline-based browsing interactions.

Aside from methodologies and findings, opportunities in terms of open technical problems and potential benefits of mining developer interaction data will also be discussed. Hopefully, the talk would inspire attendees to continue innovating in this exciting research topic.

Bio

David Lo (PhD, National University of Singapore, 2008) is an Associate Professor in School of Information Systems, Singapore Management University. His research interest lies in the intersection of software engineering, data science, and cybersecurity. He has more than 10 years of experience in research and has more than 200 publications across the three areas. He received the Lee Foundation and Lee Kong Chian Fellow for Research Excellence from the Singapore Management University in 2009 and 2018 respectively, and a number of international research and service awards including multiple ACM distinguished paper awards for his work on software analytics. He has served as general, program co-chair, and steering committee member of a number of prestigious international conferences (e.g., IEEE/ACM International Conference on Automated Software Engineering), and editorial board member of a number of high-quality journals (e.g., Empirical Software Engineering). For more information, please visit his website.