News RISE (Research Lab) Teaching Publications Services Awards

Assistant Professor

School of Computing, Queen's University
635 Goodwin Hall, Kingston, Ontario, Canada K7L2N8

yuan dot tian at cs dot queensu dot ca

cloudGoogle Scholar

Yuan Tian

I'm an Assistant Professor in the School of Computing at Queen's University, Kingston, Ontario.

I am looking for master/PhD. students who are excited about data mining and software engineering.

Before joining Queen's, I was a data scientist at the Living Analytics Research Centre (LARC), Singapore Management University (SMU). I received my Ph.D. degree in Information Systems from SMU in May 2017 and a bachelor's degree in Computer Science from Zhejiang University in 2012. I visited Carnegie Mellon University in 2015 and Inria Paris in 2013.

My research focuses on three domains:Data Mining, Software Engineering, and Social Science. My short-term research goal is to help people gain insights from messy data stored in all kinds of software repositories and propose context-aware data-driven approaches to improve the efficiency and capabilities of various stakeholders in software development process.


News

  • April-19: Awarded a Discovery Launch Supplement grant ($12,500) from NSERC/CRSNG.
  • April-19: Awarded a Discovery grant ($33,000/year for 5 years) from NSERC/CRSNG.
  • Dec-18:NVIDIA GPU Grant Approved.
  • Dec-18:Paper "PatchNet: A Tool for Deep Patch Classification" has been accepted by the International Conference on Software Engineering (ICSE), Demo track.
  • Oct-18:Join Queen's University, Kingston, Canada as an assistant professor.
  • Aug-18:Our paper "Recommending Who to Follow in the Software Engineering twitter Space" has been accepted by Transactions on Software Engineering and Methodology (TOSEM).

Selected Research Topics

APIBot: Question Answering Bot for API Documentation

To addresses the daunting task of finding information about APIs, we constructs a question answering (QA) bot called APIBot. We note that applying well-established general-purpose QA systems to API documentation poses three key challenges: (1) API QA process needs to consider domain-specific patterns and software-specific terms, (2) Much semantic of an API documentation is hidden in its implicit structure, (3) general-purpose QA bots require a large amount of manually created training data that is not necessarily available for API documentation. APIBot addresses these challenges by introducing novel components on top of a general-purpose QA system SiriusQA. An empirical evaluation of APIBot on 92 API questions showed that APIBot can achieve at Hit@5 score of 0.706 (i.e. the correct answer is among the top five answers returned about 70% of the time). This paper appears in the ASE17 proceedings. Access it at here.


Automated Software Bug Triage

Software systems are often released with bugs due to system complexity and inadequate testing. To help developers effectively address and manage bugs, bug tracking systems such as Bugzilla and JIRA are adopted to manage the life cycle of a bug through bug report. Since most of the information related to bugs are stored in software repositories, e.g., bug tracking systems, version control repositories, mailing list archives, etc. These repositories contain a wealth of valuable information, which could be mined to automate bug management process and thus save developers time and effort. In the past, we have shown how historical bug data could help with automated bug triage process, including duplicate bug report detection (refer to our CSMR12 paper), bug report prioritization (refer to our EMSE15 paper), and bug report assignment (refer to our ICPC16 paper).


Mining Socail Media for Software Engineering

Different from traditional media, microblog users tend to focus on recency and informality of content. Many tweet contents are relatively more personal and opinionated, compared to that of traditional news report. Thus, by analyzing microblogs, one could get the real-time information about what people are interested in or feel toward a particular topic. To support developers in collecting software engineering related content, we build a microblog observatory that aggregates more than 58,000 Twitter feeds, captures software-related tweets, and computes trends from across topics and time points (refer to our MSR12 tool paper). We also applied latest event detection algorithm to find hot topics related to software engineering on Twitter (refer to our ICSME15 paper). To extract software engineering related content from twitter, we perform a preliminary study to investigate the feasibility of automatic classification of microblogs into two categories: relevant and irrelevant to engineering software systems in our MSR12 paper. Following this work, we propose a novel approach named NIRMAL (refer to our SANER15 paper), which automatically identifies software relevant tweets from a collection or stream of tweets based on language modelling. Recently, we propose a new approach to find and rank URLs harvested from Twitter based on their in formativeness and relevance to a domain of interest (refer to our SANER17 paper).


Teaching

  • CS351 Advanced Data Analytics, Winter 2019: website
  • CS235 Data Structure, Winter 2019: website

Services

Organization Committee
  • Workshops Co-chair: 27th IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER), London, Canada, 2020
  • Social Media Co-chair: 35th IEEE/ACM International Conference on Automated Software Engineering (ASE), Melbourne, Australia, 2020
  • Co-chair: Consortium for Software Engineering Research 2019 Spring Meeting, Montréal, Canada, 2019
Program Committee
  • 34th IEEE/ACM International Conference on Automated Software Engineering (ASE), San Diego, California, USA, 2019
  • 19th IEEE International Conference on Software Quality, Reliability, and Security (QRS), Sofia, Bulgaria, 2019
  • 23rd International Conference on Evaluation and Assessment in Software Engineering (EASE), Short Papers Track, Copenhagen, Denmark, 2019.
  • 26th IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER), HangZhou, China, 2019.
  • 25th Asia-Pacific Software Engineering Conference (APSEC), Nara, Japan, 2018
  • 25th IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER), ERA track, Campobasso, Italy, 2018.
  • 22nd International Conference on Evaluation and Assessment in Software Engineering (EASE), Short papers, Christchurch, New Zealand, 2018.
  • 10th International Conference on Collaboration Technologies (CollabTech), Setubal, Portugal, 2018.
Journal Referee
  • IEEE Transactions on Software Engineering (TSE)
  • ACM Transactions on Software Engineering and Methodology (TOSEM)
  • ACM Transactions on Information Systems (TOIS)
  • Empirical Software Engineering (EMSE)
  • Transactions on Intelligent Systems and Technology (TIST)
  • Transactions on Services Computing (TSC)
  • IEEE Transactions on Reliability (TR)
  • Information and Software Technology (IST)
  • Software Testing, Verification and Reliability (STVR)
  • Journal of Software: Evolution and Process (JSMS)

Awards

  • Best Paper Award, SANER ERA track, 2017.
  • Delegates of SMU to attend the 2017 Royal Society Commonwealth Science Conference, invited by the National Research Foundation (NRF) of Singapore, 2017.
  • SMU Presidential Doctoral Fellowship, 2015-2016.
  • Best Paper Award nomination, ICSM, 2013.
  • Travel Grant for Female Students, provided by Google and ICSM, 2013.
  • Temasek Foundation Leadership Enrichment and Regional Networking Scholarship, 2012.