Table of Contents

Sillito, J.; Murphy, G. C. & De Volder, K. Asking and Answering Questions during a Programming Change Task. Transactions on Software Engineering, IEEE CS Press, 2008, 34, 434-451

Abstract

Little is known about the specific kinds of questions programmers ask when evolving a code base and how well existing tools support those questions. To better support the activity of programming, answers are needed to three broad research questions: (1) what does a programmer need to know about a code base when evolving a software system? (2) how does a programmer go about finding that information? and (3) how well do existing tools support programmer?s in answering those questions? We undertook two qualitative studies of programmers performing change tasks to provide answers to these questions. In this paper, we report on an analysis of the data from these two user studies. This paper makes three key contributions. The first contribution is a catalog of 44 types of questions programmers ask during software evolution tasks. The second contribution is a description of the observed behavior around answering those questions. The third contribution is a description of how existing deployed and proposed tools do, and do not, support answering programmers' questions.

Comments

Yann-Gaël Guéhéneuc, 2014/01/08

In this paper, the authors seek to provide the typical questions asked by developers during real change tasks. Knowing such questions could help designing tool support that is more efficient than today's tool support. It could also help refining models of program comprehension. Thus, it complements nicely previous works on models of program comprehension, programmers' questions, and empirical studies of change. This paper includes a very interesting list of references but unfortunately does not relate in details the cited previous work with the 44 questions discussed in the paper. In particular when it describes other works on programmers' questions, it does not explicitly relate previous works by:

It would have been interesting to relate all these different sets of questions and identify their intersections, redundancies, and gaps. In particular, there seems to be little interest about the behaviour… Other interesting questions pertain to the possible errors when mapping (top-down) or grouping (bottom-up) concepts and the stopping condition. It seems that few authors studied the errors that developers make when understanding programs (and why they make them). They also include relating the different sets of questions to comprehension theory/models, such as the theory of cognitive support or distributed cognition.

The authors observed two sets of developers: pair programmers in an artificial environment (pairs of students performing change tasks on an unknown system) and in a real environment (professional developers performing changes on a system that their company develops). The use a grounded-theory analysis to code the collected audio recordings (and others) and, as categories emerge, to perform further selective sampling and gather more variations about the categories. “The aim here is to build rather than test [a] theory and the specific result of this [analysis] is a theoretical understanding of the situation of interest that is grounded in the data collected.”

Although some explanations are provided, I am confused about the choice of pairs of programmers in the artificial environment. In both environment, the limits of the used think-aloud method are not discussed: in particular, think aloud and problem with social desirability. This limit could explain the lack of “more involved” question regarding design choices (not the how/what but the why) and the behaviour of the systems.

The authors considered a code source as a graph of entities and categorised the questions as finding focus points, expanding focus points, understanding a subgraph, and questions over groups of subgraphs. Surprisingly, they mentioned neither design patterns as typical cases of “subgraph” nor Sim's structured transition graphs to explain the “jumps” that developers make between categories. The 44 questions are as follows:

At the end, the authors report the frequencies of the questions but among groups of developers, not among sessions: “[i]f a specific question is asked repeatedly in a session, it is counted only once […]”, which undermine the usability of the number: it would have been much more interesting to know how often a question is asked in general! Also, the authors identify tool support for the questions but remain quite general. An in-depth study would be required. They conclude that “[b]ecause tools typically [provide] only limited support for defining the scope over which to operate, [developers] end up asking questions more globally than they intend and, so, the result sets presented will include many relevant items”, which is “[an opportunity] for tools to make use of the larger context to help [developers] more effectively scope their questions and to determine what is relevant to their higher level questions”.