My research lies at the intersection of human-computer interaction (HCI) and software engineering (SE): I apply theories and methods from HCI to problems in SE. My Ph.D research explores the applicability of Information Foraging Theory when there are multiple variants (or versions) of information.

My recent projects are:

    1. Variations foraging
    2. Information seeking in version control history
    3. Supporting end-user programmers

Variations foraging for exploratory programming

For ill-defined problems, people explore multiple solutions and alternatives. For example, see the picture below: these are different options a programmer explored for a game’s UI.


These are the different options I explored for a variations foraging conference talk (meta!).


As you can see, people, while doing such creative exploratory work save the different solution copies or “variants”, so that they can compare, backtrack or reuse bits and pieces in the future. Unfortunately, it is a mess to deal with: for example, I had no idea which slide deck had what in it and I had to manually go into each one of them to see what slide deck contains the slide I need. The problem here is that stuff is all very hard, and there can be too many of ’em (even a few slide decks were not easy!).

The goal of my research is to understand how programmers work with variants, so that we can build better exploratory programming, or in general, creativity support tools. For this, I use Information Foraging Theory (IFT) to understand what goes on in programmers’ heads while dealing with variants.

To build the theoretical foundations –or, as I call it, the variations foraging theory–, we my collaborators and I conducted user studies to see how programmers worked with variants, formed hypotheses about what actually was going on in their heads (that made them do what they did) and then built computational models of what goes on in the programmers’ heads. If the model, based on what is going on the programmers’ head, could predict the programmers’ actual code navigations, then we win: our hypotheses are valid. Otherwise also we win — we have more work to do to refine our hypotheses and models.

The user study result was published at CHI and won a Best Paper Award. Paper (a CHI’16 best paper)Preprint, Talk.

Our first computational model, PFIS-V, was published at CHI’17. Paper, Preprint, TalkSource code.

The second computational model, PFIS-H, is an active manuscript — I hope to finish it up and send it off to a journal in the coming weeks!

Currently, I am looking into how I can take these theoretical foundations to build a variations support tool.

Information seeking in version control history

Developers often go to their project’s version control history looking for answers to questions like “who wrote this code”, “for what requirements is this code here”, “why is this code implemented in this weird way”, to name a few. Unfortunately, information sources like these are also hard to find anything in and I believe we can do much better with their information design.

So, we started off looking at what developers were really doing with their version control history, so that we can identify painpoints and better support those activities. Via interviews and surveys, we did the first study to characterize the whats and hows and whys of developers’ information seeking in version control tools.

The result is what we call the “three-lens model” for software history. The idea here is that there are three parts to software history–uncommitted changes, recent commits and old commits. Developers go to these parts of history with different intents and seek information in very different manners, just like people use different lens to see different kinds of stuff. Tool builders need to keep these three lenses in mind while building version control tools. This paper got an ICSME’15 Best Paper Award. Preprint.

Next, we drilled deeper into the three lens and asked ourselves why questions: why do developers really need the information, why do developers go about finding information the way they do, why are the problems developers face really problems and so on. Theories are very good at answering such why questions and so, we brought in Information Foraging Theory as an analytical framework.

Among the interesting things we found are that: in the old history (archaeological lens), developers foraged in ways traditional IFT has predicted. In the recent history (awareness lens), developers foraged very differently–they spent less cost, to get a high-level (not in-depth) idea of what changed, so that they could make their lives easier in the future (e.g., avoid merge conflicts).

In the area of uncommitted changes (immediate lens), the devil is in committing changes–but in doing so, developers had to consider how to make it easier for future (e.g., split changes to commits, write good commit messages), but often failed at it because different situations and people needed different commit sizes, commit messages, etc. In other words, producers can only produce one thing, whereas consumers might need all kinds of things! (or producer produces something and consumer needs something else!). This is an open problem to investigate in IFT’s cost- value framework.

This work is mostly in my MS thesis. A journal paper is currently under revision.

Supporting end-user programmers

I am under an NDA and not quite allowed to talk about the details of this study. But, I worked on an extension to Calculation View, a new way of editing and working with formulas in spreadsheets. Original Calculation View paper by Sarkar et al.