Journey to my Ph.D.: October 2015

Wednesday, October 21, 2015

VL/HCC Day 3

I will state again, I love VL/HCC :)
If ever given the opportunity to go (if you do any human centric or visual language research), GO! It's valuable in so many ways...even the talks you think you're not interested in you end up loving. Like today's keynote...I thought, not another blocks talk...and then it happened. And I was glad I was there.

The keynote today was titled "Taking Stock of Blocks: Promises and Challenges of Blocks Programming Languages" given by Franklyn Turbak of Wellesley College Computer Science Department. I was going to write a post about it, but Felienne already did a great job, so check it out!

One thing I found interesting is that there is a long history of blocks-based programming. Here is the lineage discussed in the talk:

Blox (Glinert, 1986) - first time puzzle pieces used to represent code
LogoBlocks (Begel, 1996)
Alice (Pausch et al., 2001) - 3D animations; evolved from Python to drag and drop
PicoBlocks (Bonta, Silverman, et al., 2006) - microprocessor for robotics; passes the "Lucite Test" - imagine constructing out of physical blocks; also has extension language
Scratch (Resnick et al., 2007) - best of Alice and PicoBlocks
StarLogo TNG (Roque, Wendel, et al., 2007) - created OpenBlocks frameworks for other to build language
BYOB/Snap! (Harvey et al., 2008) - have first class functions
App Inventor Classic - clunky
Blockly (Fraser, 2012) - javascript based (embedded in web browser); mutators = edit blocks with other mini block language
App Inventor 2 (2013) - local variables, improved parameters
PencilCode (Bau 2013) - toggle between blocks and text in interesting way
Droplets (Bau 2014) - '' ''

Languages with physical blocks (cool!)
Robot Park
Tangible Kindergarten

As the conference comes to an end, I think about how grateful I am for the experiences I've had and continue to have as PhD student. I met and connected with one of my research/blogging idols (Felienne - we even had drinks together! :D), made some new unexpected connections with other researchers that I should definitely know (Mark Guzdial, Caitlin Kelleher, Ronald Metoyer to name a few), and even made the realization that I've advanced in my field/area as I know more and more people that I encounter at these conferences (and they actually like my research!!). It's venues like VL/HCC where I feel like I get the most value as a researcher -- I'm walking away more ready and confident than I came. And that's alright :).

Favorites talks (from first session -- took second half to visit family :D):

A Syntax-Directed Keyboard Extension for Writing Source Code on Touchscreen Devices
Islam Almusaly and Ronald Metoyer
(screenshot)

Adapting Higher-order List Operators for Blocks Programming
Soojin Kim and Franklyn Turbak
PHOLOs - "Pseduo-Higher-Order Operators"

Hub Map: A new approach for visualizing traffic data sets with multi-attribute link data
Andrew Simmons, Iman Avazpour, Hai L. Vu, Rajesh Vasa
perfect for venue location (ATL known for traffic)

Interesting papers (missed the talk):
Natural Language Programming: Designing Effective Environments for Novices
Judith Good and Katherine Howland

A Principle Evaluation for a Principled Idea Garden
Will Jerigan, Amber Horvath, Michael Lee, Margaret Burnett, Taylor Cuilty, Sandeep Kuttal, Anicia Peters, Irwin Kwan, Faezeh Bahmani, Andrew Ko

Enabling Independent Learning of Programming Concepts through Programming Completion Puzzles
Kyle J. Harms, Noah Rowlett, Caitlin Kelleher

Tuesday, October 20, 2015

VL/HCC Day 2 - Keynote and other memorable talks

Another great day at VL/HCC :).

To start, there was an amazing keynote given by Georgia Tech's own Mark Guzdial, a CS Education legend, titled "Requirements for a Computing-Literate Society". The focus of the talk was the challenges while working towards a computing-literate society and how we can re-invent CS education to accomplish this goal. I must say, as an African American female from South Carolina, where CS education is almost non-existent in K-12 education and almost completely unrelatable beyond that, I could completely understand and relate to everything he said. Because I loved it SO much, I took some notes to share some of the major points.

WHY SHOULD WE CARE?
Mark made a few great points as to why we should care about the advancement and spreading of CS education and knowledge. The two predominant ones (that for sure stood out) are that 1) CS is study of process/problem solving, which impacts everyone and 2) learning and understanding computer science provides the ability for people to express themselves in ways they couldn't without programs/automation.

WHAT ARE THE CHALLENGES TO ACCOMPLISHING THIS GOAL?
One of the obvious and most talked about challenges is access to computer science courses or resources for learning CS and accommodations for diversity -- though there are initiatives, such as CODE2040 aimed at increasing diversity and access to computing resources, there is still a gap leading to lower participation. This is especially true for underrepresented minorities like myself.

Another challenge, which I never thought about, was what Mark called the "inverse lake wobegon effect" -- in other words, we think we know more than we do. This theory suggests that the way things are currently done, we only know the top half. The top half, which I would NOT include myself in, would be for example students with CS courses in high school. Those with access are most privileged meaning they are the ones that get noticed by universities or even at the university level getting noticed by professors and other students as being the "real deal" while people like me fall by the waist-side. Thank goodness for my mentor who wouldn't let that happen - #thepowerofmentors

The last challenge he discussed was the unanswered questions that policy-makers continue to ask. To segway into this discussion, Mark starts with a discussion of initiatives such as "Georgia Computes!" and CAITE that aim to inspire students to study CS. He also eludes to some of the differences across states that make it difficult to make concrete, uniform improvements -- these differences also relate to to unanswered questions that could affect how we improve and facilitate CS education for the masses. For example, states vary on opinions of AP CS courses (apparently to some, access to advanced education is considered 'elitist'), what CS is, and whether to require CS. From this, Mark notes other questions that are unanswered, and extremely relevant, such as are the CS requirements really CS. For this he used South Caroline as an example, which hit home for me. I was shocked when I heard that the SC curriculum requires CS for graduation. It was a few years back when I graduated and although I did take the one and only programming class offered by my high school, I knew nothing of a CS requirement or anything close to it. According to some of his findings, one of the questions that need to be answered is 'what kind of CS can we teach to everyone?' -- especially considering how important a high school degree is. We don't want to have people not graduating because they can't pass a CS course but at the same time, typing is not Computer Science >.>.

One of the more interesting findings presented is that although initiatives like "Georgia Computes!" seems to be effective overall, it has different affects on different minorities/populations. But why?? For example, when observing the affects for black in CS (graph above), there is almost a completely flat line, meaning there is no increase in participation. This all showcases the need for more research and exploration into why minorities, such as African Americans, choose to study CS and how we can get them engaged and keep them engaged. Fortunately, Mark had some ideas for that as well :)

WHAT ARE SOME POSSIBLE SOLUTIONS?

Mark discussed two possible solutions, each of which take on a different aspect of the CS education problem.

1) The Role of Context - For many, the problem is that CS seems irrelevant to them or their lives. I can say for me, if I hadn't discovered I could fiddle with my MySpace page and that be considered CS, I might have thought the same thing! But many are asking, how can computing be considered irrelevant? Easy. The context in which we teach/introduce CS is critical, especially for younger audiences who may have preconceived (incorrect) notions of what CS is. For example, some of the problems we are first confronted with when learning CS are tower of hanoi or fibonnacci sequence. How many of us can say that what we do now is at all relevant or related to either of these?? I know I sure can't. Increasing interest could be as simple as teaching in a relevant context (i.e. robots and digital media).
An example of this that he spoke about is called Glitch Game Testers, started by Betsy DiSalvo and Amy Bruckman. This program hires African American males as game testers and have them think about computing deeper than the curriculum. Turns out, all of them finished high school and over half continued on to take computing classes post-secondary. This further showcases the importance of making CS relatable and relevant.

2) Understanding CS Teachers Needs - An important foundation for CS education is of course the educators! Mark suggests that teachers need sense of identity, which takes the form of confidence in their ability, as well as a sense of community with role models to look up to (which I relate to completely). Disciplinary Commons is a group that was started to bring together CS teachers to talk about classes; it's not big, but it's a step in the right direction. There is also a need for more professional learning on how to teach CS; too often we assume that to teach someone about CS, you have to be damn near a software developer. But is that true? Mark posed this question, and it's another one of those things you don't think about until someone else says it. An example he used is that one of the most successful CS teachers actually focus less on coding and more on writing assignments. Because the goal is for students to learn and understand what CS is. If it's something their interested in, the rest will follow.

Although my research area isn't CS Education, I was extremely moved by this talk and I hope the work continues and gets the publicity and backing it needs to really make a difference for states like my home.

As I did yesterday, here are some of the more memorable talks from today (again, in my opinion :D):

Supporting Exploratory Data Analysis with Live Programming
Robert DeLine and Danyel Fisher
Related:
Tempe -- web app for live exploration and analysis of data

Tempe: Live Scripting for Live Data (short paper on technology from full paper above)
Robert DeLine, Danyel Fisher, Badrish Chandramouli, Jonathan Goldstein, Michael Barnett, James Terwilliger, and John Wernsing

Jeeves – A Visual Programming Environment for Mobile Experience Sampling
Daniel Rough, Aaron Quigley
replacing paper "diaries" and expensive or difficult apps with Jeeves (using visual programming)
great/engaging slide set!

Detecting Problematic Lookup Functions in Spreadsheets
Felienne Hermans, Efthimia Aivaloglou, Bas Jansen
discusses usage of and problems with lookup functions in Excel
<3 presentation style (less than 1 minute summary at end)

Interactive Visual Machine Learning in Spreadsheets
Advait Sarkar, Mateja Jamnik, Alan Blackwell and Martin Spott
BrainCel v0.2 - spreadsheets and visualizations to help end users use and understand machine learning

Extending Scratch: New Pathways into Programming
Sayamindu Dasgupta, Shane Clements, Abdulrahman Y. Idlbi, Chris Willis-Ford and Mitchel Resnick
Scratch Extension System, toolkit for anyone to extend Scratch language and capabilities
Resources:
http://scratchx.org
http://wiki.scratch.mit.edu/wiki/Scratch_Extension_Protocol_(2.0)
Considerations:
maintaining low barrier to entry, consistency with other blocks/conventions, and right level of abstraction, choosing the right extension

Strengthening Collaborative Groups Through Art-Mediated Self-Expression
Mengyao Zhao, Yi Wang, David Redmiles
Building interpersonal relationships between local and remote team members via art with Doodled "Us" - collaborative doodle system

Understanding Triggers for Clarification Requests in Community-Based Software Help Forums
Nathaniel Hudson, Parmit K. Chilana, Xiaoyu Guo, Jason Day, and Edmund Liu
What causes people to ask clarifying questions to improve Q&A site experiences --> design interventions to make Q&A sites more efficient

I will try to post as much as I can tomorrow - it's my last day so I really want to visit family that's down here. One of the downsides of the PhD is you really don't get to see family and friends as much as you like so advantage I will take of this :)
Until next time!

Monday, October 19, 2015

VL/HCC 2015 - Atlanta, GA (GC & Day 1)

Whew!

After traveling to Houston for Grace Hopper (post to come soon about the happenings there), and now to ATL for VL/HCC (IEEE Symposium on Visual Languages and Human Centric Computing) I'm pretty exhausted. BUT not too exhausted to share some of the greatness that has gone down since I've gotten here for the conference :).

To start, the GC (Graduate Consortium for those who don't know) was amazing. I found myself comparing it to last year, and I can honestly say it just keeps getting better. Now, this may be because my research gets more refined, which makes it more suitable to feedback, but I feel like I really got some ideas that are gonna push me in the right direction. On top of the great feedback I got, I met and connected with some AMAZING PhD students at various stages of their career. Despite all having unique research interests and directions, we were all able to provide insights to improve each others' work (and even potentially reference each others' work).

Now, for the first day of the conference. The theme this year is "Computational Thinking and Computer Science Education"...one thing I love about VL/HCC is that it's a smaller venue, so it's a lot more personal. I walked in for the intro and first session and found a seat next to a friendly looking female...of course I asked if the seat was taken and then proceeded to introduce myself. Come to find out, I was sitting next to none other than Felienne Hermans of Delft University in the Netherlands! The cool thing about it is I've been virtually stalking her ever since she wrote a blog post about my first conference paper/presentation 2 years ago...and now I've met her face to face. And it feels awesome. The most awesome part is I got to talk to her about my dissertation research, and she loved the idea! We did some brainstorming and she even had some work she's going to pass my way related to it. Yet another boost of confidence for my dissertation research :).

It would have been hard to ruin my day after how it started; fortunately, I didn't have to worry about that. The day was filled with interesting talks related to computational thinking and computer science education. Although all the talks were great, here are some of my favorites from the day:

Tutorons: Generating Context-Relevant, On-DemandExplanations and Demonstrations of Online Code - context relevant explanations of code (browser add-on)
Andrew Head, Codanda Appachu, Marti A. Hearst, Bjorn Hartmann
Resources:
http://www.tutorons.com/

Codepourri: Creating Visual Coding TutorialsUsing A Volunteer Crowd Of Learners - crowdsourced visual tutorials for learning to program (using crowd of learners)
Mitchell Gordon and Philip J. Guo

Personality and Intrinsic Motivational Factors in EUP - model that predicts motivation based on personality profiles
Saeed Aghaee, Alan F. Blackwell, David Stillwell, Michal Kosinki
Terms:
1. bricoleurism - like to build things, tinkering with stuff
2. technophilia - love of tech/new tech
3. artistry - enjoy experimenting with creative ideas
Resources:
MyPersonality dataset
Big Five Inventory Personality Test
Related:
Facebook 'likes' to predict personality profiles
Computer better predictor of personality than people

Scientists Tell Stories About Seeking Help with Programming - "war stories" to determine help seeking behaviors of EU scientists [qualitative study]
Brian Frey and Carolyn Seaman

Facilitating Testing and Debugging of Markov Decision Processes with Interactive Visualizations
Sean McGregor, Hailey Buckingham, Thomas G. Dietterich, Rachel Houtman, Claire Montgomery, Ronald Metoyer
Resources:
MDPVis.github.io

A Study of Interactive Code Annotation for Access Control Vulnerabilities
Tyler Thomas, Bill Chu, Heather Lipford, Justin Smith, Emerson Murphy-Hill
Memorable Quote: "Ain't nobody wanna be hacked"

Codechella: program for interactive and collaborative tutoring/building mental models - simulate in-person help in an online platform
Philip J. Guo, Jeffery White, Renan Zanelatto
Resources:
Demos: pgbovine.net/rosetta
Live: pythontutor.com

Semantic Zooming of Code Change History
Youngseok Yoon and Brad A. Myers

That's all for now - I'll try to post again for day 2 and will definitely be posting about GHC soon. Until next time!

Friday, October 9, 2015

JGit Growing Pains

So, I've been using JGit (I believe a previous blog post talks about this) for about a year now...I need someone to explain to me why I'm just learning how the revert command works for the API!?

Let me break it down...JGit allows you to manipulate repositories from Java. Cool shit right? However, either from my lack of experience with JGit or my (at the time) noob-ish knowledge of Git, I am just learning that revert does not do what I would expect. Take the following code for example...

for (RevCommit rev: revisions){

if (ObjectId.toString(rev.getId()).equals(currentHash)){

git.revert().include(rev).call();

}

Now, as most developers do, I went to the internet to find out how to revert a repository from Java, and this is what I found. Not much more documentation than what you're looking at right here...As I could recall, the way revert worked was that you can revert your repository to any given revision -- because of this I made the silly assumption that the code above would revert the repository to the RevCommit I passed in. Except I was wrong. And I didn't realize I was wrong until the revision I was analyzing did not match up with the revision I was diffing in my code from a file add...

Today, one year later, I have been scouring the internet to find out how this sequence of method calls work together and how, specifically, the include(RevCommit)method works. Does include mean this is the revision I want to revert to? Does it mean I want to revert the changes I made at this revision to the previous revision? Do I need to use more than one include call to revert back more than one revision? All of these are questions I have that unfortunately the interwebs is doing not so great a job of answering.

For those curious, revert (in this context) actually works by reverting the changes made to the RevCommit passed in. So whatever revision gets passed in is reverted to the revision before...which meant, due to my misunderstanding, I was basically doing everything backwards. I'm almost embarrassed it took me so long to realize this problem; until I realize that if there was some clear documentation out there that told me this information I could have discovered this issue earlier (or prevented it all together). For those of you thinking "that couldn't have made that big of a difference"...once I got the revert process correct, I had to restructure the way my code works for some of the new detectors I implemented as well as to detect any sort of pattern removal from the repository.

So my word of advice? If you're using JGit for analyzing source code at each revision in a repository and care about at which revision source code was introduced, know that the revert command should always come AFTER analysis if you care at all about analysis and revisions matching up (which I do). This makes it sound intuitive...but is it?

Hopefully this helps someone else struggling with this particular feature of JGit. Until next time!

Sunday, October 4, 2015

DLF Retreat - Day 3 and Retrospective

Welp, The first annual DLF Retreat has officially come to an end.

For our last and final day, we spent a few hours discussing each of our individual research endeavors in order to provide feedback to one another. A few interesting things came from this, for me anyways. One, I feel like I have a better understand of what others are doing, as well as what I am doing and how I might conduct my research. Something I've been told and am really starting to realize is true is that everyone's path to the PhD is different. Two, I realized how important, if not vital, it is to get other perspectives on the research you do; there are some things that others know or that others see that you might not. And finally, I feel more confident in the work I'm doing and the feasibility of doing big things with a small, simple idea.

After that, we headed back...but first made a much deserved stop at a local donut shop. I got a maple bacon donut and a Turtle Mocha Iced Coffee...yes, I said turtle. It was pretty damn good though it had nothing on the donut :). The donut shop also doubled as our place of reflection from the weekend. I think we all felt good about what we did and the ideas we came up with. Most of what we did we all thought went well, however, some points for improvement we discussed included having a shorter retreat, including new students in the mix (possibly having small more frequent retreats for the whole lab), and possibly even incorporating our significant others (who are obviously also affected by our decision to pursue a PhD). We actually had some productive conversations on the way back about work/life balance with significant others; this is something I personally always struggle with. My boyfriend is a little older than me and is much more ready to have kids than I am. If I wasn't getting my PhD I probably would have started a family by now but we decided together it would be best to wait (although he definitely makes it known that he wouldn't mind trying now). These type of decisions vary by relationship, but it's definitely an important aspect of the PhD to consider.

Now that I'm back home, part of me is hype about digging back into my research and going back into the lab with a new perspective, new ideas, and plans for moving forward...and part of me wants to sit and stare at a wall and do nothing for the rest of the evening. Maybe I can just put together a plan for next week so I feel productive :)

Either way, I think this weekend was a huge success. I can't wait to see the changes put into place. Now we just have to hope the rest of the lab doesn't kill us for switching shit up :P

B signing off. Until next time!

Saturday, October 3, 2015

DLF Retreat - Day 2

The first full day of the retreat (and last full day), day 2, and I'm thinking it won't be as difficult to convince people we did things after all :).

The day started with pancakes...the best way to start any day. After breakfast, we got straight into it with some post-breakfast exercises. The first exercise was to think about the strengths and weaknesses of our lab. Now, this sounds easy enough...but the challenge is separating your strengths and weaknesses from that of the group. For example, one of my strengths is that I'm social so I make new connections easily. However, the same might not be true for everyone in the lab, therefore that wouldn't necessarily apply as a lab strength. This exercise was especially useful because although we were focusing on improving the lab, improving the lab means improving each of us individually. And being it's a small group (4 of us), it was easier to have ad-hoc discussions about each. It was also nice that those involved in this exercise are senior students in our lab, meaning they've been around for a bit and know the ins and outs of how the lab is currently run.

The next exercise that followed almost directly from this one was looking at opportunities for our lab that we may be missing (based on location, resources, our skill sets, etc.) and threats that could prevent our lab from being world-renowned (and of course graduating students in a timely manner). This exercise was an interesting one; we found that the first go around it wasn't completely clear for all of us what exactly was expected. One of the weakness of our lab that a couple of us mentioned was the ability to speak up when you don't agree or understand something. We sat there for 5+ minutes as most of us wrote little to nothing. It wasn't until our advisor said "okay let's take a step back. there's not much writing going on here" for us to actually say "yeah I didn't get the question". Once that was out in the open, and clarification was made, the exercise went much more smoothly and we got some good discussions from it (and ideas for improvement). General advice for PhD students: Speak up! Especially when you don't agree or understand. Time is precious; no need to waste it being timid (or confused).

The last exercise for the day also followed directly from the others; we were told to think about the weaknesses we discussed earlier and brainstorm on ways to improve. Now, of course when you get a group of researchers together to come up with problems they're going to come up with a laundry list, so we had quite a few weaknesses (or shall I say "areas for improvement") that we came up with. Therefore, we each picked one weakness that we thought would be important to deal with and brainstormed on each. Of course, being the brilliant minds we are, we came up with some interesting ways to move forward in the lab and build both our lab brand and our individual brands. The test now is, will we carry all the things we discussed and decided on this weekend back and propagate them throughout the lab/group...only time will tell! And of course I will try to share some of the big changes we make as we make them.

This is B from the DLF, signing off. It's time to watch some COPS before hitting the sack.

In the mean time, enjoy some of the dope photos we took during our trip :)

Friday, October 2, 2015

The First Annual DLF Retreat - Day 1

I am spending the weekend at Topsail Beach, NC with my advisor and 2 other senior students in our lab. You may be asking yourself 1) why are you at the beach with your colleagues and 2) did you know there's a hurricane coming that way? Our fearless leader decided it would be a good idea for us all to spend a weekend together AWAY from the lab to brainstorm ideas for our research and future publications. I must say, we are off to a pretty great start. I will also add that the trip was planned wayyyy before anyone knew Joaquin was headed this way. :)

Today was the first day of the retreat; the weather was wet and gross, as it has been for the past 2-3 weeks, but we made the best of it. Our first stop on the way in was the Duplin Winery (gotta get the trip started right!). Duplin is a local wine, which I had heard of (and of course drank) but had no clue originated in NC. We got to taste various red and white wines for free -- and of course we had to buy some as well! We even got complimentary homemade crackers, which were actually pretty good. Before leaving the winery, in true researcher spirit, we had a brainstorming session. I shall not divulge the awesome ideas we came up with, but know the DLF is making moves ;).

Once we made it to our place of residence for the weekend, we ate some lunch (which our advisor so thoughtfully packed) and then continued to do our first official activity of the weekend. For this activity, we had to think about our career path (where we see ourselves in 5-10 years) and answer various questions about to our career path such as advantages and disadvantages, measures of success, and how our career path might evolve. Even if we aren't sure what we want to do for a career, the exercise helped us think about what it is we truly want to do and why. More concretely, it helped us think about how we can better prepare ourselves (and others in our lab) now for their future, whatever it holds. Retreat or not, I recommend this activity or anything similar to researchers/research groups. Even if the discussion is informal, it can be informative, thought-invoking and lead to more detailed discussions.

Now after all this heavy, of course we had to toss in some leisure; the rain gave us a break so we took full advantage. Despite the lack of sun, it was still nice out on the beach. We even got to see some surfers try to take advantage of some of the waves coming in! After the beach, we had a lovely gourmet dinner prepared by Dr. E -- corn dogs, tater tots, and salad :P. So delicious!

We probably should have stopped there though, since once we continued exploring we found a post-apocalyptic arcade, complete with broken ceiling pieces on the floor, a closed "surf shop" where you are supposed to be able to rent fun water gear, and an indoor pool and hot tub that was not only dark but surrounded by a moat of water. Condo = nice. Beach = wonderful. Everything else = meh.
After exploring the beach and the "resort" we're staying at, rather than trying to play arcade games around the broken and water damaged ceiling, we had some quality bonding time while watching COPS on Spike TV. Yes. COPS. Don't you love it? :)

Day 1 has come to an end...I'm pretty excited to see what day 2 brings. I'll try to post tomorrow on our festivities. Until next time! :D

Thursday, October 1, 2015

Tricorder: Building A Program Analysis Ecosystem

In this paper, authors provide an overview of the program analysis platform "Tricorder" that is being currently used in [Google](https://www.google.com/) for program analysis. They also present a set of guiding principles that went into creating the platform.

Article link

Abstract

"Static analysis tools help developers find bugs, improve code readability, and ensure consistent style across a project. However, these tools can be difficult to smoothly integrate with each other and into the developer workflow, particularly when scaling to large codebases. We present TRICORDER, a program analysis platform aimed at building a data-driven ecosystem around program analysis. We present a set of guiding principles for our program analysis tools and a scalable architecture for an analysis platform implementing these principles. We include an empirical, in-situ evaluation of the tool as it is used by developers across Google that shows the usefulness and impact of the platform."

Thoughts

Authors present their design decisions and lessons learned with the program analysis platform Tricorder.

TRICORDER ARCHITECTURE

The Tricorder platform accepts as input the code to be analyzed and outputs inline comments the code. The platform is a realized as a microservice architecture. Authors argue that thinking in terms of services encourages scalability, modularity, and resilience (in case of node failure). Tricorder is designed to such that the analysis workers are designed to be replicated and stateless in order to make the system robust and scalable. The results appear in the code review as comments (Google terms it as robot comments or robocomments for short).

GOOGLE PHILOSOPHY ON PROGRAM ANALYSIS

No false positives
"No" may be an over statement as admitted by the authors. The aim is to reduce the number of false positives to the minimum. Authors also define a effective false-positives as a measure of number of reports that a user chose not to take any action.
Empower users to contribute
The insight is to leverage the knowledge of masses to build a robust system. Google actively encourages users of Tricorder platform to write new analyzers. Since the users (google employees) typically work with variety of programming languages and custom APIs, authors seek to leverage the vast domain knowledge of the users to write analysers. The "Tricorder" platforms enforces quality by reserving the right to remove an analyser from the system based on performance (i.e. how many users are ignoring warnings, are the warnings annoying, is the analyzer taking too many resources...). Authors argue that analyzer writers typically take pride in their work, thus the analyzers are generally of high quality. The issues reported against the analyzers are typically resolved fairly quickly.
Make data-driven usability improvements
The idea is to avoid arbitrary design and implementation decisions and ground them on empirical evidence. Enhancement and correction to analyzers are made based on the user feedback.
Workflow integration is key
The key idea is that the analysis should integrate with the users workflow rather than a user having to go out of way to perform analysis. For instance, a standalone analyser is less likely to be invoked by a developers in contrast to an analysis mechanism that integrates to his/her IDE.
Project customization, not user customization
Past experiences at Google showed that allowing user specific customization caused discrepancies within and across teams, and resulted in declining usage of tools. Authors observed that often times when a developer using a tool abandons project, team members often check in code containing warnings found by that tool. However, Tricorder platform offers limited project-based customization. For instance, a team may choose to run optional analysis by default and some analysis that are not applicable can be disabled for a team.

Evaluation

One thing to point out about this paper is that the focus is less on the technical aspects of the platform and more on the principles that went into developing Tricorder and how well the tool implements each. This was nice; the platform took years to develop and I'm sure there are various technical pieces. It would have been too much to try to go into technical detail, so kudos on finding a good balance. At a high level, the guiding principles presented by the Tricorder paper make sense; Some have been documented in existing literature, while others have been experimented with within Google. For example, false positives and workflow integration have been been discussed in literature as major deterrents for potential static analysis tool users. The authors give sufficient background and include references to the existing works, though some expected Google to take all the credit for the ideas presented :).

Although there is an evaluation of the platform, it leaves something to be desired.

For example, they kept track of how often developers using the tool clicked NOT USEFUL or PLEASE FIX to determine the usability of the tool. However, the numbers for theses clicks per day are surprisingly low for a company the size of Google. If this tool is available to Google employees, do the low numbers mean people are not using the tool or they are just not clicking the buttons?

The work would benefit from a more in-depth evaluation, perhaps including some qualitative findings regarding how people use the platform and how often in comparison to other tools that are available or have been used in the past.

Closing Points

The publication that came from this work is overall an informative one that makes contributions to both the research and technical communities. Besides the fact that the paper is well written (which I should note, being unfortunately good writing is harder to come by now a days), there are a few others reasons why we believe this paper got accepted:

The design guidelines for Tricorder are strong; again, some are not supported by existing literature. But there is support for each design decision made.
There are design decisions at all (guidelines that can be replicated are always good).
The project spans years of work with lots of data.
It's Google!
It's a relevant topic that people care about (and developers can relate to) in research and industry.

Overall, well done Google! :)

_______________________________________________________

Guest writer: Dr. Rahul Pandita, a postdoctoral researcher at NC State University in the Realsearch group advised by Dr. Laurie Williams.