Research | Timothy J. Aveni

Publications

Additional scholarship

OmniFill: Domain-Agnostic Form Filling Suggestions Using Multi-Faceted Context

Timothy J. Aveni, Armando Fox, Björn Hartmann

arXiv preprint

Passive Haptic Learning for Computer Stenography

Timothy J. Aveni

Undergraduate thesis

Ph.D. work

These days, I'm a Ph.D. student studying human-computer interaction in the Berkeley Institute of Design group, studying under Björn Hartmann and Armando Fox.

Recently, I have been interested in systems, interfaces, and techniques that tackle the “unwritten code” of the world; what are all the tiny scripts, customizations, and analyses that no one ever bothers to implement? How would the world look different if we weren't limited to running just the code that has been hand-crafted by software engineers? What if code were economical to write even if it had only ten users in the world, each running it once a year? What about one user, as a one-off?

Even as a programmer, I have a mountain of code that might make my life easier or more interesting, if only I had the time and energy to write it. For non-programmers, it's not just a question of speed; plenty of tasks would require that the user learn a whole new skillset to instruct the computer capably.

To me, a special subset of users are the improvisers, people who use computers for highly dynamic tasks. How do we empower teachers to adapt their classroom technology to a student's clever question? How can artists define and then immediately use a new creative tool? How does a livestreamer modify their on-screen automations dynamically to engage with trends in their chat? When the task is not fixed, neither should be the available computing tools.

Some highly-customizable software can address particular task domains pretty flexibly, but the most expressive interfaces end up being the programming languages. Naturally, one approach is to teach people to code, something I've spent a lot of my time doing. I'm fond of approaches that give people a better intuition for how to specify their intent to their computers and of high-level languages that enable end-user programming.

But another option is to change the nature of intent specification. If computers can figure out what we want them to do, maybe we don't need to spend so much time specifying our intent through code. I'm interested in end-user-accessible program synthesis tools, especially programming-by-example/demonstration.

Recently, large language models have given us new power in building tools that can leverage flexible specifications of user intent. One form of such flexibility is, of course, that we can speak to LLMs directly in natural language. Other flexibility is possible, though. My OmniFill work explores this, demonstrating that many tasks can be inferred by an LLM-backed system even without task-specific configuration or a natural language prompt, provided the language model can access context relevant to the task specification. Ply, a visual programming system I built, provides a lens into ways the LLM can communicate back to the user about the nature of their programs, using visuals rather than just code or text.

I'm interested especially in tools that are practical to deploy in the real world. That means we need to handle that “last twenty percent” of the way to a useful solution, handling the things that come up out in people's lives, the stuff that makes or breaks the utility of a tool. Often, the messiness of the real world, the presence of exceptions and imperfections in actual computer use, means that the almost-useful research prototypes go unused.

I don't think this is just a problem for the industry people implementing the research to fix; I think there are research questions here about how to span that final gap. My hunch (and my experience!) is that LLMs, with how flexibly they can take input in varying formats and language, may be a big step in making it practical to build systems that are robust to the entropy of practical use cases. You’ll see reflections of this attitude in OmniFill, reagent, and Ply. That's not to say that LLMs solve this problem on their own; we’ll need to pair these LLM-backed systems with new interactions.

2016 – 2019: Undergraduate Research

While I was an undergraduate at Georgia Tech, I did research under Thad Starner, working on developing Passive Haptic Learning (PHL) as a solution for introducing users to computer stenography.

With guidance from Caitlyn Seim, I designed user studies, then built hardware, firmware, and software to power these studies. I ran tests, analyzed the results, iterated (a lot), and finally developed a working system. This work was the subject of my undergraduate thesis as well as a conference publication:

T. J. Aveni, C. Seim and T. Starner, “A preliminary apparatus and teaching structure for passive tactile training of stenography,” 2019 IEEE World Haptics Conference (WHC), Tokyo, Japan, 2019, pp. 383-388.

Below is a video of a presentation I gave on the topic at the Georgia Tech UROP Symposium in Spring 2018. This is from an earlier version of the work, before we had refined and finalized the system.

I also assisted with the development of another paper by developing some of the core Android code used in its user studies:

C. Seim, R. Pontes, S. Kadiveti, Z. Adamjee, A. Cochran, T. Aveni, P. Presti, T. Starner. “Towards Haptic Learning on a Smartwatch,” ISWC ’18. ACM, New York, NY, USA, 2018