Harmonia Opportunities
Are you an outstanding upper division undergraduate looking for CS 199 units? A graduate students looking for exciting research opportunities? The Harmonia group has a number of interesting projects available to enterprising graduate and undergraduate students. Some of these projects are a matter of straightforward (and not so straightforward) software engineering; others constitute genuine research problems.
Most of these projects require experience with programming language design and implementation, as the techniques in this field are central to understanding the research goals of the Harmonia project. Each project below lists more specific prerequisites and requirements.
If interested, please contact Marat Boshernitsan (maratb@cs.berkeley.edu) and we will be very happy to consider you for one of these projects.
- Eclipse Integration
- Automated Name Resolution Analysis
- Harmonia Visual Studio .NET Port
- Harmonia Java Port
- Structural Preprocessing of C and C++
- Visualization of LR/GLR Parsing for Grammar Debugging
- Pretty Printer Based on General-Purpose Tree Pattern Matcher
Eclipse Integration
Prerequisites: strong knowledge of Java, CS160, CS164 helpful, but not required
To take full advantage of the Harmonia framework as an embeddable program analysis library, it needs to be integrated into a front-end tool such as an editor. We have successfully connected Harmonia to the XEmacs editor by building a Harmonia-mode that provides language-sensitive editing services in XEmacs. We are now creating a more advanced programming environment by embedding the Harmonia framework in Eclipse, an extensible public-domain IDE written in Java.
We have finished the basic integration and are now looking for an advanced undergraduate student to complete the work and polish the interface. Initially, we will be enhancing Eclipse's own editor with the Harmonia language services. This project may extend into building an alternative program editor with advanced layout and editing features similar to the CodeProcessor.
Automated Name Resolution Analysis
Prerequisites: CS164
Compilers are typically divided into several phases: lexing, parsing, semantics, optimization and code generation. Over the past 40 years, lexing, parsing and code generation have been automated, meaning that a language designer can create a domain-specific specification that omits implementation details.
In this project, we are automating name resolution, which is the first part of a compiler's static semantic analysis. This analysis entails matching every use of a variable with its definition, matching function calls to function declarations, and constructing the class inheritance hierarchies, all using an implementation-free name resolution specification. We will be adapting a formalism called an Inheritance Graph to the task of creating an inheritance graph for Cool, one of the languages used in CS164.
We are looking for a student who did well in CS164 (when taught by Prof. Aiken or Prof. Necula) to create the inheritance graph for the Cool language, as well as design and implement a language-independent specification for inheritance graphs for all programming languages.
Harmonia Visual Studio .NET Port
Prerequisites: experience coding in C++ on Windows, good understanding of Windows compiling/linking/debugging (apps and dlls)
Since a significant population of programmers uses Windows, it would be beneficial to port Harmonia to Windows to use in several applications such as XEmacs, Eclipse, and IBM ViaVoice. Harmonia was developed using g++ (2.95 and 3.2) on Solaris and Linux, and we made an initial Windows port using g++ 2.95. However, g++ is a second-class citizen on Windows, and its debugging support and binaries are lacking when compared to Visual Studio .NET.
We are looking for an advanced undergraduate (or just a good C++ coder) to port Harmonia to Visual Studio .NET. We've done about half of the work, just as a proof of concept; now we'd like to finish it. .NET's C++ dialect is slightly different than g++, which makes this project challenging (in a good way).
Harmonia Java Port
Prerequisites: experience coding in C++ and Java
Harmonia is written in C++ and takes advantage of quite a number of C++-specific features, such as multiple inheritance, #includes, and direct access to memory. We want to rewrite Harmonia in Java, converting all C++ idioms to their Java equivalents. There is quite a bit of code in Harmonia (~100 Klines), but with the power of Perl, we envision the hard part of the project to be the design of the conversion, not its enactment.
We are looking for an undergraduate with a good working knowledge of both C++ and Java to do this port, under supervision of a Ph.D. student in our research group.
Structural Preprocessing of C and C++
Prerequisites: CS164
One of the goals of Harmonia is to provide language-based services for a variety of languages. However, C and C++ present challenges to our incremental lexer and parser because of the preprocessor. Preprocessor macros are textual, not lexical or syntactic, which means that small changes in a macro may result in the entire file needing to be reanalyzed.
A structural macro is the one that operates on complete syntactic structures (like LISP macros), rather than sequences of characters or tokens. It turns out that with some intelligence, most non-structural macros can be converted to their structural equivalents. We are looking for ways to (mostly) automate that process. The outcome of this project will be a transforming pre-processing tool whose input will be a C source file with not-necessarily-structural macros and whose output will another C source file with as many macros as possible converted to the new style (i.e. the resulting file will still be susceptible to traditional pre-processing and compilation). Additionally, you will augment the Harmonia C grammar with the new macro syntax so that our parser can process the output of your transformation tool without running the C-preprocessor on it.
We are looking for an undergraduate who has experience hacking C compilers (or a graduate student who has this experience or is looking for it) to explore ways to convert C preprocessor macros into structural macros that mesh better with our incremental analysis algorithms. For a graduate student, this could turn into a very nice Master's project.
Visualization of LR/GLR Parsing for Grammar Debugging
Prerequisites: CS164, CS160 or InfoViz experience preferred, but not required
One of the greatest challenges in developing a Harmonia language module is debugging the lexical and syntactic specification of the programming language. Not only does grammar debugging require a good grasp of the LR and GLR table construction, but also a solid understanding of the parsing algorithm as well as certain quirks particular to Harmonia parsers. Our current solution is to use the debugging trace generated by the parser, which is both cryptic and not very useful. We would like to have a graduate student or an advanced undergraduate build a tool for visualizing the parsing process in the Harmonia framework. While we have come across a number of articles on visualizing LR parsing, GLR parsing presents an entirely new challenge due to a much more complicated algorithm and data structures. Thus, the development of a visualization for the GLR parser is very likely to lead to a publication.
This project may also extend to the visualization of the static semantic and other program analyses.
Pretty Printer Based on General-Purpose Tree Pattern Matcher
Prerequisites: CS172 or CS164, strong C++ skills
When embedded in an editor, the Harmonia framework provides a multitude of language-based services such as instantaneous error detection, syntactic and semantics searches and navigation, and many others. Another useful service is language-aware source code formatting. In contrast to the ad hoc syntax highlighting and pretty-printing algorithms used by traditional program editors, the Harmonia framework can provide better and, with some effort, more visually pleasing results by utilizing rich syntactic and semantic information available to the system.
We have built preliminary prototypes of language-based formatting services. These prototypes are currently used in our harmonia-mode application. A full-scale implementation requires building a general-purpose tree pattern matching engine and adapting the existing formatting tools to utilize its capabilities. We have started coding tree-based pattern matcher and would like to have an advanced undergraduate complete this project.