Open Graph Drawing Framework
current version:
v.2015.05 (Baobab)

# GSoC 2014 - Project Ideas

This page is for a Google Summer of Code program in the past. Do not apply for projects in this list!

OGDF has been accepted as a mentoring organization for the Google Summer of Code™ (GSoc) 2014! The student application period is from March 10 to March 21.

Our project ideas for OGDF range from projects that shall improve its usability to projects touching topics of current scientific research in graph drawing. For all these projects, participating students should be interested in the visualization of graphs and have some theoretical background in graph algorithms.

This is a collection of project ideas we propose for GSoC 2014. Each project idea links to its detailed description below. If you have further project ideas and or want to discuss some of our proposed project please contact us by email (`gsoc@ogdf.net`), or directly contact the responsible mentor(s).

Please also refer to our GSoC FAQ for frequently asked questions. Please use our application template when applying for a project with OGDF.

• General Data Structures and Algorithms
• Applications and Interfaces

## Detailed Project Descriptions

### Generators for Different Random Graph Models

Brief description: There are many different random graph models that are used throughout theory. (Keywords are: Erdős-Rényi (G(n,p), G(n,m)), geometric, G(n,r), Pairing, Bender-Canfield, Békéssy, Preferential Attachment, Watts-Strogatz, Norros-Reittu, Chung-Lu, hyperbolic.) The task is to rework and extend the set of efficient random graph generators in the OGDF. Every generator should come with a documentation including a brief summary of the properties of the graph model.

Expected result: different random graph generators and documentation about them

Rationale: Every good graph framework should have a standard set of random graph generators with known properties.

Prerequisites: experience with random graph models, C++, Subversion

Skill level: high/medium

Mentor: Stephan Beyer

### Basic Linear Algebra Support

Brief description: Basic data structures for vectors and matrices are used in many graph layout algorithms that apply some kind of numerical optimization. In OGDF, more and more such algorithms have been implemented in the last years, including energy- and constraint-based drawing algorithms. Since OGDF does not provide vector/matrix classes out-of-the-box, such functionality has been implemented as needed for the different algorithms. A much better approach is to provide dedicated OGDF classes for basic linear algebra in OGDF's basic interface.

The goal of this project is to provide powerful and efficient vector and matrix classes, that provide the obvious basic functionality (like addition, subtraction, scalar multiplication, inner product, vector product, matrix transposition) as well as more complex operations (triangularization, inverse, Gaussian elimination, Hermitian) and different storage models (e.g. sparse matrix). The implementations should focus on efficiency (through specializations and smart usage of new C++11 features) and parallelization for larger instances.

Expected result: Classes `Vector` and `Matrix` supported the operations / functions mentioned above; specialized types for fixed size vectors and matrices, as well as sparse representations; parallelized versions of selected operations / functions for larger vectors / matrices.

Rationale: Basic vector and matrix operations are used more and more in OGDF's layout algorithms, like in energy- and constraint-based layout algorithms. Having some general classes for basic linear algebra would simplify the implementation of such algorithms a lot.

Prerequisites: good knowledge in linear algebra, multi-threaded implementations, C++11, Subversion

Skill level: medium

Mentor: Carsten Gutwenger

### Search Trees and Priority Queues

Brief description: Though OGDF provides some common basic data structures like priority queues (implemented by binary heaps) or dictionaries (implemented by hashing), much more flexibility is desired: E.g. it should be possible to choose how a priority queue is implemented (instead of binary heaps possible implementations include Fibonacci heaps, balanced search trees and many more) and implementing data structures (like red-black-trees) should be usable directly, thereby giving algorithms better control with respect to runtime etc.

The goal of this project is clearly separate abstract data types (like priority queues and dictionaries) from implementing data structures, and identifying and providing various implementing data structures (some are already in OGDF and just need to be adapted, others like different kinds of balanced search trees and heaps have to be implemented from scratch). An important design goal is to focus on efficiency, i.e. the additional flexibility shall be achieved with no or very few runtime overhead.

Expected result: Implementation of data types `PriorityQueue` and `Dictionary` which can use different implementing data structures (for efficiency reasons the implementing class should be a template parameter); implementation (or adaption of existing classes) of data structures that can be used as implementing data structures for priority queues and dictionaries. Data structures of interest include (but are not limited to) the following. Dictionaries: red-black trees, (a,b)-trees, AVL-trees, BB[α] trees, hashing, skiplists; priority queues: Fibonacci heaps, implicit/explicit binary heaps, pairing heaps, binomial heaps, hot queues.

Remark: The implementation shall be based on the new C++11 standard (e.g. the classes shall make use of / provide move semantics, initializer lists, or emplace methods). There is a new branch of OGDF based on C++-11, which also includes a `SortedSequence` data type (currently implemented using skiplists). This data type can also make use of the implementing data structures for dictionaries, so a reasonable extension would be to include sorted sequences as well.

Rationale: Many implementations want to use data structures like red-black trees directly, or it should be possible to choose the implementation of a priority queue (e.g. Fibonacci heaps give the best theoretical bounds for Dijkstra, but other implementations are way better in practice).

Literature and Web Resources:

Prerequisites: good knowledge in basic algorithms and data structures, C++11, Subversion

Skill level: medium to high

Mentor: Carsten Gutwenger

### Preprocessing for Steiner Tree Problems

Brief description: The Steiner tree problem is a classical optimization problem on weighted graphs: Given a subset of special vertices T the objective is to find a set of edges connecting T with minimum cost. Thereby, vertices not in T may be used, or not. Although this problem is NP-hard it can be solved efficiently in practice. One main reason is the existence of powerful preprocessing methods for reducing the instance size. The implementation of some of these methods is the goal of this project.

Expected result: Several preprocessing heuristics improving the performance of already existing algorithms for solving the Steiner tree problem (exactly).

Rationale: The current DIMACS Implementation Challenge is dedicated to Steiner Tree Problems. Preprocessing techniques are important for improving the performance of Steiner tree algorithms in practice. This project will help to make the state-of-the-art in Steiner tree algorithms available in OGDF.

Literature:

Prerequisites: good knowledge in graph algorithms, C++, Subversion

Skill level: medium to high

Mentor: Stephan Beyer

### Computation of Treewidth

Brief description: The treewidth measures how similar a given graph is to a tree. The corresponding tree decomposition is a structure that (I'm simplifying here a bit) hides the graphs detail such that it globally looks like a tree. This hiding is done by putting vertices into “bags” according to several rules (a bag contains multiple vertices, a vertex may be copied into multiple bags). The required bag size is the treewidth. The reason for this decomposition is that several NP-complete problems turn out to be efficiently solvable for graphs that require only “small” bags. However, already computing the treewidth of a graph is NP-complete. There exist, however, practically good heuristics which construct a tree decomposition close to the optimum.

Expected result:

• An efficient tree decomposition datastructure, to be used by treewidth-based algorithms. This datastructure should be able to transform itself into a lean tree decomposition, when demanded.
• An efficient heuristic (in fact, several variants thereof) to compute a good tree decomposition.
• A sample application (e.g., vertex cover) using the tree decomposition.

Rationale: There are many applications for tree-width based algorithms, and hence it would be great if OGDF would be able to facilitate these. In the broader scheme, the goal is, e.g., also to have algorithms for optimal tree decompositions.

Literature:

Prerequisites: good knowledge in graph algorithms, C++, Subversion

Skill level: high

Mentor: Markus Chimani

### Node Overlap Removal

Brief description: While some graph layout algorithms naturally avoid node overlap due to the way they compute a layout, most instances of the most popular algorithmic approach, force-directed layout methods, due not take care of this problem automatically. Thus, resulting layouts of these methods may exhibit a large number of overlaps that make it difficult to perceive the graph's structure well and to nicely identify individual nodes. There however exists methods that allow to remove node overlaps in an existing layout without compromising the layout quality much. These can be run as a postprocessing step after a layout method that are prone to creating node overlaps.

Expected result: In this project a selected method out of the existing node overlap removal strategies should be implemented.

Rationale: A node overlap removal can largely improve the quality of graph layouts from force-directed methods.

Literature: Emden Gansner, Yifan Hu: Efficient, Proximity-Preserving Node Overlap Removal, JGAA vol. 14, no. 1, pp. 53-74, 2010

Prerequisites: C++, Subversion, good knowledge in algorithms and data structures

Skill level: medium

Mentor: Carsten Gutwenger, Stephan Beyer

### Improved Packing of Connected Components

Brief description: If a graph is not connected, it is a wise idea to draw each component separately, and then arrange the resulting layouts for the components such that the area of the final drawing is minimized, or a desired aspect ratio is optimized. This problem can be seen as a packing problem of 2D-geometric objects. OGDF provides a simple algorithm for this task, which is used as a subroutine by many drawing algorithm. However, there are much more advanced techniques for solving such packing problems, e.g., Polyomino packing as proposed by Freivalds et al. The goal of this project is to implement advanced packing algorithms for connected components within the OGDF. Further own ideas are welcome. E.g., in some cases, components may also be rotated or nested.

Expected result: An implementation of the Polyomino Packing algorithm and other approaches / new ideas as a `CCLayoutPackModule` in OGDF.

Rationale: This shall improve the used drawing area for many layout algorithms if the graph contains many connected components.

Literature:

Prerequisites: C++, Subversion, good knowledge in algorithms and data structures

Skill level: medium

Mentor: Stephan Beyer

### Orthogonal Grid Layout

Brief description: Though OGDF contains a state-of-the-art general orthogonal layout algorithm (in particular designed for nodes with different sizes), it lacks traditional grid-based methods. Many algorithms and data structures required for these methods are already contained in OGDF, e.g., min-cost flow, orthogonal representations, compaction algorithms. Aim of this project is to implement the classical Giotto algorithm by Tamassia et al. and the alternative quasi-orthogonal approach by Klau and Mutzel. The latter approach allows edges to be routed non-orthogonally around a vertex, so that vertices with degree greater than 4 can be placed on a single grid point, whereas the former expands high-degree vertices so that they span several grid points.

Expected result: New grid layout modules for Giotto and quasi-orthogonal layout.

Rationale: Orthogonal grid layout algorithms are classical planar graph drawing methods. Since our goal is to also provide a wide variety of established graph drawing methods in OGDF, adding these algorithms is desirable.

Literature:

Prerequisites: C++, Subversion, good knowledge in graph algorithms (in particular min-cost flow) and planar graphs

Skill level: medium to high

Mentor: Karsten Klein

### Extending the Framework for Constraint-Based Drawing

Brief description: OGDF contains a powerful framework for the so-called constraint-based drawing, where a layout is not computed by an algorithmic approach that constructs the drawing, but a descriptive approach, where the requested features of the drawing are specified in form of constraints, and then a solver is used to solve the resulting system of constraints. This approach has found considerable attention in research recently, and it very interesting for layout computation in areas with a large number of requirements for the resulting drawing, e.g. network visualization in the life sciences. In the current state, the framework only supports a small number of constraints.

Expected result: The goal of this project is to add support for two further constrains to the set of constraints supported by the constraint-based framework in OGDF.

Rationale: This projects strengthens the constraint-based part of OGDF, making it better applicable for graph drawing in practice.

Prerequisites: C++, Subversion, good understanding of geometric principles and basic maths

Skill level: high

Mentor:

### Python Bindings for OGDF

Brief description: Python bindings shall make some main functionality of OGDF (like graph objects, graph and layout algorithms) available for use in Python programs. One way to achieve this is by using SWIG, which even supports other target languages.

Expected result: A basic but extendable framework, providing some main functionality, including a tutorial on how to use it.

Rationale: Providing easy to use interfaces is crucial for every software library. Though OGDF provides a wealth of powerful data structures and algorithms, accessing them through C++ code is a big problem for many users. Making OGDF's functionality easily usable from a scripting based programming language like Python will not only make OGDF accessible for many more users, but also adds many benefits for experienced programmers.

Prerequisites: Python, C++, Subversion

Skill level: high

Possible Mentors: Markus Chimani, Stephan Beyer

### Extending OGML support for Cytoscape

Brief description: OGML, the Open Graph Markup Language, is an open, XML-based, format to store graphs and their annotations. It has been developed in the OGDF project as a successor of the GML format in order to overcome the drawbacks of this outdated format, including the lack of mechanisms to store data annotations stemming from practical application areas and to specify advanced structural concepts like clusters. While OGML has a flexible extension mechanism, it still provides explicit basic features that allow to directly map properties needed in graph drawing, making it the format of choice when annotated networks need to be stored for visualization purposes. Though the format was developed within the OGDF project, it is an independent format that can freely be used within any other project to store or transfer graph or network data. Cytoscape is an open source platform for complex network analysis and visualization. Cytoscape allows to apply several automatic layout methods and features a number of export formats. There is a simple Cytoscape OGML plugin that allows to import/export basic features of a network, such as the nodes and edges. However, many features, including most of the features that control the visual appearance, as line style, and edge bends, are not handled. This leads to an annoying change in appearance e.g. when exported data is reimported.

Expected result: The goal of this project is to complete the OGML support for Cytoscape for all relevant features by adding the missing features to the import and export capabilities, an XML schema for OGML and a (non final) documentation of the basic structure are available.

Rationale: This projects combines the two open source projects OGDF and Cytoscape. It provides a useful extension to Cytoscape and at the same time facilitates to exchange complex network data together with relevant annotations, allowing easy access to OGDF's sophisticated layout methods.

Prerequisites: Java, XML, knowledge of Cytoscape and file parsing is an advantage

Skill level: easy/medium

Mentor: Karsten Klein