\chapter{Design}\label{ch:design} We now explain the design of Birdvisu in depth. First, we explain some important decisions and present the overall structure of the project, then we look into individual parts of the program. Birdvisu is implemented in Python, using PySide6, the official bindings for Qt6, for drawing on screen. We decided to use Qt, because it provides a lot of pre-made widgets and tools and since it is widely used, it is easy to find help for it on the Internet. The decision to use Python was not hard either. Not only Qt has official bindings for it, but we use the language very often and thus are comfortable writing in it. We do not expect the potential slowness of Python to be an issue, because for handling graphics we are using Qt, which is written in C++. Also, as we have analysed in section~\ref{s:areas}, we expect the topologies to be quite small. The project comprises of three main parts: data collection, annotation and presentation part. The data collection part is tasked with finding out the current topology and creating a usable representation of such topologies and their combinations. In the annotation part, we add additional information to the topologies like the difference from the expectation or graph properties of the topology. Finally, when we have all the needed information, we draw the topology on the screen, highlighting the relevant information. \section{Recurring and general patterns} Birdvisu's data structures make heavy use of dictionaries and sets, because we do not handle much data that would need to be processed in any particular order. While this allows us to perform set operations quickly, it requires us to provide hashable keys. We have decided to embrace this requirement and use rather complex frozen dataclasses, which can hold as much of the required data as possible, as long as we can re-create that data. This can be illustrated on our usage of vertices in topology. There are two objects: a VertexID, and the Vertex itself. VertexID is the hashable part and Vertex provides additional information, like incident edges, which are not hashable. The topology then has a dictionary from the VertexIDs to Vertices, providing complete data. However, the VertexID already contains information like what version of IP it belongs in, whether it represents a router and all the possible IP addresses and identifiers related to the vertex. It is sufficient for Vertex objects to only contain sets of edges and references to the related topology and VertexID. (In the next section, we will see that a type of the vertex is also stored in Vertex, but that is really everything.) The other thing we decided to reuse, was the format of BIRD's topology output. We call the format \uv{ospffile} and extended it by allowing comments (after an octothorpe, i.e. \verb|#|). Also, empty lines do not seem to be of relevance. These are quality-of-life improvements for cases when ospffiles are edited by hand. Apart from storing topologies, we intend to use ospffiles for description of basic styles. Therefore, our implementation in \verb|birdvisu.ospffile| only constructs the tree of strings and does not try to understand it. Our module provides API similar to the one of \verb|json| or \verb|marshall| module, even though it cannot represent arbitrary types. \section{Data collection: providers and parsing} This part of the project deals with processing topologies. The core object of this part is a TopologyV3\footnote{The \uv{V3} suffix is sometimes impractical to keep, so we will sometimes shorten the class name only to \uv{Topology}. It denotes the same object.}. While the Topologies can be created manually by adding the vertices and edges, we expect that retrieving topologies from other sources like saved ospffiles or running BIRD processes. This is made possible by implementing a TopologyProvider. Representing a topology turns out to be a bit complicated problem for the following reasons: \begin{itemize} \item The topology edges need to be directed. OSPF allows a shortest path from A to B to be different to the other direction. \item It can have a very general shape, so we cannot rely on common patterns. For example, routers can be connected to other routers using point-to-point or virtual links, not just networks. \item The objects are shape-shifting. A transit network may become stub or change the designated router and we want to be able to understand the change as best as possible. \item The topology is not necessarily a graph, because multiple links may lead from a single router to the same network. However, we strongly believe that the maximum number of parallel edges is quite low, so most of the theory for simple graphs is still applicable. \item For completeness, we note here again that the shortest paths from a single vertex form a DAG, even though the OSPF specifications speak of them as of trees. (Negative edges are, fortunately, not permitted.) \end{itemize} Given the above requirements and lessons learned in section~\ref{s:net-unusual}, we need to find a representation of vertices, that is both powerful enough to uniquely describe a particular vertex, and flexible to allow us easily detect its metamorphoses. The table~\ref{tab:vertexid} shows, which information we can use for each type of object. We see that networks in particular are hard to represent, because the ID of the DR may change and it might be the only distinguishing property in case of a split network. \bgroup \def\yes{\checkmark} \def\chg{$\bullet$} \begin{table}[h] \centering \begin{tabular}{cccccc}\hline Object & Address & RID & DR ID & IF ID & Notes \\\hline \verb|router| & -- & \yes & -- & -- &\\ \verb|xrouter| & -- & \yes & -- & -- &\\ \verb|vlink| & -- & \yes & -- & -- & Peer is a \verb|router|\\ \verb|network| & v2:\yes,v3:$*$ & -- & \chg & v3:\chg &\\ \verb|external| & \yes & -- & -- & -- &\\ \verb|xnetwork| & \yes & -- & -- & -- &\\ \verb|stubnet| & \yes & \yes & -- & -- &\\ \end{tabular} \caption{Information determining each object of a topology. $*$ means it may or may not be known, \chg\ denotes an attribute that may change. Columns in order: whether it has assigned a address, relevant router ID, ID of designated router, interface number of the DR.} \label{tab:vertexid} \end{table} \egroup We decided to aim for correctness, so whenever any of the attributes of an object change, we consider it to be a a separate object. This may create some false positives, but we think that is the better case than potential false negatives, which could hide some issues. Also, when the infrastructure works correctly, the designated router should only change in case of outage. Therefore, it might actually be useful to notice the user when a network has an unexpected designated router even when it is otherwise healthy. However, we provide a way to find objects by partial information, using the VertexFinder objects, so this allows heuristics to match different objects. The information mentioned in table~\ref{tab:vertexid} serves as the main part of the VertexID. However, we want the VertexID to identify the same object even after it transforms to another kind of object, so instead of using the object type, we only note whether the object is a router or a network, since this property stays the same even for changed objects. The code is also oblivious to the fact that the interface ID is a number and what it means -- we use it as an opaque \uv{discriminator} and do not even bother with parsing it from a string. The VertexIDs are supposed to be descriptors of objective vertex state, so they do not belong to any particular TopologyV3. Instead, they can be used to track actual Vertices across multiple Topologies. Apart from VertexIDs, the TopologyV3 also consists of the additional data in Vertex objects and Edges. The Vertex objects, as noted above, contain only a set of incoming and outgoing edges, references to their TopologyV3 and VertexID objects and the actual type of the object the vertex represents (i.e. the first column of the table). An Edge knows the source and target VertexID, its cost and the number of parallel edges with these properties. If the Edge was determined by a virtual link, it is marked as virtual. This is needed, because the both Vertices are regular routers, so the information about the virtual link cannot be stored in them. Note that an Edge does not need to belong to any Topology, since it only contains factual data. The information, whether an Edge is in the topology, is stored only in the incident Vertices. A Topology can be marked as \uv{frozen}. This denotes an intent that it really should not be modified, because other code might rely on the particular shape of the Topology. However, making the Topology trully immutable would be impractical in Python, so we opted for this approach. In case our solution turns out to be prone to accidental modification of the Topology, we will deploy additional countermeasures against that. Frozen Topologies also allow us to stack them, creating a union of the original Topologies. This way, a single Topology can be used in the visualisation, while keeping the original information. This mechanism is fully generic, but was mainly invented to allow merging the reference (expected) topology with the actual one (i.e. the current state of the system). The ancestors are stored by a string label in a dictionary of the Topology. While subclassing TopologyV3 into a StackedTopology would probably be a cleaner design, since the only difference is a state of one dictionary, we did not employ this approach. The TopologyProviders are not very interesting, but are important nevertheless. There are a few caveats with parsing topologies from the ospffile format. First, the edges from routers to networks can only be resolved after the networks are known, since network's level-2 block contains information not present in the level-3 directive for the router (namely, the designated router for OSPFv2 natworks and the set of addresses for OSPFv3). Since BIRD may be running more than one instance of OSPF, the BirdSocketTopologyProvider contains an ad-hoc parser of the response to the \texttt{show protocols} command, which seems to be a reliable way to list running instances. Moreover, BIRD does not seem to expose any way to determine the version of OSPF. So far, we think it is sufficient to guess from the \texttt{network} directives, since they seem to contain a hyphen if and only if the dump is from an OSPFv3 instance. (The source code of BIRD suggests that under some circumstances, brackets can appear even in OSPFv2 dump, so that is not a possibility.) \section{Annotations} \XXX{scoping, annotator creation, advantages of storing data in annotations and not vertices. Annotator protocol and posibility of export. Various uses of annotators: enhancing, analysis, visualisation. heapq} \section{Visualisation} \XXX{Layouting (nonexistent), why not graphviz, why not consensual metrics, how we are re-using annotations internally. Saving layouts}