\chapter{\X{Analysis?}} In order to avoid as many problems as possible when visualising and analysing a OSPF topology, we need to understand several aspects of networking, the OSPF protocol and its implementation in BIRD. However, the aim of this thesis is not to be a complete guide to OSPF nor BIRD, so we only describe enough details to be able to reason about the design of the implementation. \section{OSPF overview} OSPF is a link-state routing protocol, which means that routers try to understand the whole topology of the network and find the best path using an algorithm for finding the shortest paths. Usually, Dijkstra's algorithm \X{ref?} is used. OSPF was designed to provide dynamic routing in a whole autonomous system, but running it on a much smaller scale is also possible. OSPF requires routers to share information about the state of the topology in messages called \emph{Link State Advertisements} (or LSAs for short). The LSAs contain information about which network segments are adjacent \X{term} to which router on which interfaces and with which routers it is possible to communicate on those interfaces. To distinguish between routers, each router is assigned a 32-bit number called \emph{Router ID} by the network administrator \X{term}. Router IDs are usually written in a quad-dotted notation \X{glos}. The LSAs are flooded throughout the system in order to let all the routers know about the current topology. To avoid overloading the system just with this routing traffic, OSPF devises several mechanisms of minimising the number of exchanged messages. First, each network elects a \emph{designated router} (or DR) which coordinates exchange of the messages in that network segment. Second, the system can be partitioned into \emph{areas}. That way, the frequent LSAs are only flooded throughout a limited set of networks; the \emph{area border routers} (ABRs) send accumulated LSAs into other areas.\footnote{These inter-area LSAs are called \uv{Summary LSAs}, but there is no requirement that the routing information is actually aggregated.} These LSAs do not describe the topology of the originating area. The area 0.0.0.0 is called the \emph{backbone area} and all other areas must be adjacent to it, so that it has all the routing information. When this is impractical, OSPF allows two routers to be connected using a \emph{virtual link}. From the routing perspective, this is a point-to-point connection between the routers in the backbone area. which allows to forward LSAs through other areas even when these LSAs would not normally leave those areas. There are several types of networks that emerge in OSPF topologies. The \emph{transit} networks are used for forwarding packets in an area. \emph{Stub} networks only have one router and therefore can only deliver packets originating from or destinated to that network. For representing routes outside of the area and the whole system, OSPF recognises \emph{extra-area} and \emph{external} networks respectively. It is also possible for a router to be adjacent to an extra-area router through a point-to-point link. Figure.~\ref{fig:nettypes} shows the same classification visually. \begin{figure}[h] \centering \includegraphics[width=12cm]{../img/types-of-networks.pdf} \caption{Types of networks as seen from the cyan area.} \label{fig:nettypes} \end{figure} Once the router has complete information about the topology of an area, it constructs a graph representation of the network and calculates the shortest path DAG\footnote{In the specifications, it is called the shortest path \emph{tree}, but there may be multiple shortest paths and the router is supposed to use all of them.} rooted at that router. This DAG is then used for finding shortest paths to routers and networks in that area, including the external, extra-area and stub networks adjacent to that area. OSPF specifies that the graph has all the networks and routers as vertices, directed edges lead from each router to the incident network with the configured cost and from each transit network to incident routers with cost 0 (except when the two-part metric\cite{rfc8042} is implemented). There are no edges starting at the external, extra-area or stub networks, so that the shortest path DAG calculation does not find paths through them. The cost of the edge to an external network can be of two types. Type 1 cost is specified in the same units as the internal costs, Type 2 cost is larger than any internal or type 1 cost. The OSPF family of routing protocols has undergone long evolution since the first specification in 1989\cite{rfc1131}, There are currently two versions of the protocol in use -- versions 2 and 3. While the basic idea is still the same, OSPFv2 can only handle IPv4 systems. While OSPFv3 claims to be network-protocol-independent, it is usually only used with IPv6 systems and in fact, features like virtual links can only be used with that network protocol\cite{rfc5838}. Both OSPF versions have numerous extensions, as can be seen by the number of RFCs that update the base specifications\cite{rfc2328,rfc5340}. Therefore, we do not implement the protocol ourself, but rather find a suitable routing daemon \X{glos} to determine the current topology. \section{Routing daemon selection} While we were mostly determined to use BIRD\cite{bird} from the start, since we already had some experience with it, let us present here a short summary of other possibilities. Note that the particular choice does not affect interoperability with other routers as long as the chosen routing daemon supports extensions used in the network system. There are several implementations of both versions of the OSPF protocol. However, many of them are tied to the specific router hardware, which makes it impractical to connect to a graphical visualisation. Moreover, even evaulating the feasibility would require us to obtain the specific hardware. Therefore, we only consider hardware-independent solutions. While we are aware of several software implementations, many of these do not seem to be developed anymore (Quagga\cite{quagga}, XORP\cite{xorp}, OpenOSPFd\cite{openospfd}). Apart from BIRD, we only found FRRouting\cite{frr} to be maintained, meaning that it had a release in the past year. While being maintained is not a strict requirement, it would allow us to use that implementation in case OSPF is extended again. However, even BIRD does not implement all the extensions, for example, the two-part metric\cite{rfc8042}. \section{BIRD interface} The BIRD daemon is controlled through a UNIX domain socket using a text line-based protocol slightly resembling SMTP\X{cite?}. The client may send commands to the daemon, which provides responses. The response may be long and possibly formatted into a table. This interface is primarily aimed at human users, so a rather simple client, \texttt{birdc}, is provided in the BIRD's package\X{glos?}. While there is a note of a machine-readable protocol in the \texttt{doc/roadmap.md} file in BIRD's source code\cite{bird-src}, it is not implemented, so we will need to interface using the socket. This has following consequences, most of which are not very pleasant: \begin{itemize} \item The responses to different formats often have completely different formats. This necessitates creating a dedicated parsing routines for each kind of command we want to use. \item There is no guarantee that the output will not change between versions. We might need to follow BIRD's development in order to be aware of possible changes. \item The output does not contain all details of BIRD's state. For example, we can not retrieve the shortest path DAG directly from BIRD, nor see the details of the individual LSAs. \item There is no way to get notified when the topology changes. \end{itemize} BIRD provides only a few commands that deal with OSPF: \begin{itemize} \item \texttt{show ospf} shows a simple summary of the running instances of OSPF, like which areas are they in or how many LSAs does BIRD currently consider. \item \texttt{show ospf interface} describes a current status of the individual local interfaces: their configuration, designated routers for the incident network, etc. \item \texttt{show ospf neighbors} provides details about the state of communication with adjacent routers. \item \texttt{show ospf lsadb} returns details about known LSAs. Unfortunately, this contains low-level information like checksums and sequence numbers, but not details about networks or routers. \item \texttt{show ospf state} shows an overal view of the ospf graph representation: present routers and networks, costs of links, distances, \dots \item \texttt{show ospf topology} seems to only provide a subset of the output of \texttt{show ospf state}. For example, it does not provide info about any non-transit networks. \end{itemize} Even though some of the commands can have more parameters, parsing the output of the \texttt{show ospf state} command is still the only the viable option of getting a topology description. The following subsection describes the syntax of the response to this command. \subsection{Retrieving the OSPF state}