\chapter{Analysis}\label{ch:analysis} In order to avoid as many problems as possible when visualising and analysing a OSPF topology, we need to understand several aspects of networking, the OSPF protocol and its implementation in BIRD. However, the aim of this thesis is not to be a complete guide to OSPF nor BIRD, therefore, for the sake of brevity, we skip many details which do not influence our project. To aid in testing Birdvisu and experimenting with routing, we will also present a small side-project called Gennet in this chapter. Using it, we will understand the behaviour of network splits and multiple links, as seen by the BIRD routing daemon. \section{OSPF overview} OSPF is a link-state routing protocol, which means that routers try to understand the whole topology of the network and find the best path using an algorithm for finding the shortest paths. Usually, Dijkstra's algorithm \X{ref?} is used. OSPF was designed to provide dynamic routing in a whole autonomous system, but running it on a much smaller scale is also possible. OSPF requires routers to share information about the state of the topology in messages called \emph{Link State Advertisements} (or LSAs for short). The LSAs contain information about which network segments are incident to which router on which interfaces and with which routers it is possible to communicate on those interfaces. To distinguish between routers, each router is assigned a 32-bit number called \emph{Router ID} by the network administrator. Router IDs are usually written in a quad-dotted notation \X{glos}. The LSAs are flooded throughout the system in order to let all the routers know about the current topology. To avoid overloading the system just with this routing traffic, OSPF devises several mechanisms of minimising the number of exchanged messages. First, each network elects a \emph{designated router} (or DR) which coordinates exchange of the messages in that network segment. Second, the system can be partitioned into \emph{areas}. That way, the frequent LSAs are only flooded throughout a limited set of networks; the \emph{area border routers} (ABRs) send accumulated LSAs into other areas.\footnote{These inter-area LSAs are called \uv{Summary LSAs} in OSPFv2, but there is no requirement that the routing information is actually aggregated.} These LSAs do not describe the topology of the originating area. The area 0.0.0.0 is called the \emph{backbone area} and all other areas must be incident to it, so that it has all the routing information. When this is impractical, OSPF allows two routers to be connected using a \emph{virtual link}. From the routing perspective, this is a point-to-point connection between the routers in the backbone area. which allows to forward LSAs through other areas even when these LSAs would not normally leave those areas. There are several types of networks that emerge in OSPF topologies. The \emph{transit} networks are used for forwarding packets in an area. \emph{Stub} networks only have one router and therefore can only deliver packets originating from or destinated to that network. For representing routes outside of the area and the whole system, OSPF recognises \emph{extra-area} and \emph{external} networks respectively. It is also possible for a router to be adjacent to an extra-area router through a point-to-point link. Figure.~\ref{fig:nettypes} shows the same classification visually. \begin{figure}[h] \centering % full width and hope it is readable… \includegraphics[width=\textwidth]{../img/types-of-networks.pdf} \caption{Types of networks as seen from the cyan area.} \label{fig:nettypes} \end{figure} Once the router has complete information about the topology of an area, it constructs a graph representation of the network and calculates the shortest path DAG\footnote{In the specifications, it is called the shortest path \emph{tree}, but there may be multiple shortest paths and the router is supposed to use all of them.} rooted at that router. This DAG is then used for finding shortest paths to routers and networks in that area, including the external, extra-area and stub networks adjacent to that area. OSPF specifies that the graph has all the networks and routers as vertices, directed edges lead from each router to the incident network with the configured cost and from each transit network to incident routers with cost 0 (except when the two-part metric\cite{rfc8042} is implemented). There are no edges starting at the external, extra-area or stub networks, so that the shortest path DAG calculation does not find paths through them. The cost of the edge to an external network can be of two types. Type 1 cost is specified in the same units as the internal costs, Type 2 cost is larger than any internal or type 1 cost. The OSPF family of routing protocols has undergone long evolution since the first specification in 1989\cite{rfc1131}, There are currently two versions of the protocol in use -- versions 2 and 3. While the basic idea is still the same, OSPFv2 can only handle IPv4 systems. Although OSPFv3 claims to be network-protocol-independent, it is usually only used with IPv6 systems and in fact, features like virtual links can only be used with that network protocol\cite{rfc5838}. Both OSPF versions have numerous extensions, as can be seen by the number of RFCs that update the base specifications\cite{rfc2328,rfc5340}. Therefore, we do not implement the protocol ourself, but rather find a suitable routing daemon \X{glos} to determine the current topology. \section{Routing daemon selection} While we were mostly determined to use BIRD\cite{bird} from the start, since we already had some experience with it, let us present here a short summary of other possibilities. Note that the particular choice does not affect interoperability with other routers as long as the chosen routing daemon supports extensions used in the network system. There are several implementations of both versions of the OSPF protocol. However, many of them are tied to the specific router hardware, which makes it impractical to connect to a graphical visualisation. Moreover, even evaulating the feasibility would require us to obtain the specific hardware. Therefore, we only consider hardware-independent solutions. While we are aware of several software implementations, many of these do not seem to be developed anymore (Quagga\cite{quagga}, XORP\cite{xorp}, OpenOSPFd\cite{openospfd}). Apart from BIRD, we only found FRRouting\cite{frr} to be maintained, meaning that it had a release in the past year. While being maintained is not a strict requirement, it would allow us to use that implementation in case OSPF is extended again. However, even BIRD does not implement all the extensions, for example, the two-part metric\cite{rfc8042}. \section{BIRD interface} The BIRD daemon is controlled through a UNIX domain socket using a text line-based protocol slightly resembling SMTP\X{cite?}. The client may send commands to the daemon, which provides responses. The response may be long and possibly formatted into a table. This interface is primarily aimed at human users, so a rather simple client, \texttt{birdc}, is provided in the BIRD's package\X{glos?}. While there is a note of a machine-readable protocol in the \texttt{doc/roadmap.md} file in BIRD's source code\cite{bird-src}, it is not implemented, so we will need to interface using the socket. This has following consequences, most of which are not very pleasant: \begin{itemize} \item The responses to different formats often have completely different formats. This necessitates creating a dedicated parsing routines for each kind of command we want to use. \item There is no guarantee that the output will not change between versions. We might need to follow BIRD's development in order to be aware of possible changes. \item The output does not contain all details of BIRD's state. For example, we can not retrieve the shortest path DAG directly from BIRD, nor see the details of the individual LSAs. \item There is no way to get notified when the topology changes. \end{itemize} BIRD provides only a few commands that deal with OSPF: \begin{itemize} \item \texttt{show ospf} shows a simple summary of the running instances of OSPF, like which areas are they in or how many LSAs does BIRD currently consider. \item \texttt{show ospf interface} describes a current status of the individual local interfaces: their configuration, designated routers for the incident network, etc. \item \texttt{show ospf neighbors} provides details about the state of communication with adjacent routers. \item \texttt{show ospf lsadb} returns details about known LSAs. Unfortunately, this contains low-level information like checksums and sequence numbers, but not details about networks or routers. \item \texttt{show ospf state} shows an overal view of the ospf graph representation: present routers and networks, costs of links, distances, \dots \item \texttt{show ospf topology} seems to only provide a subset of the output of \texttt{show ospf state}. For example, it does not provide info about any non-transit networks. \end{itemize} Even though some of the commands can have more parameters, parsing the output of the \texttt{show ospf state} command is still the only the viable option of getting a topology description. The following subsection describes the syntax of the response to this command. \subsection{Retrieving the OSPF state} Let us look in depth at the \texttt{show ospf state} command, since we will be using it and the format of its output a lot. The command has two optional parameters. First, the flag \texttt{all} may be added to show details not only the reachable part of the system, but from all the known and non-expired LSAs. The difference between the topologies can be used to discover network problems even without configuring the expected state. The second parameter is a name of the OSPF instance. It is only required when BIRD is running multiple instances simultaneously. This is unfortunately quite common, because in dual-stack\X{glos} systems there needs to be a separate instance of OSPF configured for each IP version. The output of the command is a tree of lines representing the topology itself. Children of a directive are indented by one more tab. An example output is shown in listing~\ref{lst:ospffile}. \begin{lstlisting}[float=h,label=lst:ospffile,caption=Example OSPFv2 state output] area 0.0.0.1 router 203.0.113.1 distance 20 network 203.0.113.0/26 metric 10 xnetwork 203.0.113.64/26 metric 10 xrouter 203.0.113.42 metric 10 router 201.0.113.2 distance 0 network 203.0.113.0/26 metric 20 external 0.0.0.0/0 metric 60 stubnet 201.0.113.128/25 metric 200 network 203.0.113.0/26 dr 203.0.113.1 distance 20 router 201.0.113.1 router 201.0.113.2 \end{lstlisting} The tree as output by BIRD\footnote{The format was determined by experimentation and inspecting of \texttt{proto/ospf/ospf.c} in BIRD's source code\cite{bird-src}.} has three levels, we call them top-level, level-2 and level-3. The top level only contains directives of form \texttt{area AreaID}, with the AreaID being written in the quad-dotted notation. On level-2 are mentioned all the routers and networks in the area. This is different for OSPFv2 and OSPFv3. While routers are always mentioned by their router IDs (again, quad-dotted), networks in OSPFv2 dumps are addressed using their IPv4 addresses (CIDR notation), but by the designated router ID and interface number in OSPFv3 ones: \verb|network [203.0.113.1-23]|. The third level describes details of the respective router/network and all the incident objects. There is always the distance from us (i.e. the router we asked for the dump), or the word \texttt{unreachable} if it is not reachable. There is also a level-3 line for each incident host and network. Overview of the \uv{tags} (the first words) and parameters is provided by table~\ref{tab:ospf-incidences}. Note that networks can only be incident to routers, while routers may be incident to anything. The incidences of routers have also a metric, after the word \texttt{metric}, or, in case of Type 2 external cost, \texttt{metric2}. \begin{table}[h] \centering \begin{tabular}{lcc}\hline Incidence type & tag & parameter \\\hline Transit network & \texttt{network} & Same as on level-2 \\ Router & \texttt{router} & Router ID \\ Extra-area router & \texttt{xrouter} & Router ID \\ Extra-area network & \texttt{xnetwork} & IP address (CIDR) \\ External network & \texttt{external} & IP address (CIDR) \\ Stub network & \texttt{stubnet} & IP address (CIDR) \\ Virtual link & \texttt{vlink} & Peer router ID \\\hline \end{tabular} \caption{Level-3 lines describing incidences} \label{tab:ospf-incidences} \end{table} For networks, additional details are also provided on level-3. For OSPFv2, the designated router ID is given in the \texttt{dr} directive, similarly, OSPFv3 may provide the networks with zero or more \texttt{address} lines with CIDR addresses. An example of a network block in OSPFv3 is in listing~\ref{lst:ospfnet}. \begin{lstlisting}[float=h,label=lst:ospfnet,caption=Example OSPFv3 network block] network [198.51.100.1-16] distance 10 router 198.51.100.1 router 198.51.100.2 address 2001:db8:b00:7::/64 \end{lstlisting} One of the nice properties of BIRD's output is that whenever there is a level-3 incidence line for object B in a level-2 block of object A, there exists an edge from A to B in the topology used by the Dijkstra's algorithm. This fact will later simplify parsing. \section{Test network system: Gennet} To help test Birdvisu and understand network behaviour, we created a simple set of scripts called Gennet. Sice it was mainly written to aid Birdvisu, we provide it as attachment~\ref{att:gennet} of this thesis. Gennet is a network generator. Using a hard-coded configuration and a set of Jinja2\cite{jinja2} templates, it provides a semi-automatic way of creating several virtual machines (their disk images and startup scripts) and configuration to connect them using software bridges\X{term?}. This will allow changing the state from the host operating system, simulating various network conditions. We fully admit that Gennet is really just a quick hack. However, since it was created specifically to aid the development of Birdvisu and because it provides a reproducible environment, we think it makes sense to attach it to this thesis. The particular choice of technologies (Jinja2, Python, Bash, QEMU and Alpine Linux)\X{refs!} is driven solely by our previous experience with them and should not affect the behaviour of the generated system in any way. %Gennet is used as follows: the user must first generate a base Linux image, %\verb|dummydisk.img|. Then, startup scripts and configuration files for the %individual machines are created in the \verb|output/| directory. These files %are then added into the copies of the base image using the %\verb|gen_disks.sh| script. Now it is possible to create the bridges using %\verb|output/gen_bridges.sh| and allow QEMU to attach VMs to these bridges by %appending the contents of \verb|output/bridge.conf| into %\verb|/etc/qemu/bridge.conf|. Finally, machines may be started, either by %running \verb|qemu.sh| in the output directory for each machine, or using %\verb|./manage_all.sh start| as a shortcut. Using Gennet generally involves creating a base disk image and configuration for individual machines, embedding this configuration into the base image, configuring the bridges and finally starting the required machines. The process is explained in detail in Gennet's \texttt{README.md} file. % I hope it is OK to just link this. I do not want to reword this again, since % this is completely unimportant and nobody should care unless they need to use % it, in which case they must read the README anyways. When used without changing Gennet's configuration, it creates a topology of 10 routers (A--I, X) and 7 networks (numbered), as shown in figure~\ref{fig:gennet}. We expect the user to provide some network connectivity to network 7 and configure the machine X manually. We use this exact topology as a base for our experimentation. \begin{figure}[h] \centering \includegraphics[width=\textwidth]{../img/gennet.pdf} \caption{The topology of the default Gennet} \label{fig:gennet} \end{figure} The default Gennet assigns addresses as follows: The networks are given addresses 172.23.$n$.0/24 and fdce:73a4:b00:$n$::/64, where $n$ is the number of the network. Routers are assigned router IDs of 172.23.100.$r$, where $r$ is the lexicographic order of that router (A gets a 1, B is 2, \dots, I becomes 9, X is 10). The IP addresses of the routers have the same number in the last octet (e.g., the IPv6 of router X in network 6 is fdce:73a4:b00:6::a), and all routers have an IP address in each incident network. The costs of all links are 10 by default. \section{Unusual network states}\label{s:net-unusual} Now that we have basic understanding of BIRD and a network system for testing, we see how both versions of OSPF react to various unusual conditions in the system. We focus on behaviour of BIRD in following scenarios: \begin{itemize} \item Network split: Hosts in the same network stop being able to communicate with some other host in the same network. This is often caused by a broken cable or switch malfunction. \item Multiple links to same networks: It might make sense to connect a single router to the same network using multiple links, possibly with different costs. This provides redundancy and helps e.g. avoid network splits. \item Network with multiple addresses: Sometimes, a network may have more than one address (prefix). This may be either intentional or a result of accidental joining of networks which should be separate. \end{itemize} Unfortunately, the behaviour of BIRD in these states is often very different depending on the version of OSPF. \subsection{Network splits}\label{s:net-split} Network splits are of particular interest to administrators. Not only are they symptoms of a broken part of the infrastructure, but also every netsplit inherently means that some addresses from the split network can not be reached by some hosts, because there is no hint, to which segment a packet should be delivered. Splits can also be tricky to spot from systems using topology-unaware approaches, because if a link connecting two switches fails, all ports on hosts are still up and the traffic in the split network might not change, when no traffic is routed through the broken link. When a split occurs, at least one segment stops being able to communicate with the network's designated router. Either this segment has only one router and becomes stub, or has more routers and then a new designated router is elected. (OSPF cannot detect when a network splits and there is no router in a separated segment. Monitoring of host reachability is therefore still useful.) The representation of split network in BIRD's output is straightforward: some routers may become attached to a \texttt{stubnet}, instead of a \texttt{network} and more level-2 network blocks can appear, one for each segment that has elected a DR. For OSPFv3, this forms a valid topology, since the level-3 network directives are derived from the designated router ID and its interface number, identifying the network uniquely. However, since OSPFv2 identifies a network using the shared address in the output, it is not immediately obvious, which of the segments the router is connected to. Luckily, this can be deduced from level-2 network blocks, because they provide information both about the segment's DR and about incident routers. \cons A network address is not sufficient to identify a network or stubnet. To do that, we either need to also know a designated router ID, in case of transit networks, or which router the network is connected to, for stubnets. \subsection{Multiple links} BIRD's implementation\footnote{We are not sure whether this is the correct behaviour} of both versions of OSPF seems to announce two copies of the same network throughout the area, if the designated router is connected to that network using multiple links. This is not an issue for routing, because using any of the copies results in the packet being sent to the right network, but is an unfortunate behaviour for visualisation. In the OSPFv2 dumps there is no way to differentiate the two copies, since the DR's interface number is not exposed, so we can only merge them into one network, solving the problem. On the other hand, in OSPFv3 the interface number may be the only information used to differentiate different networks, since the networks do not need to have any addresses assigned. The only safe way is therefore to visualise both copies just in case. (There also seems to be a bug, when OSPFv3 dump does not contain the level-2 block of the multiply-connected router on a neighbouring router. We did not explain this behaviour, but it does not seem to propagate to other routers nor affect packet forwarding, so we decided to ignore this peculiarity and debug it later. A simple workaround is to add another router between the actual network and Birdvisu, which is always possible by using unnumbered networks.)\label{ss:bug} \subsection{Multiple addresses in a single network} In OSPFv3, a single network can be assigned zero or more addresses. Therefore, from its point of view it is not an unusual state. \cons For OSPFv3, the set of all addresses must be considered to determine whether the network has changed or not. OSPFv2 treats each of the address as a separate network, ignoring other routers. When this is intended, it should not cause problems, and unintended merges of networks will not interact. However, we cannot detect this state across the area, unless there is another change in topology (for example, if this is caused by a cable connected to a wrong port, the network will probably be stub). \section{Area structure} It is also worth to consider the expected size and structure of an OSPF areas where Birdvisu might run. While it is up to the administrator and they may be very creative, there are some limits to such creativity. The largest system which can be spanned by a single OSPF instance is the whole autonomous system (AS). The largest ASes only have about several hundred thousand routers\cite{as-topologies}. The average degree also seems to be rather low. We can derive another limit from IPv4 address allocations. A /8 block (i.e. Class A) has 16 777 216 addresses. Very few ISP would be assigned such a large block, but they might be using the 10.0.0.0/8 private block. Even if an ISP wants to use all those addresses, majority of them will likely not be assigned to routers, but to some end devices that actually provide \uv{useful} services. Those devices are also likely to be grouped into non-trivial networks, thus reducing the number of vertices (i.e. routers and networks) in the OSPF topology. (While IPv6 allocates many more addresses, we assume that the overall topology will not be different from the IPv4 one.) It is probably not practical to have a single OSPF area span all the routers in a large AS, since any link state change results in an LSA being flooded throughout the area. While we cannot be sure about particular administrative decisions, given the observations above, we expect that a single area contains at most few thousand vertices and probably much less than that.