bc_thesis/en/chap02.tex

\chapter{Analysis}\label{ch:analysis}

In order to avoid as many problems as possible when visualising and
analysing an OSPF topology, we need to understand several aspects of networking,
the OSPF protocol and its implementation in BIRD. However, the aim of this
thesis is not to be a complete guide to OSPF nor BIRD, therefore, for the sake
of brevity, we skip many details which do not influence our project.

To aid in testing Birdvisu and experimenting with routing, we will also present
a small side-project called Gennet in this chapter. Using it, we will
understand the behaviour of network splits and multiple links, as seen by the
BIRD routing daemon.

\section{OSPF overview}

OSPF~\cite{rfc2328,rfc5340} is a link-state routing protocol, which means that routers try to understand
the whole topology of the network and find the best path using an algorithm for
finding the shortest paths. Usually, Dijkstra's algorithm is used.
OSPF was designed to provide dynamic routing in an entire autonomous system, but
running it on a much smaller scale is also possible.

OSPF requires routers to share information about the state of the topology in
messages called \emph{Link State Advertisements} (or LSAs for short). The LSAs
contain information about which network segments are incident to which
router on which interfaces and with which routers it is possible to communicate
on those interfaces. To distinguish between routers, each router is assigned a
32-bit number called \emph{Router ID} by the network administrator.
Router IDs are usually written in a quad-dotted notation.

The LSAs are flooded throughout the system in order to let all the routers know
about the current topology. To avoid overloading the system just with this
routing traffic, OSPF devises several mechanisms of minimising the number of
exchanged messages. First, each network elects a \emph{designated router} (or
DR) which coordinates exchange of the messages in that network segment.
Second, the system can be partitioned into \emph{areas}. That way, the frequent
LSAs are only flooded throughout a limited set of networks; the \emph{area
border routers} (ABRs) send accumulated LSAs into other areas.\footnote{These
inter-area LSAs are called \uv{Summary LSAs} in OSPFv2, but there is no requirement that
the routing information is actually aggregated.} These LSAs do not describe the
topology of the originating area.

The area 0.0.0.0 is called the \emph{backbone area} and all other areas must be
incident to it, so that it has all the routing information. When this is
impractical, OSPF allows two routers to be connected using a \emph{virtual
link}. From the routing perspective, this is a point-to-point connection
between the routers in the backbone area, which allows to forward LSAs through
other areas even when these LSAs would not normally leave those areas.

There are several types of networks that emerge in OSPF topologies. The
\emph{transit} networks are used for forwarding packets in an area. \emph{Stub}
networks only have one router and therefore can only deliver packets
originating from or destinated to that network. For representing routes outside
of the area and the whole system, OSPF recognises \emph{extra-area} and
\emph{external} networks respectively. It is also possible for a router to be
adjacent to an extra-area router through a point-to-point link.
Figure.~\ref{fig:nettypes} shows the same classification visually.

\begin{figure}[h]
	\centering
	% full width and hope it is readable…
	\includegraphics[width=\textwidth]{../img/types-of-networks.pdf}
	\caption{Types of networks as seen from the cyan area.}
	\label{fig:nettypes}
\end{figure}

Once the router has complete information about the topology of an area, it
constructs a graph representation of the network and calculates the shortest path
DAG\footnote{In the specifications, it is called the shortest path \emph{tree},
but there may be multiple shortest paths and the router is supposed to use all
of them.} rooted at that router. This DAG is then used for finding shortest
paths to routers and networks in that area, including the  external, extra-area
and stub networks adjacent to that area. OSPF specifies that the graph has all
the networks and routers as vertices, directed edges lead from each router to the
incident network with the configured cost and from each transit network to
incident routers with cost 0 (except when the two-part metric~\cite{rfc8042} is
implemented). There are no edges starting at the external, extra-area or stub
networks, so that the shortest path DAG calculation does not find paths
through them.

The cost of the edge to an external network can be of two types. A type~1 cost
uses the same units as the internal costs, whereas any type~2 cost is larger than
all internal or type~1 costs.

The OSPF family of routing protocols has undergone long evolution since the
first specification in 1989~\cite{rfc1131}. There are currently two versions of
the protocol in use -- versions 2 and 3. While the basic idea is still the
same, OSPFv2 can only handle IPv4 systems. Although OSPFv3 claims to be
network-protocol-independent, it is usually only used with IPv6 systems and, in
fact, features like virtual links can only be used with that network
protocol~\cite{rfc5838}.

Both OSPF versions have numerous extensions, as can be seen by the number of
RFCs that update the base specifications~\cite{rfc2328,rfc5340}. Therefore, we
do not implement the protocol ourself, but rather find a suitable routing
daemon to determine the current topology for us.

Many improvements of the protocol only affect the topology construction (e.g.
NSSA areas~\cite{rfc3101}) or change the data exchange between routers
(multi-instance extensions~\cite{rfc6549}, authentication~\cite{rfc5709}, \dots).
By extracting the topology from a routing daemon, we can support many OSPF
extensions for free. For this reason, it is mostly sufficient to only consider
the base specifications of OSPF.

\section{Routing daemon selection}

While we were mostly determined to use BIRD~\cite{bird} from the start, because we already
had some experience with it, let us present here a short summary of other
possibilities. Note that the particular choice does not affect interoperability
with other routers as long as the chosen routing daemon supports extensions
used in the network system.

There are several implementations of both versions of the OSPF protocol.
However, many of them are tied to the specific router hardware, which makes them
impractical to connect to a graphical visualisation. Moreover, even evaluating
the feasibility would require us to obtain the specific hardware. Therefore, we
only consider hardware-independent solutions.

While we are aware of several software implementations, many of these do not
seem to be developed anymore (Quagga~\cite{quagga}, XORP~\cite{xorp},
OpenOSPFd~\cite{openospfd}). Apart from BIRD, we only found FRRouting~\cite{frr}
to be maintained, meaning that a new version has been released in the past year. While being
maintained is not a strict requirement, it would allow us to use that
implementation in case OSPF is extended again.

However, even BIRD does not implement all the extensions, for example, the
two-part metric~\cite{rfc8042}.

\section{BIRD interface}

The BIRD daemon is controlled through a UNIX domain socket using a text
line-based protocol slightly resembling SMTP. The client may send
commands to the daemon, which provides responses. The response may be long and
possibly formatted into a table. This interface is primarily aimed at human
users, so a rather simple client, \texttt{birdc}, is provided in the BIRD's
package.

While there is a note of a machine-readable protocol in the
\texttt{doc/roadmap.md} file in BIRD's source code~\cite{bird-src}, it has not been
implemented, so we will need to interface using the socket. This has following
consequences, most of which are not very pleasant:

\begin{itemize}
\item The responses to different commands often have completely different
formats. This necessitates creating a dedicated parsing routines for each kind
of command we want to use.
\item There is no guarantee that the output will not change between versions.
We might need to follow BIRD's development in order to be aware of possible
changes.
\item The output does not contain all details of BIRD's state. For example, we
can not retrieve the shortest path DAG directly from BIRD, nor see the details
of the individual LSAs.
\item There is no way to get notified when the topology changes.
\end{itemize}

BIRD provides only a few commands that deal with OSPF:

\begin{itemize}
\item \texttt{show ospf} shows a simple summary of the running instances of
	OSPF, like which areas are they incident to or how many LSAs does BIRD currently
	consider.
\item \texttt{show ospf interface} describes the current status of the individual
	local interfaces: their configuration, designated routers for the incident
	network, etc.
\item \texttt{show ospf neighbors} provides details about the state of
	communication with neighbour routers.
\item \texttt{show ospf lsadb} returns the details about known LSAs. Unfortunately,
	this contains low-level information like checksums and sequence numbers,
	but not details about networks or routers.
\item \texttt{show ospf state} shows an overall view of the OSPF graph
	representation: present routers and networks, costs of links, distances, \dots
\item \texttt{show ospf topology} seems to only provide a subset of the output
	of \texttt{show ospf state}. For example, it does not provide information about
	any non-transit networks.
\end{itemize}

Even though some of the commands can have more parameters, parsing the output
of the \texttt{show ospf state} command is still the only the viable option of
getting a topology description. The following subsection describes the syntax of
the response to this command.

\subsection{Retrieving the OSPF state}\label{ss:ospffile}

Let us look in depth at the \texttt{show ospf state} command, since we will be
using it and the format of its output a lot.

The command has two optional parameters. First, the flag \texttt{all} may be
added to show details not only about the reachable part of the system, but from all
the known and non-expired LSAs. The difference between the topologies can be
used to discover network problems even without configuring the expected state.

The second parameter is a name of the OSPF instance. It is only required when
BIRD is running multiple instances simultaneously. This is unfortunately quite
common, because in dual-stack systems there need to be two separate
instances of OSPF, each configured for different IP version.

The output of the command is a tree of lines representing the topology itself.
Children of a directive are indented by one more tab.  An example output is
shown in listing~\ref{lst:ospffile}.

\begin{lstlisting}[float=h,label=lst:ospffile,caption=Example OSPFv2 state output]

area 0.0.0.1

	router 203.0.113.1
		distance 20
		network 203.0.113.0/26 metric 10
		xnetwork 203.0.113.64/26 metric 10
		xrouter 203.0.113.42 metric 10

	router 201.0.113.2
		distance 0
		network 203.0.113.0/26 metric 20
		external 0.0.0.0/0 metric 60
		stubnet 201.0.113.128/25 metric 200

	network 203.0.113.0/26
		dr 203.0.113.1
		distance 20
		router 201.0.113.1
		router 201.0.113.2
\end{lstlisting}

The tree as output by BIRD\footnote{The format was determined by
experimentation and inspecting of \texttt{proto/ospf/ospf.c} in BIRD's source
code~\cite{bird-src}.} has three levels, we call them top-level, level-2
and level-3. The top level only contains directives of the form \texttt{area
AreaID}, with the AreaID being written in the quad-dotted notation.

On level-2 are mentioned all the routers and networks in the area. This is
different for OSPFv2 and OSPFv3. While routers are always mentioned by their
router IDs (again, quad-dotted), networks in OSPFv2 dumps are addressed using
their IPv4 addresses (CIDR notation), but by the designated router ID and
interface number in OSPFv3 ones: \verb|network [203.0.113.1-23]|.

The third level describes details of the respective router/network and all the
incident objects. There is always the distance from us (i.e. the router we
asked for the dump), or the word \texttt{unreachable} if it is not reachable.

There is also a level-3 line for each incident host and network. Overview of
the \uv{tags} (the first words) and parameters is provided by
table~\ref{tab:ospf-incidences}. Note that networks can only be incident to
routers, while routers may be incident to anything. The incidences of routers
have also a metric, after the word \texttt{metric}, or, in case of type 2
external cost, \texttt{metric2}.

\begin{table}[h]
	\centering
	\begin{tabular}{lcc}\hline
		Incidence type     & tag               & parameter \\\hline
		Transit network    & \texttt{network}  & Same as on level-2 \\
		Router             & \texttt{router}   & Router ID \\
		Extra-area router  & \texttt{xrouter}  & Router ID \\
		Extra-area network & \texttt{xnetwork} & IP address (CIDR) \\
		External network   & \texttt{external} & IP address (CIDR) \\
		Stub network       & \texttt{stubnet}  & IP address (CIDR) \\
		Virtual link       & \texttt{vlink}    & Peer router ID \\\hline
	\end{tabular}
	\caption{Level-3 lines describing incidences}
	\label{tab:ospf-incidences}
\end{table}

For networks, additional details are also provided on level-3. For OSPFv2, the
designated router ID is given in the \texttt{dr} directive, similarly, OSPFv3
may provide the networks with zero or more \texttt{address} lines with CIDR
addresses. An example of a network block in OSPFv3 is in
listing~\ref{lst:ospfnet}.

\begin{lstlisting}[float=h,label=lst:ospfnet,caption=Example OSPFv3 network block]
	network [198.51.100.1-16]
		distance 10
		router 198.51.100.1
		router 198.51.100.2
		address 2001:db8:b00:7::/64
\end{lstlisting}

One of the nice properties of BIRD's output is that whenever there is a level-3
incidence line for object B in a level-2 block of object A, there exists an
edge from A to B in the topology used by the Dijkstra's algorithm. This fact
will later simplify parsing.

\section{Test network system: Gennet}\label{s:gennet}

To help test Birdvisu and understand network behaviour, we created a simple set
of scripts called Gennet. Sice it was mainly written to aid Birdvisu, we
provide it as attachment~\ref{att:gennet} of this thesis.

Gennet is a network generator. Using a hard-coded configuration and a set of
Jinja2~\cite{jinja2} templates, it provides a semi-automatic way of
creating several virtual machines (their disk images and startup scripts) and
configuration to connect them using software bridges. This will allow
changing the state from the host operating system, simulating various network
conditions.

We fully admit that Gennet is really just a quick hack. However, since it was
created specifically to aid the development of Birdvisu and because it provides
a reproducible environment, we think it makes sense to attach it to this thesis. The
particular choice of technologies (Jinja2, Python, Bash, QEMU and Alpine
Linux) is driven solely by our previous experience with them and
should not affect the behaviour of the generated system in any way.

%Gennet is used as follows: the user must first generate a base Linux image,
%\verb|dummydisk.img|. Then, startup scripts and configuration files for the
%individual machines are created in the \verb|output/| directory. These files
%are then added into the copies of the base image using the
%\verb|gen_disks.sh| script. Now it is possible to create the bridges using
%\verb|output/gen_bridges.sh| and allow QEMU to attach VMs to these bridges by
%appending the contents of \verb|output/bridge.conf| into
%\verb|/etc/qemu/bridge.conf|. Finally, machines may be started, either by
%running \verb|qemu.sh| in the output directory for each machine, or using
%\verb|./manage_all.sh start| as a shortcut.
Using Gennet generally involves creating a base disk image and configuration
for individual machines, embedding this configuration into the base image,
configuring the bridges and finally starting the required machines. The process
is explained in detail in Gennet's \texttt{README.md} file.
% I hope it is OK to just link this. I do not want to reword this again, since
% this is completely unimportant and nobody should care unless they need to use
% it, in which case they must read the README anyways.

When used without changing Gennet's configuration, it creates a topology of 10
routers (A--I, X) and 7 networks (numbered), as shown in
figure~\ref{fig:gennet}. We expect the user to provide some network
connectivity to network 7 and configure the machine X manually. We use
this exact topology as a base for our experimentation.

\begin{figure}[h]
	\centering
	\includegraphics[width=\textwidth]{../img/gennet.pdf}
	\caption{The topology of the default Gennet}
	\label{fig:gennet}
\end{figure}

The default Gennet assigns addresses as follows: The networks are given
addresses 172.23.$n$.0/24 and fdce:73a4:b00:$n$::/64, where $n$ is the number
of the network. Routers are assigned router IDs of 172.23.100.$r$, where $r$ is
the lexicographic order of that router (A gets a 1, B is 2, \dots,  I becomes
9, X is 10). The IP addresses of the routers have the same number in the last
octet (e.g., the IPv6 of router X in network 6 is fdce:73a4:b00:6::a), and all
routers have an IP address in each incident network. The costs of all links are
10 by default.


\section{Unusual network states}\label{s:net-unusual}

Now that we have basic understanding of BIRD and a network system for testing,
we see how both versions of OSPF react to various unusual conditions in the
system.

We focus on behaviour of BIRD in following scenarios:

\begin{itemize}
	\item Network split: Hosts in the same network stop being able to
		communicate with some other host in the same network. This is often
		caused by a broken cable or switch malfunction.
	\item Multiple links to the same network: It might make sense to connect a
		single router to the same network using multiple links, possibly with
		different costs. This provides redundancy and helps e.g. avoid network
		splits.
	\item Network with multiple addresses: Sometimes, a network may have more
		than one address (prefix). This may be either intentional or a result of
		accidental joining of networks which should be separate.
\end{itemize}

Unfortunately, the behaviour of BIRD in these states is often very different
depending on the version of OSPF.

\subsection{Network splits}\label{s:net-split}

Network splits are of particular interest to administrators. Not only are they
symptoms of a broken part of the infrastructure, but also every netsplit
inherently means that some addresses from the split network can not be reached
by some hosts, because there is no hint which segment a packet
should be delivered to.

Splits can also be tricky to spot from systems using topology-unaware approaches,
because if a link connecting two switches fails, all ports on hosts are still
up and the traffic in the split network might not change, when no traffic is
routed through the broken link.

When a split occurs, at least one segment stops being able to communicate with
the network's designated router. Either this segment has only one router and
becomes stub, or has more routers and then a new designated router is elected.
(OSPF cannot detect when a network splits and there is no router in a separated
segment. Monitoring of host reachability is therefore still useful.)

The representation of split network in BIRD's output is straightforward: some
routers may become attached to a \texttt{stubnet}, instead of a
\texttt{network} and more level-2 network blocks can appear, one for each
segment that has elected a DR.

For OSPFv3, this forms a valid topology, since the level-3 network directives
are derived from the designated router ID and its interface number, identifying
the network uniquely. However, since OSPFv2 identifies a network using the
shared address in the output, it is not immediately obvious, which of the
segments the router is connected to. Luckily, this can be deduced from level-2
network blocks, because they provide information both about the segment's DR
and about incident routers.

\cons A network address is not sufficient to identify a network or stubnet. To
do that, we either need to also know a designated router ID, in case of transit
networks, or which router the network is connected to, for stubnets.

\subsection{Multiple links}

BIRD's implementation\footnote{We are not sure whether this is the correct
behaviour} of both versions of OSPF seems to announce two copies of the same
network throughout the area, if the designated router is connected to that
network using multiple links. This is not an issue for routing, because using
any of the copies results in the packet being sent to the right network, but is
an unfortunate behaviour for visualisation. In the OSPFv2 dumps there is no way
to differentiate the two copies, since the DR's interface number is not
exposed, so we can only merge them into one network, solving the problem.

On the other hand, in OSPFv3 the interface number may be the only information
used to differentiate different networks, since the networks do not need to
have any addresses assigned. The only safe way is therefore to visualise both
copies just in case.

(There also seems to be a bug, when OSPFv3 dump does not contain the level-2
block of the multiply-connected router on a neighbouring router. We did not
explain this behaviour, but it does not seem to propagate to other routers nor
affect packet forwarding, so we decided to ignore this peculiarity and debug it
later. A simple workaround is to add another router between the actual network
and Birdvisu, which is always possible by using unnumbered networks.)\label{ss:bug}

\subsection{Multiple addresses in a single network}

In OSPFv3, a single network can be assigned zero or more addresses. Therefore,
from its point of view it is not an unusual state. \cons For OSPFv3, the set of
all addresses must be considered to determine whether the network has changed
or not.

OSPFv2 treats each of the address as a separate network, ignoring other
routers. When this is intended, it should not cause problems, and unintended
merges of networks will not interact. However, we cannot detect this state
across the area, unless there is another change in topology (for example, if
this is caused by a cable connected to a wrong port, the network will probably
be stub).

\section{Area structure}\label{s:areas}

It is also worth to consider the expected size and structure of an OSPF areas
where Birdvisu might run. While it is up to the administrator and they may be
very creative, there are some limits to such creativity.

The largest system which can be spanned by a single OSPF instance is the whole
autonomous system (AS). The largest ASes only have about several hundred
thousand routers~\cite{as-topologies}. The average degree also seems to be rather
low.

We can derive another limit from IPv4 address allocations. A /8 block (i.e.
Class A) has 16 777 216 addresses. Very few ISP would be assigned such a large
block, but they might be using the 10.0.0.0/8 private block. Even if an ISP
wants to use all those addresses, majority of them will likely not be assigned
to routers, but to some end devices that actually provide \uv{useful} services.
Those devices are also likely to be grouped into non-trivial networks, thus
reducing the number of vertices (i.e. routers and networks) in the OSPF
topology. (While IPv6 allocates many more addresses, we assume that the overall
topology will not be different from the IPv4 one.)

It is probably not practical to have a single OSPF area span all the routers in
a large AS, since any link state change results in an LSA being flooded
throughout the area.

While we cannot be sure about particular administrative decisions, given the
observations above, we expect that a single area contains at most few thousand
vertices and probably much less than that.