diff --git a/en/chap01.tex b/en/chap01.tex index e781386..1806180 100644 --- a/en/chap01.tex +++ b/en/chap01.tex @@ -9,11 +9,11 @@ this we derive a set of properties Birdvisu should fulfil. \section{Existing approaches to network monitoring} -Several approaches to network monitoring and status visalisation already exist. +Several approaches to network monitoring and status visualisation already exist. These can be approximately split into several types: visualisation of existing data, traffic visualisation, host monitoring systems and integrated system management platforms. Here we introduce them shortly and explain their -potential disadvantages, compared to visualisation of routing data. +potential disadvantages, compared to visualisation of routing information. \subsection{Visualisation of existing data} @@ -44,7 +44,7 @@ collection are available, so this differentiates them from the previous group. \subsection{Host monitoring systems} -Projects in tis category do not necessarily consider the whole network system, +Projects in this category do not necessarily consider the whole network system, but check state of individual hosts. It is possible for them to run locally, as is the case for Plotnetcfg~\cite{plotnetcfg}, or check the host over the network (CaLStats~\cite{calstats}, Icinga~\cite{icinga}), but they do not provide @@ -68,7 +68,7 @@ usually server based with web interface~\cite{nodewatcher-paper}. \subsection{Topolograph} -We are aware of the only project that would allow visualisation of OSPF +We are aware of only one project that would allow visualisation of OSPF topology, Topolograph~\cite{topolograph}. While it does not collect its own data, its companion project, Ospfwatcher~\cite{ospfwatcher}, is able to retrieve current topology data from a Quagga~\cite{quagga} routing daemon. The deployment @@ -81,7 +81,7 @@ Firefox version 116.0b2). \subsection{Summary of existing tools} As we have seen, various projects are available, but they often have some -important disadvantage. Table~\ref{t:comparison1} summarizes available projects. +important disadvantages. Table~\ref{t:comparison1} summarizes known approaches. \bgroup \def\yes{\checkmark} @@ -118,7 +118,7 @@ be running on any laptop, Birdvisu might also be a helpful tool for admins in the field trying to fix broken infrastructure. The primary motivation for implementing Birdvisu was the need of the author, -who has a dynamically routed system that switches uplinks depending on which +who has a dynamically routed system that switches uplinks depending on which of them currently work. % Pls don't tell the dorms :-) \section{Goals of Birdvisu} diff --git a/en/chap02.tex b/en/chap02.tex index 00a9c66..f7391cf 100644 --- a/en/chap02.tex +++ b/en/chap02.tex @@ -1,7 +1,7 @@ \chapter{Analysis}\label{ch:analysis} In order to avoid as many problems as possible when visualising and -analysing a OSPF topology, we need to understand several aspects of networking, +analysing an OSPF topology, we need to understand several aspects of networking, the OSPF protocol and its implementation in BIRD. However, the aim of this thesis is not to be a complete guide to OSPF nor BIRD, therefore, for the sake of brevity, we skip many details which do not influence our project. @@ -15,7 +15,7 @@ BIRD routing daemon. OSPF~\cite{rfc2328,rfc5340} is a link-state routing protocol, which means that routers try to understand the whole topology of the network and find the best path using an algorithm for -finding the shortest paths. Usually, Dijkstra's algorithm \X{ref?} is used. +finding the shortest paths. Usually, Dijkstra's algorithm is used. OSPF was designed to provide dynamic routing in an entire autonomous system, but running it on a much smaller scale is also possible. @@ -25,7 +25,7 @@ contain information about which network segments are incident to which router on which interfaces and with which routers it is possible to communicate on those interfaces. To distinguish between routers, each router is assigned a 32-bit number called \emph{Router ID} by the network administrator. -Router IDs are usually written in a quad-dotted notation \X{glos}. +Router IDs are usually written in a quad-dotted notation. The LSAs are flooded throughout the system in order to let all the routers know about the current topology. To avoid overloading the system just with this @@ -43,7 +43,7 @@ The area 0.0.0.0 is called the \emph{backbone area} and all other areas must be incident to it, so that it has all the routing information. When this is impractical, OSPF allows two routers to be connected using a \emph{virtual link}. From the routing perspective, this is a point-to-point connection -between the routers in the backbone area. which allows to forward LSAs through +between the routers in the backbone area, which allows to forward LSAs through other areas even when these LSAs would not normally leave those areas. There are several types of networks that emerge in OSPF topologies. The @@ -77,48 +77,48 @@ implemented). There are no edges starting at the external, extra-area or stub networks, so that the shortest path DAG calculation does not find paths through them. -The cost of the edge to an external network can be of two types. Type 1 cost is -specified in the same units as the internal costs, Type 2 cost is larger than -any internal or type 1 cost. +The cost of the edge to an external network can be of two types. A type~1 cost +uses the same units as the internal costs, whereas any type~2 cost is larger than +all internal or type~1 costs. The OSPF family of routing protocols has undergone long evolution since the -first specification in 1989~\cite{rfc1131}, There are currently two versions of +first specification in 1989~\cite{rfc1131}. There are currently two versions of the protocol in use -- versions 2 and 3. While the basic idea is still the same, OSPFv2 can only handle IPv4 systems. Although OSPFv3 claims to be -network-protocol-independent, it is usually only used with IPv6 systems and in +network-protocol-independent, it is usually only used with IPv6 systems and, in fact, features like virtual links can only be used with that network protocol~\cite{rfc5838}. Both OSPF versions have numerous extensions, as can be seen by the number of RFCs that update the base specifications~\cite{rfc2328,rfc5340}. Therefore, we do not implement the protocol ourself, but rather find a suitable routing -daemon \X{glos} to determine the current topology. +daemon to determine the current topology for us. Many improvements of the protocol only affect the topology construction (e.g. NSSA areas~\cite{rfc3101}) or change the data exchange between routers -(Multi-instance extensions~\cite{rfc6549}, authentication~\cite{rfc5709}, \dots). +(multi-instance extensions~\cite{rfc6549}, authentication~\cite{rfc5709}, \dots). By extracting the topology from a routing daemon, we can support many OSPF extensions for free. For this reason, it is mostly sufficient to only consider the base specifications of OSPF. \section{Routing daemon selection} -While we were mostly determined to use BIRD~\cite{bird} from the start, since we already +While we were mostly determined to use BIRD~\cite{bird} from the start, because we already had some experience with it, let us present here a short summary of other possibilities. Note that the particular choice does not affect interoperability with other routers as long as the chosen routing daemon supports extensions used in the network system. There are several implementations of both versions of the OSPF protocol. -However, many of them are tied to the specific router hardware, which makes it -impractical to connect to a graphical visualisation. Moreover, even evaulating +However, many of them are tied to the specific router hardware, which makes them +impractical to connect to a graphical visualisation. Moreover, even evaluating the feasibility would require us to obtain the specific hardware. Therefore, we only consider hardware-independent solutions. While we are aware of several software implementations, many of these do not seem to be developed anymore (Quagga~\cite{quagga}, XORP~\cite{xorp}, OpenOSPFd~\cite{openospfd}). Apart from BIRD, we only found FRRouting~\cite{frr} -to be maintained, meaning that it had a release in the past year. While being +to be maintained, meaning that a new version has been released in the past year. While being maintained is not a strict requirement, it would allow us to use that implementation in case OSPF is extended again. @@ -132,15 +132,15 @@ line-based protocol slightly resembling SMTP. The client may send commands to the daemon, which provides responses. The response may be long and possibly formatted into a table. This interface is primarily aimed at human users, so a rather simple client, \texttt{birdc}, is provided in the BIRD's -package\X{glos?}. +package. While there is a note of a machine-readable protocol in the -\texttt{doc/roadmap.md} file in BIRD's source code~\cite{bird-src}, it is not +\texttt{doc/roadmap.md} file in BIRD's source code~\cite{bird-src}, it has not been implemented, so we will need to interface using the socket. This has following consequences, most of which are not very pleasant: \begin{itemize} -\item The responses to different formats often have completely different +\item The responses to different commands often have completely different formats. This necessitates creating a dedicated parsing routines for each kind of command we want to use. \item There is no guarantee that the output will not change between versions. @@ -156,20 +156,20 @@ BIRD provides only a few commands that deal with OSPF: \begin{itemize} \item \texttt{show ospf} shows a simple summary of the running instances of - OSPF, like which areas are they in or how many LSAs does BIRD currently + OSPF, like which areas are they incident to or how many LSAs does BIRD currently consider. -\item \texttt{show ospf interface} describes a current status of the individual +\item \texttt{show ospf interface} describes the current status of the individual local interfaces: their configuration, designated routers for the incident network, etc. \item \texttt{show ospf neighbors} provides details about the state of - communication with adjacent routers. -\item \texttt{show ospf lsadb} returns details about known LSAs. Unfortunately, + communication with neighbour routers. +\item \texttt{show ospf lsadb} returns the details about known LSAs. Unfortunately, this contains low-level information like checksums and sequence numbers, but not details about networks or routers. -\item \texttt{show ospf state} shows an overal view of the ospf graph +\item \texttt{show ospf state} shows an overall view of the OSPF graph representation: present routers and networks, costs of links, distances, \dots \item \texttt{show ospf topology} seems to only provide a subset of the output - of \texttt{show ospf state}. For example, it does not provide info about + of \texttt{show ospf state}. For example, it does not provide information about any non-transit networks. \end{itemize} @@ -184,14 +184,14 @@ Let us look in depth at the \texttt{show ospf state} command, since we will be using it and the format of its output a lot. The command has two optional parameters. First, the flag \texttt{all} may be -added to show details not only the reachable part of the system, but from all +added to show details not only about the reachable part of the system, but from all the known and non-expired LSAs. The difference between the topologies can be used to discover network problems even without configuring the expected state. The second parameter is a name of the OSPF instance. It is only required when BIRD is running multiple instances simultaneously. This is unfortunately quite -common, because in dual-stack\X{glos} systems there needs to be a separate -instance of OSPF configured for each IP version. +common, because in dual-stack systems there need to be two separate +instances of OSPF, each configured for different IP version. The output of the command is a tree of lines representing the topology itself. Children of a directive are indented by one more tab. An example output is @@ -223,7 +223,7 @@ area 0.0.0.1 The tree as output by BIRD\footnote{The format was determined by experimentation and inspecting of \texttt{proto/ospf/ospf.c} in BIRD's source code~\cite{bird-src}.} has three levels, we call them top-level, level-2 -and level-3. The top level only contains directives of form \texttt{area +and level-3. The top level only contains directives of the form \texttt{area AreaID}, with the AreaID being written in the quad-dotted notation. On level-2 are mentioned all the routers and networks in the area. This is @@ -240,7 +240,7 @@ There is also a level-3 line for each incident host and network. Overview of the \uv{tags} (the first words) and parameters is provided by table~\ref{tab:ospf-incidences}. Note that networks can only be incident to routers, while routers may be incident to anything. The incidences of routers -have also a metric, after the word \texttt{metric}, or, in case of Type 2 +have also a metric, after the word \texttt{metric}, or, in case of type 2 external cost, \texttt{metric2}. \begin{table}[h] @@ -278,7 +278,7 @@ incidence line for object B in a level-2 block of object A, there exists an edge from A to B in the topology used by the Dijkstra's algorithm. This fact will later simplify parsing. -\section{Test network system: Gennet} +\section{Test network system: Gennet}\label{s:gennet} To help test Birdvisu and understand network behaviour, we created a simple set of scripts called Gennet. Sice it was mainly written to aid Birdvisu, we @@ -287,7 +287,7 @@ provide it as attachment~\ref{att:gennet} of this thesis. Gennet is a network generator. Using a hard-coded configuration and a set of Jinja2~\cite{jinja2} templates, it provides a semi-automatic way of creating several virtual machines (their disk images and startup scripts) and -configuration to connect them using software bridges\X{term?}. This will allow +configuration to connect them using software bridges. This will allow changing the state from the host operating system, simulating various network conditions. @@ -295,7 +295,7 @@ We fully admit that Gennet is really just a quick hack. However, since it was created specifically to aid the development of Birdvisu and because it provides a reproducible environment, we think it makes sense to attach it to this thesis. The particular choice of technologies (Jinja2, Python, Bash, QEMU and Alpine -Linux)\X{refs!} is driven solely by our previous experience with them and +Linux) is driven solely by our previous experience with them and should not affect the behaviour of the generated system in any way. %Gennet is used as follows: the user must first generate a base Linux image, @@ -351,7 +351,7 @@ We focus on behaviour of BIRD in following scenarios: \item Network split: Hosts in the same network stop being able to communicate with some other host in the same network. This is often caused by a broken cable or switch malfunction. - \item Multiple links to same networks: It might make sense to connect a + \item Multiple links to the same network: It might make sense to connect a single router to the same network using multiple links, possibly with different costs. This provides redundancy and helps e.g. avoid network splits. @@ -368,8 +368,8 @@ depending on the version of OSPF. Network splits are of particular interest to administrators. Not only are they symptoms of a broken part of the infrastructure, but also every netsplit inherently means that some addresses from the split network can not be reached -by some hosts, because there is no hint, to which segment a packet -should be delivered. +by some hosts, because there is no hint which segment a packet +should be delivered to. Splits can also be tricky to spot from systems using topology-unaware approaches, because if a link connecting two switches fails, all ports on hosts are still diff --git a/en/chap03.tex b/en/chap03.tex index f256574..8eaa0c0 100644 --- a/en/chap03.tex +++ b/en/chap03.tex @@ -4,12 +4,12 @@ We now explain the design of Birdvisu in depth. First, we explain some important decisions and present the overall structure of the project, then we look into individual parts of the program. -Birdvisu is implemented in Python, using PySide6, the official bindings for -Qt6, for drawing on screen. We decided to use Qt, because it provides a lot of +Birdvisu is implemented in Python~\cite{python}, using PySide6, the official bindings for +Qt6~\cite{qt6}, for drawing on screen. We decided to use Qt, because it provides a lot of pre-made widgets and tools and since it is widely used, it is easy to find help for it on the Internet. The decision to use Python was not hard either. Not only Qt has official bindings for it, but we use the language very often and -thus are comfortable writing in it. We do not expect the potential slowness of +thus are comfortable writing software in it. We do not expect the potential slowness of Python to be an issue, because for handling graphics we are using Qt, which is written in C++. Also, as we have analysed in section~\ref{s:areas}, we expect the topologies to be quite small. @@ -37,7 +37,7 @@ This can be illustrated on our usage of vertices in topology. There are two objects: a VertexID, and the Vertex itself. VertexID is the hashable part and Vertex provides additional information, like incident edges, which are not hashable. The topology then has a dictionary from the VertexIDs to Vertices, -providing complete data. +providing the complete data. However, the VertexID already contains information like what version of IP it belongs in, whether it represents a router and all the possible IP addresses @@ -46,16 +46,17 @@ only contain sets of edges and references to the related topology and VertexID. (In the next section, we will see that a type of the vertex is also stored in Vertex, but that is really everything.) -The other thing we decided to reuse, was the format of BIRD's topology output. -We call the format \uv{ospffile} and extended it by allowing comments (after an +The other thing we decided to reuse was the format of BIRD's topology output. +We call the format \uv{ospffile} and have extended it by allowing comments (after an octothorpe, i.e. \verb|#|). Also, empty lines do not seem to be of relevance. These are quality-of-life improvements for cases when ospffiles are edited by hand. +\clubpenalty=1000 Apart from storing topologies, we intend to use ospffiles for description of basic styles. Therefore, our implementation in \verb|birdvisu.ospffile| only constructs the tree of strings and does not try to understand it. Our module -provides API similar to the one of \verb|json| or \verb|marshall| module, even +provides API similar to the one of \verb|json| or \verb|marshall| modules, even though it cannot represent arbitrary types. \section{Data collection: providers and parsing} @@ -143,7 +144,7 @@ actual Vertices across multiple Topologies. Apart from VertexIDs, the TopologyV3 also consists of the additional data in Vertex objects and Edges. The Vertex objects, as noted above, contain only a set of incoming and outgoing edges, references to their TopologyV3 and VertexID -objects and the actual type of the object the vertex represents (i.e. the first +objects and the actual type of the object the vertex represents (i.e.\ the first column of the table). An Edge knows the source and target VertexID, its cost and the number of @@ -165,7 +166,7 @@ Frozen Topologies also allow us to stack them, creating a union of the original Topologies. This way, a single Topology can be used in the visualisation, while keeping the original information. This mechanism is fully generic, but was mainly invented to allow merging the reference (expected) topology with the -actual one (i.e. the current state of the system). The ancestors are stored by +actual one (i.e.\ the current state of the system). The ancestors are stored by a string label in a dictionary of the Topology. While subclassing TopologyV3 into a StackedTopology would probably be a cleaner design, since the only difference is a state of one dictionary, we did not employ this approach. @@ -211,7 +212,7 @@ Annotations do not need to take other Annotations into account, because AnnotatedTopology stores Annotations from different Annotators separately. The Annotators are a tiny bit more interesting. While these objects are -basically a wrapper around the \verb|annotate()| method, which takes an +basically wrappers around the \verb|annotate()| method, which takes an AnnotatedTopology and returns an Annotation, there are few twists to it. First, an Annotator object is intended to be created by the respective @@ -272,11 +273,14 @@ strategy might be employed to tame the memory usage. Also, the Annotators could be run dynamically when the Annotation is requested, but our current approach does not need this functionality, so it is not implemented at the moment. +\newpage + \section{Visualisation} The visualisation is split into two parts: computing the appearance and actually showing the result. For the former we reuse the Annotator infrastructure. The latter is handled by Qt's Graphics view framework. +\widowpenalty=10000 The appearance is described by a styling dictionary. For vertices, it contains a position and a highlighting colour. Edges can have a colour, line width and a @@ -289,7 +293,7 @@ they only tag vertices and edges with styling dictionaries. This provides something similar to an interface, helping to uncouple the style from the specific Annotator that provided the respective data. Each Annotator which provides data worth showing has a companion StyleAnnotator to provide the -respective style. When drawing, we pick one StyleAnnotator and highlight data +respective style. When drawing, we pick one StyleAnnotator and highlight the graph according to it. The current approach avoids mixing styles from multiple Annotators, which might @@ -393,7 +397,7 @@ used, because it is always cheaper to use the light edge. (We are aware that this is not true for asymmetrically configured point-to-point links, but we do not think they are commonly deployed.) -Since we will be able to use edges in sets, we need a canonical hashable +Since we want to be able to use edges in sets, we need a canonical hashable representation. For that, we implement a total ordering on VertexIDs, which allows us to use pairs of the VertexIDs in ascending order to reference the edge. There are currently no specific requirements for the ordering to satisfy. diff --git a/en/chap04.tex b/en/chap04.tex index 6af867d..5109e0a 100644 --- a/en/chap04.tex +++ b/en/chap04.tex @@ -18,7 +18,7 @@ future improvements. The only strict dependency of Birdvisu is Python3.10 or newer. While the project depends on the PySide6 library for the user interface, it can be downloaded -as a wheel\X{glos} from PyPI. Of course, using a system-wide installation of +as a wheel from PyPI. Of course, using a system-wide installation of PySide6 is also possible. Birdvisu can read the topology information from files, so it is not necessary @@ -29,8 +29,9 @@ to run on other types of operating systems, since we have only used cross-platform libraries. There are several ways to start the program. The recommended method is to -install the project into a Python virtual environment, but running the -\verb|visu.py| script also works when PySide6 is installed in the system.\X{uuuuuuuuuuu!!} +install the project into a Python virtual environment and using the \verb|visu| +command. Running the \verb|visu.py| script also works when PySide6 is installed +in the system, without requiring the installation step. Once the program is started, the user is presented with an empty canvas. Using the \emph{Topology} menu, it is possible to load both a reference and current diff --git a/en/chap05.tex b/en/chap05.tex index 4712186..fb28048 100644 --- a/en/chap05.tex +++ b/en/chap05.tex @@ -34,7 +34,7 @@ Naturally, Gennet can be connected to the home network. At that point, our approach to laying out vertices starts feeling suboptimal, because edges cross unnecessarily often. The connections are clear, however, and this can be alleviated by using a fixed layout in a file. In the future, this could be -addressed by using some force-based approach for the automatic layout. +addressed by using a force-based approach for the automatic layout. \section{Department of Applied Mathematics} @@ -43,10 +43,10 @@ Again, the main purpose is to address containers and virtual machines in the system. The topology consists of 5 routers and about 27 networks, most of which are stub. -Again, the main issue is the automatic vertex layout. Also, since most of the +As in the previous case, the main issue is the automatic vertex layout. Also, since most of the networks are stub (and often only contain a single host), the graph could position these networks near each other, or even collapse them into one vertex. -Birdvisu does not unfortunately support that at the moment. +Birdvisu does not unfortunately support this at the moment. \section{Czela.net} @@ -79,7 +79,7 @@ We did not notice any performance issues when dealing with the topology. Unfortunately, we did not have the opportunity to test in a large network (the Czela.net's one is the largest we tested), so we do not really know the limits -of capabilities. Those are even hard to guess, because while we are trying to +of Birdvisu. Those are even hard to guess, because while we are trying to use rather fast algorithms, they are implemented in Python, which can sometimes be quite slow, and more importantly, does not support threads well. The heavy use of hash tables and indirection can also impair performance. diff --git a/en/intro.tex b/en/intro.tex index 0f8cbc5..978e8e8 100644 --- a/en/intro.tex +++ b/en/intro.tex @@ -36,11 +36,11 @@ shared network, and conversely, that two networks are neighbours when they share a router. We will use the word \emph{network system} (or just \emph{system}) to describe a -set of network managed by a single administrator, which is intended to forward +set of networks managed by a single administrator, which is intended to forward packets across it. This can mean the whole autonomous system\footnote{This is where we borrowed the word \uv{system} from.}, but often this is a much smaller part. When we speak about routing using OSPF, a system is only the set of -networks that run a single instance of OSPF. +networks that run a single instance of the OSPF protocol. By the term \emph{IP} we mean the Internet Protocol of either version 4 or 6. When it is important to distinguish, we explicitly write \emph{IPv4} or @@ -104,11 +104,11 @@ Figure~\ref{fig:basic-icons}. \addcontentsline{toc}{section}{Structure of the thesis} In the \hyperref[ch:motivation]{first} chapter, we explore various approaches of visualising and -monitoring a network system. Then, in chapter \ref{ch:analysis}, we understand the +monitoring a network system. Then, in chapter~\ref{ch:analysis}, we understand the behaviour of relevant networking technologies. From that, we derive a design -for Bidrvisu in chapter \ref{ch:design}. Chapter \ref{ch:usage} explains usage -of the program. At the end, we demonstrate how the project works in network -systems of various sizes. +for Birdvisu in chapter~\ref{ch:design}. Chapter~\ref{ch:usage} explains the usage +of the program. At the \hyperref[ch:evaluation]{end}, we discuss how the +project copes with topologies of network systems of various sizes. %- motivation % -tgt audience of birdvisu diff --git a/en/thesis.tex b/en/thesis.tex index 4d59f91..bf8778d 100644 --- a/en/thesis.tex +++ b/en/thesis.tex @@ -24,8 +24,50 @@ \include{glossary} \chapwithtoc{List of Abbreviations}\label{ch:abbrs} -\XXX{In mathematical theses, it could be better to move the list of abbreviations to the beginning of the thesis.} -\XXX{TODO} + +\bgroup +\parindent=0pt +\begin{small} + +ABR -- Area border router + +API -- Application programming interface + +AS -- Autonomous system + +BIRD -- BIRD Internet Routing Daemon -- BIRD Internet Routing Daemon Inter\dots + +CIDR -- Classless inter-domain routing + +DAG -- Directed acyclic graph + +DR -- Designated router + +GUI -- Graphical user interface + +IP -- Internet protocol + +ISP -- Internet service provider + +LSA -- Link state advertisement + +OSPF -- Open shortest path first + +PTP -- Point-to-point + +PyPI -- Python package index + +RFC -- Request for comments + +SMTP -- Simple mail transfer protocol + +UI -- User interface + +UNIX is not an abbreviation. + +\end{small} +\egroup + \appendix \chapter{Attachments}