From ea67b53df08320008468ad723cbe23a2797547ab Mon Sep 17 00:00:00 2001 From: Pavel 'LEdoian' Turinsky Date: Tue, 18 Jul 2023 16:12:04 +0200 Subject: [PATCH] Write some ch3 lol --- en/chap03.tex | 167 ++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 162 insertions(+), 5 deletions(-) diff --git a/en/chap03.tex b/en/chap03.tex index 2e01327..34c1756 100644 --- a/en/chap03.tex +++ b/en/chap03.tex @@ -24,18 +24,175 @@ topology on the screen, highlighting the relevant information. \section{Recurring and general patterns} -\XXX{dictionaries everywhere, hashable recipes. ospffile with comments as a reusable format. Format of VertexID} +Birdvisu's data structures make heavy use of dictionaries and sets, because we +do not handle much data that would need to be processed in any particular +order. While this allows us to perform set operations quickly, it requires us +to provide hashable keys. + +We have decided to embrace this requirement and use rather complex frozen +dataclasses, which can hold as much of the required data as possible, as long +as we can re-create that data. + +This can be illustrated on our usage of vertices in topology. There are two +objects: a VertexID, and the Vertex itself. VertexID is the hashable part and +Vertex provides additional information, like incident edges, which are not +hashable. The topology then has a dictionary from the VertexIDs to Vertices, +providing complete data. + +However, the VertexID already contains information like what version of IP it +belongs in, whether it represents a router and all the possible IP addresses +and identifiers related to the vertex. It is sufficient for Vertex objects to +only contain sets of edges and references to the related topology and VertexID. +(In the next section, we will see that a type of the vertex is also stored in +Vertex, but that is really everything.) + +The other thing we decided to reuse, was the format of BIRD's topology output. +We call the format \uv{ospffile} and extended it by allowing comments (after an +octothorpe, i.e. \verb|#|). Also, empty lines do not seem to be of relevance. +These are quality-of-life improvements for cases when ospffiles are edited by +hand. + +Apart from storing topologies, we intend to use ospffiles for description of +basic styles. Therefore, our implementation in \verb|birdvisu.ospffile| only +constructs the tree of strings and does not try to understand it. Our module +provides API similar to the one of \verb|json| or \verb|marshall| module, even +though it cannot represent arbitrary types. \section{Data collection: providers and parsing} -\XXX{sub-parts, why a topology is not a graph, stacking topologies with -multiedges, fake-freezing, why is everything static. Graph representation, selection of BIRD's -instance} +This part of the project deals with processing topologies. The core object of +this part is a TopologyV3\footnote{The \uv{V3} suffix is sometimes impractical +to keep, so we will sometimes shorten the class name only to \uv{Topology}. It +denotes the same object.}. While the Topologies can be created manually by +adding the vertices and edges, we expect that retrieving topologies from other +sources like saved ospffiles or running BIRD processes. This is made possible +by implementing a TopologyProvider. + +Representing a topology turns out to be a bit complicated problem for the following reasons: +\begin{itemize} + \item The topology edges need to be directed. OSPF allows a shortest path + from A to B to be different to the other direction. + \item It can have a very general shape, so we cannot rely on common + patterns. For example, routers can be connected to other routers using + point-to-point or virtual links, not just networks. + \item The objects are shape-shifting. A transit network may become stub or + change the designated router and we want to be able to understand the + change as best as possible. + \item The topology is not necessarily a graph, because multiple links may + lead from a single router to the same network. However, we strongly + believe that the maximum number of parallel edges is quite low, so most + of the theory for simple graphs is still applicable. + \item For completeness, we note here again that the shortest paths from a + single vertex form a DAG, even though the OSPF specifications speak of + them as of trees. (Negative edges are, fortunately, not permitted.) +\end{itemize} + +Given the above requirements and lessons learned in +section~\ref{s:net-unusual}, we need to find a representation of vertices, that +is both powerful enough to uniquely describe a particular vertex, and flexible +to allow us easily detect its metamorphoses. The table~\ref{tab:vertexid} +shows, which information we can use for each type of object. We see that +networks in particular are hard to represent, because the ID of the DR may +change and it might be the only distinguishing property in case of a split +network. + +\bgroup +\def\yes{\checkmark} +\def\chg{$\bullet$} +\begin{table}[h] + \centering + \begin{tabular}{cccccc}\hline + Object & Address & RID & DR ID & IF ID & Notes \\\hline + \verb|router| & -- & \yes & -- & -- &\\ + \verb|xrouter| & -- & \yes & -- & -- &\\ + \verb|vlink| & -- & \yes & -- & -- & Peer is a \verb|router|\\ + \verb|network| & v2:\yes,v3:$*$ & -- & \chg & v3:\chg &\\ + \verb|external| & \yes & -- & -- & -- &\\ + \verb|xnetwork| & \yes & -- & -- & -- &\\ + \verb|stubnet| & \yes & \yes & -- & -- &\\ + \end{tabular} + \caption{Information determining each object of a topology. $*$ means it + may or may not be known, \chg\ denotes an attribute that may change. Columns in + order: whether it has assigned a address, relevant router ID, ID of designated router, interface number of the DR.} + \label{tab:vertexid} +\end{table} +\egroup + +We decided to aim for correctness, so whenever any of the attributes of an object +change, we consider it to be a a separate object. This may create some false +positives, but we think that is the better case than potential false negatives, +which could hide some issues. Also, when the infrastructure works correctly, +the designated router should only change in case of outage. Therefore, it might +actually be useful to notice the user when a network has an unexpected +designated router even when it is otherwise healthy. However, we provide a way +to find objects by partial information, using the VertexFinder objects, so this +allows heuristics to match different objects. + +The information mentioned in table~\ref{tab:vertexid} serves as the main part +of the VertexID. However, we want the VertexID to identify the same object even +after it transforms to another kind of object, so instead of using the object +type, we only note whether the object is a router or a network, since this +property stays the same even for changed objects. The code is also oblivious to +the fact that the interface ID is a number and what it means -- we use it as an +opaque \uv{discriminator} and do not even bother with parsing it from a string. + +The VertexIDs are supposed to be descriptors of objective vertex state, so they +do not belong to any particular TopologyV3. Instead, they can be used to track +actual Vertices across multiple Topologies. + +Apart from VertexIDs, the TopologyV3 also consists of the additional data in +Vertex objects and Edges. The Vertex objects, as noted above, contain only a +set of incoming and outgoing edges, references to their TopologyV3 and VertexID +objects and the actual type of the object the vertex represents (i.e. the first +column of the table). + +An Edge knows the source and target VertexID, its cost and the number of +parallel edges with these properties. If the Edge was determined by a virtual +link, it is marked as virtual. This is needed, because the both Vertices are +regular routers, so the information about the virtual link cannot be stored in +them. Note that an Edge does not need to belong to any Topology, since it only +contains factual data. The information, whether an Edge is in the topology, is +stored only in the incident Vertices. + +A Topology can be marked as \uv{frozen}. This denotes an intent that it really +should not be modified, because other code might rely on the particular shape +of the Topology. However, making the Topology trully immutable would be +impractical in Python, so we opted for this approach. In case our solution +turns out to be prone to accidental modification of the Topology, we will +deploy additional countermeasures against that. + +Frozen Topologies also allow us to stack them, creating a union of the original +Topologies. This way, a single Topology can be used in the visualisation, while +keeping the original information. This mechanism is fully generic, but was +mainly invented to allow merging the reference (expected) topology with the +actual one (i.e. the current state of the system). The ancestors are stored by +a string label in a dictionary of the Topology. While subclassing TopologyV3 +into a StackedTopology would probably be a cleaner design, since the only +difference is a state of one dictionary, we did not employ this approach. + +The TopologyProviders are not very interesting, but are important nevertheless. +There are a few caveats with parsing topologies from the ospffile format. +First, the edges from routers to networks can only be resolved after the +networks are known, since network's level-2 block contains information not +present in the level-3 directive for the router (namely, the designated router +for OSPFv2 natworks and the set of addresses for OSPFv3). + +Since BIRD may be running more than one instance of OSPF, the +BirdSocketTopologyProvider contains an ad-hoc parser of the response to the +\texttt{show protocols} command, which seems to be a reliable way to list +running instances. + +Moreover, BIRD does not seem to expose any way to determine the version of +OSPF. So far, we think it is sufficient to guess from the \texttt{network} +directives, since they seem to contain a hyphen if and only if the dump is from +an OSPFv3 instance. (The source code of BIRD suggests that under some +circumstances, brackets can appear even in OSPFv2 dump, so that is not a +possibility.) \section{Annotations} \XXX{scoping, annotator creation, advantages of storing data in annotations and -not vertices. Annotator protocol and posibility of export. Various uses of annotators: enhancing, analysis, visualisation} +not vertices. Annotator protocol and posibility of export. Various uses of annotators: enhancing, analysis, visualisation. heapq} \section{Visualisation}