Help us improve your experience.

Let us know what you think.

Do you have time for a two-minute survey?

 
 

Graph

Graph Overview

Apstra uses the Graph model to represent a single source of truth regarding infrastructure, policies, constraints etc. This Graph model is subject to constant change and we can query it for various reasons. It is represented as a graph. All information about the network is modeled as nodes and relationships between them.

Every object in a graph has a unique ID. Nodes have a type (a string) and a set of additional properties based on a particular type. For example, all switches in our system are represented by nodes of type system and can have a property role which determines which role in the network it is assigned (spine/leaf/server). Physical and logical switch ports are represented by an interface node, which also has a property called if_type.

Relationships between different nodes are represented as graph edges which we call relationships. Relationships are directed, meaning each relationship has a source node and a target node. Relationships also have a type which determines which additional properties particular relationship can have. E.g. system nodes have relationships of type hosted_interfaces towards interface nodes.

A set of possible node and relationship types is determined by a graph schema. The schema defines which properties nodes and relationships of particular type can have along with types of those properties (string/integer/boolean/etc) and constraints. We use and maintain an open source schema library, Lollipop, that allows flexible customization of value types.

Going back to the graph representing a single source of truth, one of the most challenging aspects was how to reason about it in the presence of change, coming from both the operator and the managed system. In order to support this we developed what we call Live Query mechanism which has three essential components:

  • Query Specification
  • Change Notification
  • Notification Processing

Having modeled our domain model as a graph, you can find particular patterns (subgraphs) in a graph by running searches on the graph specified by graph queries. The language to express the query is conceptually based on Gremlin, an open source graph traversal language. We also have parsers for queries expressed in another language - Cypher, which is a query language used by popular graph database neo4j.

Query Specification

You start with a node() and then keep chaining method calls, alternating between matching relationships and nodes:

The query above translated in english reads something like: starting from a node of type system, traverse any outgoing relationship that reaches node of type interface, and from that node traverse all outgoing relationship that lead to node of type `link.

At any point you can add extra constraints:

Notice role=`spine` argument, it will select only system nodes that have role property set to spine.

Same with if_type property for interface nodes.

That query will select all system nodes that have role either spine or leaf and interface nodes that have if_type anything but ip (ne means not equal).

You can also add cross-object conditions which can be arbitrary Python functions:

You refer to objects by giving them names and using those names as argument names for your constraint function (of course you can override that but it makes a convenient default behavior). So, in example above it will take two interface nodes named if1 and if2, pass them into given where function and filter out those paths, for which function returns False. Don't worry about where you place your constraint: it will be applied during search as soon as all objects referenced by constraint are available.

Now, you have a single path, you can use it to do searches. However, sometimes you might want to have a query more complex than a single path. To support that, query DSL allows you to define multiple paths in the same query, separated by comma(s):

This match() function creates a grouping of paths. All objects that share the same name in different paths will actually be referring to the same object. Also, match() allows adding more constraints on objects with where(). You can do a distinct search on particular objects and it will ensure that each combination of values is seen only once in results:

This matches a chain of a -> b -> c nodes. If two nodes a and c are connected through more than one node of type b, the result will still contain only one (a, c) pair.

There is another convenient pattern to use when writing queries: you separate your structure from your criteria:

Query engine will optimize that query into:

No cartesian product, no unnecessary steps.

Change Notification

Ok, now you have a graph query defined. What does a notification result look like? Each result will be a dictionary mapping a name that you have defined for a query object to object found. E.g. for following query

results will look like {'a': <node type='a'>, 'c': <node type='c'>}. Notice, only named objects are present (there is no <node type='b'> in results, although that node is present in query because it does not have a name).

You register a query to be monitored and a callback to execute if something will change. Later, if someone will modify the graph being monitored, it will detect that new graph updates caused new query results to appear, or old results to disappear or update. The response executes the callback that is associated with the query. The callback receives the whole path from the query as a response, and a specific action (added/updated/removed) to execute.

Notification Processing

When the result is passed to the processing (callback) function, from there you can specify reasoning logic. This could really be anything, from generating logs, errors, to rendering configurations, or running semantic validations. You could also modify the graph itself, using graph APIs and some other piece of logic may react to changes you made. This way, you can enforce the graph as a single source of truth while it also serves as a logical communication channel between pieces of your application logic. The Graph API consists of three parts:

Graph management - methods to add/update/remove stuff in a graph. add_node(), set_node(), del_node(), get_node()add_relationship(), set_relationship(), del_relationship(), get_relationship(), commit() Query get_nodes()get_relationships() Observable interface add_observer(),remove_observer()

Graph management APIs are self-explanatory. add_node() creates new node, set_node() updates properties of existing node, and del_node() deletes a node.

commit() is used to signal that all updates to the graph are complete and they can be propagated to all listeners.

Relationships have similar API.

The observable interface allows you to add/remove observers - objects that implement notification a callback interface. Notification callback consists of three methods:

  • on_node() - called when any node/relationship is added, removed or updated
  • on_relationship() - called when any node/relationship is added, removed or updated
  • on_graph() - called when the graph is committed

The Query API is the heart of our graph API and is what powers all searching. Both get_nodes() and get_relationships() allow you to search for corresponding objects in a graph. Arguments to those functions are constraints on searched objects.

E.g. get_nodes() returns you all nodes in a graph, get_nodes(type='system') returns you all system nodes, get_nodes(type='system', role='spine') allows you to constrain returned nodes to those having particular property values. Values for each argument could be either a plain value or a special property matcher object. If the value is a plain value, the corresponding result object should have its property equal to the given plain value. Property matchers allow you to express a more complex criterias, e.g. not equal, less than, one of given values and so on:

Note:

The example below is for directly using Graph python. For demonstration purposes, you can replace graph.get_nodes with node in the Graph explorer. This specific example will not work on the Apstra web interface.

In your graph schema you can define custom indexes for particular node/relationship types and the methods get_nodes() and get_relationships() pick the best index for each particular combination of constraints passed to minimize search time.

Results of get_nodes()/get_relationships() are special iterator objects. You can iterate over them and they will yield all found graph objects. You can also use APIs that those iterators provide to navigate those result sets. E.g. get_nodes() returns you a NodeIterator object which has methods out() and in_(). You can use those to get an iterator over all outgoing or incoming relationship from each node in the original result set. Then, you can use those to get nodes on the other end of those relationships and continue from them. You can also pass property constraints to those methods the same way you can do for get_nodes() and get_relationships().

The code in the example above finds all nodes with type system and role spine and then finds all their loopback interfaces.

Putting It All Together

Thequery below is an example of an internal rule that Apstra can use to derive telemetry expectations -- for example, link and interface status. The @rule will insert a callback to process_spine_leaf_link, in which case we write to telemetry expectations.

Convenience Functions

To avoid creating complex where() clauses when building a graph query, use convenience functions, available from the Apstra web interface.

  1. From the blueprint navigate to the Staged view or Active view, then click the GraphQL API Explorer button (top-right >_). The graph explorer opens in a new tab.

  2. Type a graph query on the left. See function descriptions below.

  3. From the Action drop-down list, select qe.

  4. Click the Execute Query button (looks like a play button) to see results.

Functions

The Query Engine describes a number of helpful functions:

match(*path_queries)

This function returns a QueryBuilder object containing each result of a matched query. This is generally a useful shortcut for grouping multiple match queries together.

These two queries are not a 'path' together (no intended relationship). Notice the comma to separate out arguments. This query will return all of the leafs and spines together.

node(self, type=None, name=None, id=None, **properties)

  • Parameters
    • type (str or None) - Type of node to search for
    • name (str or None) - Sets the name of the property matcher in the results
    • id (str or None) - Matches a specific node by node ID in the graph
    • properties (dict or None) - Any additional keyword arguments or additional property matcher convenience functions to be used
  • Returns - Query builder object for chaining queries
  • Return type - QueryBuilder

While both a function, this is an alias for the PathQueryBuilder nodes -- see below.

iterate()

  • Returns - generator
  • Return type: generator

Iterate gives you a generator function that you can use to iterate on individual path queries as if it were a list. For example:

PathQueryBuilder Nodes

node(self, type=None, name=None, id=None, **properties)

This function describes specific graph node, but is also a shortcut for beginning a path query from a specific node. The result of a `node() call returns a path query object. When querying a path, you usually want to specify a node `type`: for example node('system') would return a system node.

  • Parameters
    • type (str or None) - Type of node to search for
    • name (str or None) - Sets the name of the property matcher in the results
    • id (str or None) - Matches a specific node by node ID in the graph
    • properties (dict or None) - Any additional keyword arguments or additional property matcher convenience functions to be used
  • Returns - Query builder object for chaining queries
  • Return type - QueryBuilder

If you want to use the node in your query results, you need to name it --node('system', name='device'). Furthermore, if you want to match specific kwarg properties, you can directly specify the match requirements -

node('system', name='device', role='leaf')

out(type=None, id=None, name=None, **properties)

Traverses a relationship in the 'out' direction according to a graph schema. Acceptable parameters are the type of relationship (for example, interfaces), the specific name of a relationship, the id of a relationship, or other property matches that must match exactly given as keyword arguments.

  • Parameters
    • type (str or None) - Type of node relationship to search for
    • id (str or None) - Matches a specific relationship by relationship ID in the graph
    • name (str or None) - Matches a specific relationship by named relationship

For example:

in_(type=None, id=None, name=None, **properties)

Traverses a relationship in the 'in' direction. Sets current node to relationship source node. Acceptable parameters are the type of relationship (for example, interfaces), the specific name of a relationship, the id of a relationship, or other property matches that must match exactly given as keyword arguments.

  • Parameters
    • type (str or None) - Type of node relationship to search for
    • id (str or None) - Matches a specific relationship by relationship ID in the graph
    • name (str or None) - Matches a specific relationship by named relationship
    • properties (dict or None) - Matches relationships by any further kwargs or functions

where(predicate, names=None)

Allows you to specify a callback function against the graph results as a filter or constraint. The predicate is a callback (usually lambda function) run against the entire query result. where() can be used directly on an a path query result.

  • Parameters
    • predicate (callback) - Callback function to run against all nodes in graph
    • names (str or None) - If names are given they are passed to callback function for match

enure_different(*names)

Allows a user to ensure two different named nodes in the graph are not the same. This is helpful for relationships that may be bidirectional and could match on their own source nodes. Consider the query:

  • Parameters
    • names (tuple or list) - A list of names to ensure return different nodes or relationships from the graph

The last line could be functionally equivalent to the where() function with a lambda callback function

Property matchers

Property matches can be run on graph query objects directly - usually used within a node() function. Property matches allow for a few functions.

eq(value)

Ensures the property value of the node matches exactly the results of the eq(value) function.

  • Parameters
    • value - Property to match for equality

Which is similar to simply setting a value as a kwarg on a node object:

Returns:

ne(value)

Not-equals. Ensures the property value of the node does NOT match results of ne(value) function

  • Parameters
    • value - Value to ensure for inequality condition

Similar to:

gt(value)

Greater-than. Ensures the property of the node is greater than the results of gt(value) function.

  • Parameters
    • value - Ensure property function is greater than this value

ge(value)

Greater-than or Equal To. Ensures the property of the node is greater than or equal to results of ge().

  • Parameters: value - Ensure property function is greater than or equal to this value

lt(value)

Less-than. Ensures the property of the node is less than the results of lt(value).

  • Parameters
    • value - Ensure property function is less than this value

Similar to:

le(value)

Less-than or Equal to. Ensures the property is less than, or equal to the results of le(value) function.

  • Parameters
    • value - Ensures given value is less than or equal to property function

Similar to:

is_in(value)

Is in (list). Check if the property is in a given list or set containing items is_in(value).

  • Parameters
    • value (list) - Ensure given property is in this list

Similar to:

not_in(value)

Is not in (list). Check if the property is NOT in a given list or set containing items not_in(value).

  • Parameters
    • value (list) - List Value to ensure property matcher is not in

Similar to:

is_none()

A query that expects is_none expects this particular attribute to be specifically None.

Similar to:

not_none()

A matcher that expects this attribute to have a value.

Similar to:

Apstra Graph Datastore

The Apstra graph datastore is an in-memory graph database. The log file size is checked periodically, and when a blueprint change is committed. If the graph datastore reaches 100MB or more, a new graph datastore checkpoint file is generated. The database itself does not remove any graph datastore persistence logs or checkpoint files. Apstra provides clean-up tools for the main graph datastore.

Valid graph datastore persistence file groups contain four files: log, log-valid, checkpoint, and checkpoint-valid. Valid files are the effective indicators for log and checkpoint files. The name of each persistence file has three parts: basename, id, and extension.

  • basename - derived from the main graph datastore partition name.
  • id - a unix timestamp obtained from gettimeofday. Seconds and microseconds in the timestamp are separated by a "-". A persistence file group can be identified by id. The timestamp can also help to determine the generated time sequence of persistence file groups.
  • extension - log, log-valid, checkpoint, or checkpoint-valid.