netdisco/DEVELOPING.txt

DEVELOPER NOTES
    This document aims to help developers understand the intent and design
    of the code within Netdisco. Patches and feedback are always welcome :-)

Introduction
    This release of Netdisco is built as a Dancer application, and uses many
    modern technologies and techniques. Hopefully this will make the code
    easier to manage and maintain in the long term.

    Although Dancer is a web application framework, it provides very useful
    tools for command line applications as well, namely configuration file
    management and database connection management. We make use of these
    features in the daemon and deployment scripts.

    Overall the application tries to be as self-contained as possible
    without also needing an excessive number of CPAN modules to be
    installed. However, Modern Perl techniques have made dependency
    management almost a non-issue, and Netdisco can be installed by and run
    completely within an unprivileged user's account, apart from the
    PostgreSQL database setup.

    Finally the other core component of Netdisco is now a DBIx::Class layer
    for database access. This means there is no SQL anywhere in the code,
    but more important, we can re-use the same complex queries in different
    parts of Netdisco.

    The rest of this document discusses each "interesting" area of the
    Netdisco codebase, hopefully in enough detail that you can get hacking
    yourself :-)

Versioning
    This is Netdisco major version 2. The minor version has six digits,
    which are split into two components of three digits each. It's unlikely
    that the major version number (2) will increment. Each "feature" release
    to CPAN will increment the first three digits of the minor version. Each
    "bug fix" release will increment the second three digits of the minor
    version.

    Stable releases will have an even "feature" number. Beta releases will
    have an odd "feature" number and also a suffix with an underscore, to
    prevent CPAN indexing the distribution. Some examples:

     2.002002     - "feature" release 2, "bug fix" release 2
     2.002003     - another bug was found and fixed, hence "bug fix" release 3
     2.003000_001 - first beta for the next "feature" release
     2.003000_002 - second beta
     2.004001     - the next "feature" release

Global Configuration
    Dancer uses YAML as its standard configuration file format, which is
    flexible enough for our needs, yet still simple to edit for the user. We
    no longer need a parser as in the old version of Netdisco.

    At the top of scripts you'll usually see something like:

     use App::Netdisco;
     use Dancer ':script';

    First, this uses "App::Netdisco", which is almost nothing more than a
    placeholder module (contains no actual application code). What it does
    is set several environment variables in order to locate the
    configuration files.

    Then, when we call ""use Dancer"" these environment variables are used
    to load two YAML files: "config.yml" and "<environment>.yml" where
    "<environment>" is typically either "production" or "development".

    The concept of "environments" allows us to have some shared "master"
    config between all instances of the application ("config.yml"), and then
    settings for specific circumstances. Typically this might be logging
    levels, for example. The default file which "App::Netdisco" loads is
    "development.yml" but you can override it by setting the
    ""DANCER_ENVIRONMENT"" environment variable.

    Dancer loads the config using YAML, merging data from the two files.
    Config is made available via Dancer's "setting('foo')" subroutine, which
    is exported. So now the "foo" setting in either config file is easily
    accessed.

    Another line commonly seen in scripts is this:

     use Dancer::Plugin::DBIC 'schema';

    This plugin saves a lot of effort by taking some database connection
    parameters from the configuration file, and instantiating DBIx::Class
    database connections with them. The connections are managed
    transparently so all we need to do to access the Netdisco database, with
    no additional setup, is:

     schema('netdisco')->resultset(...)->search({...});

DBIx::Class Layer
    DBIx::Class, or DBIC for short, is an Object-Relational Mapper. This
    means it abstracts away the SQL of database calls, presenting a Perl
    object for each table, set of results from a query, table row, etc. The
    advantage is that it can generate really smart SQL queries, and these
    queries can be re-used throughout the application.

    The DBIC layer for Netdisco is based at App::Netdisco::DB. This is the
    global schema class and below that, under App::Netdisco::DB::Result is a
    class for each table in the database. These contain metadata on the
    columns but also several handy "helper" queries which can be called.
    There are also "ResultSet" classes which provide additional "pre-canned"
    queries.

    Netdisco's DBIx::Class layer has excellent documentation which you are
    encouraged to read, particularly if you find it difficult to sleep.

  Results and ResultSets
    In DBIC a "Result" is a table and a "ResultSet" is a set of rows
    retrieved from the table as a result of a query (which might be all the
    rows, of course). This is why we have two types of DBIC class. Items in
    the "Result" generally relate to the single table directly, and simply.
    In the "ResultSet" class are more complex search modifiers which might
    synthesize new "columns" of data (e.g. formatting a timestamp) or
    subroutines which accept parameters to customize the query.

    However, regardless of the actual class name, you access them in the
    same way. For example the "device" table has an
    App::Netdisco::DB::Result::Device class and also an
    App::Netdisco::DB::ResultSet::Device class. DBIC merges the two:

     schema('netdisco')->resultset('Device')->get_models;

  Virtual Tables (VIEWs)
    Where we want to simplify our application code even further we can
    either install a VIEW in PostgreSQL, or use DBIx::Class to synthesize
    the view on-the-fly. Put simply, it uses the VIEW definition as the
    basis of an SQL query, yet in the application we treat it as a real
    table like any other.

    Some good examples are a fake table of only the active Nodes (as opposed
    to all nodes), or the more complex list of all ports which are connected
    together ("DeviceLink").

    All these tables live under the App::Netdisco::DB::Result::Virtual
    namespace, and so you access them like so (for the "ActiveNode"
    example):

     schema('netdisco')->resultset('Virtual::ActiveNode')->count;

  Versioning and Deployment
    To manage the Netdisco schema in PostgreSQL we use DBIx::Class's
    deployment feature. This attaches a version to the schema and provides
    all the code to check the current version and do whatever is necessary
    to upgrade. The schema version is stored in a new table called
    "dbix_class_schema_versions", although you should never touch it.

    The "netdisco-db-deploy" script included in the distribution performs
    the following services:

     * Installs the dbix_class_schema_versions table
     * Upgrades the schema to the current distribtion's version

    This works both on an empty, new database, and a legacy database from
    the existing Netdisco release, in a non-destructive way. For further
    information see DBIx::Class::Schema::Versioned and the
    "netdisco-db-deploy" script.

    The files used for the upgrades are shipped with this distribution and
    stored in the ".../App/Netdisco/DB/schema_versions" directory. They are
    generated using the "nd-dbic-versions" script which also ships with the
    distribution.

  Foreign Key Constraints
    We have not yet deployed any FK constraints into the Netdisco schema.
    This is partly because the current poller inserts and deletes entries
    from the database in an order which would violate such constraints, but
    also because some of the archiving features of Netdisco might not be
    compatible anyway.

    Regardless, a lack of FK constraints doesn't upset DBIx::Class. The
    constraints can easily be deployed in a future release of Netdisco.

Web Application
    The Netdisco web app is a "classic" Dancer app, using most of the
    bundled features which make development really easy. Dancer is based on
    Ruby's Sinatra framework. Its style is for many "helper" subroutines to
    be exported into the application namespace, to do things such as access
    request parameters, navigate around the "handler" subroutines, manage
    response headers, and so on.

    Pretty much anything you want to do in a web application has been
    wrapped up by Dancer into a neat helper routine that does the heavy
    lifting. This includes configuration and database connection management,
    as was discussed above. Also, templates can be executed and Netdisco
    uses the venerable Template::Toolkit engine for this.

    Like most web frameworks Dancer has a concept of "handlers" which are
    subroutines to which a specific web request is routed. For example if
    the user asks for ""/device"" with some parameters, the request ends up
    at the App::Netdisco::Web::Device package's ""get '/device'"" handler.
    All this is done automatically by Dancer according to some simple rules.
    There are also "wrapper" subroutines which we use to do tasks such as
    setting up data lookup tables, and handling authentication.

    Dancer also supports AJAX very well, and it is used to retrieve most of
    the data in the Netdisco web application in a dynamic way, to respond to
    search queries and avoid lengthy page reloads. You will see the handlers
    for AJAX look similar to those for GET requests but do not use
    Template::Toolkit templates.

    Compared to the current Netdisco, the handler routines are very small.
    This is because (a) they don't include any HTML - this is delegated to a
    template, and (b) they don't include an SQL - this is delegated to
    DBIx::Class. Small routines are more manageable, and easier to maintain.
    You'll also notice use of modules such as Net::MAC and NetAddr::IP::Lite
    to simplify and make more robust the handling of data.

  Running the Web App
    Dancer apps conform to the "PSGI" standard interface for web
    applications, which makes for easy deployment under many stacks such as
    Apache, FCGI, etc. See Dancer::Deployment for more detail.

    At a minimum Netdisco can run from within its own user area as an
    unprivileged user, and ships with a simple web server engine (see the
    user docs for instructions). The "netdisco-web" script uses
    Daemon::Control to daemonize this simple web server so you can
    fire-and-forget the Netdisco web app without much trouble at all. This
    script in turn calls "netdisco-web-fg" which is the real Dancer
    application, that runs in the foreground if called on its own.

    All web app code lives below App::Netdisco::Web, but there are also some
    helper routines in App::Netdisco::Util::Web (for example sorting device
    port names).

  Authentication
    Dancer includes (of course) good session management using cookies and a
    memory database. You should change this to a disk database if using a
    proper forking web server installation so that sessions are available to
    all instances.

    Session and authentication code lives in App::Netdisco::Web::AuthN. It
    is fully backwards compatible with the existing Netdisco user
    management, making use of the database users and their MD5 passwords.

    There is also support for unauthenticated access to the web app (for
    instance if you have some kind of external authentication, or simply
    trust everyone).

  Templates
    In the "share/views" folder of this distribution you'll find all the
    Template::Toolkit template files, with ".tt" extensions. Dancer first
    loads "share/views/layouts/main.tt" which is the main page wrapper, that
    has the HTML header and so on. It then loads other templates for
    sections of the page body. This is a typical Template::Toolkit "wrapper"
    configuration, as noted by the "[% content %]" call within "main.tt"
    that loads the template you actually specified in your Dancer handler.

    All templates (and Javascript and Stylesheets) are shipped in the
    App::Netdisco distribution and located automatically by the application
    (using the environment variables which App::Netdisco set up). The user
    doesn't have to copy or install any files.

    There's a template for the homepage called "index.tt", then separate
    templates for searching, displaying device details, and showing
    inventory. These are, pretty much, all that Netdisco ever does.

    Each of these pages is designed in a deliberately similar way, with
    re-used features. They each can have a "sidebar" with a search form (or
    additional search parameters). They also can have a tabbed interface for
    sub-topics.

    Here's where it gets interesting. Up till now the page content has been
    your typical synchronous page load (a single page comprised of many
    templates) in response to a GET request. However the content of the tabs
    is not within this. Each tab has its content dynamically retrieved via
    an AJAX request back to the web application. Javscript triggers this
    automatically on page load.

    This feature allows the user to search and search again, each time
    refreshing the data they see in the tab but without reloading the
    complete page with all its static furniture. AJAX can, of course, return
    any MIME type, not only JSON but also HTML content as in this case. The
    templates for the tabs are organised below "share/views/ajax/..." in the
    distribution.

  Stylesheets
    The main style for Netdisco uses Twitter Bootstrap, which is a modern
    library of CSS and javascript used on many websites. It does a lot of
    heavy lifting, providing simple CSS classes for all of the standard web
    page furniture (forms, tables, etc). Check out the documetation at the
    Twitter Bootstrap web site for more information.

    These stylesheets are of course customised with our own "netdisco.css".
    We try to name all CSS classes with a prefix ""nd_"" so as to be
    distinct from Twitter Bootstrap and any other active styles.

    All stylesheets are located in the "share/public/css" folder of the
    distribution and, like the templates, are automatically located and
    served by the Netdisco application. You can also choose to serve this
    content statically via Apache/etc for high traffic sites.

    Although Twitter Bootstrap ships with its own set of icons, we use an
    alternative library called Fontawesome. This plugs in easily to
    Bootstrap and provides a wider range of scaleable vectored icons which
    are easy to use.

  Javascript
    Of course many parts of the Netdisco site use Javascript, beginning with
    retrieving the page tab content itself. The standard library in use is
    jQuery, and the latest version is shipped with this distribution.

    Many parts of the Netdisco site have small Javscript routines. The code
    for these, using jQuery as mentioned, lives in two places. The main
    "netdisco.js" file is loaded once in the page HTML header, and lives in
    "share/public/javascripts/netdisco.js". There's also a
    "netdisco_portcontrol.js" which is included only if the current user has
    Port Control rights.

    Netdisco also has Javascript routines specific to the device search or
    device details pages, and these files are located in
    "share/views/js/..." because they're loaded within the page body by the
    templates. These files contain a function "inner_view_processing" which
    is called each time AJAX delivers new content into a tab in the page
    (think of it like a callback, perhaps).

    Also in the "share/public/javascripts/..." folder are the other public
    libraries loaded by the Netdisco application:

    The Toastr library is used for "Growl"-like notifications which appear
    in the corner of the web browser and then fade away. These notify the
    user of successful background job submission, and jos results.

    The d3 library is a graphics toolkit used to display the NetMap feature.
    This works differently from the old Netdisco in that everything is
    generated on-the-fly using SQL queries ("DeviceLinks" resultset) and
    this d3 library for rendering.

    Finally Twitter Bootstrap also ships with a toolkit of helpful
    Javascript driven features such as the tooltips and collapsers.

Job Daemon
    The old Netdisco has a job control daemon which processes "port control"
    actions and also manual requests for device polling. The new Netdisco
    also has a daemon, although it is a true separate process and set of
    libraries from the web application. However, it still makes use of the
    Dancer configuration and database connection management features
    mentioned above.

    The job daemon is backwards compatible with the old Netdisco database
    job requests table, although it doesn't yet log results in the same way.
    Most important, it cannot yet poll any devices for discovery or
    macsuck/arpnip, although that's next on the list!

    All code for the job daemon lives under the App::Netdisco::Daemon
    namespace and like the rest of Netdisco is broken down into manageable
    chunks.

  Running the Job Daemon
    Like the web application, the job daemon is fully self contained and
    runs via two simple scripts shipped with the distribution - one for
    foreground and one for background execution (see the user docs for
    instructions).

    The "netdisco-daemon" script uses Daemon::Control to daemonize so you
    can fire-and-forget the Netdisco job daemon without much trouble at all.
    This script in turn calls "netdisco-daemon-fg" which is the real
    application, that runs in the foreground if called on its own.

  Daemon Engineering
    The job daemon is based on the MCE library, which handles the forking
    and management of child processes doing the actual work. This actually
    runs in the foreground unless wrapped with Daemon::Control, as mentioned
    above. MCE handles three flavours of "worker" for different tasks.

    One goal that we had designing the daemon was that sites should be able
    to run many instances on different servers, with different processing
    capacities. This is both to take advantage of more processor capability,
    but also to deal with security zones where you might only be able to
    manage a subset of devices from certain locations. Netdisco has always
    coped well with this via its "discover_*" and similar configuration, and
    the separate poller process.

    So, the single Manager "worker" in the daemon is responsible for
    contacting the central Netdisco database and booking out jobs which it's
    able to service according to the local configuration settings. Jobs are
    "locked" in the central queue and then copied to a local job queue
    within the daemon.

    Along with the Manager we start zero or more of two other types of
    worker. Some jobs such as port control are "interactive" and the user
    typically wants quick feedback on the results. Others such as polling
    are background tasks which can take more time and are less schedule
    sensitive. So as not to starve the "interactive" jobs of workers we have
    two types of worker.

    The Interactive worker picks jobs from the local job queue relating to
    device and port reconfiguration only. It submits results directly back
    to the central Netdisco database.

    The Poller worker (is not yet written!) and similarly picks job from the
    local queue, this time relating to device discovery and polling.

    There is support in the daemon for the workers to pick more than one job
    at a time from the local queue, in case we decide this is worth doing.
    However the Manager won't ever book out more jobs from the central
    Netdisco job queue than it has workers available (so as not to hog jobs
    for itself against other daemons on other servers). The user is free to
    configure the number of Interactive and Poller workers in their
    "config.yml" file (zero or more of each).

  SNMP::Info
    The daemon obviously needs to use SNMP::Info for device control. All the
    code for this has been factored out into the App::Netdisco::Util
    namespace.

    The App::Netdisco::Util::Connect package provides for the creation of
    SNMP::Info objects along with connection tests. So far, SNMPv3 is not
    supported. To enable trace logging of the SNMP::Info object simply set
    the "INFO_TRACE" environment variable to a true value. The Connect
    library also provides routines to map interface and PoE IDs.

    Configuration for SNMP::Info comes from the YAML files, of course. This
    means that our "mibhome" and "mibdirs" settings are now in YAML format.
    In particular, the "mibdirs" list is a real list within the
    configuration.

    Other libraries will be added to this namespace in due course, as we add
    more functionality to the Job Daemon.

  DBIx::Class Layer
    The local job queue for each Job Daemon is actually an SQLite database
    running in memory. This makes the queue management code a little more
    elegant. The schema for this is of course DBIx::Class using Dancer
    connection management, and lives in App::Netdisco::Daemon::DB.

    There is currently only one table, the port control job queue, in
    App::Netdisco::Daemon::DB::Result::Admin. It's likely this name will
    change in the future.