netdisco/Netdisco/DEVELOPING.pod

=head1 DEVELOPER NOTES

This document aims to help developers understand the intent and design of the
code within Netdisco. Patches and feedback are always welcome :-)

=head1 Introduction

This release of Netdisco is built as a L<Dancer> application, and uses many
modern technologies and techniques. Hopefully this will make the code easier
to manage and maintain in the long term.

Although Dancer is a web application framework, it provides very useful tools
for command line applications as well, namely configuration file management
and database connection management. We make use of these features in the
daemon and deployment scripts.

Overall the application tries to be as self-contained as possible without also
needing an excessive number of CPAN modules to be installed. However, Modern
Perl techniques have made dependency management almost a non-issue, and
Netdisco can be installed by and run completely within an unprivileged user's
account, apart from the PostgreSQL database setup.

Finally the other core component of Netdisco is now a L<DBIx::Class> layer for
database access. This means there is no SQL anywhere in the code, but more
important, we can re-use the same complex queries in different parts of
Netdisco.

The rest of this document discusses each "interesting" area of the Netdisco
codebase, hopefully in enough detail that you can get hacking yourself :-)

=head1 Versioning

This is Netdisco major version 2. The minor version has six digits, which are
split into two components of three digits each. It's unlikely that the major
version number (2) will increment. Each "feature" release to CPAN will
increment the first three digits of the minor version. Each "bug fix" release
will increment the second three digits of the minor version.

Stable releases will have an even "feature" number. Beta releases will have an
odd "feature" number and also a suffix with an underscore, to prevent CPAN
indexing the distribution. Some examples:

 2.002002     - "feature" release 2, "bug fix" release 2
 2.002003     - another bug was found and fixed, hence "bug fix" release 3
 2.003000_001 - first beta for the next "feature" release
 2.003000_002 - second beta
 2.004001     - the next "feature" release


=head1 Global Configuration

Dancer uses YAML as its standard configuration file format, which is flexible
enough for our needs, yet still simple to edit for the user. We no longer need
a parser as in the old version of Netdisco.

At the top of scripts you'll usually see something like:

 use App::Netdisco;
 use Dancer ':script';

First, this uses C<App::Netdisco>, which is almost nothing more than a
placeholder module (contains no actual application code). What it does is set
several environment variables in order to locate the configuration files.

Then, when we call "C<use Dancer>" these environment variables are used to
load two YAML files: C<config.yml> and C<< <environment>.yml >> where
C<< <environment> >> is typically either C<production> or C<development>.

The concept of "environments" allows us to have some shared "master" config
between all instances of the application (C<config.yml>), and then settings
for specific circumstances. Typically this might be logging levels, for
example. The default file which C<App::Netdisco> loads is C<development.yml>
but you can override it by setting the "C<DANCER_ENVIRONMENT>" environment
variable.

Dancer loads the config using YAML, merging data from the two files. Config is
made available via Dancer's C<setting('foo')> subroutine, which is exported.
So now the C<foo> setting in either config file is easily accessed.

Another line commonly seen in scripts is this:

 use Dancer::Plugin::DBIC 'schema';

This plugin saves a lot of effort by taking some database connection
parameters from the configuration file, and instantiating DBIx::Class database
connections with them. The connections are managed transparently so all we
need to do to access the Netdisco database, with no additional setup, is:

 schema('netdisco')->resultset(...)->search({...});


=head1 DBIx::Class Layer

DBIx::Class, or DBIC for short, is an Object-Relational Mapper. This means it
abstracts away the SQL of database calls, presenting a Perl object for each
table, set of results from a query, table row, etc. The advantage is that it
can generate really smart SQL queries, and these queries can be re-used
throughout the application.

The DBIC layer for Netdisco is based at L<App::Netdisco::DB>. This is the
global schema class and below that, under L<App::Netdisco::DB::Result> is a
class for each table in the database. These contain metadata on the columns
but also several handy "helper" queries which can be called.  There are also
C<ResultSet> classes which provide additional "pre-canned" queries.

Netdisco's DBIx::Class layer has excellent documentation which you are
encouraged to read, particularly if you find it difficult to sleep.

=head2 Results and ResultSets

In DBIC a C<Result> is a table and a C<ResultSet> is a set of rows retrieved
from the table as a result of a query (which might be all the rows, of
course). This is why we have two types of DBIC class.
Items in the C<Result> generally relate to the single table
directly, and simply. In the C<ResultSet> class are more complex search
modifiers which might synthesize new "columns" of data (e.g. formatting a
timestamp) or subroutines which accept parameters to customize the query.

However, regardless of the actual class name, you access them in the same way.
For example the C<device> table has an L<App::Netdisco::DB::Result::Device>
class and also an L<App::Netdisco::DB::ResultSet::Device> class. DBIC merges
the two:

 schema('netdisco')->resultset('Device')->get_models;

=head2 Virtual Tables (VIEWs)

Where we want to simplify our application code even further we can either
install a VIEW in PostgreSQL, or use DBIx::Class to synthesize the view
on-the-fly. Put simply, it uses the VIEW definition as the basis of an SQL
query, yet in the application we treat it as a real table like any other.

Some good examples are a fake table of only the active Nodes (as opposed to
all nodes), or the more complex list of all ports which are connected together
(C<DeviceLink>).

All these tables live under the
L<App::Netdisco::DB::Result::Virtual> namespace, and so you
access them like so (for the C<ActiveNode> example):

 schema('netdisco')->resultset('Virtual::ActiveNode')->count;

=head2 Versioning and Deployment

To manage the Netdisco schema in PostgreSQL we use DBIx::Class's deployment
feature. This attaches a version to the schema and provides all the code to
check the current version and do whatever is necessary to upgrade.
The schema version is stored in a new table called
C<dbix_class_schema_versions>, although you should never touch it.

The C<netdisco-db-deploy> script included in the distribution performs the
following services:

 * Installs the dbix_class_schema_versions table
 * Upgrades the schema to the current distribtion's version

This works both on an empty, new database, and a legacy database from the
existing Netdisco release, in a non-destructive way. For further information
see L<DBIx::Class::Schema::Versioned> and the C<netdisco-db-deploy> script.

The files used for the upgrades are shipped with this distribution and stored
in the C<.../App/Netdisco/DB/schema_versions> directory. They are generated
using the C<nd-dbic-versions> script which also ships with the distribution.

=head2 Foreign Key Constraints

We have not yet deployed any FK constraints into the Netdisco schema. This is
partly because the current poller inserts and deletes entries from the
database in an order which would violate such constraints, but also because
some of the archiving features of Netdisco might not be compatible anyway.

Regardless, a lack of FK constraints doesn't upset DBIx::Class. The
constraints can easily be deployed in a future release of Netdisco.


=head1 Web Application

The Netdisco web app is a "classic" Dancer app, using most of the bundled
features which make development really easy. Dancer is based on Ruby's Sinatra
framework. Its style is for many "helper" subroutines to be exported into the
application namespace, to do things such as access request parameters,
navigate around the "handler" subroutines, manage response headers, and so on.

Pretty much anything you want to do in a web application has been wrapped up
by Dancer into a neat helper routine that does the heavy lifting. This
includes configuration and database connection management, as was discussed
above. Also, templates can be executed and Netdisco uses the venerable
L<Template::Toolkit> engine for this.

Like most web frameworks Dancer has a concept of "handlers" which are
subroutines to which a specific web request is routed. For example if the user
asks for "C</device>" with some parameters, the request ends up at the
L<App::Netdisco::Web::Device> package's "C<get '/device'>" handler. All this
is done automatically by Dancer according to some simple rules. There are also
"wrapper" subroutines which we use to do tasks such as setting up data lookup
tables, and handling authentication.

Dancer also supports AJAX very well, and it is used to retrieve most of the
data in the Netdisco web application in a dynamic way, to respond to search
queries and avoid lengthy page reloads. You will see the handlers for AJAX
look similar to those for GET requests but do not use Template::Toolkit
templates.

Compared to the current Netdisco, the handler routines are very small. This is
because (a) they don't include any HTML - this is delegated to a template, and
(b) they don't include an SQL - this is delegated to DBIx::Class. Small
routines are more manageable, and easier to maintain. You'll also notice use
of modules such as L<Net::MAC> and L<NetAddr::IP::Lite> to simplify and make
more robust the handling of data.

=head2 Running the Web App

Dancer apps conform to the "PSGI" standard interface for web applications,
which makes for easy deployment under many stacks such as Apache, FCGI, etc.
See L<Dancer::Deployment> for more detail.

At a minimum Netdisco can run from within its own user area as an unprivileged
user, and ships with a simple web server engine (see the user docs for
instructions). The C<netdisco-web> script uses L<Daemon::Control> to daemonize
this simple web server so you can fire-and-forget the Netdisco web app without
much trouble at all. This script in turn calls C<netdisco-web-fg> which is the
real Dancer application, that runs in the foreground if called on its own.

All web app code lives below L<App::Netdisco::Web>, but there are also some
helper routines in L<App::Netdisco::Util::Web> (for example sorting device
port names).

=head2 Authentication

Dancer includes (of course) good session management using cookies and a memory
database. You should change this to a disk database if using a proper forking
web server installation so that sessions are available to all instances.

Session and authentication code lives in L<App::Netdisco::Web::AuthN>. It is
fully backwards compatible with the existing Netdisco user management, making
use of the database users and their MD5 passwords.

There is also support for unauthenticated access to the web app (for instance
if you have some kind of external authentication, or simply trust everyone).

=head2 Templates

In the C<share/views> folder of this distribution you'll find all the
Template::Toolkit template files, with C<.tt> extensions. Dancer first loads
C<share/views/layouts/main.tt> which is the main page wrapper, that has the HTML
header and so on. It then loads other templates for sections of the page body.
This is a typical Template::Toolkit "wrapper" configuration, as noted by the
C<[% content %]> call within C<main.tt> that loads the template you actually
specified in your Dancer handler.

All templates (and Javascript and Stylesheets) are shipped in the Dancer
distribution and located automatically by the application (using the
environment variables which L<App::Netdisco> set up). The user doesn't have to
copy or install any files.

There's a template for the homepage called C<index.tt>, then separate
templates for searching, displaying device details, and showing inventory.
These are, pretty much, all that Netdisco ever does.

Each of these pages is designed in a deliberately similar way, with re-used
features. They each can have a "sidebar" with a search form (or additional
search parameters). They also can have a tabbed interface for sub-topics.

Here's where it gets interesting. Up till now the page content has been your
typical synchronous page load (a single page comprised of many templates) in
response to a GET request. However the content of the tabs is not within this.
Each tab has its content dynamically retrieved via an AJAX request back to the
web application. Javscript triggers this automatically on page load.

This feature allows the user to search and search again, each time refreshing
the data they see in the tab but without reloading the complete page with all
its static furniture. AJAX can, of course, return any MIME type, not only JSON
but also HTML content as in this case. The templates for the tabs are
organised below C<share/views/ajax/...> in the distribution.

=head2 Stylesheets

The main style for Netdisco uses Twitter Bootstrap, which is a stylish modern
library of styles and javascript used on many websites. It does a lot of heavy
lifting, providing simple CSS classes for all of the standard web page
furniture (forms, tables, etc). Check out the documetation at the Twitter
Bootstrap web site for more information.

These stylesheets are of course customised with our own C<netdisco.css>. We
try to name all CSS classes with a prefix "C<nd_>" so as to be distinct from
Twitter Bootstrap and any other active styles.

All stylesheets are located in the C<share/public/css> folder of the
distribution and, like the templates, are automatically located and served by
the Netdisco application. You can also choose to serve this content statically
via Apache/etc for high traffic sites.

Although Twitter Bootstrap ships with its own set of icons, we use an
alternative library called Fontawesome. This plugs in easily to Bootstrap and
provides a wider range of scaleable vectored icons which are easy to use.

=head2 Javascript

Of course many parts of the Netdisco site use Javascript, beginning with
retrieving the page tab content itself. The standard library in use is jQuery,
and the latest version is shipped with this distribution.

Many parts of the Netdisco site have small Javscript routines. The code for
these, using jQuery as mentioned, lives in two places. The main C<netdisco.js>
file is loaded once in the page HTML header, and lives in
C<share/public/javascripts/netdisco.js>. There's also a
C<netdisco_portcontrol.js> which is included only if the current user has Port
Control rights.

Netdisco also has Javascript routines specific to the device search or device
details pages, and these files are located in C<share/views/js/...> because
they're loaded within the page body by the templates. These files contain a
function C<inner_view_processing> which is called each time AJAX delivers new
content into a tab in the page (think of it like a callback, perhaps).

Also in the C<share/public/javascripts/...> folder are the other public
libraries loaded by the Netdisco application:

The Toastr library is used for "Growl"-like notifications which appear in the
corner of the web browser and then fade away. These notify the user of
successful background job submission, and jos results.

The d3 library is a graphics toolkit used to display the NetMap feature. This
works differently from the old Netdisco in that everything is generated
on-the-fly using SQL queries (C<DeviceLinks> resultset) and this d3 library
for rendering.

Finally Twitter Bootstrap also ships with a toolkit of helpful Javascript
driven features such as the tooltips and collapsers.


=head1 Job Daemon

The old Netdisco has a job control daemon which processes "port control"
actions and also manual requests for device polling. The new Netdisco also has
a daemon, although it is a true separate process and set of libraries from the
web application. However, it still makes use of the Dancer configuration and
database connection management features mentioned above.

The job daemon is backwards compatible with the old Netdisco database job
requests table, although it doesn't yet log results in the same way. Most
important, it cannot yet poll any devices for discovery or macsuck/arpnip,
although that's next on the list!

All code for the job daemon lives under the L<App::Netdisco::Daemon> namespace
and like the rest of Netdisco is broken down into manageable chunks.

=head2 Running the Job Daemon

Like the web application, the job daemon is fully self contained and runs via
two simple scripts shipped with the distribution - one for foreground and one
for background execution (see the user docs for instructions).

The C<netdisco-daemon> script uses L<Daemon::Control> to daemonize so you can
fire-and-forget the Netdisco job daemon without much trouble at all. This
script in turn calls C<netdisco-daemon-fg> which is the real application, that
runs in the foreground if called on its own.

=head2 Daemon Engineering

The job daemon is based on the L<MCE> library, which handles the forking and
management of child processes doing the actual work. This actually runs in the
foreground unless wrapped with Daemon::Control, as mentioned above. MCE
handles three flavours of "worker" for different tasks.

One goal that we had designing the daemon was that sites should be able to run
many instances on different servers, with different processing capacities.
This is both to take advantage of more processor capability, but also to deal
with security zones where you might only be able to manage a subset of devices
from certain locations. Netdisco has always coped well with this via its
C<discover_*> and similar configuration, and the separate poller process.

So, the single Manager "worker" in the daemon is responsible for contacting
the central Netdisco database and booking out jobs which it's able to service
according to the local configuration settings. Jobs are "locked" in the
central queue and then copied to a local job queue within the daemon.

Along with the Manager we start zero or more of two other types of worker.
Some jobs such as port control are "interactive" and the user typically wants
quick feedback on the results. Others such as polling are background tasks
which can take more time and are less schedule sensitive. So as not to starve
the "interactive" jobs of workers we have two types of worker.

The Interactive worker picks jobs from the local job queue relating to device
and port reconfiguration only. It submits results directly back to the central
Netdisco database.

The Poller worker (is not yet written!) and similarly picks job from the local
queue, this time relating to device discovery and polling.

There is support in the daemon for the workers to pick more than one job at a
time from the local queue, in case we decide this is worth doing. However the
Manager won't ever book out more jobs from the central Netdisco job queue than
it has workers available (so as not to hog jobs for itself against other
daemons on other servers). The user is free to configure the number of
Interactive and Poller workers in their C<config.yml> file (zero or more of
each).

=head2 SNMP::Info

The daemon obviously needs to use L<SNMP::Info> for device control. All the
code for this has been factored out into the L<App::Netdisco::Util> namespace.

The L<App::Netdisco::Util::Connect> package provides for the creation of
SNMP::Info objects along with connection tests. So far, SNMPv3 is not
supported. To enable trace logging of the SNMP::Info object simply set the
C<INFO_TRACE> environment variable to a true value.  The Connect library also
provides routines to map interface and PoE IDs.

Configuration for SNMP::Info comes from the YAML files, of course. This means
that our C<mibhome> and C<mibdirs> settings are now in YAML format. In
particular, the C<mibdirs> list is a real list within the configuration.

Other libraries will be added to this namespace in due course, as we add more
functionality to the Job Daemon.

=head2 DBIx::Class Layer

The local job queue for each Job Daemon is actually an SQLite database running
in memory. This makes the queue management code a little more elegant. The
schema for this is of course DBIx::Class using Dancer connection management,
and lives in L<App::Netdisco::Daemon::DB>.

There is currently only one table, the port control job queue, in
L<App::Netdisco::Daemon::DB::Result::Admin>. It's likely this name will change
in the future.

=cut