Blogs

Released: Netomata Config Generator (NCG), version 0.10.2

I just posted a slightly revised version of Netomata Config Generator (NCG), version 0.10.2. The only difference from the recently-released version 0.10.1 is a small change in which Ruby libraries we depend upon, because of a reorganization of one of the commonly-used Ruby libraries.

Basically, NCG makes heavy use of the Ruby Dictionary class, which used to be part of the Facets library; however, as of the recent Facets 2.9.0 release, the Dictionary class is no longer included in Facets, and you have to get it from the Ruby Hashery library instead.

If you've got a working NCG installation, this shouldn't affect you, unless you update your Ruby libraries so that you only have Facets 2.9.0 or later. A simple "gem update" to update your libraries would install the new version of Facets, but wouldn't delete the old version (unless you also did a "gem cleanup"), and Ruby will happily continue to take the Dictionary class from the older version of Facets if it's still there.

If you do break your NCG installation by updating Facets to 2.9.0 and then deleting the older version(s), it's easy to add the older version back simply by doing "gem install -v 2.8.4 facets". Your Ruby installation will prefer Facets 2.9.0, but will fall back to Facets 2.8.4 for stuff that it can't find in 2.9.0 (including the "Dictionary" class that NCG depends upon).

As always, the latest version of NCG can be downloaded from http://www.netomata.com/tools/ncg

Thanks to Bryan Wann for bringing this problem to our attention, and suggesting the fix.

Catch me working Emergency Services at Decompression

This Sunday (10 Oct 2010), I'll be volunteering at Burning Man's "Decompression" street faire in San Francisco with the Emergency Services Department (ESD). If you're at Decompression, stop by and say hi; I should be at the ESD/Medical booth most of the time, particularly from 6pm to midnight (I'll be on duty, but I don't expect to be too busy to chat). If you want to get a taste of Burning Man, and you're in the Bay Area, come to Decompression!

Although I'm mostly known professionally for my work in networking, automation, firewalls, and IT infrastructure, I also have a strong interest in emergency services. Many of my networking colleagues aren't aware of it, but I've got nearly two decades of volunteer experience in air search and rescue (with Civil Air Patrol, as everything from a search pilot through an incident commander) and community disaster preparedness (with Community Emergency Response Teams in Mountain View, San Francisco, and Alameda, California, as both a team member and an instructor).

My most recent emergency services volunteer work has been with Burning Man's Emergency Services Department. This year at Burning Man, I worked as a 911 dispatcher. We dispatch medical, fire, and mental health responses for the event's 50,000 participants, as well as coordinate with several local, state, and federal law enforcement agencies. As dispatchers, the resources at our fingertips included over a dozen pieces of apparatus (fire engines, water tenders, medical quick response vehicles, and various hybrid fire/medical vehicles), two medical aid stations, and several dozen personnel on duty at any given time; through allied agencies, we could also call on a fully-equipped emergency medical clinic, several ambulances, dozens of Black Rock Rangers (Burning Man's non-confrontational community mediators), and dozens of law enforcement personnel.

Occasionally, I've had the opportunity to merge my "professional" and "volunteer" lives, such as when I worked as a disaster relief volunteer designing and deploying free wireless Internet access after Hurricane Katrina, or when I've given my "Incident Command for IT: What We Can Learn from the Fire Department" presentations.

Released: Netomata Config Generator (NCG), version 0.10.1

I'm pleased to announce the release of Netomata Config Generator (NCG), version 0.10.1.

Netomata Config Generator (NCG) creates complete, ready-to-install config files for network devices and services from a common light-weight model of your network. Because these config files are generated programmatically (rather than by hand), and generated from a shared model (rather than being managed separately for each device or service), they are more likely to be consistent and complete, which makes your network more reliable, easier to troubleshoot, and easier to expand in both size and functionality.

The inputs to NCG are a model describing your network (neto and neto_table files), and templates (ncg files) for the config files of the various devices (routers, swiches, load balancers, firewalls, etc.) and services (SNMP, DNS, DHCP, etc.) that you want to generate config files for. From these inputs, NCG produces complete, consistent, ready-to-install config files for those devices and services.

For more information about Netomata and the philosophy behind NCG, see

New Features

Significant new features of this release include:

  • .neto file processing has been changed so that [% ... %] ERB template processing can be used on any line, not just in "key = value" assignments.
  • A new "<" header line directive has been added for .neto_table files, which is analogous to an include directive in a .neto file. This provides a convenient way to apply many lines of directives in .neto format to each line of data in the .neto_table file (sort of like a subroutine). For example, the following in a list of .neto_table header lines:
    
            < key filename [ ... ]
    
    tells ncg to process the .neto format file(s) named by "filename ..." in the context of "key". This is equivalent to the .neto format construct of
    
            key {
                include filename [ ... ]
            }
    
  • Assigning a value of "-" (either via a "key = value" statement in a .neto file, or in a column of data in a .neto_table file) will cause a key to be deleted.
  • Several new utility methods have been added, to do things like manipulate netmasks in IP addresses and sort lists of interfaces into their "natural" order.

For a full list of changes, see the RELEASE_NOTES file, which is both within the release and available separately in the download section of the Netomata Config Generator (NCG) page.

Downloading

NCG can be downloaded from the Netomata Config Generator (NCG) page.

NCG requires Ruby 1.8.6 (it may work under other versions, but hasn't been tested under other versions) and the "Facets" Ruby Gem. NCG should work just fine on pretty much any UNIX/Linux-based operating system which has Ruby 1.8.6 available. The REQUIREMENTS file (included in the distribution, or available via the "Files" section of the NCG page) details how to install the necessary prerequisites on a variety of different platforms.

Documentation

Please see the Netomata Config Generator (NCG) program documentation pages.

License

NCG is released as free open source software under the GNU General Public License, version 3. Please see the Netomata Config Generator (NCG) License page for full details. If you would like to discuss alternative licensing terms, please email license@netomata.com.

Status

The current release should definitely be considered experimental in nature. Commands and file formats are all subject to change, as we work out what paradigms work best and figure out how to get the most out of this tool. We'll try to limit future changes that break backwards compatibility, but we can't promise that at this stage; right now, we believe that it's more important to experiment with how best to build and use this tool, rather than to carve anything in stone.

To make an analogy to programming languages, this release is like the interpreter for a language that is so new that the standard libraries for the language haven't been developed yet. The basic language capability is there, but the functionality and leverage normally provided by standard libraries doesn't exist yet.

Join us!

Please download NCG, give it a try, and join the community that's working to make this a useful and valuable tool for improving the reliability and scalability of networks like yours!

Netomata releases web-based Config Review Tool

Once you've used a tool like the Netomata Config Generator (NCG) to generate configs for a bunch of devices on your network, how do you convince yourself that those new configs are complete and correct and ready to deploy? How do you determine that the newly-generated configs differ from the old configs in only the ways that you want, and that you haven't inadvertently introduced unintended changes?

Wouldn't it be great if you could, say, compare the newly-generated configs to the original (hand-created) configs for those devices, or to the previous generated configs? And how cool would it be if there was some sort of "approval" mechanism wrapped around this, so that you could easily identify the files that had been reviewed and approved as good-to-go for installation?

We've got a tool for you!

We've just released the Netomata Config Review Tool, which addresses these issues. It is a simple web-based tool for reviewing NCG-generated config files and approving them for installation on devices. It is written in Ruby as a web CGI program; it should work fine on any web server that supports CGI programs, such as Apache. We're releasing it as open source under a GPLv3 license (the same as NCG).

This tool is an outgrowth of a recent consulting project that we did for Netflix, helping them install NCG and set it up to generate configs for the routers at their dozens of shipping hubs throughout the USA. We'd love to do a project like this for your organization, too!

How it works

For each device, the tool keeps track of 3 config files (if they exist):

  • Original: the config that the device was originally running (which was presumably created by hand)
  • Generated: the most recent config generated by NCG
  • Approved: the most recent generated config that has been "approved" via this process

For each device, this tool lets you:

  • View the Original, Generated, and (if it exists) Approved config
  • See diffs between pairs of configs:
    • Original => Generated
    • Generated => Approved
    • Original => Approved
  • Approve a Generated config (i.e., make it the Approved config for the device)
  • Unapprove a currently approved config (i.e., delete the Approved config for the device)

The tool does not (yet) install approved configs on devices; the assumption is that you will use a tool such as RANCID to do that, from the files in the "approved" directory.

How to get it

You can read all about it, see screen shots, and download the code at http://www.netomata.com/wiki/config_review_tool

Presenting a free webinar about Network Automation, Wed 23 June 2010

On Wed 23 June 2010, I'll be presenting a 30-minute overview of network automation benefits and tools as part of a free online webinar produced by SearchNetworking, entitled Optimizing and Managing the Dynamic Enterprise Network:

Today, as more applications and IT functions converge on the network infrastructure, user expectations are higher than ever. The advent of cloud computing and virtualization demands a solid yet flexible network that can instantly adjust to changing conditions. Unfortunately, many IT departments today find themselves facing this technology challenge with lean networking teams and low budgets. That makes choosing the right network management and optimization tools critical.

In this free, one-day virtual seminar our experts will cover how to rethink your management strategy and implement techniques that allow networking teams to understand performance, make the most of the infrastructure, and offload low-level tasks so that they can focus on improving performance and making progress.

Attend and gain insight on how to:

  • Manage your network in the age of the dynamic network
  • Ensure application performance on the WAN
  • Use network automation to make your network more cost-effective, reliable, and flexible

And much more!

My part of the webinar is scheduled to start at 10:30am PDT (1:30pm EDT). After the webinar, I'll be online to answer questions from the audience.

Optimizing and Managing the Dynamic Enterprise Network:

Seats still available at *free* DevOps Day conference, Fri 25 June 2010 in Mountain View

A group of folks who are active in the emerging "devops" field are putting together DevOps Day, a free one-day conference on Friday 25 June 2010, in Mountain View, CA, hosted by LinkedIn:

DevOps Day is an open event for discussing all topics around improving the interaction between what is traditionally considered development activity and that which is traditionally considered operations activity.

...

DevOps Day US is a single-track conference organized around a series of panels where open discussion amongst all conference participants is encouraged.

This is a one-day "hmm, we're all facing similar issues; let's get together and talk about this" event being put together by practitioners, not a "conference" being sponsored by folks who are trying to sell you something. I expect it to be more like an extended user group meeting than anything else, and I'm looking forward to some very interesting discussions.

Planned discussions include:

  • Your mileage may vary: Experiences and lessons learned facing DevOps problems in the IT trenches (even if they weren’t calling it DevOps!). The good, the bad, the surprises, and ideas for the future.
  • Infrastructure as code: Automation is essential to DevOps. The infrastructure as code concept drives many of today’s cutting edge automaton techniques. What is it all about? Where are its limitations?
  • Changing culture to enable DevOps: Changing tools is easy when compared to changing people and processes. How can we cultivate an organization’s culture to identify and solve DevOps problems?
  • Does the Cloud needs DevOps? Does DevOps need the Cloud?: Examining the role that cloud technologies can play in solving DevOps problems and the role that DevOps solutions can play in getting the most out of cloud technologies.
  • DevOps requires visibility: monitoring, testing, and performance: Examining the (often overlooked) role of monitoring and testing techniques in solving DevOps problems.
  • DevOps outside of Web Operations: Much of the public discussion about DevOps focuses on Web Operations. This panel is about taking the lessons of DevOps to other types of IT.
  • Making the business case: We know that solving DevOps problems improves your business operations and improves the bottom line, but how do you do you explain that to your CEO or CFO? How do you get the executives to buy in and invest in DevOps solutions?

Expected participants include Luke Kanies (creator of Puppet) and Adam Jacob (creator of Chef), as well as practitioners from organizations such as LinkedIn, Shopzilla, Etsy, Cisco, ITA Software, and Tripwire.

DevOpsDays 2010 US

All in all, it's a very interesting topic, and this looks like it will be a fascinating event. I'll be there, and I hope to see you there too!

Speaking about Automating Network Configuration at NANOG in SF, Sun 13 Jun 2010

This quarter's NANOG meeting is in San Francisco, and I'll be presenting a 90-minute tutorial on Automating Network Configuration:

You've been using tools like Puppet and cfengine to corral the complexity on your servers. You revel in the scalability, reliability, and ease of maintenance of doing it The Right Way. You don't fear the next change because you know the tools will just get it Right. But you still tremble at an 'enable' prompt, hoping you remembered all the bits that need to be twiddled, on all the networking devices everywhere. Is your DNS tied on straight - both ways? Is it all *really* being monitored by Nagios? As your network's complexity increases, so do the errors, inconsistencies, and omissions caused by manual configuration, and brokenness abounds. But wait - there's a way out of the swamp! Come hear Brent Chapman as he reveals methods and tools for automating the mind-numbing task of configuring network devices and services. Among other things, he'll talk about his cool new open source Netomata Config Generator, which addresses some of these problems.

Brent Chapman is the founder, CEO, and technical lead of Netomata, Inc. He is the coauthor of the highly regarded O'Reilly & Associates book Building Internet Firewalls. He is also the founder of the Firewalls, List-Managers, and Network-Automation Internet mailing lists, and the creator of the Majordomo mailing list management package. In 2004, Brent was honored with the annual SAGE Outstanding Achievement Award 'for outstanding sustained contributions to the community of system administrators'. He has been a frequent and popular speaker at USENIX, LISA, BayLISA, and many other events over the past 15 years.

I expect to be there for the full NANOG meeting, from Sun 13 Jun 2010 through Wed 16 Jun 2010; if you're there, too, I hope you'll come to my talk, or at least catch me and say hello.

And if you haven't registered for NANOG yet, it's not too late... As the NANOG web site says:

NANOG49 will feature presentations on networking advancements and techniques, educational tutorials, interesting tracks, and more. Whether you are new to the networking profession or a seasoned veteran, NANOG49 will educate and inform with a full agenda of interesting topics.

I highly recommend it, and I hope to see you there!

O'Reilly offering 25% Memorial Day discount for Velocity conference

O'Reilly Velocity Web Performance & Operations Conference 2010 The O'Reilly Velocity conference is only in its third year, but it has rapidly become one of my favorite events. If you do web operations or architecture, I'd say it's a "must do" conference; the amount of info you'll pick up in 2 short days (3 if you attend the workshops) is amazing.

Even better, O'Reilly has just announced a special 25% discount on registration, good from now through Memorial Day weekend (until Tue 1 Jun 2010); just use the discount code "MEMORIALDAY" when you register.

I hope to see you there!

ZipTie versus RANCID

Someone recently asked me to share my thoughts on ZipTie (now officially known as "AlterPoint NetworkAuthority Inventory" or "AlterPoint NAI") versus RANCID as network configuration management tools.

To begin with, what are these tools?

RANCID is a command line tool which handles configuration communications with various types of networking devices (most major brands of routers, switches, load balancers, firewalls, etc.). You can use it to copy config files to and from devices, or to execute a series of commands on the device. Essentially, RANCID pretends to be a human user of the device's command line interface, and you give RANCID a simple "script" to follow in dealing with the device (i.e., "when you see the 'login:' prompt, send 'admin'; then, when you see the 'password:' prompt, send 'opensesame'; then, when you see the 'alibabascave>' prompt, send 'enable'; then ..."). RANCID is sometimes used by itself, but more often used as a building block in larger, custom-built automated network management systems; people use it in conjunction with tools to manage an archive of config files (such as CVSweb), or in conjunction with tools to programmatically generate config files (such as our own Netomata Config Generator (NCG) tool), or in a wide variety of other ways.

ZipTie, on the other hand, has a slick web-based user interface, and is designed to be a complete "environment" for managing the devices on your network. According to its web page:

NetworkAuthority Inventory provides continuous discovery and tracking of your network devices. Using a simple, web-based interface you can backup and restore device configurations, detect configuration changes and compare configurations between devices. NetworkAuthority Inventory generates an accurate, real-time, detailed view of every device in your network and keeps it up to date.

So, what are the key differences between RANCID and ZipTie?

  • As already discussed, RANCID is a command line based tool that can also be used from shell scripts and other programs, while ZipTie is a web-based tool that is designed for interactive use (there are ways to drive ZipTie programmatically, but that's not its main purpose).
  • ZipTie includes a "discovery" mode, to find the manageable devices on your network; with RANCID, you have to tell it what you want it to manage.
  • Both ZipTie and RANCID will move configs to and from network devices. ZipTie gives you a web interface to do that, while RANCID is command line driven. Which of those is "better" depends on your situation, and your team's skills and preferences.
  • ZipTie has lots of different reports and graphs and such; RANCID has none of that.
  • ZipTie is largely self-contained; it probably already does most of what you might want, and there are various extensions (some provided by AlterPoint, and some by the community) to make it do even more, but integrating it with other tools might be more challenging. RANCID, on the other hand, does very little (just moves configs on and off devices, really, although you can also use it to run scripted commands on those devices) by itself, but is easier to integrate with other systems that you're building yourself.
  • ZipTie has a cool "compare config" tool, that shows you how two config files (from different devices, or from different times on the same device) differ. With RANCID, you have to extract the right versions of the right files from CVS and then compare them yourself with "diff".
  • RANCID is some pretty ugly Perl code; it's hack piled upon hack atop other hack, haphazardly and occasionally supported by its user community, most of whom are excellent network engineers and but only so-so programmers. ZipTie, on the other hand, is developed and supported by professional programmers at a "real" company, which uses it as the core of their money-making product, so they have a strong incentive to maintain and improve it. The flip side of that is the whole "open source versus commercial" debate; RANCID is open source, and ZipTie is commercial, although the basic package (which might be enough to meet your needs) is free.

So, essentially, I suggest the following approach to comparing these two tools for your situation:

  • Try ZipTie, to see if it does what you need, since it's already got so much functionality built-in (discovery, graphs, reports, config comparisons, etc.)
  • If ZipTie and its various add-ons don't do what you need, and you feel that you need to build your own solution, then building it on top of RANCID probably makes sense.

So you think you know traceroute...

Most network engineers and sysadmins would probably say that they're intimately familiar with 'traceroute', and consider it one of their fundamental network troubleshooting tools... I certainly do. But you might be amazed to learn, as I did, how much you don't know about traceroute.

Richard Steenbergen of nLayer Communications, Inc., did an excellent presentation on traceroute at this month's NANOG (North American Network Operators Group) meeting:

Among other things, this presentation shows you:

  • How traceroute works
  • What you can learn from the DNS hostnames returned by traceroute
    • Where the ISP/carrier boundaries are
    • Where the equipment is located, geographically (do you know what a CLLI code is?)
    • What type of equipment the ISP/carrier is using
  • What the round trip times reported by traceroute really mean
  • How you can be led astray by ICMP prioritization, rate limiting, asymmetric paths, and load balancing

One of the coolest tricks I learned from this presentation is, to find out more about what's at the other end of some hop that appears to be a point-to-point link, assume that the IP address you see is one of the two addresses in a /30 subnet (as is commonly assigned to point-to-point links), and do a DNS reverse lookup of the other address in the /30.

This is useful, for example, in figuring out which egress port a packet went out on, since traceroute normally only shows you the ingress ports for each device along the way. For example, let's say I was looking at the following traceroute output, and wanted to know the egress port on router #3, as the packet moved to router #4:

brent% traceroute www.google.com
traceroute: Warning: www.google.com has multiple addresses; using 208.67.219.230
traceroute to google.navigation.opendns.com (208.67.219.230), 64 hops max, 40 byte packets
 1  192.168.0.1 (192.168.0.1)  3.145 ms  2.573 ms  2.382 ms
 2  75-101-29-1.dsl.static.sonic.net (75.101.29.1)  9.555 ms  9.054 ms  9.089 ms
 3  127.at-X-X-X.gw3.200p-sf.sonic.net (208.106.96.193)  9.510 ms  9.871 ms  9.194 ms
 4  200.ge-0-1-0.gw.equinix-sj.sonic.net (64.142.0.210)  11.965 ms  11.870 ms  11.839 ms
 5  0.as0.gw2.equinix-sj.sonic.net (64.142.0.150)  11.928 ms  12.519 ms  12.394 ms
 6  GigabitEthernet3-1.GW2.SJC7.ALTER.NET (157.130.194.17)  11.360 ms  16.257 ms  11.268 ms
 7  0.so-0-0-1.XL4.SJC7.ALTER.NET (152.63.51.50)  11.729 ms  11.679 ms  11.403 ms
 8  0.so-7-0-0.XL2.PAO1.ALTER.NET (152.63.113.21)  14.775 ms  17.455 ms 0.so-5-0-0.XL2.PAO1.ALTER.NET (152.63.48.9)  15.548 ms
 9  POS7-0.GW6.PAO1.ALTER.NET (152.63.55.14)  12.886 ms  13.143 ms  13.029 ms
10  65.203.37.46 (65.203.37.46)  13.517 ms  14.708 ms  16.566 ms
11  * * *
12  * * *
^C

To find out more about router #3's egress port, I look at the IP address for router #4 (64.142.0.210), figure out what would be the other IP address in the same /30 (64.142.0.209; hint: the lower address in a /30 pair always ends in an odd number, and the higher address always ends in an even number, so if the address you know ends in an odd number, the other address in the same /30 is going to be the next higher number, and if the address you know is even, the other is going to be the next lower number), and do a DNS reverse lookup of that address:

brent% dig -x 64.142.0.209

; <<>> DiG 9.4.3-P3 <<>> -x 64.142.0.209
;; global options:  printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 49382
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;209.0.142.64.in-addr.arpa.	IN	PTR

;; ANSWER SECTION:
209.0.142.64.in-addr.arpa. 259200 IN	PTR	200.ge-6-3-0.gw3.200p-sf.sonic.net.

;; Query time: 31 msec
;; SERVER: 208.67.222.222#53(208.67.222.222)
;; WHEN: Fri Nov 13 09:42:05 2009
;; MSG SIZE  rcvd: 91

Another handy tip from the presentation is that, since light travels through fiber optic cable at about 200 km (or 125 miles, if you prefer) per millisecond, each 1 ms of delay shown by traceroute (which, remember, is round trip delay) should represent about 100 km (62.5 mi) of distance if the delay were due entirely to the distance travelled (i.e., no queuing or processing delays). Using that fact, you can see that 40ms for a packet to go from San Francisco to New York (about 2500 miles, or 4000km) would be "normal", but 40ms for a packet to go from San Francisco to San Jose (about 50 miles, or 80km) would indicate a problem; it should take the packet less than 1ms to cover that distance and back, so something else (congestion or processing delays, for example) must account for the other 39ms.

There's a lot more in this presentation, about more complex issues such as

  • how the way in which routers handle traceroute packets can produce biased results (most routers handle traceroute packets much more slowly than they handle "real" data packets, which can make things look much worse than they are)
  • how asymmetric paths can lead you astray (traceroute only shows you the path to a system, but if you're pulling lots of bytes from the system, as would typically be the case with a remote server, you probably care more about the path back from the system, which might be totally different
  • how using MPLS, which is increasingly common in carrier networks, can lead to very confusing round-trip times in traceroute

Anyway, if you ever use traceroute, I highly recommend that you review this excellent presentation. I think you'll be pleasantly surprised at how much you learn.

Thanks to Strata Chalup of Virtual.net for bringing this very informative presentation to my attention.

Syndicate content