• strict warning: Only variables should be passed by reference in /var/www/sites/www.netomata.com/sites/all/themes/clean/template.php on line 126.
  • strict warning: Only variables should be passed by reference in /var/www/sites/www.netomata.com/sites/all/themes/clean/template.php on line 126.
  • strict warning: Only variables should be passed by reference in /var/www/sites/www.netomata.com/sites/all/themes/clean/template.php on line 126.
  • strict warning: Only variables should be passed by reference in /var/www/sites/www.netomata.com/sites/all/themes/clean/template.php on line 126.
  • strict warning: Only variables should be passed by reference in /var/www/sites/www.netomata.com/sites/all/themes/clean/template.php on line 126.
  • strict warning: Only variables should be passed by reference in /var/www/sites/www.netomata.com/sites/all/themes/clean/template.php on line 126.
  • strict warning: Only variables should be passed by reference in /var/www/sites/www.netomata.com/sites/all/themes/clean/template.php on line 126.
  • strict warning: Only variables should be passed by reference in /var/www/sites/www.netomata.com/sites/all/themes/clean/template.php on line 126.

Blogs

So you think you know traceroute...

Most network engineers and sysadmins would probably say that they're intimately familiar with 'traceroute', and consider it one of their fundamental network troubleshooting tools... I certainly do. But you might be amazed to learn, as I did, how much you don't know about traceroute.

Richard Steenbergen of nLayer Communications, Inc., did an excellent presentation on traceroute at this month's NANOG (North American Network Operators Group) meeting:

Among other things, this presentation shows you:

  • How traceroute works
  • What you can learn from the DNS hostnames returned by traceroute
    • Where the ISP/carrier boundaries are
    • Where the equipment is located, geographically (do you know what a CLLI code is?)
    • What type of equipment the ISP/carrier is using
  • What the round trip times reported by traceroute really mean
  • How you can be led astray by ICMP prioritization, rate limiting, asymmetric paths, and load balancing

One of the coolest tricks I learned from this presentation is, to find out more about what's at the other end of some hop that appears to be a point-to-point link, assume that the IP address you see is one of the two addresses in a /30 subnet (as is commonly assigned to point-to-point links), and do a DNS reverse lookup of the other address in the /30.

This is useful, for example, in figuring out which egress port a packet went out on, since traceroute normally only shows you the ingress ports for each device along the way. For example, let's say I was looking at the following traceroute output, and wanted to know the egress port on router #3, as the packet moved to router #4:

brent% traceroute www.google.com
traceroute: Warning: www.google.com has multiple addresses; using 208.67.219.230
traceroute to google.navigation.opendns.com (208.67.219.230), 64 hops max, 40 byte packets
 1  192.168.0.1 (192.168.0.1)  3.145 ms  2.573 ms  2.382 ms
 2  75-101-29-1.dsl.static.sonic.net (75.101.29.1)  9.555 ms  9.054 ms  9.089 ms
 3  127.at-X-X-X.gw3.200p-sf.sonic.net (208.106.96.193)  9.510 ms  9.871 ms  9.194 ms
 4  200.ge-0-1-0.gw.equinix-sj.sonic.net (64.142.0.210)  11.965 ms  11.870 ms  11.839 ms
 5  0.as0.gw2.equinix-sj.sonic.net (64.142.0.150)  11.928 ms  12.519 ms  12.394 ms
 6  GigabitEthernet3-1.GW2.SJC7.ALTER.NET (157.130.194.17)  11.360 ms  16.257 ms  11.268 ms
 7  0.so-0-0-1.XL4.SJC7.ALTER.NET (152.63.51.50)  11.729 ms  11.679 ms  11.403 ms
 8  0.so-7-0-0.XL2.PAO1.ALTER.NET (152.63.113.21)  14.775 ms  17.455 ms 0.so-5-0-0.XL2.PAO1.ALTER.NET (152.63.48.9)  15.548 ms
 9  POS7-0.GW6.PAO1.ALTER.NET (152.63.55.14)  12.886 ms  13.143 ms  13.029 ms
10  65.203.37.46 (65.203.37.46)  13.517 ms  14.708 ms  16.566 ms
11  * * *
12  * * *
^C

To find out more about router #3's egress port, I look at the IP address for router #4 (64.142.0.210), figure out what would be the other IP address in the same /30 (64.142.0.209; hint: the lower address in a /30 pair always ends in an odd number, and the higher address always ends in an even number, so if the address you know ends in an odd number, the other address in the same /30 is going to be the next higher number, and if the address you know is even, the other is going to be the next lower number), and do a DNS reverse lookup of that address:

brent% dig -x 64.142.0.209

; <<>> DiG 9.4.3-P3 <<>> -x 64.142.0.209
;; global options:  printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 49382
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;209.0.142.64.in-addr.arpa.	IN	PTR

;; ANSWER SECTION:
209.0.142.64.in-addr.arpa. 259200 IN	PTR	200.ge-6-3-0.gw3.200p-sf.sonic.net.

;; Query time: 31 msec
;; SERVER: 208.67.222.222#53(208.67.222.222)
;; WHEN: Fri Nov 13 09:42:05 2009
;; MSG SIZE  rcvd: 91

Another handy tip from the presentation is that, since light travels through fiber optic cable at about 200 km (or 125 miles, if you prefer) per millisecond, each 1 ms of delay shown by traceroute (which, remember, is round trip delay) should represent about 100 km (62.5 mi) of distance if the delay were due entirely to the distance travelled (i.e., no queuing or processing delays). Using that fact, you can see that 40ms for a packet to go from San Francisco to New York (about 2500 miles, or 4000km) would be "normal", but 40ms for a packet to go from San Francisco to San Jose (about 50 miles, or 80km) would indicate a problem; it should take the packet less than 1ms to cover that distance and back, so something else (congestion or processing delays, for example) must account for the other 39ms.

There's a lot more in this presentation, about more complex issues such as

  • how the way in which routers handle traceroute packets can produce biased results (most routers handle traceroute packets much more slowly than they handle "real" data packets, which can make things look much worse than they are)
  • how asymmetric paths can lead you astray (traceroute only shows you the path to a system, but if you're pulling lots of bytes from the system, as would typically be the case with a remote server, you probably care more about the path back from the system, which might be totally different
  • how using MPLS, which is increasingly common in carrier networks, can lead to very confusing round-trip times in traceroute

Anyway, if you ever use traceroute, I highly recommend that you review this excellent presentation. I think you'll be pleasantly surprised at how much you learn.

Thanks to Strata Chalup of Virtual.net for bringing this very informative presentation to my attention.

Quadruple whammy for IT as a startup grows

At some point during their growth, usually around the 50-100 employee stage, most startups face a "quadruple whammy" of IT infrastructure challenges. If the startup doesn't recognize that this is happening (or, better yet, anticipate and prevent it from happening), IT can quickly become a major drag on the startup's continued growth.

Early on, a startup's IT needs are generally handled internally on an ad hoc basis by a de facto IT team of various personnel, acting in addition to their primary responsibilities as engineers, managers, and so forth. This works fine for a while, often for several years. At some point, though, as the startup continues to grow, several factors all come together:

  • The IT workload is increasing, as the number of employees, offices, and customers all increase simultaneously.
  • Expectations for the company's IT infrastructure are also increasing, faster even than the numerical growth of the company might suggest. As the company grows, everyone (new hires, long-time employees, management, customers, investors, regulators, etc.) expects more of the company's IT infrastructure, and becomes less tolerant of deficiencies that were accepted earlier on.
  • The de facto IT team are all getting busier with their "real" jobs, leaving less time to "help out" with IT, at exactly the time when IT problems are becoming more complex because of the company's growth and rising expectations.
  • New hires are less able to fulfill their own IT needs, both because of changes in the nature of hiring over time (i.e., early hires in a startup tend to be more self-sufficient, while later hires have a higher expectation of what is already in place), and because of the ever-growing complexity of the company's IT infrastructure.

As a result of these factors, bad things start happening:

  • Routine IT infrastructure requests (moves, adds, changes, and troubleshooting) are an increasing burden on the de facto IT infrastructure team, all of whom have other primary responsibilities, to the point where the IT work is beginning to interfere with those other responsibilities.
  • Despite the best efforts of the de facto IT infrastructure team, the IT infrastructure isn’t living up to expectations throughout the company, and is beginning to become an obstacle.
  • Many IT infrastructure decisions are being made in an expedient and ad hoc fashion, without adequate contemplation of future needs, growth path, maintainability, and so forth, due to lack of a coherent IT infrastructure architecture and road map.

Essentially, at this point, the startup needs to put in place the framework of IT architectures, systems, processes, and people that will enable its IT infrastructure to facilitate the company's growth, rather than impede that growth.

Netomata's staff have helped many startups through this transition; if this situation sounds all too familiar to you, contact us, and we can help you too!

Perils of treating network management as a second-class service

Too many organizations treat network management as a "nice to have" part of their operational toolkit, rather than a "must-have" capability. You can usually get away with this for a while, but eventually your luck runs out...

Last week, I related an all-too-typical tale of woe about how a startup suffered an all-day customer-visible outage because of a network problem, explaining how network automation could have shortened the outage from hours to minutes. Well, it turns out that lack of network automation wasn't their only problem...

As it happened, at the time of the outage, they didn't have any network management capability, because their sole network management host had suffered a disk failure several days before and they hadn't gotten around to restoring the host yet because it was "just the network management system".

Unfortunately for them and their customers, the failed system that was "just the network management system" would have:

  • enabled them to detect the failing ethernet switch (which was the root cause of the outage) much sooner, perhaps even before the switch totally failed, because that was where they were running their network status and performance monitoring tools such as Nagios and MRTG.
  • helped them diagnose the switch failure much more quickly, once the outage began, by referring to those same network status and performance monitoring tools.
  • quickly and efficiently paged everybody on the operations team when the outage began, instead of diverting somebody (who could otherwise have been working on resolving the problem) to alert everybody by phone, because their paging system was part of the status monitoring tool.
  • helped them quickly swap out the failed switch with a replacement, because the failed switch's last-saved configuration was backed up on the network management system.

In retrospect, I'm sure they wish that they had engineered "just the network management system" with the same level of service reliability as their customer-visible "production" systems. I'm sure they wish that they had treated the failure of "just the network management system" with the same sort of urgency as they would a failure of one of their customer-visible "production" systems.

Once the network management system failed, they were living on borrowed time. When something else failed (i.e., the ethernet switch), they were severely hampered in their ability to detect and deal with that failure, which resulted in an extended customer-visible outage. Even though the network management system isn't itself customer-visible, it is an essential part of providing a reliable service, and needs to be treated as such.

Netomata can help you avoid problems like this with your network, while making your network more cost-effective, reliable, and flexible; please contact us to discuss how.

How network automation could have shortened an all-day customer-visible outage

A friend of mine recently related a tale of woe about network problems at his startup, a cloud service provider. Unfortunately, because they lacked a network automation system, they suffered a day-long customer-visible service outage; if they'd had an appropriate network automation system, they could have dealt with the problem in less than an hour.

It all started with a failing Ethernet switch, one of the pair of core switches in their data center installation. The failing switch would simply drop its 10Gb Ethernet connection to the other core switch, with no warning and no explanation. They tried the obvious quick fixes (try a different port on the failing switch, try a different cable between the switches, etc.), with no success; no matter what they tried, they couldn't resurrect the connection to the other core switch.

For various reasons, a drop-in replacement switch wasn't immediately available. After a physical inspection, counting open and used ports on both switches, they determined that they had just enough open ports on the working switch to allow them to re-home all the connections from the failing switch. "All" they needed to do was configure those ports on the working switch, along with associated VLAN definitions, access control lists, and so forth. Essentially, they needed to merge the functionality from the two switch configs (failing and working) into a single switch config.

Manual Pain and Suffering

Unfortunately, they had to do this configuration work by hand, because they don't use an automated configuration management tool such as NCG. Moving two dozen port configurations (plus associated VLAN definitions, access control lists, and so forth) from one switch to another by hand poses a number of problems:

  • The process is slow and error prone; it took them quite a while (many hours) and several iterations to get it right.
  • The process is complicated by inconsistencies and artifacts from past manual configuration of the devices. For example, they discovered that some of the nominally-unused ports on the "working" switch had been grouped into a port-channel group; they had to take time to understand that, figure out whether it was still needed or not, and then clean up those ports and associated virtual interfaces.
  • The process is risky. While they were making these changes on the working switch, they were risking inadvertently disrupting what was left of their network if they accidentally typo'd a command or applied something to the wrong port.
  • The process is intricate. The changes on the switch necessitate other changes beyond the switch. Even once they had the switch reconfigured, for example, they still needed to update their monitoring systems to monitor all the newly-activated ports on the switch. Since updating the monitoring systems is also a manual process, it too is slow, error-prone, and complicated.

Automated Nirvana

If they had been using an automated configuration management tool such as NCG, they could have been back in service much sooner (probably in less than an hour), with a much higher degree of confidence in the new config for the remaining switch.

A hypothetical automated configuration management system for their network would probably have the following characteristics:

  • A data file for each switch, describing the switch and listing its ports. Each port would probably be described by a single line in this file, containing the following information about the port:
    • name -- i.e., "GigabitEthernet0/3/1".
    • class -- What is this port used for? I.e., is it an inter-switch trunk carrying all VLANs? An access port on a particular VLAN? An unused port?
    • description -- a human-meaningful word or phrase describing the port, for use in interface labels, usage graphs, and so forth.
  • A set of master config templates for the switches. Since the two core switches are similar in make/model and in function, the same master config template would likely be used for both, thus ensuring consistency between the two switches.
  • A set of sub-templates for particular classes of ports on the switches; for instance, given the classes described above, you would sub-templates for classes "trunk", "access", and "unused". In addition to making the appropriate settings for a particular class of port, these sub-templates would also make any necessary additions to related things such as access control lists.
  • A set of templates for configuring the monitoring system (or systems) such as MRTG, NAGIOS, or similar. These would be used to generate monitoring configs that completely and correctly correspond to the switch configs.
  • An automated mechanism for getting configs onto the switches, such as RANCID or ZipTie.
  • A revision control mechanism such as RCS, CVS, Subversion or Git, to provide a history of the templates and data files that are inputs to the config generation process, as well as of the generated and installed configs.

Here are the steps they could have followed instead of doing everything by hand, had they been using such an automated system:

  1. Review the switch port lists to simply count the number of ports used on the failing switch and the number of ports available on the remaining switch, to quickly determine that there were enough open ports available on the remaining switch to accomodate everything.
  2. Edit the "port" list for the remaining switch, cutting and pasting the lines from the list for failing switch, and making minor adjustments as necessary (in particular, to port names, since it's unlikely that the open ports on the remaining switch exactly correspond to the used ports on the failing switch).
  3. Generate the new config file for the remaining switch, as well as all dependent config files (i.e., for the monitoring systems).
  4. Inspect the newly-generated config files for reasonability, likely by comparing them to the previously-generated config files from before this change.
  5. Install the newly-generated config files on the relevant systems, using tools such as RANCID or ZipTie.
  6. Check all the updates into the revision control system (RCS, CVS, Subversion, Git, or whatever) so that there's a record of changes and a fallback position.

Comparison of manual and automated results

Using network automation tools such as NCG, RANCID, and ZipTie:

  • The incident could have been resolved in less than an hour, rather than the outage lasting several hours while the incident was resolved by hand.
  • You could be much more confident that the resulting configs were complete, consistent, and correct.
  • All related configurations (i.e., the switch configurations and the monitoring system configurations) could be updated together, maintaining consistency between them.

In my experience, it only takes a week or two of work to use open source tools to assemble a network automation system for an existing network such as this (i.e., a handful of related switches and associated monitoring systems, all of which you already have working manually-created configs for).

Hopefully, my friend's company will see the light, and automate their network management so that they're better prepared for next time; maybe they'll even offer me a consulting contract to help them get there... ;-)

Please contact us to discuss how Netomata can help you avoid problems like this with your network, while making your network more cost-effective, reliable, and flexible.

Speaking about Automating Network Configuration at SASAG (Seattle Area Sysadmin Guild), next Thu 10 Sep 09

I'll be speaking about "Automating Network Configuration" at this month's Seattle Area Sysadmin Guild (SASAG) meeting, which is next Thursday evening (10 Sep 09) at 7:00pm in Room 403 of the EE1 (Electrical Engineering) building on the University of Washington campus.

Here's the description of the talk:

Automating Network Configuration

You've been using tools like Puppet and cfengine to corral the complexity on your servers. You revel in the scalability, reliability, and ease of maintenance of doing it The Right Way. You don't fear the next change because you know the tools will just get it Right. But you still tremble at an "enable" prompt, hoping you remembered all the bits that need to be twiddled, on all the networking devices everywhere. Is your DNS tied on straight - both ways? Is it all *really* being monitored by Nagios? As your network's complexity increases, so do the errors, inconsistencies, and omissions caused by manual configuration, and brokenness abounds. But wait - there's a way out of the swamp! Come hear world-renowned networking expert and popular BayLISA speaker Brent Chapman as he reveals methods and tools for automating the mind-numbing task of configuring network devices and services. Among other things, he'll talk about his cool new open source "Netomata Config Generator", which addresses some of these problems.

Brent Chapman is the founder, CEO, and technical lead of Netomata, Inc. He is the coauthor of the highly regarded O'Reilly & Associates book Building Internet Firewalls. He is also the founder of the Firewalls, List-Managers, and Network-Automation Internet mailing lists, and the creator of the Majordomo mailing list management package. In 2004, Brent was honored with the annual SAGE Outstanding Achievement Award "for outstanding sustained contributions to the community of system administrators". He has spoken at BayLISA numerous times over the past 15 years.

SASAG meetings are free and open to the public.

I hope to see you there!

Upcoming Talks

I'll also be presenting a full-day tutorial on Automating Network Configuration and Management at this year's USENIX LISA conference, on Mon 2 Nov 09 in Baltimore.

Please email me (brent@netomata.com) if you're interested in scheduling a presentation for your user group, company, or other organization.

Speaking about Automating Network Configuration at BayLISA, this Thu 20 Aug 09

I've just accepted a last-minute opportunity to speak about "Automating Network Configuration" at this month's BayLISA meeting, which is this Thursday evening (20 Aug 09) at 7:30pm at the LinkedIn offices at 2027 Stierlin Ct. in Mountain View, CA.

Here's the description of the talk:

Automating Network Configuration

You've been using tools like Puppet and cfengine to corral the complexity on your servers. You revel in the scalability, reliability, and ease of maintenance of doing it The Right Way. You don't fear the next change because you know the tools will just get it Right. But you still tremble at an "enable" prompt, hoping you remembered all the bits that need to be twiddled, on all the networking devices everywhere. Is your DNS tied on straight - both ways? Is it all *really* being monitored by Nagios? As your network's complexity increases, so do the errors, inconsistencies, and omissions caused by manual configuration, and brokenness abounds. But wait - there's a way out of the swamp! Come hear world-renowned networking expert and popular BayLISA speaker Brent Chapman as he reveals methods and tools for automating the mind-numbing task of configuring network devices and services. Among other things, he'll talk about his cool new open source "Netomata Config Generator", which addresses some of these problems.

Brent Chapman is the founder, CEO, and technical lead of Netomata, Inc. He is the coauthor of the highly regarded O'Reilly & Associates book Building Internet Firewalls. He is also the founder of the Firewalls, List-Managers, and Network-Automation Internet mailing lists, and the creator of the Majordomo mailing list management package. In 2004, Brent was honored with the annual SAGE Outstanding Achievement Award "for outstanding sustained contributions to the community of system administrators". He has spoken at BayLISA numerous times over the past 15 years.

I hope to see you there!

BayLISA meetings are free and open to the public. To help them know how many attendees to prepare for, BayLISA requests RSVPs by email to rsvp@baylisa.org.

Upcoming Talks

I'll be giving this same "Automating Network Configuration" talk in Seattle on the evening of Thu 10 Sep 09 for SASAG (the Seattle Area System Administrators Guild).

I'll also be presenting a full-day tutorial on Automating Network Configuration and Management at this year's USENIX LISA conference, on Mon 2 Nov 09 in Baltimore.

Please email me (brent@netomata.com) if you're interested in scheduling a presentation for your user group, company, or other organization.

Full-day network automation tutorial at USENIX LISA Conference, Mon 2 Nov 09

I'll be presenting a full-day tutorial on "Automating Network Configuration and Management" at the 2009 USENIX LISA Conference, on Monday 2 November 2009 in Baltimore MD.

For over 20 years, the Large Installation System Administration Conference (LISA) has been the must-attend system administration conference. It offers an unparalleled opportunity to meet and mingle with the leaders of the system administration industry.

-- Adam Moskowitz, LISA 2009 Program Chair

I'll also be doing a "Guru is In" Q&A session on Network Management on Thursday afternoon at the conference.

I'm going to LISA '09I hope I'll see you there!

Details about the "Automating Network Configuration and Management" Tutorial

Who should attend: Network and system administrators who want to bring the benefits of automated configuration and management to their networks. These benefits include consistency, reliability, repeatability, and scalability; the automation techniques covered apply to the whole range of network devices (routers, switches, load balancers, firewalls, etc.) and services (SNMP status and performance monitoring, DNS, DHCP, ACLs, routing, etc.). Students should already be generally familiar with networking fundamentals (addressing, naming, routing, etc.), the roles and basic methods of operation of common network devices and services, and how these devices and services are typically configured and managed by hand; this tutorial isn't going to teach you what a firewall is or how it works, for example, but it will teach you how to automate the configuration and management of a typical firewall.

This tutorial introduces students to a variety of network automation principles and practices, as well as to specific network automation tools such as Netomata Config Generator (NCG) for generating device/service config files, RANCID and ZipTie for managing configs on devices, and Nagios and MRTG for SNMP network status and performance monitoring. In addition, the tutorial shows how to integrate these network automation tools with host automation tools such as Puppet and Cfengine.

Take back to work: Effective techniques for automating the configuration and management of common network devices and services, as well as approaches to getting the most out of automation and arguments to convince peers, managers, and executives that automation is worth the effort.

Topics include:

  • Benefits of automation
  • Aspects of automation
    • Keeping track of what is connected to your network, and how
    • Generating configs
    • Getting configs to and from devices
    • Change management and control
    • Principles of automation
    • Levels of automation
  • Tools
    • RANCID
    • ZipTie
    • NCG (Netomata Config Generator)
    • Vendor-specific device configuration tools
  • Automating configuration of network devices
    • Routers
    • Switches
    • Firewalls
    • Load balancers
    • PDUs
  • Automating configuration of network services
    • SNMP status monitoring (e.g., Nagios)
    • SNMP trend monitoring (e.g., MRTG)
    • DNS
    • DHCP
    • ACLs
    • VLANs
    • VPNs
  • Integration with host automation systems, such as Puppet and Cfengine
  • Best practices, pearls of wisdom, tips and tricks
  • Emerging trends and special circumstances
    • Virtualization
    • Cloud computing (including public, private, and hybrid clouds)
    • QA labs, testbeds, and development environments
    • IPv6
    • CoBIT
    • ITIL
  • Strategies for promoting automation in your organization
    • Arguments to convince management to support automation
    • Arguments to convince staff to support automation
    • Methods for gradually automating existing networks

LISA '09

Using NCG to configure the IETF meeting network

I just got back from two weeks in Sweden, where I was helping the IETF (Internet Engineering Task Force) use Netomata Config Generator (NCG) to set up and manage the LAN for their thrice-annual meeting.

The IETF is the key infrastructure standards body for the Internet. They meet for a week approximately every 4 months, in locations around the world. There were about 1100 attendees at last week's meeting in Stockholm, Sweden.

Because of both the nature of its work and the nature of its attendees, IETF meetings require a fairly heavy-duty LAN and significant Internet connectivity, well beyond what most meeting facilities can supply directly; on a typical day during the meeting, for example, we were pulling down an average of about 40 Mb/s from the Internet, with peaks to 60+ Mb/s (and we were sending an average of about 12 Mb/s, with peaks to 26 Mb/s). The net was organized into a dozen or so VLANs and about 8 WiFi SSIDs, spread out across 5 floors of the Stockholm City Conference Centre and surrounding facilities. We were using IPv6 on all the VLANs (and only IPv6 on one of them), BGP to our upstream ISP, SNMP, VoIP, streaming audio and video, and a wide variety of other "challenging" protocols; it's a complex set of requirements.

Physically, the meeting's network consisted of a pair of Juniper M7100 edge/core routers, 20 or so Cisco 3560 and 3750 switches, and about 35 Cisco 1250 wireless access points (WAPs), plus associated server infrastructure (a handful of Linux servers running as virtual machines on a couple of VMWare hosts, for stuff like DNS, DHCP, RANCID, and Cacti). Connections were mostly Cat5 copper, with a few fiber runs to some of the more distant switches.

I'm told that this was actually a relatively small and quiet net for an IETF meeting, compared to the past several meetings; the number of attendees was down (because of the economy), and the meeting venue was fairly compact (which means we needed fewer switches and WAPs to cover it).

The IETF meeting network has some unique characteristics:

  • Because the meeting's organizers pay for every day that they're using the meeting facility (including setup days), the IETF networking team only has a couple of days before the meeting to get the network fully installed and configured.
  • Because the meeting is only a week long, time is critical when it comes to making changes to the network; on this network, expectations for time to complete user change requests are measured in minutes, versus hours or even days on a more "typical" enterprise network.
  • Because they build a network like this every 4 months, the IETF networking team has a standard network design that they use for each meeting, with variations for local circumstances (more or fewer APs or switches to handle the venue, different upstream ISP and BGP arrangements, timezone and other localization changes, etc.).

Netomata Config Generator (NCG) makes it easy to make the changes needed for a particular meeting's network, then generate complete, consistent, ready-to-install config files for all the routers, switches, and WAPs as well as to generate DNS data files, MRTG monitoring system config files, and RANCID config files for all of the network infrastructure.

After making a change to the network design (adding a new device, changing some parameter for an existing device, changing a parameter for all devices of a particular type, or whatever), NCG could regenerate all the config files for all the devices and services in about a minute.

To understand the value of using NCG, consider:

  • How much time it would have taken to manually configure 2 routers, 20 switches, and 35 WAPs
  • How many errors, omissions, and inconsistencies there would have been in those manually-created configurations
    • How much time would have been taken up trouble-shooting the problems created by those errors, omissions, and inconsistencies
    • How much of an impact those problems would have had on the network's users
  • How difficult and time-consuming it would have been to go back and make a change across all those devices

Our "Benefits of Automating Network Configuration" page discusses these benefits in more detail.

I got involved with this project through Jim Martin, who is the NOC Team Lead on the IETF networking team, as well as a well-regarded network architecture consultant. A few months ago, we were discussing NCG and how it might be useful for some of his projects, and one thing led to another, culminating in this trip to Sweden. I really enjoyed working with the IETF networking team, as well as visiting a wonderful part of the world that I'd never been to before. Thanks, Jim!

Netomata can make your network more cost-effective, reliable, and flexible! Please contact us today to discuss how we can help:

Generating Cisco "type 5" keys (MD5 hashes) using openssl

If you have service password-encryption set in a Cisco config, it will convert clear-text passwords that you give it (for "enable secret" lines, "username ..." lines, and so forth) to MD5 hashes in the saved config, so that the clear-text passwords aren't visible.

If you'd rather generate the MD5 hashes yourself, so that you never write a clear-text password into a config file that you're creating or editing, it turns out that you can use the openssl command (which is available for most UNIX, Linux, and Mac systems) to do so. For full details and instructions, see our http://www.netomata.com/wiki/cisco_tips#md5_hashes wiki page.

At the O'Reilly Velocity Conference, Tue-Wed 23-24 Jun 09 in San Jose

I'm at the O'Reilly Velocity Conference ("the Web Performance and Operations conference") in San Jose today and tomorrow (Tue-Wed 23-24 Jun 09). If you're here too, I hope you'll catch me and say hello...


Velocity, the Web Performance and Operations Conference 2009

Syndicate content