17th NMRG-Meeting (November 14th, 2004, UC Davis)
=================================================
Participants:
-------------
# Aiko Pras
(University of Twente, The Netherlands)
# Alexander Clemm
(Cisco, USA)
# Bert Wijnen
(Lucent)
# Dave Perkins
(Trapeze Networks, USA)
# Felix Wu
(UC Davis, USA)
# James Won-Ki Hong
(Postech, Korea)
# Jürgen Schönwälder
(International University Bremen, Germany)
# Lisandro Granville
(Federal University of Rio Grande do Sul, Brazil)
# Luciano Paschoal Gaspary
(Universidade do Vale do Rio dos Sinos, Brazil)
# Olivier Festor
(LORIA-INRIA, France)
# Radu State
(LORIA-INRIA, France)
# Raouf Boutaba
(University of Waterloo, Canada)
# Wes Hardaker
(Sparta, USA)
Minutes: Luciano (notes taken by Luciano, Wes and Jürgen)
Agenda:
-------
1. IAB Review (Jürgen)
2. SMI Metrics and Analysis (Jürgen)
3. Web Services for Management (Aiko)
4. Performance Evaluation for Management Frameworks (Olivier)
5. SNMP Measurements (Aiko and Jürgen)
6. Wrap Up
1. IAB Review
-------------
- Jürgen had a meeting at the last IETF meeting to conduct an IAB
review about the NMRG IRTF meeting. Most of the IAB was present.
Meeting was about 1 hour long.
- Slides reviewed: [1]
- Some of the discussions held regarding Jürgen's presentation are
summarized below.
- What do we understand by web services? Is it the same than SOAP over
HTTP? Or more than just SOAP? What is the real motivation for using
web services in network management?
- Distributed Network Management vs. classic centralized network
management. Routing area folks surely believed NM should be
distributed. Some people want NM that is fully automatic
(e.g. automatic problem determination and root cause analysis for
arbitrary technologies). Some people suggested to focus on managing
networks rather than devices.
- Lack of useful tools to manage networks. Discussion about usability
and not-usability. How can the NMRG help? Internal discussion
between open source tools vs. commercial tools. Agreement that many
operators like open source stuff, of course.
- SIP discussion and about integration and management of those
networks. Some belief that deployment of those networks is
increasing and their management will become problematic in the very
near future as those app-to-app networks become larger and
larger. Problem space is not fully understood yet. Need for tools to
manage SIP networks. There are some applications to manage H.323
networks, but they are quite rare for SIP.
- Discussion held about IAB's lack of feedback on the issues related
to more general IRTF issues. What can the NMRG do? How can we get in
touch better with other groups? Jürgen suggested the promotion of a
workshop, where different research groups could present and discuss
their research interests. The workshop could be co-located with
other conference/workshop/meeting (e.g. INFOCOM, SIGCOMM, IM, IETF
meeting). Wes proposed the workshop could include all IRTF groups.
Bert suggested that we should start that if we feel it is needed,
rather than just making the suggestion. Jürgen said much of the
issue is getting interest. Jürgen would like greater interaction
with other works, but his polling of the other groups hasn't shown
much similar interest. Discussion on what the goals of better
communication should be. Should the chairs get together? Should the
groups as a whole get together? Should we try to get better about
publishing research results as documents which are marked as coming
from those groups (RFCs aren't marked as research work very well)?
Discussion about whether some people would come or not. Bert wanted
to know what we could do to facilitate interaction with the
operation/network management areas in the IETF. How can we push the
results back better? Bert and Jürgen had a discussion about what
IRTF groups should do vs IETF groups. Bert mentioned that
standardization work belongs in the IETF; "play with", "experiment
with", etc should be in the IETF. In same cases when the parties are
very divided about the best way forward, then it should go to a
research group. Jürgen discussed some of the topics that have groups
in both areas. Bert believes that some people like to get stuff into
the IRTF groups just to publish RFCs through a different
mechanism. Pure research should stay in university and like
environments and the IRTF should sit between that and the IETF.
2. SMI Metrics and Analysis
---------------------------
- Jürgen presented the results of his investigation on SMI metrics and
analysis. He wanted to simply collect some statistics and better
understand them. How many MIB objects are out there? How many are
writable? What features are being used heavily, what are not? What's
the time scale of those modules? Typical size of a module? Size of
values?
- Presentation of results: [2]
- Discussion about how to obtain more MIB module sets.
- 2002 has the biggest spike in published/republished modules. IETF
starts early; 2001 has a big dip; 2004 looks fairly big. Cisco:
2002 was a huge spike; 2004 looks low. Enterasys: started later with
SMIv2 (98); last few years are fairly big. Juniper: huge spike in
2002/2003; started 98; 2004 looks quite low.
- Republish: republish slower for standards bodies.
- In some companies MIB modules are revised very often, while in
others they are not. Aiko asked if the low rate of MIB module
revision is result of a policy or conservative practice.
- Type usage: Dave commented that bits are frequently not understood
and people use integers instead (when they really want bits).
- Jürgen highlighted that there seems to exist a correlation between
Cisco and IETF regarding the distribution of objects based on the
MAX-ACCESS attribute.
- It would be nice to see how policy affected publication of
particular object types/accessibility.
- Dave asked if the high number of indexes found in some tables are
due to filtering mechanisms.
- Discussion on object sizes and whether or not 484 is realistic. What
should be currently realistic minimum max message size? Many doubt
whether 484 is currently a reasonable minimum max message size.
- Discussion on how to calculate MIB module complexity (e.g. via
description size, reference usage, etc). How to estimate that
numerically? Bert wanted to use the results of a complexity guess to
assign more or less MIB module reviewers. It was understood that the
complexity may come from ineptness, protocol complexity, and other
sources and thus it would be hard to figure out where it came
from. But some argued it would still be useful to look at the
results of.
3. Web Services for Management
------------------------------
- Aiko presented some slides comparing the performance of SNMP and web
service to retrieve network management data. Three metrics have been
considered: bandwidth, latency (round-trip time), and CPU usage.
- Presentation of results: [3]
- SNMP bandwidth: calculated formulas for GET/BULK/etc; packet lengths
calculated using SNMPv1/2c and not SNMPv3.
- Data collected from a fairly large number of devices to determine
what the average OID size and average data size was.
- CPU time shows that web service compression takes more time than XML
encoding, which takes slightly more encoding than BER encoding.
- CPU time for Net-SNMP 5.0.9 getting data is a huge hit in CPU
intensity and is much higher than the web service compression.
Besides, web service retrieval is fairly efficient. There was a lot
of discussion about how the architecture of the two things was done
and why Net-SNMP is poorly designed.
- The main conclusions of Aiko's presentation were: (a) SNMP is better
for a small number of objects; (b) compressed web services is better
for a large number of objects; (c) XML versus BER may not be the
main issue; (d) data retrieval is frequently problematic; (e)
different SNMP agents perform quite differently.
- An aspect that has been discussed is how much the development of
SNMP agents based on a sub-agent architecture impacts their
performance. Topics commented on as alternatives to improve agent
performance were caching and development of functionalities at the
kernel level.
- Some discussion was held about how web services interfaced with the
kernel vs. how SNMP agents would by default. Most web service
requests would be organized as get-this-large-collection and it
wouldn't even come into the consideration of those programmers not
to remember the data retrieved from the kernel unlike SNMP (where
requests frequently always re-pull the data).
4. Performance Evaluation for Management Frameworks
---------------------------------------------------
- Olivier presented his research interest towards understanding
management traffic patterns (e.g. what is the common behavior of
managers, what are the traffic patterns the manager generates, what
are the usual pooling rates). He claimed it is difficult to compare
different investigations because they use parameters with
non-homogeneous meanings. Can we agree on certain specific metrics?
Something similar to IPPM in the context of network management.
- The understanding of management traffic patterns (e.g. algorithms
used by management applications) could help, for example, in the
definition of simulation models (as already available for other
research topics in networking). Wouldn't it be cool to have an ns2
tool which generates management traffic and which simulates agents?
- Raouf liked the idea of defining measurement parameters and
scenarios in order to evaluate the quality of management solutions.
- A common model for network management simulations. Note the risks of
using a common simulation model without explicit continued
validation. For example polling: adaptive polling, event-driven
polling, usual polling rates, etc.
- Monitoring of a few health-type metrics. Can we more formally define
those patterns? There was some disagreement about whether it would
be possible to standardize on those things and if there was such a
commonality.
- There was understanding that simulation models are frequently broken
and do not appropriately reflect reality much of the time. Many
people also inherit other peoples models without validating if they
are correct for their experiments.
5. SNMP Measurements
--------------------
- Other approach related to SNMP dynamics understanding (e.g. what
OIDs are requested, how smart are management applications, etc) was
introduced by Jürgen. The idea is to collect SNMP traffic from real
production networks.
- Jürgen showed a tool that is able to read a traffic traces (pcap
format) and export SNMP packets in a XML document. It also allows to
suppress sensitive data and anonymize each packet content (e.g. IP
source/destination addresses and SNMP communities).
- The idea is to ask network operators to run the tool and provide XML
files containing the traffic traces.
- Lots of discussion on the legal difficulties in doing this. How can
we get around it? Bert/Wes complained that even by anonymizing the
content of the packets it would be possible to infer the network
topology.
- Different legislations may apply for a single traffic trace. Note
that user data may be involved in SNMP payload. Generic data sources
may be useless for collecting traces since the collectors might be
attached to the wrong places. Need to attach probes close to
management stations.
- In summary, having the data too look at would be nice, but it will
be difficult to get it and it would be representational of a single
environment.
- Felix briefly described a Homeland Security project, which aims at
providing a framework to collect classes of data that are
interesting and/or people want to be collected. There are
Universities and companies supporting the project (e.g. University
of Michigan, University of Washington, Internet2, CAIDA, Verio, and
Equinix).
6. Wrap Up
----------
- Two documents are going to be produced: (a) metrics for network
management (Olivier, Aiko, James, and Raouf); (b) measurements
guidelines (Aiko, Jürgen, and Olivier).
- Discussion of when to hold future meetings. A few were suggested
(IM/Nice, IETF/Paris, DSOM/Barcelona), but nothing definitive
selected yet.
References
----------
[1] NMRG Status Report '2004
[2] Characterization of SNMP MIB Modules
[3] Web Service for Management - How is Performance?