17th NMRG-Meeting (November 14th, 2004, UC Davis)
=================================================

Participants:
-------------

# Aiko Pras 
  (University of Twente, The Netherlands)
# Alexander Clemm 
  (Cisco, USA)
# Bert Wijnen 
  (Lucent)
# Dave Perkins 
  (Trapeze Networks, USA)
# Felix Wu 
  (UC Davis, USA)
# James Won-Ki Hong 
  (Postech, Korea)
# Jürgen Schönwälder 
  (International University Bremen, Germany)
# Lisandro Granville 
  (Federal University of Rio Grande do Sul, Brazil)
# Luciano Paschoal Gaspary 
  (Universidade do Vale do Rio dos Sinos, Brazil)
# Olivier Festor 
  (LORIA-INRIA, France)
# Radu State 
  (LORIA-INRIA, France)
# Raouf Boutaba 
  (University of Waterloo, Canada)
# Wes Hardaker 
  (Sparta, USA)

Minutes: Luciano (notes taken by Luciano, Wes and Jürgen)

Agenda:
-------

1. IAB Review (Jürgen)
2. SMI Metrics and Analysis (Jürgen)
3. Web Services for Management (Aiko)
4. Performance Evaluation for Management Frameworks (Olivier)
5. SNMP Measurements (Aiko and Jürgen)
6. Wrap Up

1. IAB Review
-------------

- Jürgen had a meeting at the last IETF meeting to conduct an IAB
  review about the NMRG IRTF meeting. Most of the IAB was present.
  Meeting was about 1 hour long.

- Slides reviewed: [1]

- Some of the discussions held regarding Jürgen's presentation are
  summarized below.

- What do we understand by web services? Is it the same than SOAP over
  HTTP? Or more than just SOAP? What is the real motivation for using
  web services in network management?

- Distributed Network Management vs. classic centralized network
  management.  Routing area folks surely believed NM should be
  distributed.  Some people want NM that is fully automatic
  (e.g. automatic problem determination and root cause analysis for
  arbitrary technologies).  Some people suggested to focus on managing
  networks rather than devices.

- Lack of useful tools to manage networks. Discussion about usability
  and not-usability. How can the NMRG help? Internal discussion
  between open source tools vs. commercial tools. Agreement that many
  operators like open source stuff, of course.

- SIP discussion and about integration and management of those
  networks. Some belief that deployment of those networks is
  increasing and their management will become problematic in the very
  near future as those app-to-app networks become larger and
  larger. Problem space is not fully understood yet. Need for tools to
  manage SIP networks. There are some applications to manage H.323
  networks, but they are quite rare for SIP.

- Discussion held about IAB's lack of feedback on the issues related
  to more general IRTF issues. What can the NMRG do? How can we get in
  touch better with other groups? Jürgen suggested the promotion of a
  workshop, where different research groups could present and discuss
  their research interests. The workshop could be co-located with
  other conference/workshop/meeting (e.g. INFOCOM, SIGCOMM, IM, IETF
  meeting). Wes proposed the workshop could include all IRTF groups.
  Bert suggested that we should start that if we feel it is needed,
  rather than just making the suggestion. Jürgen said much of the
  issue is getting interest. Jürgen would like greater interaction
  with other works, but his polling of the other groups hasn't shown
  much similar interest. Discussion on what the goals of better
  communication should be. Should the chairs get together? Should the
  groups as a whole get together? Should we try to get better about
  publishing research results as documents which are marked as coming
  from those groups (RFCs aren't marked as research work very well)?
  Discussion about whether some people would come or not.  Bert wanted
  to know what we could do to facilitate interaction with the
  operation/network management areas in the IETF. How can we push the
  results back better? Bert and Jürgen had a discussion about what
  IRTF groups should do vs IETF groups. Bert mentioned that
  standardization work belongs in the IETF; "play with", "experiment
  with", etc should be in the IETF. In same cases when the parties are
  very divided about the best way forward, then it should go to a
  research group. Jürgen discussed some of the topics that have groups
  in both areas. Bert believes that some people like to get stuff into
  the IRTF groups just to publish RFCs through a different
  mechanism. Pure research should stay in university and like
  environments and the IRTF should sit between that and the IETF.

2. SMI Metrics and Analysis
---------------------------

- Jürgen presented the results of his investigation on SMI metrics and
  analysis. He wanted to simply collect some statistics and better
  understand them. How many MIB objects are out there? How many are
  writable? What features are being used heavily, what are not? What's
  the time scale of those modules? Typical size of a module? Size of
  values?

- Presentation of results: [2]

- Discussion about how to obtain more MIB module sets.

- 2002 has the biggest spike in published/republished modules.  IETF
  starts early; 2001 has a big dip; 2004 looks fairly big.  Cisco:
  2002 was a huge spike; 2004 looks low. Enterasys: started later with
  SMIv2 (98); last few years are fairly big. Juniper: huge spike in
  2002/2003; started 98; 2004 looks quite low.

- Republish: republish slower for standards bodies.

- In some companies MIB modules are revised very often, while in
  others they are not. Aiko asked if the low rate of MIB module
  revision is result of a policy or conservative practice.

- Type usage: Dave commented that bits are frequently not understood
  and people use integers instead (when they really want bits).

- Jürgen highlighted that there seems to exist a correlation between
  Cisco and IETF regarding the distribution of objects based on the
  MAX-ACCESS attribute.

- It would be nice to see how policy affected publication of
  particular object types/accessibility.

- Dave asked if the high number of indexes found in some tables are
  due to filtering mechanisms.

- Discussion on object sizes and whether or not 484 is realistic. What
  should be currently realistic minimum max message size? Many doubt
  whether 484 is currently a reasonable minimum max message size.

- Discussion on how to calculate MIB module complexity (e.g. via
  description size, reference usage, etc). How to estimate that
  numerically? Bert wanted to use the results of a complexity guess to
  assign more or less MIB module reviewers. It was understood that the
  complexity may come from ineptness, protocol complexity, and other
  sources and thus it would be hard to figure out where it came
  from. But some argued it would still be useful to look at the
  results of.

3. Web Services for Management
------------------------------

- Aiko presented some slides comparing the performance of SNMP and web
  service to retrieve network management data. Three metrics have been
  considered: bandwidth, latency (round-trip time), and CPU usage.

- Presentation of results: [3]

- SNMP bandwidth: calculated formulas for GET/BULK/etc; packet lengths
  calculated using SNMPv1/2c and not SNMPv3.

- Data collected from a fairly large number of devices to determine
  what the average OID size and average data size was.

- CPU time shows that web service compression takes more time than XML
  encoding, which takes slightly more encoding than BER encoding.

- CPU time for Net-SNMP 5.0.9 getting data is a huge hit in CPU
  intensity and is much higher than the web service compression.
  Besides, web service retrieval is fairly efficient. There was a lot
  of discussion about how the architecture of the two things was done
  and why Net-SNMP is poorly designed.

- The main conclusions of Aiko's presentation were: (a) SNMP is better
  for a small number of objects; (b) compressed web services is better
  for a large number of objects; (c) XML versus BER may not be the
  main issue; (d) data retrieval is frequently problematic; (e)
  different SNMP agents perform quite differently.

- An aspect that has been discussed is how much the development of
  SNMP agents based on a sub-agent architecture impacts their
  performance. Topics commented on as alternatives to improve agent
  performance were caching and development of functionalities at the
  kernel level.

- Some discussion was held about how web services interfaced with the
  kernel vs. how SNMP agents would by default. Most web service
  requests would be organized as get-this-large-collection and it
  wouldn't even come into the consideration of those programmers not
  to remember the data retrieved from the kernel unlike SNMP (where
  requests frequently always re-pull the data).

4. Performance Evaluation for Management Frameworks
---------------------------------------------------

- Olivier presented his research interest towards understanding
  management traffic patterns (e.g. what is the common behavior of
  managers, what are the traffic patterns the manager generates, what
  are the usual pooling rates). He claimed it is difficult to compare
  different investigations because they use parameters with
  non-homogeneous meanings. Can we agree on certain specific metrics?
  Something similar to IPPM in the context of network management.

- The understanding of management traffic patterns (e.g.  algorithms
  used by management applications) could help, for example, in the
  definition of simulation models (as already available for other
  research topics in networking). Wouldn't it be cool to have an ns2
  tool which generates management traffic and which simulates agents?

- Raouf liked the idea of defining measurement parameters and
  scenarios in order to evaluate the quality of management solutions.

- A common model for network management simulations. Note the risks of
  using a common simulation model without explicit continued
  validation. For example polling: adaptive polling, event-driven
  polling, usual polling rates, etc.

- Monitoring of a few health-type metrics. Can we more formally define
  those patterns?  There was some disagreement about whether it would
  be possible to standardize on those things and if there was such a
  commonality.

- There was understanding that simulation models are frequently broken
  and do not appropriately reflect reality much of the time.  Many
  people also inherit other peoples models without validating if they
  are correct for their experiments.

5. SNMP Measurements
--------------------

- Other approach related to SNMP dynamics understanding (e.g. what
  OIDs are requested, how smart are management applications, etc) was
  introduced by Jürgen. The idea is to collect SNMP traffic from real
  production networks.

- Jürgen showed a tool that is able to read a traffic traces (pcap
  format) and export SNMP packets in a XML document. It also allows to
  suppress sensitive data and anonymize each packet content (e.g. IP
  source/destination addresses and SNMP communities).

- The idea is to ask network operators to run the tool and provide XML
  files containing the traffic traces.

- Lots of discussion on the legal difficulties in doing this. How can
  we get around it? Bert/Wes complained that even by anonymizing the
  content of the packets it would be possible to infer the network
  topology.

- Different legislations may apply for a single traffic trace.  Note
  that user data may be involved in SNMP payload. Generic data sources
  may be useless for collecting traces since the collectors might be
  attached to the wrong places. Need to attach probes close to
  management stations.

- In summary, having the data too look at would be nice, but it will
  be difficult to get it and it would be representational of a single
  environment.

- Felix briefly described a Homeland Security project, which aims at
  providing a framework to collect classes of data that are
  interesting and/or people want to be collected. There are
  Universities and companies supporting the project (e.g. University
  of Michigan, University of Washington, Internet2, CAIDA, Verio, and
  Equinix).

6. Wrap Up
----------

- Two documents are going to be produced: (a) metrics for network
  management (Olivier, Aiko, James, and Raouf); (b) measurements
  guidelines (Aiko, Jürgen, and Olivier).

- Discussion of when to hold future meetings. A few were suggested
  (IM/Nice, IETF/Paris, DSOM/Barcelona), but nothing definitive
  selected yet.

References
----------

[1] NMRG Status Report '2004 
    <http://www.ibr.cs.tu-bs.de/projects/nmrg/>

[2] Characterization of SNMP MIB Modules 
    <http://www.ibr.cs.tu-bs.de/projects/nmrg/meetings/2004/davis/>

[3] Web Service for Management - How is Performance?
    <http://www.ibr.cs.tu-bs.de/projects/nmrg/meetings/2004/davis/>