17th NMRG-Meeting (November 14th, 2004, UC Davis) ================================================= Participants: ------------- # Aiko Pras (University of Twente, The Netherlands) # Alexander Clemm (Cisco, USA) # Bert Wijnen (Lucent) # Dave Perkins (Trapeze Networks, USA) # Felix Wu (UC Davis, USA) # James Won-Ki Hong (Postech, Korea) # Jürgen Schönwälder (International University Bremen, Germany) # Lisandro Granville (Federal University of Rio Grande do Sul, Brazil) # Luciano Paschoal Gaspary (Universidade do Vale do Rio dos Sinos, Brazil) # Olivier Festor (LORIA-INRIA, France) # Radu State (LORIA-INRIA, France) # Raouf Boutaba (University of Waterloo, Canada) # Wes Hardaker (Sparta, USA) Minutes: Luciano (notes taken by Luciano, Wes and Jürgen) Agenda: ------- 1. IAB Review (Jürgen) 2. SMI Metrics and Analysis (Jürgen) 3. Web Services for Management (Aiko) 4. Performance Evaluation for Management Frameworks (Olivier) 5. SNMP Measurements (Aiko and Jürgen) 6. Wrap Up 1. IAB Review ------------- - Jürgen had a meeting at the last IETF meeting to conduct an IAB review about the NMRG IRTF meeting. Most of the IAB was present. Meeting was about 1 hour long. - Slides reviewed: [1] - Some of the discussions held regarding Jürgen's presentation are summarized below. - What do we understand by web services? Is it the same than SOAP over HTTP? Or more than just SOAP? What is the real motivation for using web services in network management? - Distributed Network Management vs. classic centralized network management. Routing area folks surely believed NM should be distributed. Some people want NM that is fully automatic (e.g. automatic problem determination and root cause analysis for arbitrary technologies). Some people suggested to focus on managing networks rather than devices. - Lack of useful tools to manage networks. Discussion about usability and not-usability. How can the NMRG help? Internal discussion between open source tools vs. commercial tools. Agreement that many operators like open source stuff, of course. - SIP discussion and about integration and management of those networks. Some belief that deployment of those networks is increasing and their management will become problematic in the very near future as those app-to-app networks become larger and larger. Problem space is not fully understood yet. Need for tools to manage SIP networks. There are some applications to manage H.323 networks, but they are quite rare for SIP. - Discussion held about IAB's lack of feedback on the issues related to more general IRTF issues. What can the NMRG do? How can we get in touch better with other groups? Jürgen suggested the promotion of a workshop, where different research groups could present and discuss their research interests. The workshop could be co-located with other conference/workshop/meeting (e.g. INFOCOM, SIGCOMM, IM, IETF meeting). Wes proposed the workshop could include all IRTF groups. Bert suggested that we should start that if we feel it is needed, rather than just making the suggestion. Jürgen said much of the issue is getting interest. Jürgen would like greater interaction with other works, but his polling of the other groups hasn't shown much similar interest. Discussion on what the goals of better communication should be. Should the chairs get together? Should the groups as a whole get together? Should we try to get better about publishing research results as documents which are marked as coming from those groups (RFCs aren't marked as research work very well)? Discussion about whether some people would come or not. Bert wanted to know what we could do to facilitate interaction with the operation/network management areas in the IETF. How can we push the results back better? Bert and Jürgen had a discussion about what IRTF groups should do vs IETF groups. Bert mentioned that standardization work belongs in the IETF; "play with", "experiment with", etc should be in the IETF. In same cases when the parties are very divided about the best way forward, then it should go to a research group. Jürgen discussed some of the topics that have groups in both areas. Bert believes that some people like to get stuff into the IRTF groups just to publish RFCs through a different mechanism. Pure research should stay in university and like environments and the IRTF should sit between that and the IETF. 2. SMI Metrics and Analysis --------------------------- - Jürgen presented the results of his investigation on SMI metrics and analysis. He wanted to simply collect some statistics and better understand them. How many MIB objects are out there? How many are writable? What features are being used heavily, what are not? What's the time scale of those modules? Typical size of a module? Size of values? - Presentation of results: [2] - Discussion about how to obtain more MIB module sets. - 2002 has the biggest spike in published/republished modules. IETF starts early; 2001 has a big dip; 2004 looks fairly big. Cisco: 2002 was a huge spike; 2004 looks low. Enterasys: started later with SMIv2 (98); last few years are fairly big. Juniper: huge spike in 2002/2003; started 98; 2004 looks quite low. - Republish: republish slower for standards bodies. - In some companies MIB modules are revised very often, while in others they are not. Aiko asked if the low rate of MIB module revision is result of a policy or conservative practice. - Type usage: Dave commented that bits are frequently not understood and people use integers instead (when they really want bits). - Jürgen highlighted that there seems to exist a correlation between Cisco and IETF regarding the distribution of objects based on the MAX-ACCESS attribute. - It would be nice to see how policy affected publication of particular object types/accessibility. - Dave asked if the high number of indexes found in some tables are due to filtering mechanisms. - Discussion on object sizes and whether or not 484 is realistic. What should be currently realistic minimum max message size? Many doubt whether 484 is currently a reasonable minimum max message size. - Discussion on how to calculate MIB module complexity (e.g. via description size, reference usage, etc). How to estimate that numerically? Bert wanted to use the results of a complexity guess to assign more or less MIB module reviewers. It was understood that the complexity may come from ineptness, protocol complexity, and other sources and thus it would be hard to figure out where it came from. But some argued it would still be useful to look at the results of. 3. Web Services for Management ------------------------------ - Aiko presented some slides comparing the performance of SNMP and web service to retrieve network management data. Three metrics have been considered: bandwidth, latency (round-trip time), and CPU usage. - Presentation of results: [3] - SNMP bandwidth: calculated formulas for GET/BULK/etc; packet lengths calculated using SNMPv1/2c and not SNMPv3. - Data collected from a fairly large number of devices to determine what the average OID size and average data size was. - CPU time shows that web service compression takes more time than XML encoding, which takes slightly more encoding than BER encoding. - CPU time for Net-SNMP 5.0.9 getting data is a huge hit in CPU intensity and is much higher than the web service compression. Besides, web service retrieval is fairly efficient. There was a lot of discussion about how the architecture of the two things was done and why Net-SNMP is poorly designed. - The main conclusions of Aiko's presentation were: (a) SNMP is better for a small number of objects; (b) compressed web services is better for a large number of objects; (c) XML versus BER may not be the main issue; (d) data retrieval is frequently problematic; (e) different SNMP agents perform quite differently. - An aspect that has been discussed is how much the development of SNMP agents based on a sub-agent architecture impacts their performance. Topics commented on as alternatives to improve agent performance were caching and development of functionalities at the kernel level. - Some discussion was held about how web services interfaced with the kernel vs. how SNMP agents would by default. Most web service requests would be organized as get-this-large-collection and it wouldn't even come into the consideration of those programmers not to remember the data retrieved from the kernel unlike SNMP (where requests frequently always re-pull the data). 4. Performance Evaluation for Management Frameworks --------------------------------------------------- - Olivier presented his research interest towards understanding management traffic patterns (e.g. what is the common behavior of managers, what are the traffic patterns the manager generates, what are the usual pooling rates). He claimed it is difficult to compare different investigations because they use parameters with non-homogeneous meanings. Can we agree on certain specific metrics? Something similar to IPPM in the context of network management. - The understanding of management traffic patterns (e.g. algorithms used by management applications) could help, for example, in the definition of simulation models (as already available for other research topics in networking). Wouldn't it be cool to have an ns2 tool which generates management traffic and which simulates agents? - Raouf liked the idea of defining measurement parameters and scenarios in order to evaluate the quality of management solutions. - A common model for network management simulations. Note the risks of using a common simulation model without explicit continued validation. For example polling: adaptive polling, event-driven polling, usual polling rates, etc. - Monitoring of a few health-type metrics. Can we more formally define those patterns? There was some disagreement about whether it would be possible to standardize on those things and if there was such a commonality. - There was understanding that simulation models are frequently broken and do not appropriately reflect reality much of the time. Many people also inherit other peoples models without validating if they are correct for their experiments. 5. SNMP Measurements -------------------- - Other approach related to SNMP dynamics understanding (e.g. what OIDs are requested, how smart are management applications, etc) was introduced by Jürgen. The idea is to collect SNMP traffic from real production networks. - Jürgen showed a tool that is able to read a traffic traces (pcap format) and export SNMP packets in a XML document. It also allows to suppress sensitive data and anonymize each packet content (e.g. IP source/destination addresses and SNMP communities). - The idea is to ask network operators to run the tool and provide XML files containing the traffic traces. - Lots of discussion on the legal difficulties in doing this. How can we get around it? Bert/Wes complained that even by anonymizing the content of the packets it would be possible to infer the network topology. - Different legislations may apply for a single traffic trace. Note that user data may be involved in SNMP payload. Generic data sources may be useless for collecting traces since the collectors might be attached to the wrong places. Need to attach probes close to management stations. - In summary, having the data too look at would be nice, but it will be difficult to get it and it would be representational of a single environment. - Felix briefly described a Homeland Security project, which aims at providing a framework to collect classes of data that are interesting and/or people want to be collected. There are Universities and companies supporting the project (e.g. University of Michigan, University of Washington, Internet2, CAIDA, Verio, and Equinix). 6. Wrap Up ---------- - Two documents are going to be produced: (a) metrics for network management (Olivier, Aiko, James, and Raouf); (b) measurements guidelines (Aiko, Jürgen, and Olivier). - Discussion of when to hold future meetings. A few were suggested (IM/Nice, IETF/Paris, DSOM/Barcelona), but nothing definitive selected yet. References ---------- [1] NMRG Status Report '2004 [2] Characterization of SNMP MIB Modules [3] Web Service for Management - How is Performance?