Attendees --------- J.P. Martin-Flatin Friday Bert Wijnen Friday, Saturday Aiko Pras Friday, Saturday Luca Deri Friday, Saturday Juergen Schoenwaelder Friday, Saturday Markus Brunner Friday, Saturday Szabolcs Boros Friday, Saturday Frank Strauss Friday, Saturday Dave Perkins Friday, Saturday (via phone) 0 Agenda Discussion: -------------------- - We will focus on "visions" and do alarms (shortly) when Dave is awake. - Bert suggests to produce an Internet Draft as input for the IAB meeting. 1 Status Reports: ----------------- - SNMP over TCP is in the hands of the RFC editor, some minor comments, EOS call passed and no major objections were raised. - SNMP payload compression needs more work: o finalize and update the ID (Juergen) o check the EOS working group o Bert willing to review, others showed little interest - SNMP subtree retrieval (consensus to drop it) o tell EOS that the NMRG drops work on subtree retrieval since it falls into the EOS charter (Juergen) - Information modeling vs. data modeling (Aiko instead Szabolcs) o Aiko Pras volunteers to write a paper summarizing the NMRG meeting in December 2000. o Where to publish? Aiko does not like to have too many articles written by him in Simple Times. o Jean-Philippe published two papers at DBTel 2001 and NOMS 2002, and has a paper for the IEEE Communications Magazine under review (accepted with minor modifications). o Aiko's first draft is due 1st June, more complete draft paper before the summer holidays (whatever this means to Aiko) o IEEE Communication Surveys and Tutorials might be another candidate for publication o Probably also a good RFC - SMIng: o currently two proposals in the SMIng working group, NMRG SMIng and SMI-DS o general idea (according to Bert) is to merge the proposals o stimulate discussion on the SMIng WG mailing list rather than discuss things in the NMRG WG - SMI to XML and SMI to XML schema discussion: o Juergen presents a short example explaining what the difference is o Aiko says it depends on the vision whether he is interested o Jean-Philippe thinks automatic translation is not enough o Who is interested to continue the basic mapping? unclear - deferred o Who is interested to continue the schema mapping? unclear - deferred o Luca is interested in having a proxy from SNMP to XML 3. Network Management Visions ----------------------------- o Luca presents his vision: - security should be done outside of management protocols - protocol should be ASCII and human readable text - XML is fine since it is easy to translate - management protocols should be useful for configuration and monitoring - simple query language support (filtering) - Frank asks whether XPath can be used for filtering? Luca does not know o Aiko presents his vision: - more and more unskilled network "operators" (home networks) - every house will have Internet devices and we have to think about making this work - management must be simple in the sense that it is something everybody understands and knows - web services are the technologies that will be used by everyone and network management should use them as well - how can we continue to use our value (MIBs) in such an environment? o Bert presents his vision: - we will be in the same or even a deeper mess - too many people are not willing to agree on standards - too much focus on end-to-end quality of service - self-managing devices are desired (scripts in some form or the other) - plug-and-play protocols and self-management protocols - configuration (example: cloning of configurations) - configuration synchronization between devices and central databases - combined data models for CLI and management protocols would be a good thing - allow vendors to easily define their own data models to address the need for management interfaces before standards time - higher levels of service management (CORBA, XML) - perhaps this should be handled by organizations outside of the IETF - doubts that policies will play a big role o Frank presents his vision: - domain experts lack management knowledge and the process of writing management definitions is heavy - SNMP was OK 12 years ago but the world has changed quite a bit since then - evolutionary vs. revolutionary approach? no answer, tends to prefer evolution on largely deployed technologies - expectation: small steps forward, keep using SNMP for monitoring - processing data is more important than moving data - use of some web service technologies, focus on processing and storage, not necessarily communication of data o Jean-Philippe presents his vision: - need to refocus on the openness of management - can we avoid vendor lock-in? - today most platforms are largely proprietary - open architecture on top of Web services for management: new opportunity to open management platforms again (time window: 2-3 years) - information modeling is a major problem to be solved - IETF business model worked fine with simple data models, but does not work anymore with more complex data models (because of a lack of good modelers) - more complex data models (UIMs) may be done by other organizations, while simple data models by IETF/DMTF - need progress on configuration (automation, ldap directories are not the way to go, what are good repositories?, what about distribution?, are xml repositories a way forward?) o Juergen presents his vision: - get rid of special purpose technologies where appropriate - expect programs on the devices - management protocols will play a less important role - standards for XML schemas will not work - since it is too easy to map between schemas - loose coupling rather than strict coupling (let people define their own interfaces and make adaptions really cheap) - standards between management applications do not work - creation of timely standards is a serious issue - coordination is a major problem between organizations and WGs o Markus presents his vision: - more management in the future - no belief in integrated networks; considers IP another service of the network - standards between management systems/components (QoS and billing for example) - IP only works because there is a well managed underlying network; if you take away the underlying network, IP management has to fill the hole - accounting and billing is an important management function - handle mobility o Szabolcs presents his vision: - we can't predict the future - we still need management, but probably less management - move focus from element management to service management o Later, Frank presents a second vision (and Aiko explains it ;-): | | |-------| API | ... | +-------+ | | SNMP | +-------+ | o o o | | ... | - At a very high abstraction level we just move data around between devices and databases plus some translation between it. - Does this really help us? o Dave Perkins presents his vision: - economic evolution has brought people back to use established technologies and there is less experimentation with new stuff - three market areas with very different management needs: + service providers markets + large enterprise markets + home and small enterprises market - web based management stuff plays a big role in home and small business markets, ease of use, plug and play, automatic firewall configurations - large enterprises have hard times to separate management of devices and management of networks, good at application management - service providers focused on networks, strong need for "exception-based" management, alarm stuff is a key issue in "exception-based" management - push to simplify things - important to have standards, we will continue to have separate standards, enterprise and ISP market will call for standardization to make it easier to connect devices Discussion: - Aiko starts to talk about cyclic models (generic management protocols (manager/agents) versus specific management protocols (DHCP, OSPF, ...) and the trend to create specific protocols once you understand the problem. - Jean-Philippe steps ahead to organize things and proposed the following categories: o context changes o configuration o automated management o data modeling o vendor lock-in vs. open management o XML and Web Services o accounting & billing o standardization process problems Configuration management: ------------------------- + management protocols should be useful for configuration and monitoring + configuration (example: cloning of configurations) + configuration synchronization between devices and central databases + need progress on configuration: automation, LDAP directories are too slow, what are good repository technologies for config data?, how should these repositories be distributed?, are XML repositories the way to go? Is there a need for integrated monitoring and configuration? - Integration at the data model level strongly desirable, not necessarily (but preferably) also at the protocol level. - Monitoring is really monitoring of status and monitoring of statistics. Status monitoring needs to be exception based while statistics require some bulking and offline support. - With multiple data models, instance identification will become a serious problem. A single data model avoids that problem. - Need to clear where the authoritative copy of a piece of configuration is. - There might be different views on the same data (e.g., show the members of a group or the groups someone is a member of) - Make distinctions between status, statistics and configuration data. Lifetimes of values are predictable for configuration, status depends on the operation of the device, statistics depend on the traffic going through the device. - Referential integrity across separate configuration operations. How do we keep configuration databases in synchronisation? - Does regular comparision of configuration data suffice to check for synchronisation? - Do we need reliable configuration change events? - Juergen says this is needed for scalability since regular configuration dumps won't scale and introduce high delays. - Requires to identify which objects are configuration objects. - With SNMP/SMI this is not possible (SMI does not identify configuration objects, SNMP requires referential integrity so you can't dump data and feed it into another device). Should devices log configuration changes locally? - Infrastructure should allow to support this and leave it to operators to decide whether they use it. - Are uncontrolled changes not just a security problem? Dave says it is a process problem that does not go away. - If it is an organizational problem, can or should we solve that by adding technology? - Juergen remarks that in system administration, people check for unwanted changes on files etc. even though they in principle should not happen. There is a real problem which needs to be addressed. - Aiko is not convinced that fixing what he sees as an organizational problem with management technology makes sense. - Lack of transactions are probably not the main problem with current configuration mechanisms. - Actually some CLIs have some transaction models (restrict only one user in write mode, commit changes when leaving a mode). Can Web services provide the services to clone configurations? - Luca says that this is possible with basically all protocols unless there is length restrictions for operations (such as in SNMP). - What are the semantics? Merge changes or reset and configure from a well know state? Both semantics should be supported. Data Modeling: -------------- + combined data models for CLI and management protocols would be a good thing + more complex data models (UIMs) may be done by other organizations, while simple data models by IETF/DMTF + allow vendors to easily define their own data models to address the need for management interfaces before standards time + information modeling is a major problem to be solved Should we use the same technology for managing a variety of things? - Jean-Philippe believes that we should accept that there will be multiple data modeling languages, because de facto there are multiple standards organizations: IETF, DMTF, TMF, etc. - Within the IETF, integration on a single data model would be a good thing. - Aiko sees benefits in having a single technology to manage various aspects of various devices. - There are devices which have internally a common data model and others that have not. Do we know which ones are better to manage/configure? Might have benefits for the vendors in the longer lifetime of a product (less testing etc.) User benefits are probably better and more consistent documentation. - People sometimes prefer to implement CLIs by doing modeling with C data structures that are closely aligned with the internal data structures and the hardware. Does not require to abstract from the internals. Context: -------- + every house will have Internet devices and we have to think about making this work + management must be simple in the sense that it is something everybody understands and knows - All things that could be done reasonably with SNMP have already been done. - Juniper and Extreme boxes are already to a large extent programmable. Web Services: ------------- - Aiko is convinced that Web services will be the big thing. - Juergen says there will be another technology tomorrow. - What are Web Services? SOAP, WSDL - Aiko suggests to map MIB modules - or even better the information model behind them - to WSDL. - Juergen asks whether changing the protocol really gives us better management applications? - Luca agrees that web services will be everywhere but is not sure that we use it for everything. - Will web services not again bring us into trouble of not being able to copy configuration due to sequencing problems? - Jean-Philippe believes that Web services have a good chance to succeed. What are the reasons for using Web Services? - They will be available on all systems. - They will be a standard technology, not specific to management. - Programming applications will be easy. - If it succeeds, than using it makes things easier. - Is it because the W3C and Microsoft/IBM are behind it? - Skills and training advantages for common technologies. Standardization: ---------------- + standards between management applications do not work + creation of timely standards is a serious issue + coordination is a major problem between organizations and WGs What about Telcordia? Why are things so slow in the IETF? - Distinguish between MIBs and management infrastructure (protocol and data modeling infrastructure). - All things are not moving fast enough: - process too complicated these days - lack of motivation to actually work together - At the moment, the protocol is not moving at all, modeling is moving but too slow, some MIBs are doing fine while there are many others that are not doing fine. - Lack of direction (COPS vs. SNMP, PCIM vs. SNMP, ...) - Progress is being made more in groups where code is being done while groups that basically do paper work are usually not moving that fast. - Some WGs are forced to produce MIBs but there is no real interest in the WGs and thus they move slowly. - Tendency to over-engineer MIBs. - Where does the complexity come in? Is the IETF process a root for too complex things? - Too many too specialized MIBs might be a problem. - Can the NMRG make a list of examples where we have gone over board? - Aiko says that SNMPv3 security is a good example. - A decision to use SNMP only for monitoring would greatly simplify things. Can we collect measurement data about what is really being used on networks? - RMON technology is outdated for 5 years, NetraMet technology is outdated now. There is more interest in application level measurements rather than generic IP or transport level stuff. - Is RMON being used in practice? Probably in enterprise networks? We do not know - there is a need for measurements. - The time between versions is too long. - Even simple additions to standard MIBs (e.g., adding system load to HOST-RESOURCES-MIB) are so time consuming that people do not even try it. - Even with other technologies, the problem remains the same. What can be expected to be reasonably standardized? - Some people are more interested in getting names on RFCs or making their ideas an official standard rather than trying to agree on things for goodness. - Operators have to take the initiative to force standards on vendors. But operators are not pushing this. Without this, no standards will happen. - Previously, there were many small companies that were willing to standardize. Big leaders do not so much desire standards. Why management standards? - Standards are useful if they create a market. - Management standards usually do not create a market. - Humans can adapt and there is no need to standardize management interfaces that are used by humans. Standards are only needed when programs process management information. Who needs standards? Automated Management: --------------------- - custom control program (scripts) running on network devices - exception based management - self-configuration Exception-based Management: --------------------------- - Dave presents some slides on what he calls exception-based management (appended below) - Dave says that people started trying to use the existing notification-type stuff to be used for alarms. It is problematic to differ usual state changes and alarms. Perceived severity to classify alarms. Alarm handling systems so far are quite static(?). - Juergen prefers that problems with the proposals in the DISMAN WG to be discussed in that WG. The NMRG should discuss just the general view. - Dave points to a problem using notifications: There is no sequence number in the notification to refer to explicit events. Bert answers that adding a counter object as a sequence number would not be a big deal. Some discussion starts since the counter would be counter for each notification receiver target. - Juergen says that we should not try to engineer the solution. However Dave's presentation is fine to get common understanding of what exception based management is. Future of the NMRG: ------------------- - Dave's minimum input: IETF should standardize existing implemented technology. NMRG should try new ideas out and then go to working groups. Dave wants the NMRG to do much more applied research. - Who is going to pay for that? - Jean-Philippe says there are a number of U.S. Universities who get a lot of hardware and software for free from vendors, while European Universities rarely get such funding from vendors (they only get discounts). This makes it difficult to do more applied research. - Dave suggests to work from open source routers rather than real commercial routers. - Get new people into the boat. - Next meeting on Web Services for management organized by Aiko Pras, probably in September in Osnabrueck. - More work together is needed. - Start a project to do management measurements. Action Items: ------------- - SNMP over TCP revision (Juergen) - SNMP compression (Juergen) - Information vs. data modeling (Aiko) (first initial draft should be available before the IAB meeting) (second draft should be available before the summer break) - NMRG meeting on Web Services for management (Aiko) - SNMP usage measurements (Aiko, Juergen, Luca, Frank, ...) - SMI to XML and XML schema mappings (?) - Invitation of new people (Juergen) Appendix: ========= Exception-based Management David T. Perkins Riverstone Networks April 20, 2002 The Need: --------- - A significant part of management is tracking the status of network elements (NEs) - When the status indicates there is a problem at a NE, then to take corrective action - The overwhelming situation is that there are no problem(s) How to Determine Problem Existence: ----------------------------------- - The methods: (a) manager polls agent in the NE and looks at values returned (direct status or computed status) (b) Agent in NE self determines that there is a problem and reports it to a manager Polling Pros and Cons: ---------------------- - Pros: + manager controls what is used to determine the status + manager can use values from more than one NE to determine network status - Cons: - most of the time all is well, so polling adds "unneeded traffic" to the network - on average, the lag time between problem occuring and manager determining is 1/2 the poll interval - the number of NEs that is managed by a manager is limited by the amount of generated traffic and the analysis of returned values Benefits of Exception Reporting: -------------------------------- - Significantly reduces the amount of network management traffic - Reduces processing of data on a manager - Allows a manager to support a much greater number of NEs (lowers management costs) - A manager is more responsive to problems, since the time lag is "eliminated" Required Mechanisms for Exception-based Management: --------------------------------------------------- - Management protocol operations to report exceptions (problems) - An agent maintained list of current problems with efficient retrieval - An agent maintained log of raised and cleared problems with efficient retrieval - A language to describe (define) problems Why Current Problem List and Log: --------------------------------- - Communication between manager and agent is over "unreliable" network - Even with reliable network, manager can drop reports due to overload or restart - Current problem list and log allow a manager to synchronize with an NE's problem status Types of Problems: ------------------ - Stateful: - condition - Incidents: - ? Alarms: ------- - An alarm is an announcement calling attention to a circumstance or event - The start of a condition is an event - The termination of a condition is an event - The detection of an incident is an event ITU-T Alarms: ------------- - Those defined in X.733 - communication: procedures and/or processes required to convey information from one point to another - quality of service: a degration in the quality of a service - processing error: a software or processing fault - equipment alarm: an equipment fault - environmental alarm: a condition relating to an enclosure in which the equipment resides - Those defined in X.736 - security alarm: a security violation Security Alarm Types: --------------------- - integrity violation: an indication that information may have been illegally modified, inserted or deleted - operational violation: an indication that the provision of the requested service was not possible due to the unavailability, malfunction or incorrect invocation of the service - physical violation: an indication that a physical resource has been violated in a way that suggests a security attack - security service or mechanism violation: an indication that a security attack has been detected by a security service or mechanism - time domain violation: an indication that an event has occured at an unexpected or prohibited time Real World Use of Alarms: ------------------------- - Real world usage is different (more complex) - Use of acknowledgement - Alarm with snooze button