Minutes of the Meetings of the Network Management Research Group at IETF46 Reported by David Harrington, . Sunday 11-7-99 Attendees: David Harrington, Cabletron Systems Dave Levi, Nortel Networks Ron Sprenkels, University of Twente David Partain, Ericsson Andy Bierman, Cisco Jeff Case, SNMP Research Steve Moulton, SNMP Research Dave Thaler, Microsoft Dave Perkins, Tollbridge and SnmpInfo Jon Saperia, JDS Consulting Shawn Routhier, ISI Juergen Schoenwalder, University of Braunschweig Eric Schoenfelder, Gaertner Datensysteme Bert Wijnen, IBM Keith Mcloghrie, Cisco Unsigned 64 for smiv2 Keith and Andy explained the hcdata proposal. We discussed whether the problem needed to be solved for v1 as well. If we move v2c to standard, it will be easier to migrate from v1 to smiv2. Opaque encoding allows us to implement without new encoding. However, opaque encoding gives more opportunities to get the encoding wrong. It would take way too long to get opaque into deployment. We could add new encodings just as easily and they're better. No matter what the outcome, it will probably look like 2-32s. Whether it's encoded as opaque, an overloaded c64, or a new tag, for most developers, it doesn't matter what decision is. Some disagree with overlaying a signed on top of an unsigned; overlaying an unsigned over an unsigned is the best we can do. It looks like a general solution when it is a hack. We shouldn't sell it as a general solution. We should explain this as a specific hack to resolve the problem. We should warn everybody not to use this in other cases. We shouldn't say hack outside this room. It's important to understand the history behind this. We wrote a rule to avoid abuse of c64, but couldn't write a rule to avoid abuse of uint64 and signed64, so we chose not to have uint64 and signed64. We need to advance RMON, and this is the best we can do. We need a 64-bit snapshot and a 64bitdelta. We need to worry about negative deltas. We have tolerated RMON, so maybe we should localize the issue to RMON, but people import from RMON. Is the concern that this TC would be used beyond RMON, or that it would set bad precedent? Both are concerns at different levels. If the TC was used outside RMON, would that be a big a problem? Some would not be comfortable with wording that said the IETF couldn't make up its mind, and this is the current decision, and the decision may be rolled back later. We must decide now whether it's legal or not, and decide the issue once and for all. High capacity is more important than RMON only. We need this in high-speed networks. If we limit this to RMON only, another WG will come to the same problem 6 months down the road and want to use the TC. RMON also defines a zero-based counter 64, with a gentleman's agreement. That was also called illegal. We need to determine what is legal and not. It deals with a TC expanding semantics. However, a zero-based counter wouldn't affect the semantics that counters must always increase, while uint64 does. People will define everything as zero-based 64-bit counters because they wouldn't need to do two reads and calculate a delta, if the counter is initially zero. 32-bit zero-based counters are from gauge; zero-based64 is based on counter64. how many think this is illegal? - All is this the best we can do? Yes is it allowed outside RMON? Yes is the general technique going to be allowed? No Should it be allowed in more than RMON? I think this must be resolved for all high-speed networks. Where else is it needed? DISMAN. Can we agree it cannot be done other than Unsigned64 and Gauge64? We shouldn't make the decision for the future. We shouldn't constrain future SNMP decisions on this. This should be only for the current problem and only u64 and g64. Elevating it to a general-purpose proposal, as hcdata did, is wrong. There is middle ground; define a Counter64snapshot and a Counter64delta, and let Andy and Keith decide the names. action item: Andy will rewrite TC doc with explanation and appropriately named TCs. We should also include text that it may break some existing implementations, and that it may change in the future. I want to be able to point out to NMS vendors that counting on encoding for type determination is a bad practice. Does this text go in hcRMON or in a general document? By putting the module name in front of TC, it may localize the problem better. Consensus: We are all agreeing to do something illegal. We should put it in a separate document, document that it is illegal, and remove the contentious wording. We don't want it to say it's illegal, just that it is "not strictly legal". We can come up with some flowery text that others must not use this. Andy will write an Informational document with 2 TCs. We need new module identity. ZeroBased64: It has been argued that zerobased64 is illegal because making it zero-based is not consistent with the underlying "not initialized to 0". Both approaches are taking away semantics. Not really - a zero-based counter is merely a constraint on the base. You can never put a range on a counter; by putting a range on it, the deltas no longer work. ZeroBased64 has the same problem. [here we started spinning wheels for a while] A counter32 today can be constrained to start at zero. They are contained in MIBs currently under review. The review is based on the assumption that zero-based counters are acceptable. [here we did some RFC lookup] RFC2578 says "Counters have no defined initial value, and thus, a single value of a Counter has (in general) no information content. ... A DEFVAL clause is not allowed for objects with a SYNTAX clause of Counter32." There is an identically worded constraint for Counter64. rfc2021: "ZeroBased32 ... will be set to zero(0) on creation ..." This needs working group consensus; we need to be sure. action item: Andy will add a third TC in the document with the other two. The short-term proposal will be presented in the Ops-Area meeting. Andy and Keith will submit the document. There will be a 4-week last call, so we can have a short-term solution by year-end. DPerkins requested Ops-Area chair to charter new committee to resolve new data types long-term problem. Jeff Case suggests we should discussion the big picture before dispatching little groups. The meeting chair asks the group whether they wish to discuss vision or small items first. The group chooses vision. Vision We went around the table and asked for each person's three most important issues. Top three things: Dperkins thinks we need a long term solution for data types, we need a better solution for bulk transfer, and we need to support operations in PDUs. steve moulton: no submissions eric: nothing to say Keith says we need support for new data-types. He elects to keep his options open for other issues. Andy's concerns are not related to SNMP. We need a new mgmt architecture. We need a more scalable, better delegation model. We expect the high level to have an understanding of low level stuff. We need to layer the architecture better, to hide the details. Plus we need OID compression, as a way to get data faster. Also, the COPS-PR vs. SNMP debate needs to be settled. ron: nothing Juergen Shoenwalder thinks the SMI is a prime concern. Problems need to be resolved by making the SMI more powerful. We need new data types, operations, etc. It does not necessarily mean things like aggregate types. We need Bulk transfer. Bert expressed his concerns. SNMP is not object-oriented. Policy-based mgmt needs some real attention; how does it fit into the mgmt picture? DMTF is invading the IETF with their CIM model; Do we want this? What do we see for 5 years from now? Is SNMP still viable? Should we embark on something new? Are they (DMTF) leading us? Should the mgmt architecture be COPS plus SNMP? Dave Levi expresses the view that the COPS vs. SNMP debate will drive some of the details. If SNMP wins, then bulk transfer is the most important. Shawn agrees with most of these, but doesn't know the priority order. Jon Saperia expressed the need for a complete information data model. We must work on common goals that matter operationally. Some local work appears important, but operationally the benefit isn't as obvious. We need to work on solutions to customer problems rather than some detailed MIB. Maybe we need a minimum mgmt document. Jeff Case thinks stability, perceived stability, completeness, and deployment is the most important. We need new spins on the protocol, the MIBs, and the SMI. The MIB must contain more standard configuration and control, not just monitoring. We need to standardize more objects to encourage people to use standard approaches. The SMI needs new data types, etc. David Partain reiterates Dave Levi's comments. The COPS-PR vs. SNMP debate is the most important concern. What do we learn from that debate? Dave Harrington observed the need for snmpv1 table read efficiency for all those operators that will continue using snmpv1. We need better business case justifications - why should vendors implement? We need more orientation to customer demands. Compatibility with existing stuff is also critical. People won't do SETs. People used security as an excuse. With snmpv3, they still don't do SETs. SETs really are a bitch to do. People need to write scripts or something. App needs to be powerful enough that non-SNMP person can easily configure it to do what they want. There have been financial reasons why they don't do them, rather than SNMP issues. SNMP is low-level, which is important for monitoring, and for particular types of SETs. To do SETs at a higher level, you must be aware of the different requirements of monitoring and for doing SETs. COPS-PR and SNMP approach the tradeoff differences differently. Some businesses need multiple mgmt systems. We agree that expressing things at a higher level of abstraction is very important. That can be done with SMI as is, and we need to write a set of recommendations. However, that doesn't allow you to get rid of RowStatus. ----- If SNMPv3 is not being deployed, maybe there's something wrong with it. What we need to do is publish coexistence docs, because lack of [a coexistence strategy] is preventing deployment of v3. Another problem with getting SNMPv3 deployed is the fact that Cisco has a tree in their source, and customers must choose between stable routing and snmpv3. Not all issues are feature sets in specs. [Then a discussion broke out] they want traps, with data, and the ability to turn off traps they don't want. We need to move from tech-driven to customer driven. We need to do better than to send lots of little-bitty pieces of data to a human. We need to convince people to use DISMAN, and we need to standardize a script language to make it possible for operators to do what they want. We need to address problems on a timely basis. We need to pick up the pace. There are real problems and we need to solve them for people on a timely basis. We need more smarts everywhere. Plug and play has to be real; we need to get configuration down to [???]. There is a big disconnect between what customers think is important and what's discussed in these meetings. SNMP purity isn't important to the customers. They know to have remote intelligence, security is needed. That will cause some deployment of SNMPv3. We must not send mixed signals. The chair raised the issue of the agenda for tomorrow's session. Should we discuss the COPS vs. SNMP debate before we go into the BOF, so we don't give an impression of a catfight in that meeting? NMRG meeting Monday 11-8-99 Attendees: Shawn Routhier, Integrated Systems Jon Caron, Cabletron Keith McCloghrie, Cisco Ron Sprenkels, University of Twente Jeff Case, SNMP Research Steve Moulton, SNMP Research Jon Saperia, JDS Consulting Dave Levi, Nortel Networks Glen Waters, Nortel Networks Dave Perkins, SnmpInfo and Tollbridge John Seligson, Nortel Networks Juergen Schoenwalder, University of Braunschweig Dave Partain, Ericsson Dave Harrington, Cabletron Bert Wijnen, IBM Andy Bierman, Cisco --- What is the goal for the meeting? We discuss setting out the goals for both COPS-PR and SNMP and try to find commonality. It may be that we want to have two protocols as they can then be tuned better for their specific pieces. Perhaps the 13 requirements in the mumble docs are too fine grained; maybe we need to look at a higher level and try to have one framework. One framework may still have multiple protocols. One seamless framework might be better. Dave Harrington was going to draw the architecture from rfc2571 and ask Keith if that's what he means, but Keith has a different vision and is now drawing the slide. Dave asks Jeff if he has thought about what a COPS like thing would look like in the SNMP world, and whether he could also put up a drawing, so we could compare the visions. A Picture drawn by Jeff Case: This would not be duplication and would be a good thing to do both. +----------+ +----------+ +----------+ | CLI GUI | | SNMP GUI | |Policy GUI| +----------+ +----------+ +----------+ | | | | | | | +------------+ +------------+ +-----+ | | Mgt Station| | Pol Server | | Rep | | +------------+ +------------+ +-----+ | | | | | | | | | | | +-------+ | | | PDP | | | +-------+ | | | | | +---------+ | | | +-----|---------------|--------|-----+ | | | | | | +-----+ +------------+ +-----+ | | | CLI | | SNMP Agent | | PEP | | | +-----+ +------------+ +-----+ | | | | | | ... | /--------------------------/ | | / Data / | | /--------------------------/ | | | +------------------------------------+ A picture drawn by Keith McCloghrie. Note that this picture includes the image that one policy may affect multiple end unit entities. +----------+ +----------+ +----------+ +----------+ | Other | | GUI |--------| Directory| | SNMP | +----------+ +----------+ +----------+ +----------+ | | | | ----------+ | +------------+ +------+ | | | | +------------------------------------------------------+ | | | | | | | +--------------------+ +--------------------+ | | | PDP | | SNMP Manager | | | +--------------------+ +--------------------+ | | | | | | +------------------------------------------------------+ | | | +-------+ | (CLI) (SNMP) | | (COPS) | +-----------------+ | | | +------------------------------------+ | | | +-----+ +------------+ +-----+ | | | CLI | | SNMP Agent | | PEP | | | +-----+ +------------+ +-----+ | | | | | | ... | /--------------------------/ | | / Data / | | /--------------------------/ | | | +------------------------------------+ There was a debate as to the purpose of the pictures. A distinction was made between what problem we are trying to solve vs. the drawings which are more or less what we are currently doing. Dave H. was attempting to establish some common ground, to understand which pieces of the two pictures could be merged, and which pieces could not be merged. Part of the desire for pictures was an attempt to get a better understanding of what is happening. Jon attempts to draw a picture without any specific technologies. management system ---------------------------- cloud 1 cloud2 s1, s2, s3 s1, b1 management apps: configuration, fault, (element specific) netwide (policy based elements) --- The authors of the mumble doc agreed on the requirement of network wide configuration (policy stuff etc.). It also points out that a network will need element configuration to get fault, and other, information. This leads to a suggestion for COPS for policy level and SNMP for element level. So what happens for element level information that must be aggregated, such as the number of errors, number of packets across backbone etc.? Element information will be used to determine if and how the policies were carried out. So the proposal being put forth by the COPS-PR proponents is to use SNMP for aggregation of element information (statistics etc) and use COPS for policy distribution - is that correct? Yes. An example of gathering aggregate statistics using SNMP: collection statistics on backbone interfaces - show all interfaces on the backbone that have more than 50% utilization... There is a concern that other people in the COPS area may have desires about using COPS instead of SNMP for element information. Some people might push for either pure SNMP or pure COPS to do all management, both policy and element. The IETF can use the deployment club to try to convince people that as SNMP already exists, and that is what should be used for element management. Having a single framework is attractive; the SNMP framework is already deployed. But there's COPS-PR stuff already deployed. What metric is used to make decisions - deployment? Political clout? If deployment is the metric, then SNMP should win for statistics. There is no real deployment for policy management so we have some leeway. SETs are hard and perhaps we can make things easier by changing the requirements - such as eliminating the possibility of multiple managers. This could also be done using SNMP. The use of TCP allows us to achieve some of the new requirements, but having SNMP run over both TCP and UDP would lead to two protocols. Doing the same things as in COPS would be harder and more complicated in SNMP. By narrowing focus, COPS-PR optimizes the solution for one part of the problem. However, doing the optimization for only one part of the problem may make the larger problem harder, i.e. one could win the battle (COPS and policy), but lose the war (network management). The concern is understood, but we may need to solve a small section and let other groups solve some of the other problems. We can put more smarts into the SNMP agent including some of the features, such as single-request row-creation. The MIB question is a detail we should not be discussing; we should be discussing at a higher level. This is a major question, the fact that an agent can be smarter may allow us to do much different things / more efficiently in SNMP. Fixing one thing by adding another GUI (COPS) may make things worse. Currently operators must use CLI, and SNMP etc., but if we add another then we will have multiple GUIs. Do we want to drive towards multiple protocols? The borderline between COPS configuration and element configuration is not a hard line; different people will think of the borderline as being at different places, so by adding COPS we will allow even more ways of configuring the same thing. This is undesirable. The cost of deployment may cause SNMP based policy configuration to not be deployed, i.e. MIBs may not be deployed. Customers want cross machine information, response time, availability etc. We currently don't have much (if any) MIBs to do this; only a small number of MIBs supply some cross machine (network wide) information and configuration. Not much has been done with configuration due to security. Why would people deploy COPS if they won't deploy SNMP sets? The deployment time for COPS-PR is less. Possibly non-technical reasons such as org charts and who can/will control the work might affect deployment decisions. Due to a lack of consistent configuration interfaces, it isn't worthwhile to write the applications to do configuration. There is a lack of standard configuration MIBs. There are some problems in the SNMP protocol if we would like to use it for configuration management. We should either do all configuration management via SNMP or expect that essentially all configuration management would happen via COPS. How far should the non-use of SNMP SETs be carried? SETs would still be allowed but not for configuration management, so for example if we choose COPS we should get rid of the remote configuration MIBs for SNMP. Because of the rfc2571 framework, we should be able to add a new application to SNMP that would support the policy stuff, possibly with new verbs and maybe new types etc. The real win in the COPS stuff is the delegation model not in the other optimizations, such that the highest level manager would not need to understand everything. This leads to a suggestion that the agent needs to be more intelligent. Possibly, we are mixing two concepts: The first is the high level view - an instance could be device specific vs. role specific. The second is the delegation - the distribution of policy (or other information) from the top level to all affected items, or to a central item that then can distribute the information to other affected items. This gets us on the slippery slope of too fine a granularity. This is a MIB design question not a new technology. Some people are pushing for a PIB that looks like the MIB. In Diffserv, they don't want to have multiple versions. As an example of the problem, assume a router with four blocks. We are allowed to build things out of these four blocks that are fine for a router on the edge, but are not necessarily useful in all routers, such as might be used within Diffserv. Some working groups won't be specifying high level MIBs due to a lack of motivation. So you're saying that COPS & PIBs are being proposed to overcome a procedural problem? Sort of. We could fix this by sending people to the right places. However, we can't do the work for them. Would it be acceptable if we convinced people to write multiple MIBs, one for element type information, and one at a higher (network) level? That would not be adequate; there are other items as well. The lack of progress in SNMP has been a problem. The simplicity of COPS is good, as is the limited bits on the wire. We could allow the use of PIBs as MIBs when a PDP is not involved; the PIBs can be manipulated via SNMP, though the byte count is higher. We would need to get the work done. As was mentioned, stability and perceived stability are important. We need to figure out how we would do some of these things. A new application might avoid the stability problems. However, that wouldn't address the byte count problems etc. We might need a new, more efficient underlying transport mechanism. We would get less of the optimizations of COPS and we might be incompatible. There is trouble getting SNMPv3 deployed. Why would COPS be easier to deploy? There is no deployed base and the customers don't have any happiness to worry about. It is amusing that the SNMP folks think that COPS will get out there faster and that the COPS folks think that they will get things out faster if they are attached to SNMP. One comment was more about customers, the other more about vendors. There followed a discussion of how the policy stuff would be presented (as an SNMP application or as something else). The use of SNMP has the perception that it would take longer and still be non-optimal due to the baggage associated with older versions. Some would prefer to use SNMP but we would want it soon. Would having a set of SNMP specs available in 6 months be acceptable? Here is an attempt at a compromise. Cops already exists for QoS, so one could argue that including Diffserv is not adding a new protocol; meanwhile SNMP could go off and try and implement something else that might help. Sometime later, we could compare to see where we are, whether the SNMP solution is better or worse than the COPS solution, and whether one or both should be continued. The COPS-PR promoters support forming a working group for using SNMP for policy management as long as IESG would not block development of COPS for provisioning with a review in the future. Notice that this may be a problem in the future if the decision is to kill COPS, as people are unlikely to want to let the IESG decide; the COPS folks may want to let the market decide. Another comment that things take too long. We agree that COPS optimizes some pieces (PDP to PEP), some of which (perhaps all) we might get by working on with SNMP. Such an optimization might cause problems elsewhere. jon: Talks about the rap schedule The Diffserv PIB has no current home due to the question of COPS vs. SNMP. This was done by the WG chairs, and it is not in the current charter anywhere. The Diffserv folks aren't all that interested in writing MIBs or PIBs to do the provisioning. Any required MIBs or PIBs should be done in the technology working group. Where is the sopi language defined? It is currently not defined anywhere. A new document should be coming soon. Are there concerns about parallel development? Yes, deployment wins and so whatever gets deployed first probably is not going to be replaced. A suggestion is to get a group together to work on an SNMP version and see if they can get it done in 4 (or whatever) months, and compare them then. Would these be on the standards track? Perhaps both should start on the experimental or informational track. There seems to be general agreement on that. We might get a better product if all effort were focused on one product. Not necessarily, competition is a good thing. This would not be duplication and would be a good thing to do both. Things going on the proposed track would send the wrong impression; experimental would be a good thing. There is a BOF on Thursday is to help the IESG to make up their mind. Do we recommend trimming the amount of time spent on requirements in the BOF? Spending more time on the type of discussion we have had here in the BOF would be a good thing. Bert discussed some of what should happen in the BOF. We have reached consensus on some things: 1) MIB design impacts complexity of implementation, 2) both high level and low level views are needed, 3) agents can do more, and 4) we have a lot to learn for managing network-wide behavior. An example of implementation complexity and MIB design is given. * set ifoperstat.0 to down * set ifoperstat of "foo" to down * set ifoperstat of "foo*" to down What is our recommendation? Many agree that if we don't do anything to improve SNMP for this use, COPS should win. There is a proposal that both should go to experimental while SNMP should attempt to fix the problems that have been identified. Going to experimental implies that one of them may be dropped in the future, probably when things would go to proposed. A decision was promised so we should have either a decision or perhaps a time period. Who (or what group) should be the "author" of the recommendation? Bert suggests the NMRG, there doesn't seem to be any problems with that. Do we have a recommendation? It is not clear we have one. Recommendations: 1) Find people to start work on the documents that will need to be discussed. A list of potential volunteers is captured. 2) Create a design team (either official or independent) (not in snmpv3 WG) 3) Create a working group at some point. Don't open the wish list to everybody until we make some progress. We would accept COPS people if they want to bring the proposals together.