15th NMRG-Meeting (January 8th, 2004, IU Bremen)
================================================

Participants:
-------------

# Marcus Brunner (NEC Europe, Germany)
# Luca Deri (ntop.org, Italy)
# Olivier Festor (LORIA-INRIA, France)
# Torsten Klie (TU Braunschweig, Germany)
# Aad van Moorsel (?)
# George Pavlou (University of Surrey, England)
# Aiko Pras (University of Twente, The Netherlands)
# Juergen Quittek (NEC Europe, Germany)
# Juergen Schoenwaelder (International University Bremen, Germany)
# Radu State (LORIA-INRIA, France)
# Frank Strauss (TU Braunschweig, Germany)

Minutes: Torsten

1. Path-coupled signaling for traffic measurement (Marcus)
----------------------------------------------------------

Passive and active measurement technologies are available for
measuring hop-by-hop properties of traffic along its path through the
Internet. Passive technologies can measure these properties
accurately, but configuring them for the measurement of a particular
traffic flow at all hops requires significant overhead for measurement
configuration. This problem does not apply to active measurements,
such as traceroute, because probing packets automatically follow the
same path as the traffic flow to be measured. However, active
techniques measure properties/conditions of the injected traffic,
which may differ from those of the traffic of interest.

Marcus showed an approach [1] that tries to combine the two ways of
measurement. It uses signaling for configuring passive hop-by-hop
measurements along the path of a traffic flow of interest.

Implementations use a pre-standard IETF NSIS protocol. 

Discussion:

AP: What do you consider a high speed link?
MB: A link with more than 1GBit.

AP: Why is using signaling for measurement configuration better than
    using traceroute for example?
MB: Traceroute gives you the location of the current flow which may
    change. With signaling, you measure only what you really are
    interested in.
AP: Are these route changes a real issue or are they just
    theoretically a problem?
MB: They are a real problem but not much looked at.
JQ: It is simpler because you do not have to implement the transport
    or the basic security level. Another reason is that it must be
    simpler because it is meant to be an end user technology and not for
    someone who is monitoring the network anyway. So it is important
    not to require any knowledge of the topology.

LD: What are your probes?
MB: There are 3 implementations, one on Linux (kernel space), one on
    Linux (user space) and another one on an IXP node.

LD: Is it possible to apply this to a real network? It looks more like
    a configuration for routers. So why yet another protocol instead
    of a configuration in a MIB?
JQ: We are only interested to have it on routers. Dedicated probes is
    not our focus. The basic idea of the  NSIS protocol is to have
    signaling support on routers.
JQ: There are problems with load balancing (data and signaling may
    have different paths). The same with MPLS. There are scalability
    problems as well (like in RSVP).

MB: In order to save memory the number of managed flows can be
    restricted.

AP: How do you do time-stamping?
MB: There is GPS in the routers.
AP: What about routers "without daylight"?
MB: The problem has not been solved. Maybe using NTP can help. This is
    definitively an issue.
AP: What about the accuracy without GPS?
JS: It depends on the accuracy of the kernel mechanisms (more
    information on the NTP web-site [2]).

JS: You have an implementation of the NSIS signaling protocol?
MB: It is just a pre-standard prototype implementation.
JS: Does the protocol specify how you detect route changes? Or is it
    an implementation issue?
JQ: It works on a refresh base. You have to refresh your routing
    configuration regularly and when the routing changes the refresh
    is also rerouted.
MB: There is no agreement so far. It depends on the situation (for
    example mobile vs. more stable environment).

JS: Are there other fundamental differences between NSIS and RSVP?
MB: The most fundamental change was the split into the generic part
    and application-oriented part. RSVP is targeted to QoS which has
    been given up. Multicast is not included. 
JS: How about security?
MB: Security is another hot topic in NSIS and SIP. It is tight to the
    business model and the trust relationships.

OF: If it is not possible to do measurements on all hops in the path,
    is it possible to go back (like in RSVP) and make a proposal to do
    measurements every two hops, for example?
MB: You could implement that. It is part of the signaling/measurement
    application. The base NSIS protocol allows you to do that.

OF: Can collection be done with signaling as well?
MB: Yes.

MB: In the future we want to look at what kinds of measurements can be
    done this way and what are the benefits.
AP: The most attractive idea is that you can give these kinds of
    services to you end  users without the need of giving them access
    to the internals of your network.
JS: I think the end user model has scalability problems. I think, it
    could be used as a service for neighbor ISP because you do not
    have to give them insight on your network topology. 
MB: It can be the end user. If it is the service, the user pays for
    it, so the price will solve the scalability problem.

AP: A disadvantage of the approach is that the measurement has to be
    done within the router. Is there not a problem with the computing
    power?  Signaling may increase the load. This may be an obstacle
    for deployment.
JQ: The routers have an "overload brake" (so that the router stops
    signaling and measurement when a certain load threshold has been
    reached). There may be a problem with availability, but the
    service is not meant to be available for all time.

2. Improving Passive Packet Capture: Beyond Device Polling (Luca)
-----------------------------------------------------------------

Passive packet capture is necessary for many activities including
network debugging and monitoring. With the advent of fast gigabit
networks, packet capture is becoming a problem even on PCs due to the
poor performance of popular OSs. The introduction of device polling
has improved the capture process quite a bit but not really solved the
problem. The problem with polling is that the time-stamps may not be
accurate if polling is too slow. If polling is fast, very little CPU
is left for user space applications.

Luca showed an approach [3] to passive packet capture that combined
with device polling further improves it and allows, on fast machines,
packets to be captured at (almost) wire speed. He proposed a packet
ring data structure in the network driver where incoming packets are
copied to. The packet will not be queued into kernel data structures.

It is important to know where package loss occurs (driver, kernel, or
user-space).

Discussion:

AP: I have obtained different measurement results with a 1GBit
    card. Why are the differences in package loss between a 1GBit card
    and a 100MBit card so high?
LD: The logic of the faster cards is much more efficient so interrupts
    are raised less often. 10GBit cards are even more efficient. 

JQ: The point is that the large number of package loss only occurs
    with a large sequence of small packets, so it does not happen
    that often. Thus, libpcap works in many cases but does not always
    capture all packets.
AP: That was our observation that when you have a DNS attack than you
    lose data. However, we do not care about counting packets if we
    get an DNS attack.
LD: There is another issue that you should consider. When you lose
    packets you should also know where you lose packets (kernel
    level, driver level or user space level). If tcpdump, for example,
    tells you that you do not have package loss it does not mean that
    you do not lose packets.

JS: What are the differences between the implementations of the
    different OSes? 
LD: Linux uses soft interrupts which lead to low performance. Windows
    uses optimized network drivers and deferred interrupts to achieve
    better performance.

MB: Are there differences in the implementation between Linux 2.4 and
    2.6?
LD: Not much.

JS: How much work is it to adapt the device drivers?
LD: Very little. You just have to modify the driver to call my
    function instead of netif_rx and to disable the transmission.

MB: Why do you not implement the function at the beginning of the
    chain in the kernel before the package is filled into the whole
    structures?	Then it would be generally available to each network
    card.
LD: Such a generic approach should be doable. However, it is much
    easier to just override the system call in the driver.

3. Solving the middle-box problem (Juergen Q.)
---------------------------------------------

 Firewalls and NATs are middle-boxes and integral components of the
 Internet infrastructure but they are also obstacles for many
 communication services including IP telephony, video conferencing,
 etc. Several alternative approaches for overcoming this problem are
 currently under investigation.

Juergen showed [4] three of them: (1) controlling middle-boxes by more or
less central entities like 'call agents', as investigated by the IETF
MIDCOM WG [5], (2) path-coupled signaling between terminals and
middle-boxes, as investigated by the NSIS WG [6], and (3) smart middle-boxes
configuring themselves based on observed signaling messages. A
comparison of advantages and disadvantages of the approaches shows
that in different scenarios, different approaches are preferable.

The first approach is a telco-style solution. It is widely understood
because it is close to gateway controllers. 

The MIDCOM WG was charted to select an existing protocol. They
selected SNMPv3 as the appropriate protocol for configuring firewalls
and NATs. The reason was that SNMPv3 is a full standard. However, some
people claim that SNMP was not designed for that purposes.

The second approach is path coupled signaling, as described above in
Markus' presentation. The terminals are enabled to open pin holes. The
topic is addressed by the NSIS WG. If there were a transport protocol
for signaling, it would progress more quickly. This approach is the only
one that will work if the SIP signaling path is different from the
data path. 

The third approach is the smart middle-box approach. A smart middle-box
is a device that is smart enough to handle everything. There is
small-office/home-office firewall available from CISCO. However,
Juergen did not test it, because there are no specifications
available. A main problem here is that the firewall must be able to
support new signaling protocols. Another issue is that the signaling
must be path coupled with the session that shall be established. There
should be some policy control. Juergen implemented a modular firewall
on NET-BSD with loadable kernel modules which extend  ipfilter by new
rules. 

The current conclusion is that all three approaches are useful and
needed in different environments. A telecommunication operator will
like to have the call-agent approach. At home, users definitely want
to have smart middle-boxes. In some other scenarios, path coupled
signaling is the best solution. However, probably not all three
approaches will survive.


Discussion:

AP: What happens when you forget to close the pin holes?
JQ: For approach 1 and 2, there are timeouts. I would call it
    "middle-box control" rather than "middle-box management" because you
    do not configure permanent state.

AP: Is the smart middle-box approach not the best solution in theory?
JQ: All approaches have advantages and disadvantages. The problem is
    that networks are usually too complicated. For example, the
    signaling and the voice data stream do not necessarily take the
    same way. I also consider it the best solution,
    except for this disadvantage. Another problem is that it needs
    quite a lot of computing power on the firewall or NAT because it
    has to be aware of all the protocols involved. 
AP: Is the functionality not included in iptables? So implementing it
    should be quite simple.
JQ: Yes, but you also need to implement the protocol parser.

AP: [related to approach 1] What happens if the gateway is owned by the
    end user (for example at home networks)? The end user will have to
    trust his operator. Furthermore, you need authentication etc.
JS: I completely disagree, the firewall does not want to trust the IP
    phone.
AP: I was thinking about the firewall at my home. There will be more
    home user firewalls than firewalls of companies.At home, I trust
    my IP phone.
JQ: It is possible to have an own SIP server which controls the
    personal firewall. The call-agent will only control firewalls
    which are somewhere else (SIP proxy chaining).
AP: Then I will need a SIP proxy  at home, but I do not want another
    box there.
JQ: It is possible to run the SIP proxy as a process on your PC (such
    as your personal firewall) or in a box that already provides
    firewall functionality.
AP: If you put the functionality into the same box, where is the
    difference between a smart middle-box?
JQ: In that case, the two approaches are almost merged. However, if
    you look at companies where you have 50 users but only a single
    SIP server in one firewall it is fine.

AP: People who have ADSL at home will start using IP telephony, so
    this issue will become important.
JQ: Firewalls must be extended with SIP proxy servers (or opened).
JS: What happens to your personal firewall if you get an incoming call
    that wants to go through your firewall?
JQ: You will listen on the SIP port and then you will have to open
    your firewall.
OF: Today, in France, if you use the current offers on VoIP over ADSL,
    you have the phone access directly on the box that you have
    received from your provider. You cannot put a firewall between
    your VoIP signaling. 
AP: ADSL is often offered by other companies than your telephone
    company, who are competitors, so they try to give you locked
    solutions so that you cannot switch the operator.

OF: There are applications which configure your firewall using UPnP.
JQ: UPnP is also one of the protocols here.
MB: Our approach was not targeted to the home user with one firewall
    and one phone, but UPnP is.
JQ: UPnP is not well suited if you have a larger office.


LD: What about security?
JQ: If a remote phone says "open your firewall for me" the incoming
    call will have to be acknowledged. This can be done via
    signaling. In order to be more secure the protocol could be
    extended to restrict the opened UDP traffic to the source address of
    the calling phone.

LD: With IPv6, you will not have NAT but you will have firewalls so
    your approach will be still useful.
JQ: NAT will still be used with IPv6.
GP & LD: NAT will disappear.
JQ: NAT will still be used because huge companies such as IBM, Sun, HP
    use NAT although they have enough IP addresses. However, it will
    not be used that intensively as today.
AP: If you have a lot of machines at home - Morris said yesterday that
    there might be 100000 machines at home - your network is more easy
    to manage using NAT.
LD: You can connect to all machines using the same address.

AP: Why does the MIDCOM WG not use NETCONF instead of SNMPv3?
JQ: It is a fast and dynamic configuration issue which you do on a call
    per call base. I am not sure if NETCONF is the right tool here.
AP: No, it is not.
JS: Another reason is that SNMPv3 is a full standard. Formally, it is
    better to use something that already exists than something that
    might exist sometimes.

JQ: SNMP was not designed to do these kind of things.
AP: SNMP was designed to do that but it is not happening.
JQ: If it is not happening, there is probably something wrong with the
    design.
JS: It is completely contradictory that you do a protocol
    evaluation which concludes that SNMP is the right choice and
    afterwards without really knowing the details you find out that
    SNMP was not really designed for the purpose.
JQ: It can serve the purpose but that does not mean that it has to be
    specifically designed to serve the purpose. The other protocols
    were also not designed to serve the purpose (COPS, for example).
JS: What was the technical argument behind the statement "SNMP was not
    designed  for that purpose"?
JQ: Transactions are possible with SNMP but not really
    convenient. It is possible to do everything in a single set
    operation, but only if it fits.
AP: How much data do you usually need? Maybe it will fit.
JS: It might fit but it is a problem of the problem. With the security
    enabled, the space in a set operation will be quite limited.

AP: What about scalability? If the security features of SNMPv3 are
    used, key management will become very complicated.
JQ: Key management is a problem anyway, also with the other protocols
    that were considered.

AP: What about the future of the WG?
JQ: The WG probably will be closed soon after a MIB will be released.

AP: Is the constraint of not defining a new protocol a good argument
    for smart middle-boxes? If you are not allowed to define a
    protocol. do not define and use a protocol.
JQ: We defined a simple protocol as an Internet draft. However, the
    area director said we first have to define a MIB and then we are
    allowed to publish the new protocol as an informational RFC.

OF: [related to approach 3] The intelligence of the firewall must be
    configured somehow. Is this possible?
AP: The firmware of a firewall can be updated in the same way as usual
    operating systems.

AP: What protocols can be used?
JQ: ftp, h.323, SIP, rtcp
AP: Which ones have been done?
JQ: None, but ftp should be quite simple to do.
AP: ftp has.
MB: No, iptables does not support ftp.
JQ: Well, ftp is simple to implement and we used ipfilters and not
    iptables.
 
AP: Was UPnP not specifically designed to solve the mentioned problems?
JQ: UPnP is the best choice for the single computer home
   environment. Therefore, it is also discussed in the MIDCOM WG. But
   there are scenarios where UPnP is not sufficient.


4. Using Distributed Object Technologies for Network Management (George)

The use of distributed object technologies (DOT) for network management
has been intensively researched in the mid- to late-1990's. The X/Open-NMF
JIDM produced guidelines for translating SNMP, SMI, and OSI-SM GDMO
models to CORBA IDL and using CORBA as the access mechanism. This
approach though was never adopted per se, but variations of it have
been used mostly in telecommunication environments. It has recently
become evident that a semantic rather than syntactic approach for
converting SNMP SMI and GDMO models to distributed object interface
specifications is the way forward. George's presentation [7] reviews the
state-of-the-art in using distributed object technologies for network
management and will propose a framework that circumvents their usual
problems, making potentially possible to adopt distributed objects for
Internet management.

For all other cases than table retrieval, plain Get request is
sufficient. However, this is not true for SNMPv1 because of the lack
of proper error handling.

A disadvantage of DOT is that the default is to have one get method
per object attribute. In case of large object populations, this leads
to sub-optimal information retrieval. George therefore proposes a
different mapping. For all attributes (i.e. properties) use one method
per object. Dynamic counters and probably time attributes should be
grouped.


Discussion:

JS: Some of my code does not use Get at all.
GP: Why do you use GetNext to retrieve single instance objects?
JS: The reason is that when you ask for a list of objects and one
    object is not present then your whole get operation fails.
AP: This is only true for SNMPv1.
JS: Well, I am talking to real agents. Anyway, it is possible to
    generate stubs out of MIBs.
GP: Yes, that is what WS people do. BTW, is there any SNMP API with
    stubs?
JS: Yes, I have written one (smidump -f scli). The SNMP protocol is 
    very simple and it requires some engineering on top of it to make
    it usable. The interesting thing is, that in all the years with
    SNMP people have not done this.
GP: I completely agree.
AP: We as researchers keep making the same mistakes. When we talk
    about simple, we talk about simple design. But if you want to have
    something simple it should be simple to use. The main advantage of
    WS is that it is easy to use because you get it for free
    everywhere.
GP: If you take WS with a plain SOAP API it is very difficult to use.

AP: Why should there not be a Get operation in the WSDL file that
    retrieves all the data in a large XML file on which XPath and
    XQuery can be used to select more specific data?
GP: The proposal is not only applicable to WS. It is transparent to
    all DOTs.
AP: There is another advantage. This is easy to parse with existing
    software. With MS Excel, for example, you can write it down in
    four lines. If you have an XML document you will have the XML
    handling yourself. For users, this is far more difficult.
AvM: HP decided to go use a grid service approach because they think
     there will not be consensus on the grouping of objects. That is
     why they prefer to deal with large XML documents. In general, the
     XML based approach seems to be more useful in a large scale
     operator environment where experts do the management.

GP: Network management is not that different to other distributed
    systems topics.
AP: Management is not different than anything else so we should use
    the same technologies. However, we may have our specialized WSDL
    files.
AvM: The Question is what should be standardized.
AP: The question is what will be used. I think that the users will
    decide for what they find easy to use. George's approach is
    simpler to use than shipping large XML documents. However, if you
    manage sophisticated machines within an operator environment where
    you have skilled people with the XML approach you are more
    powerful. But if we have 100000 computers per human being we
    cannot manage them by programming with XML documents. It does not
    scale anymore.

AvM: Why do you consider WS to have strong typing?
GP: On the SOAP level, it is loose typing. But if you use stubs, you
    will get strong typing.

AP: Is it better to ship entire tables or is it better to retrieve
    single entries?
GP: If you ship entire tables it is possible to have faster
    agents. With the retrieval of single entries it is possible to save
    bandwidth. However, bandwidth is not an issue these days. If you
    need to get single objects you can implement it. However, then you
    will need a naming etc. and it will get complicated.

AP: Would it not be useful to have a possibility to retrieve an entire
    table and specifying arguments that select rows which have certain
    properties?
GP: I was trying to keep it simple. It shall be dine at the manager's
    side. 

GP: Why did MIB designers invented linked replies? Why did they not
    include all the data in a single big reply?  
AP: Because the reply could consume MBs of data.
JS: Because the model allows asking multiple agents in one
    request. That makes it impossible to put everything into one
    reply.
OF: If you have an application level routing on distinguished names,
    the request can be forwarded to different agents depending on the
    prefix.

JS: [shows his TCP-MIB API, which has been generated by a compiler]
    I agree with the statement that for real management applications
    the API is the key. In my API, there is a stub function to which you
    pass a mask and which then retrieves the desired data from the
    MIB. I allow the application to chose the data. For read/write
    stuff, such as the tcp table, for example, you get another stub
    function that retrieves the whole table. You can mask some columns
    if you want to. Another stub function is "get one entry". With
    this API you can write management applications without knowing
    anything about SNMP. The compiler was written in 2000. However,
    people are still programming with Get and GetNext API calls.

AP: I think vendors will include WS into their devices like they have
    put web servers for manual configuration. Probably, these WS will
    not be standardized.


5. Performance Evaluation of Web Services as Management Technology
   (George)

Web Services has been recently emerging as an XML-based technology for
distributed access to Internet services. A careful examination reveals
that Web Services is a technology with many similarities to
distributed objects, so it could also be used for network
management. This could be possibly done through the framework
presented in the previous talk, which avoids potential scalability
problems. In this presentation, George first identifies the similarities of
WS and distributed object technologies. He then examines the usability
and suitability of WS for network management and presents a performance
evaluation of selected scenarios in comparison to SNMP and CORBA.

Aiko did almost similar measurements but got different results. 

Discussion:

GP: We used WASP and not gSOAP, because GSOAP is highly optimized
    software and there is no highly optimized software for CORBA. To
    ensure a fair comparison, we used a "lighter" API.
AP: This is an important point. With WS you can get highly optimized
    software for free.

AvM: WS are not OO technology, at least not at the moment (WSDL 1.1)
     because inheritance and statefulness is missing. Maybe those will
     be added in WSDL 1.2 or 2.0. It would be better to
     compare CORBA with Grid Services (OGSI and OGSA) because WS is
     service oriented technology whereas GS is OO technology.

AP: It is an interesting observation that sometimes it takes longer to
    retrieve values from kernel space than to do the entire protocol
    handling. So protocol handling often is only a minor issue. If you
    use NET-SNMP, for example, which does not do any caching, it will
    retrieve value after value out of the interface table even when
    you look at the whole table. Therefore, if you make a Get on 100
    values from the interface table you will make 100 single kernel
    polls.

GP: We started our implementations in Java. Later, we recoded them in
    C++ due to the overhead of Java.
AP: We did some implementations in Java because we wanted to run the
    applications on mobile phones. The problem with Java is that it is
    very difficult to get low level informations about the amount of
    memory that you use.

AP: I have different measurement results that lead me to different
    conclusions. One reason is that George used hard-wired data with
    the TCP implementation (e.g. counters).  So, in my figures, SNMP
    is much worse than anything else.
JS: This is due to a bad behavior of the SNMP implementation which may
    lead to a large number of get requests there are different results
    with non hard-wired data.
AvM: SNMP is bad in your cases.
JS: No, the implementation is bad. The quality of the SNMP
    implementations after 12 years of SNMP is totally disappointing,
    so we should stop.

AP: WS without compression leads to very large amount of data that has
    to be shipped, much more than SNMP or CORBA use. Compressed WS are
    very good with respect to network usage.	

AP: When using compression with WS, the efficiency increases if more
    data is retrieved. My conclusion is that it is impossible to
    conclude which technology is better in general because it depends
    on the use case and the used software packages. If you just want
    to retrieve sysUpTime from a router, do it via SNMP. If you want
    to retrieve the entire interface table for 500 customers that are
    connected to your ADSL multiplexer, forget about SNMP, because
    compressed WS are far more efficient. My measurements also show
    that the time for the protocol is neglectable compared to the time
    of the data retrieval.

JS: It is not valid to make comparisons with bad
    implementations. 
AP: We want to compare what is practically available and does we do not
    care about theoretical comparisons.


References
---------------------------
[1] http://www.ibr.cs.tu-bs.de/projects/nmrg/meetings/2004/bremen/brunner.pdf
[2] http://www.ntp.org
[3] http://www.ibr.cs.tu-bs.de/projects/nmrg/meetings/2004/bremen/deri.pdf
[4] http://www.ibr.cs.tu-bs.de/projects/nmrg/meetings/2004/bremen/quittek.pdf
[5] http://www.ietf.org/html.charters/midcom-charter.html
[6] http://www.ietf.org/html.charters/nsis-charter.html
[7] http://www.ibr.cs.tu-bs.de/projects/nmrg/meetings/2004/bremen/pavlou1.pdf