Re: [tkined] SNMP to SQL

From: Eddie Corns (E.Corns@ed.ac.uk)
Date: Wed Feb 23 2000 - 19:35:11 MET


   Date: Wed, 23 Feb 2000 14:45:23 +0100 (CET)
   From: Szokoli Gabor <szocske@vaskutya.sch.bme.hu>
   Precedence: bulk

   I'm talking to two people at the same time here, trying to cut down on my
   traffic.

> Actually you don't know or care which order it puts the data into the
> database table - you only care about the order it comes out when you
> need to process or report on it.
   I get Eddie's originial point: records could be saved one after another,
   in the order they arrive, and that would keep things ordered properly,
   so there would be no need for an ORDER BY in the query, and everyone would
   be content.
   With a sophisticated SQL DB, we get the same results, but a piece of
   information is lost, that would allow major optimalisations, even if the
   table is indexed properly (I'm not completely familiar with SQL terms
   here) the DB will find out at each and every INSERT by hard work, that
   "heck, this new reckord goes after the last one, and not in between
   somewhere!"
   But we knew it all the time, and if we could say in SQL "this table will
   be loaded in the order the index wants it", we could save the trouble.
   I mean, the results would be the same of course, but it would take less
   time.

More or less, but it's not so much keeping the records in order for later use
but only the time taken to load in the batch of data that matters. Once the
data is in the relational table we may still want to look at it in a different
ORDER. I accept now that this is a perfectly acceptable approach and my
earlier doubts about needing a whole RDBMS was probably a result of thinking
that this was a kludgey thing to do.

> >In practical terms, my first attempt to update the database by doing
> >an UPDATE
> >command for each poll required several hours for each iteration, however once

   What exactly were you doing whith what amount of records in which RDBMS?
   Are we talking about simply adding new records to the DB?

Since I update the 'rolled up' data for yearly tables etc. directly rather
than cascading from 15min -> daily -> .. -> yearly then for each of the 8,000+
entries I was doing 1 UPDATE of _every_ roll up period plus 1 INSERT for the
15min data. Now of course I only do 15min polls to file which gets imported
then at midnight work out the daily counts and then update all the roll up
tables. All the work I've done was with MySQL. I think it's probably the
fastest free one (it claims to be a lot faster than postgresql) and has pretty
good functionality.

> >I converted to outputting the data to a file then importing it, it
> >only took a
> >few seconds! I assume Stuart was noticing similarly huge differences.
> >
> Not so huge as that, but as someone pointed out the SQL*Loader is the
> fastest way to get stuff into a table.
   I'm showing my lack of experience again:
   is SQL*Loader a DBvendor independent way to load a table from a file?

I think any sensible SQL DB would have to have a file import facility.

> >
> This is intriguing. Would you be willing to post a snippet of code
> where you do the parallel polls? Do you have any thoughts on the
> resources (memory, cpu) used compared to a straight walk?

   I'd be interested, too!

I did about 8 versions in total, here are some of the more interesting ones:

1)

package require Tnm 2.1

set ifcnt 0
foreach addr $argv {
    set snconn [snmp session -address $addr]
    catch {$snconn walk x ifDescr {
        incr ifcnt
    }}
    $snconn destroy
}
puts $ifcnt

The simplest version, just walks every host given on the command line. Just
increments a counter to ensure that it did actually reach them all. This
takes 1min 10sec with a given list of hosts, from which I removed all those
that took a while to respond (leaving 83 hosts).

2)

This version extracted ifNumber and did that number of getnexts, it took
about 1min 15sec. All done synchronously.

3)

This just did an explicit get on ifDescr.$ind where ind was looped from 0 to
ifNumber, this took about 1min 1sec. Again synchronous.

4)

This was like 3 but using asynchronous gets eg

  $snconn get ifDescr.$ind {incr ifcnt}

again about 1min 1sec

5)

package require Tnm 2.1

proc mib_value blist {
    return [lindex [lindex $blist 0] 2]
}

set ifcnt 0
foreach addr $argv {
    set snconn($addr) [snmp session -address $addr]
}
foreach addr $argv {
    set var ifDescr.0
    if {[catch {$snconn($addr) get ifNumber.0} num]} continue
    set cnt [mib_value $num]
    set ind 0
    while {$ind <= $cnt} {
        $snconn($addr) get ifDescr.$ind {incr ifcnt}
        incr ind
    }
}
snmp wait
foreach addr $argv {
    $snconn($addr) destroy
}
puts "$ifcnt"

Creates a session for each host, fires out every every asynchronous get (but
note it doess all of one machine before doing the next) then waits for all
responses to be handled. This takes about 45sec.

6)

package require Tnm 2.1

proc mib_value blist {
    return [lindex [lindex $blist 0] 2]
}

proc dohosts {} {
    global snmpQ snmpQind snconn maxind
    for {set ind 0} {$ind <= $maxind} {incr ind} {
        foreach addr [array names snmpQ] {
            catch {$snconn($addr) get [lindex $snmpQ($addr) $ind] "procpoll $addr $ind"}
        }
    }
}
proc procpoll {addr ind} {
    global snmpQ snmpQind snconn ifcnt
    incr ifcnt
}
    

set ifcnt 0
foreach addr $argv {
    set snconn($addr) [snmp session -address $addr]
}
set maxind 0
foreach addr $argv {
    set var ifDescr.0
    if {[catch {$snconn($addr) get ifNumber.0} num]} continue
    set cnt [mib_value $num]
    if {$cnt > $maxind} {set maxind $cnt}
    set ind 0
    while {$ind <= $cnt} {
        lappend snmpQ($addr) ifDescr.$ind
        incr ind
    }
    set snmpQind($addr) 0
}
dohosts

snmp wait
foreach addr $argv {
    $snconn($addr) destroy
}
puts "$ifcnt"

Like 5) but here it does the first get for each host before going back to do
the second get for each host etc. This one takes about 24sec.

7) the piece de resistance

package require Tnm 2.1

proc mib_value blist {
    return [lindex [lindex $blist 0] 2]
}

proc dohost addr {
    global snmpQ snmpQind snconn
    catch {$snconn($addr) get [lindex $snmpQ($addr) $snmpQind($addr)] "procpoll $addr"}
}
proc procpoll addr {
    global snmpQ snmpQind snconn ifcnt
    incr ifcnt
    incr snmpQind($addr)
    if {$snmpQind($addr) < [llength $snmpQ($addr)]} {
        dohost $addr
    }
}
    

set ifcnt 0
foreach addr $argv {
    set snconn($addr) [snmp session -address $addr]
}
foreach addr $argv {
    set var ifDescr.0
    if {[catch {$snconn($addr) get ifNumber.0} num]} continue
    set cnt [mib_value $num]
    set ind 0
    while {$ind <= $cnt} {
        lappend snmpQ($addr) ifDescr.$ind
        incr ind
    }
    set snmpQind($addr) 0
}
foreach addr $argv {
    dohost $addr
}
snmp wait
foreach addr $argv {
    $snconn($addr) destroy
}
puts "$ifcnt"

This sends the first get to every host but uses the response as a trigger to
send the next one to that host. This takes about 8sec.

---

Note I didn't think too much about boundary conditions, some of those <= $cnt should perhaps be < etc. (I'll check this before putting it to use).

I used ifDescr simply because that's what I'm using in the actual system. What I need to do now to complete the checks is to use the same code to walk a very large tree (the whole MIB) on just one device.

I had always expected number 6 to be the fastest (though I realise it would probably not scale to doing a large tree?). I'm sure someone will come up with an explanation.

I think that number 7 actually also has the best compromise on resources since it's strictly dependent on the number of devices.

Eddie -- !! This message is brought to you via the `tkined & scotty' mailing list. !! Please do not reply to this message to unsubscribe. To subscribe or !! unsubscribe, send a mail message to <tkined-request@ibr.cs.tu-bs.de>. !! See http://wwwsnmp.cs.utwente.nl/~schoenw/scotty/ for more information.



This archive was generated by hypermail 2b29 : Mon Jan 08 2001 - 15:27:37 MET