Re: [tkined] SNMP to SQL

From: saustin_deja@my-deja.com
Date: Mon Feb 21 2000 - 20:18:51 MET


 

--

On Fri, 18 Feb 2000 12:02:05 Juergen Schoenwaelder wrote: > >I have a few questions that fit nicely in this thread: > >- What are the database schemas you are using? Are they derived from > polled MIB tables or are they more generic? > Well, our architecture is meant to be distributed and scalable - in other words, I might have several poller/data reduction components all feeding into a centralized data warehouse. So the schema for each is a little different.

But the basic idea is to build a generic measurement collection and reporting system that knows as little as possible about the type of measurements it contains.

There are user-defined configuration data that determine how a measurement is categorized, for processing, storage and reporting. For example, you might call ifInOctets a throughput measurement; other types of measurements could be classified as errors, etc.

I define "data reduction" to mean the process of turning the raw measurement into a database entity that can show up on a report. In my case, this means a number of steps.

>From polling I have the IP address of the element, the OID for each instance of the MIB variable I've walked (mib_var_name.instance), the time of the PDU, and the value of the instance. So this information is the basis of the schema for my "raw measurement" table. (I can either use my own scotty poller script, or the poller from Tivoli NetView MLM.)

I use a scotty script with various "mib" commands to convert the instance information from the OID and combine it with the element IP into what we call a "Measurement Instance ID." This "MIID" is a string of the form IP:shelf-slot-port:circuit_id and is meant to be generic for ATM, FR and Ethernet switches. Each vendor seems to make up their own mapping of instance index to shelf-slot-port:circuit_id so I have Tcl/Tnm functions that dissect the OID and do the mapping.

Once I have the "MIID" it becomes the means for identifying a measurement source throughout the whole system (i.e. primary key or part of primary key with things like timestamp, MIB name, etc.). If I want to group measurments on the same switch, I can just extract the IP portion of the MIID. Same for slot, port, etc.

I have one table of this schema into which the newest measurements are SQL*Loader'd (flat file import). This schema contains MIID, MIB name, measurement timestamp, and value.

There's a matching table like this called "last_recent_measurement" that allows me to compute the value deltas and the time interval since the last measurement. As a new measurement comes in, I compute the deltas, then store the new measurement in the last_recent_measurement table.

For some measurements I also compute an average (delta value divided by time interval), then if it's a throughput measurement, compute utilization by dividing that result by the maximum throughput of the channel (obtained from a manually-populated configuration table).

So for each input measurement, I save the MIID, timestamp, value, time interval, average and utilization. So you can see how the measurement has grown from just the COUNTER32!

Later these values are exported to the data warehouse, which uses a "smarter" (from a relational point of view) scheme to normalize the data. Basically we save 32 days of individual measurements. Reports are done at a minimum of 1 hour roll-ups, and we save the rolled-up measurements for longer time intervals in other tables.

Rolling up involves re-computing the average and utilization from the individual samples (not as an average of rolled-up measurements) and also saving the peak (and time of peak) over the reporting interval. So while I'm getting a reduction in the number of samples I'm saving more fields.

>- How to you represent the polling time and the "context" (e.g. SNMP > agent address) in your database? > Basically each measurement has to keep its timestamp (from rx'ing the PDU), measurement source (the MIID) and measurement name (MIB name) with it for the entire time it exists in the database.

>- How compact is the data actually stored on disk (e.g. 100000 polls > of variables xyz take m MByte of disk space)? Funny you should ask, the customer has just asked the same question. We're beginning to do the analysis now, but I'm expecting it to be something like a linear function

y = mx+b

Where y = total database space required x = number of measurements in the database b = space required just to support the database (for stored procs, schemas, etc.) m = coefficient of measurement storage overhead.

"m" captures all the additional storage required to keep the measurement in the database, so this would include storing its MIID, timestamp, MIB name, statistics, etc.

Because of the way we've normalized (again, using the relational database definition of that word) the data, I believe "m" is a random variable, not a constant. For example, we save only the MIID, MIB name and timestamp of a zero measurement (so we know the measurement was made), but not any of the other statistics. So depending on how many zero measurements you have that will effect what "y" is computed to be.

I will use a mixture of analysis and statistics (i.e. guesswork) to characterize what "m" looks like.

> >- How do you make the polling engine reliable, e.g. do you fire up a > new poller every once in a while or do you rely on the poller being > reliable and able to actually handle all error situations? > Basically the same as Eddie, that is, we trust you implicitly!! Actually we were requested by the customer not to put the poller to sleep between polling intervals, so we use cron to kick off a script that does the polling, data reduction and transfer to the warehouse. So one set of measurements is completely processed during each polling interval with a brand new scotty script each time.

However if we use the MLM poller it runs as a daemon, in that case we're trusting Tivoli...it seems to be reliable but in my opinion is very limited (example, it can't poll sysUpTime or any COUNTER64 variables).

Incidentally we are using Oracle 8i as our database and Oracle App Server as the reports interface. This means all our menus and reports are available in the user's Web browser.

We provide both spreadsheet-style reports and rudimentary line graphs. I'd be interested in Web-based tools for Oracle that could do a better job with the graphs, if anyone has suggestions.

Cheers Stuart Austin

--== Sent via Deja.com http://www.deja.com/ ==-- Share what you know. Learn what you don't. -- !! This message is brought to you via the `tkined & scotty' mailing list. !! Please do not reply to this message to unsubscribe. To subscribe or !! unsubscribe, send a mail message to <tkined-request@ibr.cs.tu-bs.de>. !! See http://wwwsnmp.cs.utwente.nl/~schoenw/scotty/ for more information.



This archive was generated by hypermail 2b29 : Mon Jan 08 2001 - 15:27:37 MET