Date: Tue, 22 Feb 2000 12:08:53 +0200
From: Alexios Zavras <email@example.com>
Frank Harper wrote [edited]:
> What I'm wondering, is why aren't people using RRDtool (round robin database).
> I'm not an RRD expert, but I think it's highly optimized for just this kind of
> time-series data...
Yes, it is. Its sole disadvantage, from what I've read and toyed
with MRTG, is that it keeps compacting data. You get all the measurements
for recent timestamps, plus averages for measurements further back,
with ever-increasing time-period. If that's what you need (like MRTG),
it's ideal. However, there are cases where you need *all* data kept,
in an ever-increasing database (more like a log :-).
At the risk of boring everyone with my usual ranting on this. Consider also
how adaptable a solution is. For instance, although I briefly considered the
idea of distributed collection agents before this discussion, I never
considered making the definition of the collected data more general than using
OIDs (which is very remiss of me since it's exactly the sort of thing I
complain about - I can only hope I would have noticed it later :) So having
it brought to my attention now, I can think about how I might implement this
at a later stage, this will (hopefully) just involve redefining a few tables
then some slight modifications to the code that updates/queries the tables.
However if I'd chosen RRDtool (which I did look at) or something similar I
would have to hack a large collection of C code which would take an order of
magnitude longer and would have to be redone for every change, ...
On another note, clarifying what I've said about temporal databases,
I agree that simple timestamps are easily managed as keys. Once you
get to timeranges, though, things get complicated. In any case,
you have to implement the time primitives yourself.
For timestamps (e.g. ts1 and ts2), you only have a predicate:
earlier(ts1,ts2) ts1 < ts2
(well, and probably equal).
For timeranges (e.g. tr1), you have:
inside(ts1,tr1) (ts1 > tr1.start) AND (ts1 < tr1.end)
and primitives involving two timeranges, like:
inside(tr1,tr2) (tr1.start > tr2.start) AND (tr1.end < tr2.end)
but then of course you also have functions like timerange union, with
u.start = MAX(tr1.start, tr2.start) and u.end = MIN(tr1.end, tr2.end)
but there are also non-overlapping timeranges, and so on.
I'm not sure what database organization should be used to facilitate
operations like these.
Not sure what you're getting at here. Or rather not sure what you want to
achieve from this. The way I chose to store the data, although it has a
single timestamp this also represents an implicit range because in table
INTVAL it represents the data over the last 15 minutes, in table MONTHLY it
represents the data for that whole month etc. In the raw (not rolled up) data
then a timestamp is serving two functions.
-- !! This message is brought to you via the `tkined & scotty' mailing list. !! Please do not reply to this message to unsubscribe. To subscribe or !! unsubscribe, send a mail message to <firstname.lastname@example.org>. !! See http://wwwsnmp.cs.utwente.nl/~schoenw/scotty/ for more information.
This archive was generated by hypermail 2b29 : Mon Jan 08 2001 - 15:27:37 MET