[ILUG] benefits of raw i/o
David Murphy
drjolt+ilug at redbrick.dcu.ie
Wed Jun 14 13:18:13 IST 2000
Quoting <Pine.LNX.4.21.0006140309460.1255-100000 at fogarty.jakma.org>
by Paul Jakma <paul at clubi.ie>:
> serial means more or less the same as sequential (to me). but to be
> specific i mean:
> serial I/O == raw I/O == character device I/O.
That sounds like 'sequential raw I/O' to me.
> no data buffering, no filesystem, no vfs. (no vat...). Just a char
> device that plugs you straight into the device driver.
Yes, raw I/O. I think we agree with what it is. Of course, you might
have an LVM in between the application and the scsi device drivers, if
you're doing raw access to a stripe or whatever.
> I've been arguing that the metadata cache is not the cause of
> slowness,
[...]
> The slowness is in the data cache.
Yes. Never been arguing otherwise.
> From the point of view of it, it sees that within a range of blocks
> ( range*blocksize >> allowed data cache) the usage pattern is
> extremely complex (big database). In order for the data cache to
> correctly predict that usage pattern it must have unacceptably
> complex heurastics.. better then for the data cache to get
> completely out of the way -> raw I/O.
No. Data cache out of the way can be raw devices, or 'direct I/O',
that is I/O to a filesystem without caching. Direct I/O allows a
tradeoff between the performance advantages of uncached access and the
management advantages of filesystems.
> > It was developed because, while raw disks are the ultimate in
> > performance, they are more work to administer than filesystems,
> > for obvious reasons.
> never having worked with raw I/O: in what way is it more difficult
> to maintain? i would have thought easier. You just point oracle at a
> raw I/O logical volume and forget about it until oracle starts
> telling you that it's running short, at which point you either
> extend the LV or give it a fresh lv.
Those are LVM issues, and are solved by having an LVM. An example
filesytem issue would be backup - it's kinda awkward to do an
incremental backup of a raw device.
> > Ah, but you don't, 'cos it uses extents, not indirect blocks and
> > fragments and things, and extents can be big.
> but inside the extent you must surely still use fragments/blocks?
> with the same dereferencing overhead as always. ('cept now the
> extent is an extra layer). there must be some layer of finer grained
> access inside the extent, otherwise what happens when the VFS says
> "AcmeFS, give me these blocks"? Does the FS say "uhmm.. here's a
> nice big 256MB blob of data"
My limited understanding is that an extent consists of a starting
point, and a number of blocks, i.e., "File lotsofdata starts 98349349
blocks in, and lasts for 3843438473848347 blocks."
> eg SGI XFS is extent based (it calls them "allocation groups") , but
> afaik it still uses moer traditional metadata such as superblocks,
> blocks, indirect blocks, fragments, directories, inodes within each
> extent/group.
You're describing a UFS with some extent-like feautres. Sun have made
similar modifications to UFS in Solaris, but neither is the same as an
extent-based filesystem, where the allocation is done with extents.
--
When asked if it is true that he uses his wheelchair as a weapon he will reply:
"That's a malicious rumour. I'll run over anyone who repeats it."
Stephen Hawking - [http://www.smh.com.au/news/0001/07/features/features1.html]
David Murphy - For PGP public key, send mail with Subject: send-pgp-key
More information about the ILUG
mailing list