[ale] Any MySQL Experts around?
Jason Etheridge
phasefx at magusaptus.com
Sat Feb 14 22:12:12 EST 2004
> I'm thinking of having all my server logs (including NT/2K/2K+3) record
> into a MySQL db for better reporting. Does anyone know if MySQL can
> handle 100,000+ inserts per day? Also, if it handling this many
> inserts,
> how good is the SELECT performance for the automated reporting? ANy
> thoughts, pointers?
Disclaimer: I'm not a real DBA and the tuple calculus I learned in
school actually tripped me up in the real world :D
The largest table I work with in mysql has roughly 8 million records
using full-text indexing. Each row contains a little over 260 bytes of
data. Every morning I turn indexing off and do a LOAD DATA REPLACE
touching every record, changing a good bit of them, and adding a
thousand or so new records. It can take around 3 hours to do that and
reindex on an old Sun E10K (I forget how many processors; maybe a dozen
for that domain). Queries are fast after that, but I use READ LOCK's
and there are no updates during the day. I'm not reporting on live
data.
If you don't use full-text indexing, and instead use LIKE statements on
large fields, searching is going to be pretty slow with that many
records, especially if you can't lock the tables because you're doing
updates. If you do use full-text indexing, I'd worry about 100,000+
inserts a day and concurrent searching. Full-text updates work best
with indexing disabled, and even then, when you go to renable indexing,
it'll do it via REPAIR BY SORTING, which could take a long time if you
don't have enough RAM to do it without thrashing the disks.
What sort of reporting do you want to do and sort of logs are they? If
you just want to store key statistical information rather than entire
log entries, that would be better. But for storing actual logs, I'm
not sure relational databases are the best tool. For log-based
reporting, I currently use a kludge: the system loads into temporary
tables specific data parsed from log entries from just the months we're
interested in for the current running report. This only works well for
one or two months in our case because of the arbitrary timeout value I
set. For anything really large, like a year's worth of transactions, I
just run the logs through a perl script and forget mysql.
I have another table without full-text indexing that gets searched and
updated throughout the day, with about a million records. But nothing
is added except in the mornings, just updated. I'll try to find some
stats for that if you want.
Best thing to do is get a test machine and play around with some sample
loads, and make sure to allocate a lot of filehandles for mysql's
process space.
-- Jason
More information about the Ale
mailing list