If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. |
|
|
Thread Tools | Display Modes |
#1
|
|||
|
|||
Large Volume Mail Server Recommendations
Hey guys,
I work for a small company that's growing very, very rapidly. We provide a mail service to our customer base but have found that we're running into performance issues as our user base grows. The performance bottleneck that we're seeing appears to be seek time, that is, a user accesses their mailbox, grabs a few hundred KB of data, and goes away. What is the right way to deal with this situation? We've broken up our SAN to have many smaller arrays and are moving users to those smaller drive sets, but I was interested in hearing how the users of this group would do it. Here's our basic setup: - Hewlett-Packard SAN array - 2Gb Fibre connections - Drives consist of 73GB, 15K U320 drives From our metrics, we're way under capacity for Disk IO bandwidth, but our CPU time during which I/O requests were issued to the device is way high. Just wondering what the pros would suggest... Michael |
#2
|
|||
|
|||
Previously Michael wrote:
Hey guys, I work for a small company that's growing very, very rapidly. We provide a mail service to our customer base but have found that we're running into performance issues as our user base grows. The performance bottleneck that we're seeing appears to be seek time, that is, a user accesses their mailbox, grabs a few hundred KB of data, and goes away. What is the right way to deal with this situation? We've broken up our SAN to have many smaller arrays and are moving users to those smaller drive sets, but I was interested in hearing how the users of this group would do it. Here's our basic setup: - Hewlett-Packard SAN array - 2Gb Fibre connections - Drives consist of 73GB, 15K U320 drives From our metrics, we're way under capacity for Disk IO bandwidth, but our CPU time during which I/O requests were issued to the device is way high. Just wondering what the pros would suggest... I think that the problem is probably not your hardware. On the OS level you need something that does good caching and needs to be as memory efficient as possible. You also need a filesystem that works well with your access patterns. On the software side you need software that is as efficient as possible, does few seeks loads fast or better, resides in memory completely and permanently and generally sees the most common access pattern not as a major undertaking, but as something to be done fast and efficient. The high CPU load basically means that the SAN is not the bottleneck. One important factor is that flat files are far more efficient than any kind of DB or DB like structure. If each mail is stored in an individual file (e.g., just to illustrate the point), you get maybe 2-3 seeks per mail. If the customer has a single mail file per account, it can be read completely once and then rewritten once. Taking into acount that you get 100KB...500KB of linear reading for the time of a seek on a good filesystem a single maibox file is probably several orders of magitude faster than individual files, if the machine this is running on has enough RAM. One other problem is that your SAN is not so much more efficient than an bunch of IDE disks with software RAID. Maybe half an order of magnitude, if that. On the software, filesystem and OS side, there is usually far more room for optimisation. Another effect is that a SAN does not perform better in seeks than a single disk if it is RAID5. It does so with regard to throughput, but not with regard to access time. I think you already know this. The other thing you need to think about is distributing. The task is very well suited to it. Maybe put RAID-1 pairs of these disks in high-quality PCs with FreeBSD or Linux and forget about SANs completely, except for additional backup which the customer never accesses. A SAN is ideal for large amounts of data with little requests. You seem to need smaller amounts of data with a lot of requests and possibly some CPU load. So assign as few disks as possibly to each CPU and get more CPUs. Then you can add a new machine for every 1000 customers or the like (no idea how high this number would be) and your bottleneck becomes the mail handling on the SMTP side (can be possibly done with a forward server that does no disk access for the individual forward, keeping the cutomer DB completely in memory, e.g.) and log-in (can be done with one or a number of log-in servers that forward the individual log-ins or requests to the storage servers and also does practically no disk accesses. This way you be able to scale pretty fast. Just some thoughts on the problem. What you urgently need is a strategic plan that allows scaling for a longer time and keeps cost per customer as constant as possible. Arno -- For email address: lastname AT tik DOT ee DOT ethz DOT ch GnuPG: ID:1E25338F FP:0C30 5782 9D93 F785 E79C 0296 797F 6B50 1E25 338F "The more corrupt the state, the more numerous the laws" - Tacitus |
#3
|
|||
|
|||
"Arno Wagner" wrote in message
... [snip] Thanks for taking the time to reply, Arno. I think that the problem is probably not your hardware. On the OS level you need something that does good caching and needs to be as memory efficient as possible. You also need a filesystem that works well with your access patterns. The filesystem we use is ext3. I've seen a couple of threads on filesystem comparisons, but I was wondering what people here recommend. For all of my servers, I always use reiserfs, but the system that inherited here is ext3. From the numbers I've read, reiserfs is faster, but has anyone here seen it make a difference? On the software side you need software that is as efficient as possible, does few seeks loads fast or better, resides in memory completely and permanently and generally sees the most common access pattern not as a major undertaking, but as something to be done fast and efficient. The high CPU load basically means that the SAN is not the bottleneck. We use a commerical POP3/IMAP solution for our user access. Maybe there's something better, I'm not sure. One important factor is that flat files are far more efficient than any kind of DB or DB like structure. If each mail is stored in an individual file (e.g., just to illustrate the point), you get maybe 2-3 seeks per mail. If the customer has a single mail file per account, it can be read completely once and then rewritten once. Taking into acount that you get 100KB...500KB of linear reading for the time of a seek on a good filesystem a single maibox file is probably several orders of magitude faster than individual files, if the machine this is running on has enough RAM. The machine has plenty of RAM and we use single mail files for each folder. One other problem is that your SAN is not so much more efficient than an bunch of IDE disks with software RAID. Maybe half an order of magnitude, if that. On the software, filesystem and OS side, there is usually far more room for optimisation. Another effect is that a SAN does not perform better in seeks than a single disk if it is RAID5. It does so with regard to throughput, but not with regard to access time. I think you already know this. You're right, we know this. But it's interesting you bring this up because what we've done is.... The other thing you need to think about is distributing. The task is very well suited to it. Maybe put RAID-1 pairs of these disks in high-quality PCs with FreeBSD or Linux and forget about SANs completely ....create 14 RAID-1 pairs on the SAN hardware and mount them on our mail server (running Linux). Adding more pairs does reduce our disk load, but we've found that we're not scaling at a really good rate. It's getting unwieldy very fast. except for additional backup which the customer never accesses. A SAN is ideal for large amounts of data with little requests. You seem to need smaller amounts of data with a lot of requests and possibly some CPU load. So assign as few disks as possibly to each CPU and get more CPUs. Then you can add a new machine for every 1000 customers or the like (no idea how high this number would be) This would be unreasonable I think. But maybe we should think about it. [snipped SMTP stuff] We have SMTP flowing smooth like butter... Just some thoughts on the problem. What you urgently need is a strategic plan that allows scaling for a longer time and keeps cost per customer as constant as possible. Oh I know it. ;-) I went from a large company to this small, fast growing one and I find the problems presented to me to be far more challenging. They don't have the resources to throw unlimited money at any problem like my last company, but they're creative, caring, and really driven. I want to help them more. So we can work on partitioning the user's out to more machines. We have the RAID-1 pairs. What about the filesystem? Do you (or anyone else here) have an opinion on the best filesystem for the somewhat random mail server access pattern? Michael |
Thread Tools | |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Implementing a RAID System | Chris Guimbellot | General | 33 | February 3rd 05 09:43 AM |
Salvage Server Project | Ablang | General | 0 | July 27th 04 02:30 AM |
ASR-8400 Modem/Router/4 Port 10/100 Swith - Having Problems.... | Carlos Arruda | Asus Motherboards | 11 | July 21st 03 03:27 AM |
Anyone knows one.....???? | Carlos Arruda | Asus Motherboards | 12 | July 18th 03 08:52 PM |
e-mail attachments | KenTak | Asus Motherboards | 4 | June 29th 03 05:15 AM |