hosting for 20 million pageviews per month

webmasterbeta

New Member
We changed our classifieds site to Mosso at 1. 6. 2008.
The site was previously hosted by 2 dedicated servers (Core 2 Duo E6750 | 4GB Ram | 250GB HDD SATA x2 3ware paying 200$ for each one).
Now we have maybe between 8 - 10 million page views per month, both servers was maybe in highest traffic period, ( between 10:00 am 18:00 pm ) on 75 % .
Mosso will have a new billing system, for computer cycles and its still not clear, how much they will bill us.
Expecting 15-20 million page views per month next year, we must make the decision to cluster with a load balancer and 4 servers, or stay by Mosso, but its still not clear, how much they will bill us . We don't have the resources and knowledge, to manage the configuration and clustering, that抯 the reason why we changed to Mosso. But if this will be to expensive, we must have an other solution ( at the moment it looks, like with the computer cyrcle billing method our expenses will go from monthly 400 $ to 4?00 $ ?5?00 $, and we can not spend more than 1000 ?1200 $ per month for hosting)
We tried already to cluster, but always was the problem with the mysql.
Any suggestions about services, pricings, companies who can manage the hosting?I would suggest your alphared DOT comIf you were doing 10Mil/day on those two servers, you should be able to split the load among a few DualQuad core servers. 1-2 for web, 1 for mysql + 1 mysql replicated. Hard to say though exactly w/o knowing the mysql usage etc.we dont have the resource and now-how to prepare servers – all 60 days one of our servers crashed for some reason, that’s really stressy….the reason, why we changed to a “Cloud Computing” company like mosso was that we have a 100 % uptime and they manage the hosting.A good solution for us would be a hosting company who build us the solution we need, include the service for managing the optimization of DB, the clustering, security and servers optimization.We tried already to cluster, but always was the problem with the mysql.
Any suggestions about services, pricings, companies who can manage the hosting?
Is the problem for clustering only mysql? What about finding a company that can manage mysql clustering for you? There are plenty of good companies for this.20M page views per month isn't that much (less than 1M per day). Mosso/Rackspace do charge a premium but your budget of $2k should easily cover 4 mid-range servers and some load balancing. Add in a hardware load balancer and you'll need to increase your budget a little but a jump from $400 to $4000 with no warning sounds unfair. Do you have an intensive database? Does this need to be clustered?You can get a good scalable cluster for ~$ 1100/mth with a load balance. If the configuration is simple, management costs could be included in that. Don't forget the bandwidth cost though.Kind Regards,20M page views per month isn't that much (less than 1M per day). Mosso/Rackspace do charge a premium but your budget of $2k should easily cover 4 mid-range servers and some load balancing. Add in a hardware load balancer and you'll need to increase your budget a little but a jump from $400 to $4000 with no warning sounds unfair.

Do you have an intensive database? Does this need to be clustered?

Application is provided by a third supplier and a lot of customization was done in the last 2 years. We have a intensive DB - this need to be clusteredDo you happen to know by any chance, how intensive your database really is? Also, how large is it?Do you happen to know by any chance, how intensive your database really is? Also, how large is it?

there are maybe between 30 - 60 queries by pageload..there are maybe between 30 - 60 queries by pageload..

Hello,

you probably do not require mysql load balancing for what you are describing - though, its not a bad idea to do it. Some pretty serious sites run off a tried and tested method of 2 or more load balanced web nodes sharing storage on some sort of SAN + 1 heavy mysql server. For example, this is the configuration slashdot employs:

<!-- m --><a class="postlink" href="http://slashdot.org/faq/tech.shtml">http://slashdot.org/faq/tech.shtml</a><!-- m -->

as you can see, they have a single mysql server serving 8 web server nodes. if you have the right mysql server, locked down tight, only accessible to the web server nodes, very redundant hardware - you will get very very close to 100% uptime - and since mysql data alone tends to be relatively small in size, you can recover from a complete disaster in 30-45 mins (or less). You could of course replicate and load balance mysql service if you like on 2 lesser nodes, but, overall reliability will not improve significantly, and you will lose performance over a single, "super" mysql node.

What I would recommend is the following:

load balancer(s)
2* web server nodes (dual xeon, 2-4 GB, 2*small scsi, raid1 for OS)
1* SAN NFS (or GFS or whatever file system) storage space (shared or dedicated)
1* mysql/email server node (2*dual or quad core xeon, 4-8GB RAM, 2*146 GB (or whatever size) SAS drives - redundant and/or hot swap everything (fans, power supplies, ram, CPU, drives, nics)

you could of course also load balance mysql - but, based on what you have said, I am not certain you need it at this point - but, nothing wrong with extra redundancy ... so, its a cost:benefit decision you will ultimately need to make..

$2k/month would be on the lower end for a proper setup, but, its not that far off the mark and depends on whether you are willing to share some services (ie load balancer(s), SAN) or whether you want everything dedicated...

hope this helps...I agree with Andrew. Perhaps MySQL isn't the problem. You definitely need some hardware load balancing and a simple web cluster using a SAN that is moderately powerful. 20+ million hits a month isn't too much by any means and a moderate solution can give you the reliability and performance you need. All within your budget.the mysql is on the Core 2 Duo E6750,4GB Ram sever in high traffic time maybe on 75-90 %, also we like to have redundantthe mysql is on the Core 2 Duo E6750,4GB Ram sever in high traffic time maybe on 75-90 %, also we like to have redundant

what kind of drives do you have on there?

mysql is all I/O and RAM (more so then processors). You could have 10 such processors in the unit, and if you are using SATA drives instead of 15k SCSI or SAS - you will still get bottlenecks with I/O and have loads shoot up on you...

as I said, you can probably get more redundancy (ie less overall points of failure) from a single high end mysql server node if done properly, vs having 2 lower end mysql nodes. Having said this, mysql clustering is pretty easy to setup - so, if thats the route you want to go, I am sure you will not have any issue doing so...what kind of drives do you have on there?mysql is all I/O and RAM (more so then processors). You could have 10 such processors in the unit, and if you are using SATA drives instead of 15k SCSI or SAS - you will still get bottlenecks with I/O and have loads shoot up on you...Don't discount the processor, for something DB intensive, nothing less than 4 cores total. I've seen mysql at 80% cpu usage on a C2D (4GB RAM) with 150-200qps and io wait was within reasonable targets.I should note part of the reason was quite a few inefficient queries. It also depends on your read to write ratio, as table locking will drive you nuts. As I understand it, faster CPU would help in that scenario.Don't discount the processor, for something DB intensive, nothing less than 4 cores total. I've seen mysql at 80% cpu usage on a C2D (4GB RAM) with 150-200qps and io wait was within reasonable targets.
I should note part of the reason was quite a few inefficient queries. It also depends on your read to write ratio, as table locking will drive you nuts. As I understand it, faster CPU would help in that scenario.

granted - but, based on what has been posted here, I am almost assure I/O is driving their loads up on this box...

you are correct though, cant discount that - CPU certainly matters with DB service which is why I ultimately suggested dual, quad core CPU's...We're using sata drives. Perhaps it is worth the try to use scsi drives. But I thought the performance gap between sata and scsi isn't that big nowadays.
We tried mysql cluster. But because the storage is volatile (in ram), if both servers go down data is lost to the last backup...

Before moving to mosso we were preparing for load balancing. One master mysql server replicated to the webserver nodes (and also mysql servers for selects, as connection over socket is faster than over tcp/ip).

One nfs server was serving the filesystem to the webserver nodes.

But the problem is, if anything goes wrong with the mysql servers somebody has to resync databases.

As our application is build by a third supplier (old fashioned coded: no logic, no structure) and one web node goes down i don't know if the other node can handle all the requests in peak time...

But then came the idea with mosso.I wanted to apologize for my inadvertent reply several days ago concerning the solicitation of our services at NationalNet, I will be more careful to make sure that I post them in the correct forum going forward and to abide by the rules and policies as outlined in the terms and conditions.How much BW are you pushing total?How much BW are you pushing total?

at the moment maybe 800 - 1200 GB per monthWe tried mysql cluster. But because the storage is volatile (in ram), if both servers go down data is lost to the last backup...That's not quite accurate. While MySQL Cluster 5.0.x only offered memory based tables, MySQL Cluster 5.1.x (or 6.2.x in the updated versioning system) supports the use of disk based tables. And even with inmemory tables, it uses two disk based flushing mechanisms (global check points to an incremental redo log and local check points with full data); by default global check points occur every few seconds, and you can lower that number if need be.@lockbull
Thanks for the info! Didn't know that. Still on the 5.0 version :-)
So if it really happens that the whole mysql cluster node goes down you can restart them without any other manual interaction?@lockbull
Thanks for the info! Didn't know that. Still on the 5.0 version :-)
So if it really happens that the whole mysql cluster node goes down you can restart them without any other manual interaction?

Yes, with the same caveats as any service interruption (a power outage that corrupts your disks will have the same affect regardless of the storage engine). There is a lot of tweaking that you can do when it comes to the redo logs / checkpoints to increase resiliency to unexpected catastrophic failure of the cluster. Here's some good info:

<!-- m --><a class="postlink" href="http://johanandersson.blogspot.com/2007/05/good-configuration.html">http://johanandersson.blogspot.com/2007 ... ation.html</a><!-- m -->

If you're actually using MySQL Cluster 5.0.x in a production environment, I would highly recommend that you consider moving to 5.1.23-cluster-6.2.x. There are a variety reasons for that; check the MySQL website for details on what improvements 6.2.x offers. MySQL Cluster has been spun off of the standard MySQL versioning due to it's higher rate of updates; the latest beta version is up to 6.3.x.@lockbull
Thanks for the info! Didn't know that. Still on the 5.0 version :-)
So if it really happens that the whole mysql cluster node goes down you can restart them without any other manual interaction?

MySQL Cluster / NDB is not for all SQL deployments. It works well when performing simple SELECT statements but is not a good choice if you are using any type of complex JOIN's. It is technology that was donated to MySQL by Sony-Ericson who used it mainly for telecom related SQL queries. In the future with on disk storage of SQL data in NDB it will be a better choice for general deployments.

What you would benefit from is running your MySQL master's in a HA configuration (RHCS) and replicating your data to READ slaves. Separate out your READ/WRITE's in your application to use use the slaves and master for different purposes. If application mods are not possible look into using MySQL proxy to route your queries appropriately. I can help you with a network design if you'd like, just send me a PM.Nice Page Views.:eek:Having a host spend a little bit of time getting to know a few more specifics and work out a solution would probably be best, but I'd recommend the following:1 - Full Eval by potential provider2 - 4 servers, 2xWeb 2xMySql3 - Basic Master/Slave replication between MySQL4 - Split Reads across both MySQL server, isolate writes to master5 - load-balance/round-robin between the 2 webservers.With this, tuning/optimizing both Apache and MySQL would be key. Knowing more details such as can you make the app spread the reads/writes, is it more heavily reads than writes or vice-versa, storage engine(s) being used, etc... would be critical in finding the best solution.TBH, I think you could quite easily server 20M PVs off of a pair of servers, one web, one mySQL - if things are optimised correctly. Do you really need to be doing 60 queries per page? I'm highly doubtful of that - most customers we've come across that are doing something like that, can reduce down to 10-20% of that amount. Not to mention optimising the mySQL configuration itself - this can have an absolutely massive impact on performance.That's not quite accurate. While MySQL Cluster 5.0.x only offered memory based tables, MySQL Cluster 5.1.x (or 6.2.x in the updated versioning system) supports the use of disk based tables. And even with inmemory tables, it uses two disk based flushing mechanisms (global check points to an incremental redo log and local check points with full data); by default global check points occur every few seconds, and you can lower that number if need be.

Hello,

certainly MySQL cluster changed from v5.0 to v5.1. In v5.0 the data base ran in memory only, wheres v5.1 introduced table spaces, with the capability of storing nonindexed data on disk. However, you still have limitations with mysql clustering - features like full text searching, transaction isolation levels higher then read committed, etc.. this may or may not matter to the Original Poster.

Honestly, from what I have read here, the OP does not require mysql clustering. However, if they want redundancy in mysql, they should consider mysql replication and load balancing in combination vs mysql clustering.

The obvious problem with traditional mysql replication is the applications need to be `replication aware`- ie) write on the master and read to the slave. However, it is possible, and not overly complex, to build a mysql replication system where all nodes act as a master and a slave at the same time. This means that the user does not need to worrying about their coding and can utilize the solution just like they would a standard single mysql server.

End of the day, this particular customer doesnt need the expense of adding additional nodes for mysql replication or clustering. But, heck, if they find value in the cost of setting this up, then it certainly is something that should be considered. probably overkill, but, I guess you can never overkill on peace of mind...We're using sata drives. Perhaps it is worth the try to use scsi drives.

Your database server is working overtime and your using SATA. This is a clear case of needing to upgrade to SAS and ensure you use a quality controller. Database servers use I/O like crazy. Upgrading the hard drive setup will help a lot i reckon. Even if you do need to cluster the database, ensure you use quality drive arrays, it will be worth it.I don't think the budget allows for an optimal MySQL Cluster deployment. Master/Slave setup should be sufficient for your traffic. I would enable slow query logging on your master. Do you have the source code for the application ? We can continue to throw solutions your way, but until we can have a thorough understanding of your application we won't be able to make an intelligent recommendation. We don't know for sure that the db is the bottleneck or could it be the frontend servers? Could it be something as simple as dns resolvers? or maybe a slow query that examines 2mil rows?bmetraux,

I work in the R&D department at Mosso. The first thing I would like to remind you is that managed web hosting (involving the management of clustered solutions) is more expensive than unmanaged dedicated servers. Comparing apples to oranges will result in setting unrealistic expectations for your web hosting budget if you really need managed hosting.

Mosso makes complex hosting in a clustered site easy. This is valuable for those (like you) who have busy sites, but don't have the system administration and engineering resources in-house to take a pile of ordinary servers and make it into a top notch solution.

Because I work at Mosso, I was able to actually research the specific resource needs of your particular classifieds site. Your site relies heavily on dynamically generated content from a database. Your application is CPU bound. That's why your Compute Cycles consumption is pushing you into the $4K/mo range for hosting this site.

As a point of reference, if you wanted to host this site on a Dual Xeon 2.8 GHz 4GB RAM 10K SAS/SCSI servers each with RAID-1, you would need about four of them to have a site that performs well under load. If you want to be able to survive sudden bursts of traffic, you should probably have a couple extra to accommodate that. Between four and six servers total is probably about right for 20M page views of your application.

Unmanaged dedicated servers in this class would cost you about $400/mo each ($1600-$2400/mo total), and you will also need a load balancing solution as well, which will likely cost on the order of $100/mo per server, but this varies depending on where you host. Total budget for unmanaged servers is between $2000/mo and $3000/mo.

For managed dedicated servers in this class, you are looking at a monthly total of about $800/server which brings a solution with load balancing to between $4K/mo and $6K/mo.

Based on my calculations, your fees at Mosso for a 20M _page_ view month would be about $4K. You can host 20M _hits_ of your site for about $400/mo. With Mosso, you can scale up rapidly (without notice) if you need to, and the resources are already there for you to use. Plus you can depend on a staff who exclusively specializes in the management of clustered systems.

If you compare apples to apples, making a move to traditional dedicated servers probably does not make sense unless you are willing to sacrifice performance and try to jam this site onto an underpowered configuration.

As I see it, you have these options:

1) Optimize your application to use less server resources.

-or-

2) Reset your expectations about what the budget should be for hosting this site.

-or-

3) Buy the best servers you can for what your budget allows, and just hope it works. If you get a lot of traffic all of a sudden, it will simply slow down, or maybe choke.

Specific Recommendations

A) Arrange for some software development and engineering work on your site software so that dynamic content is automatically rendered to static content as it changes on an event driven basis, so that the web server mostly delivers content from static HTML files that are dynamically updated rather than dynamically generated HTML content from a database back-end. At Mosso, deploying this strategy can reduce your fees to about 10% of what it costs to host a fully dynamic site at scale.

B) Optimize your SQL queries, and put sensible limits on what data your visitors can search. Make sure your database is optimally indexed for the type of searching your visitors do. This can significantly reduce the resource requirements for hosting your site.

C) If you don't have the ability to optimize the site you may be able to front-end it with a custom front-end caching solution that may approximate the solution from above. Contact Mike Welsh at Mosso if you want to explore this option.

D) If you don't have the ability to improve the software that runs your site, then you must reset your expectations about what it will cost to host this site at scale. If you want it to perform well, you're going to need to allocate more than $1200/mo.

Parting Thoughts

As a point of reference, a typical dynamic application with a database can fit comfortably on a single server with a load of about 2.5 million page views. Servers with lots of CPU cores extend that a bit, sometimes to 5 or 10 million. If you are willing to accept reduced performance, sometimes you can pack on several multiples of the traffic workload and get away with it, but perhaps some of your visitors might sometimes wait 5-20+ seconds for a page to load.

With the right combination of software development and system engineering, it's possible to have a dynamic application that serves 40-50+ million page views per month per server. In order to get this performance, you need careful application of caching technology for application opcode and HTML content, highly efficient web server software, proactive content compression, and a highly optimized database. It just takes a tremendous amount of profiling, tweaking, and engineering to get to these levels.

The bottom line here is that I've looked at your application in a production capacity, and confirmed that demands a good deal of system resources. It's simply not going to work well slapped onto $1K worth of web hosting unless it's well optimized. If you optimize it, you might as well host it at Mosso, because it will fall within your budget then.

Mosso values your business, and we want you to have the best possible hosting solution for your site.
 
Back
Top