Intel Woodcrest, AMD's Opteron and Sun's UltraSparc T1: Server CPU Shoot-out
by Johan De Gelas on June 7, 2006 12:00 PM EST- Posted in
- IT Computing
MySQL Results: Scaling
Back to our main subject, our astute readers have probably already noticed a weird anomaly. Let us analyze this further. If you look closely at both our measurements, Quad-core and Dual-core x86, you'll notice that the scaling is negative. To make it more clear, we made an average of all concurrency numbers from 5 and higher.
This is nothing short of amazing. It seems like an anomaly, but this is not the case. These benchmarks have been checked, verified and checked again. They are accurate. The x86 cores running on Linux perform better with two cores than with four cores, but the T1 running Solaris actually improves performance going from 4 to 8 cores.
So who is guilty? Linux or the Opteron system? We had to test with Solaris on the Opteron to be sure. However, the Serverworks chipset of our MSI 1U server was not supported by x86 Solaris. So we went back to our homebuilt server, based on the MSI K8N Master2-FAR.
And this puts the performance of our UltraSparc T1 in a whole different perspective. First of all, it is clear that while MySQL might not be the most scalable database, the current kernel of Linux is not helping matters. We did tweak the Linux kernel two ways: the 2.6.15 kernel was optimized for either Intel's or AMD's architecture and the AMD architecture also got NUMA support.
So what is going on here? After talking to our MySQL guru (P. Zaitsev), it turns out that in some circumstances, MySQL might cause trouble for the Linux mutex (mutual exclusion) implementation: "mutex ping-pong". The mutex implementation makes sure that two threads cannot access data in the main memory that is locked by another thread.
It seems however more a MySQL problem than a Linux one, as other databases like DB2 scale very well in Linux. For DB2 under the same load we noticed a performance increase of no less than 80-85% when going from two to four cores. Also, with some loads, the bad scaling kicks in later than our "Select dominated" load. Intel's performance labs told us that they also ran into the same problem.
These issues are not as severe as the problems we encountered with MySQL in Mac OSX. Note that Apple seems to have recognized the problem and seems to offer a workaround. We'll report back with other MySQL workloads to investigate the MySQL scaling problem further.
PostGreSQL Results
PostgreSQL 8.0.7, another open source database, uses processes and not threads to deal with connections. The consequence is that the benchmark numbers are a lot more stable: once each core is busy with it's process, you almost get maximum performance. In other words, the results didn't change much from 5, 10 or 25 concurrent users. To keep things simple, we only list the numbers with 20 users, which results in peak performance. The queries per second numbers at 5 and 25 were only a few percent lower. We did not include the T2000 Sun Server as the optimal PostGreSQL configuration is still under investigation.
Another clear victory for Woodcrest. On the Opteron, every 10% in clockspeed increase seems to result in a 7% performance increase. So if we extrapolate, an Opteron 3 GHz would arrive at 616 queries per second.
Back to our main subject, our astute readers have probably already noticed a weird anomaly. Let us analyze this further. If you look closely at both our measurements, Quad-core and Dual-core x86, you'll notice that the scaling is negative. To make it more clear, we made an average of all concurrency numbers from 5 and higher.
MySQL Linux (Queries/s) | |||||
Sun T1 4/8 cores 1 GHz |
MSI K2-102A2M Opteron 275 |
Xeon 5160 Woodcrest 3 GHz |
MSI K2-102A2M Opteron 280 |
||
Average Dual-core (T1: quad-core) |
362 | 749 | 996 | 805 | |
Average Quad-core (T1: octal-core) |
433 | 590 | 904 | 622 | |
Speedup Dual to Quad | 20% | -21% | -9% | -23% |
This is nothing short of amazing. It seems like an anomaly, but this is not the case. These benchmarks have been checked, verified and checked again. They are accurate. The x86 cores running on Linux perform better with two cores than with four cores, but the T1 running Solaris actually improves performance going from 4 to 8 cores.
So who is guilty? Linux or the Opteron system? We had to test with Solaris on the Opteron to be sure. However, the Serverworks chipset of our MSI 1U server was not supported by x86 Solaris. So we went back to our homebuilt server, based on the MSI K8N Master2-FAR.
MySQL Solaris (Queries/s) | |||
Sun T1 4/8 cores 1 GHz | Opteron 280 Solaris | Opteron 280 Linux | |
Average Dual-core (T1: quad-core) |
362 | 456 | 799 |
Average Quad-core (T1: octal-core) |
433 | 605 | 625 |
Speedup Dual to Quad | 20% | 33% | -22% |
And this puts the performance of our UltraSparc T1 in a whole different perspective. First of all, it is clear that while MySQL might not be the most scalable database, the current kernel of Linux is not helping matters. We did tweak the Linux kernel two ways: the 2.6.15 kernel was optimized for either Intel's or AMD's architecture and the AMD architecture also got NUMA support.
So what is going on here? After talking to our MySQL guru (P. Zaitsev), it turns out that in some circumstances, MySQL might cause trouble for the Linux mutex (mutual exclusion) implementation: "mutex ping-pong". The mutex implementation makes sure that two threads cannot access data in the main memory that is locked by another thread.
It seems however more a MySQL problem than a Linux one, as other databases like DB2 scale very well in Linux. For DB2 under the same load we noticed a performance increase of no less than 80-85% when going from two to four cores. Also, with some loads, the bad scaling kicks in later than our "Select dominated" load. Intel's performance labs told us that they also ran into the same problem.
These issues are not as severe as the problems we encountered with MySQL in Mac OSX. Note that Apple seems to have recognized the problem and seems to offer a workaround. We'll report back with other MySQL workloads to investigate the MySQL scaling problem further.
PostGreSQL Results
PostgreSQL 8.0.7, another open source database, uses processes and not threads to deal with connections. The consequence is that the benchmark numbers are a lot more stable: once each core is busy with it's process, you almost get maximum performance. In other words, the results didn't change much from 5, 10 or 25 concurrent users. To keep things simple, we only list the numbers with 20 users, which results in peak performance. The queries per second numbers at 5 and 25 were only a few percent lower. We did not include the T2000 Sun Server as the optimal PostGreSQL configuration is still under investigation.
PostgreSQL 8.0.7 (Queries/s) | |
DL385 1 x Opteron 280 | 517 |
Intel 2 x Xeon "Irwindale" 3.6 GHz | 448 |
MSI 1U 1 x Opteron 275 | 490 |
MSI 1U 1 x Opteron 280 | 524 |
Intel 1 x Xeon 5160 WC 3 GHz | 673 |
Another clear victory for Woodcrest. On the Opteron, every 10% in clockspeed increase seems to result in a 7% performance increase. So if we extrapolate, an Opteron 3 GHz would arrive at 616 queries per second.
91 Comments
View All Comments
Questar - Thursday, June 8, 2006 - link
Why? Because AMD got creamed?ashyanbhog - Thursday, June 8, 2006 - link
and Intel woodcrest may have fantastic performance when compared to earlier xeons,but Intel is 3 years late to the party, Opteron was here in 2003!
also remember, woodcrest is a brand new design from PIII base, manufactured on 65nm process. It is still to make its debut in the market and be available in volumes. Amd its indeed nice to see it being compared to a 3 year old design manufactued on 90nm process.
AMD still has two product launches to come this year. Move to DDR2 for opterons which should cut some power usage for the total system AND introduction of products manufactured on 65nm at the fag end of the year. Will woodcrest and conroe still retain their performance margins then? if not, for how many months or weeks has Intel grabbed this "performance crown"?
zsdersw - Thursday, June 8, 2006 - link
Consider the following:- If comparisons could be made between new products from both companies (i.e., Woodcrest versus K8L), they would be made. In the game of leapfrog that we have betweeen AMD and Intel, the comparisons will always be between existing tech and new tech. Will you be pointing out how AMD is "late to the party" when they release their new stuff?
- Making its debut and availability in volume is an issue for both AMD and Intel. It's not a valid point unless you make it across the board.
- 65nm will allow clock speeds of Opterons/A64's to increase.. but Conroe/Woodcrest speeds will be increasing as well.
ashyanbhog - Thursday, June 8, 2006 - link
not because AMD got creamed!a 35 billion$ dollar turnover company (Intel) is bound to make a comeback one day.
it Anandtech's review setup, its full of holes
the mysql benchmark on Dual Dual core opterons where they see a 30% drop against single core dual processor numbers in this becnhmark contradicts their own earlier benchmark where they see a 10% performance increase.
http://www.anandtech.com/IT/showdoc.aspx?i=2447&am...">http://www.anandtech.com/IT/showdoc.aspx?i=2447&am...
they also use a substandard MSI motherboard in one of the Opteron systems and fail to mention which system was used for the benchmarks
mistakes like this, genuine or intentional, are rife throughout the review report
the whole thing looks like the rig was setup to push the performance diff b/w woodcrest and Opterons to the max,
why would anybody two months to tweak settings before they publish the review!
Questar - Thursday, June 8, 2006 - link
Why? Because AMD got creamed?duploxxx - Thursday, June 8, 2006 - link
yeah right its a workstation motherboard it uses an nforce controller so maybe they rate it as server board it still is a budget board used for workstations, not a real server board or server chipset like they used on the intel woodcrest.check the servers like sun galaxy and hp dl385 they have amd chipsets... big difference.
the nforce has a shared memory bus...
zsdersw - Thursday, June 8, 2006 - link
Yeah, that's one of the 3 Opteron servers. At any rate, the MSI board is a basic server board.. it's still a server board.duploxxx - Thursday, June 8, 2006 - link
yeah they have done 1 real bench with an hp. all other benches were done with the 2 MSI basic boards...still waiting for the wintel benches
wolaris - Thursday, June 8, 2006 - link
In corporate environments, no-one with any hardware budget at all runs webserver and database on the same machine, as it hurts both performance and reliability. This affects T1 most, as its low clock speed and simple cores are not meant for database workloads.I think that you should run web serving tests using common, high-performance Opteron DB server and separate webservers, as it would be the case in real-world scenarios.
MrKaz - Thursday, June 8, 2006 - link
So Power consuming of the new Intel processor on .65nm at already high clock speed of 3.0Ghz is already consuming more than the older AMD Opteron on .90nm 2.8Ghz and DDR.When AMD releases socket F will go DDR2 (less power) and better .90nm samples (lower power). So then "new" Intel is already getting beaten...
And those tests where done with Cool&Quite?
Also don’t forget this tests where done with Woodcrest 3.0Ghz VS Opteron 2.2Ghz and 2.4Ghz, so when AMD releases the 2.8Ghz and 3.0Ghz with socket F the performance lead of Intel will vanish…
I think the biggest surprise here is how bad Xeon (P4) was (IS!!), and people keep buying it.