Free Republic
Browse · Search
General/Chat
Topics · Post Article

Skip to comments.

Facebook trapped in MySQL ‘fate worse than death’
Giga OM ^ | July 7, 2011 | Derrick Harris

Posted on 07/07/2011 8:55:49 PM PDT by TenthAmendmentChampion

According to database pioneer Michael Stonebraker, Facebook is operating a huge, complex MySQL implementation equivalent to “a fate worse than death,” and the only way out is “bite the bullet and rewrite everything.”

Not that it’s necessarily Facebook’s fault, though. Stonebraker says the social network’s predicament is all too common among web startups that start small and grow to epic proportions.

During an interview this week, Stonebraker explained to me that Facebook has split its MySQL database into 4,000 shards in order to handle the site’s massive data volume, and is running 9,000 instances of memcached in order to keep up with the number of transactions the database must serve. I’m checking with Facebook to verify the accuracy of those numbers, but Facebook’s history with MySQL is no mystery.

The oft-quoted statistic from 2008 is that the site had 1,800 servers dedicated to MySQL and 805 servers dedicated to memcached, although multiple MySQL shards and memcached instances can run on a single server. Facebook even maintains a MySQL at Facebook page dedicated to updating readers on the progress of its extensive work to make the database scale along with the site...

(Excerpt) Read more at gigaom.com ...


TOPICS: Business/Economy; Computers/Internet
KEYWORDS: bsd; facebook; linux; mysql; opensource; socialnetwork; unix
Navigation: use the links below to view more comments.
first previous 1-5051-100101-127 last

BTTT


101 posted on 07/08/2011 7:30:03 AM PDT by DollyCali (Don't tell God how big your storm is... tell your storm how BIG your God is!)
[ Post Reply | Private Reply | To 1 | View Replies]

To: NVDave

thanks for the trip down memory lane.


102 posted on 07/08/2011 8:14:34 AM PDT by NonValueAdded (From her lips to the voters' ears: Debbie Wasserman Schultz: "We own the economy" June 15, 2011)
[ Post Reply | Private Reply | To 26 | View Replies]

To: andyk; smokingfrog
I bet this post is stored in a MySQL database.

I'll bet it is, but then it's fairly straightforward and relatively small. You have users, threads, posts, mails and the connections between them. Everything's text, no BLOBs. We also have maybe a few hundred thousand users at most, not several hundred million. We probably have tens of millions of records in the "posts" table, but that's not a problem with modern hardware.

Google probably runs the largest commercial database, but it's highly customized. The proprietary Linux modifications and the database were all designed for this. It's highly distributed by the core design, and fault tolerant in that it doesn't care if some search data is lost since it will be rebuilt on later crawls.

103 posted on 07/08/2011 9:07:01 AM PDT by antiRepublicrat
[ Post Reply | Private Reply | To 30 | View Replies]

To: Salamander
Mack - ah - DAME - eee - ah!

Ooooooooooh!
104 posted on 07/08/2011 9:11:46 AM PDT by shibumi (The man who never alters his opinion is like standing water and breeds reptiles of the mind - Blake)
[ Post Reply | Private Reply | To 77 | View Replies]

To: kingu

Does the company you work for understand that fixing this problem cannot wait until the system breaks? That it takes time to move stuff over to a more robust system?

I ask, because I’ve been in a similar situation and the company wouldn’t listen when I told them they need to pay attention and starting looking at upgrades now, rather than at the last minute.


105 posted on 07/08/2011 9:20:23 AM PDT by stylin_geek (Never underestimate the power of government to distort markets)
[ Post Reply | Private Reply | To 10 | View Replies]

To: Greysard
If some customers need more performance they can always send a check to Steve Ballmer.

Just a bit of warning, writing a check to get Standard or Enterprise doesn't just mean you get to access more memory and CPUs and can have a bigger databases. It's much better than in the MSDE days, but there are still other differences that can affect how you design your database. For example, off the top of my head, the full versions also give indexed views, transparent encryption (so you don't have to encrypt yourself), table and index partitioning and database mail. Luckily, you can now get full text search in the Express Edition too if you download the version with advanced services.

Still, it's much better than moving from mySQL to a top-line system. FTR, I don't hate mySQL, since I've used it quite a bit on small systems I know won't get big. I just wouldn't trust it with the multi-terabytes I've managed on other systems.

106 posted on 07/08/2011 9:39:08 AM PDT by antiRepublicrat
[ Post Reply | Private Reply | To 43 | View Replies]

To: re_nortex
I remember the day the AOL masses were unleashed upon the internet. You could see a visible change in the culture of the net overnight.

BTW, I still have links for some gopherspace that is still out there and running.

107 posted on 07/08/2011 9:40:28 AM PDT by zeugma (The only thing in the social security trust fund is your children and grandchildren's sweat.)
[ Post Reply | Private Reply | To 56 | View Replies]

To: smokingfrog

Translation?

They have to demolish the building and put up another which looks exactly the same but which won’t spontaneously implode like the first one is about to.


108 posted on 07/08/2011 9:41:44 AM PDT by ctdonath2
[ Post Reply | Private Reply | To 4 | View Replies]

To: re_nortex
..and then along came AOL followed by the "September that Never Ended".

Weird Al is a genius,

And postin' "Me too!" like some brain-dead AOL-er
I should do the world a favor and cap you like Old Yeller
You're just about as useless as jpegs to Hellen Keller

109 posted on 07/08/2011 9:43:55 AM PDT by antiRepublicrat
[ Post Reply | Private Reply | To 56 | View Replies]

To: stylin_geek

Indeed, it does understand this, but is not of great concern, because the company will most likely be closing down before the end of the year due to eBay’s pricing changes. Unfortunately, we’re also taking down several others at the same time when we stop buying from them.


110 posted on 07/08/2011 9:54:49 AM PDT by kingu (Everything starts with slashing the size and scope of the federal government.)
[ Post Reply | Private Reply | To 105 | View Replies]

To: kingu

Okay, whip it out...

I had one with 20 million+ main records, but all text so it was only a few hundred gigabytes. Tweaking the indexes (regular and full-text) for fast searching was the key there. Had another that used lots of BLOBs, ran to terabytes in over a million records. That just sucked. I left before SQL Server 2008 came available with its ability to hold BLOBs outside the database files.

But then I have friends who have dealt with a lot more than that. One ran a mainframe with two of those robotic tape silos. He’s probably juggling petabytes by now. There’s always a bigger fish.


111 posted on 07/08/2011 9:56:43 AM PDT by antiRepublicrat
[ Post Reply | Private Reply | To 64 | View Replies]

To: glock rocks

“Regular inner join, or do you plan to get kinky with a left outer join?”


112 posted on 07/08/2011 10:00:25 AM PDT by antiRepublicrat
[ Post Reply | Private Reply | To 63 | View Replies]

To: TenthAmendmentChampion

It’s got 2 Americas and change worth of users. Somehow I don’t think mySQL is holding them back very much.


113 posted on 07/08/2011 10:00:28 AM PDT by discostu (Ph'nglui mglw'nafh Cthulhu R'lyeh wgah'nagl fhtagn)
[ Post Reply | Private Reply | To 1 | View Replies]

To: Cronos
And don’t get me started on RAID settings...

Wouldn't it be fun if we found out Facebook was running RAID 5 or 6?

114 posted on 07/08/2011 10:05:53 AM PDT by antiRepublicrat
[ Post Reply | Private Reply | To 83 | View Replies]

To: BuckeyeTexan
... locking the DB down to one user and introducing a few sleep statements here and there ...

LOL. I really like that.

115 posted on 07/08/2011 10:08:57 AM PDT by ken in texas (Can't Afford a Tagline... send money.)
[ Post Reply | Private Reply | To 96 | View Replies]

To: antiRepublicrat

never used Raid 6 — how does that differ from 5?


116 posted on 07/08/2011 10:16:53 AM PDT by Cronos ( W Szczebrzeszynie chrzaszcz brzmi w trzcinie I Szczebrzeszyn z tego slynie.)
[ Post Reply | Private Reply | To 114 | View Replies]

To: Cronos
never used Raid 6 — how does that differ from 5?

It's 5 with an extra parity drive. It's supposed to eliminate that exposure to data loss while a RAID 5 is rebuilding after losing a drive. Basically, with RAID 6 and a hot spare, odds are you'll always have one level of redundancy.

117 posted on 07/08/2011 10:30:05 AM PDT by antiRepublicrat
[ Post Reply | Private Reply | To 116 | View Replies]

To: LivingNet
Personally, I use Cache almost every day and as far as I am concerned it is the greatest database and programming language ever...just don't know why it doesn't catch on.

Cache....I think that's a multi-value Pick variant isn't it? If so, it is an extremely strong database but I haven't worked specifically in the Cache development environment. We are a Universe shop (for now anyway) and a guy that used to work with me went to a Cache shop and LOVES it.
118 posted on 07/08/2011 12:28:06 PM PDT by copaliscrossing (Progressives are Socialists)
[ Post Reply | Private Reply | To 44 | View Replies]

To: JRandomFreeper

Bingo!

Your *ENTIRE* reply hit the nail on the head.
I’ve seen (more often heard, rather than seen) this type of issue occur when forthought and growth is not engineered in at the beginning.


119 posted on 07/08/2011 1:04:23 PM PDT by Verbosus (/* No Comment */)
[ Post Reply | Private Reply | To 13 | View Replies]

To: NVDave

Wow.

Yet another person thinks like I do.

(I’m not fond of C++ either. Give me “C”. Pure and powerful K&R “C”.)


120 posted on 07/08/2011 1:10:25 PM PDT by Verbosus (/* No Comment */)
[ Post Reply | Private Reply | To 26 | View Replies]

To: antiRepublicrat

Sounds like someone finally cottoned onto what Honeywell did on their drive systems under CP-6.


121 posted on 07/08/2011 1:23:53 PM PDT by NVDave
[ Post Reply | Private Reply | To 117 | View Replies]

To: NVDave

I don’t really like RAID 6 though. For a small installation of five drives, you might as well add one more and get RAID 10. for larger installations the big problems comes in drive rebuild times. With very large drives, the rebuild of one drive of a RAID 5 can take so long the risk of failure during the rebuild becomes more and more significant. RAID 6 just pushes the problem off a bit.


122 posted on 07/08/2011 1:38:48 PM PDT by antiRepublicrat
[ Post Reply | Private Reply | To 121 | View Replies]

To: antiRepublicrat

ok — I work with datawarehouses so most of our stuff is RAID 1 or sometimes 5


123 posted on 07/08/2011 1:44:48 PM PDT by Cronos ( W Szczebrzeszynie chrzaszcz brzmi w trzcinie I Szczebrzeszyn z tego slynie.)
[ Post Reply | Private Reply | To 117 | View Replies]

To: antiRepublicrat

A growing number of shops are dumping Wintel server farms (and even IBM RS6000 type farms) for Linux instances running on mainframes.


124 posted on 07/08/2011 2:31:02 PM PDT by djf ("Life is never fair...And perhaps it is a good thing for most of us that it is not." Oscar Wilde)
[ Post Reply | Private Reply | To 98 | View Replies]

To: antiRepublicrat
You have users, threads, posts, mails and the connections between them.

LOL
125 posted on 07/08/2011 6:31:50 PM PDT by andyk (Interstate != Intrastate)
[ Post Reply | Private Reply | To 103 | View Replies]

To: mad_as_he$$
Feel free to disagree, but I think you need to improve your knowledge on Facebook's internal systems. There are several papers/preso off of infoq.com and highscability.com if you're so inclined. Both sites also have architecture information on other web scale companies.

I'm sure at one time hundreds of gigs per day was impressive. A year ago, Facebook had roughly 130TB of logs per day. I'm sure that's gone up.
126 posted on 07/10/2011 10:22:21 PM PDT by tfecw (It's for the children)
[ Post Reply | Private Reply | To 99 | View Replies]

To: tfecw
Once controlled structure is achieved housekeeping is pure execution. I think the FB problem is that it was started by a dorm room operation not professional computer guys and then never upgraded to deal with large amounts of data. I have two guys that spend all day every day looking at architecture and DB design. They are worth their weight in gold since they are true system guru's. FB seem to have missed that.
Probably didn't make my point as clear as I could of. In data management foundation is everything and worth the initial pain. We start a project today with a very large customer still trying to run a relatively complex operation on Access of all things. Our biggest obstacle is the IT Dept that wrote the application years ago and has been nursing it all this time. Should be real interesting in a painful sort of way!
127 posted on 07/11/2011 4:57:38 AM PDT by mad_as_he$$
[ Post Reply | Private Reply | To 126 | View Replies]


Navigation: use the links below to view more comments.
first previous 1-5051-100101-127 last

Disclaimer: Opinions posted on Free Republic are those of the individual posters and do not necessarily represent the opinion of Free Republic or its management. All materials posted herein are protected by copyright law and the exemption for fair use of copyrighted works.

Free Republic
Browse · Search
General/Chat
Topics · Post Article

FreeRepublic, LLC, PO BOX 9771, FRESNO, CA 93794
FreeRepublic.com is powered by software copyright 2000-2008 John Robinson