Google open sources 'Protocol Buffers'

Daily Newsletters

Sign up to ZDNet UK's daily newsletter.

NEWS

Google has open sourced an internal development tool called 'Protocol Buffers', a data description language that forms a basic part of the operation of the company's vast computing cluster.

The tool, which has been in use for several years at Google, handles the process in which the company encodes almost any sort of structured information that needs to be passed across the network or stored on a disk, Google open-source programs manager Chris DiBona said in a blog post announcing the move.

Protocol Buffers could be useful for other organisations that need an efficient way to move structured data around a network, for instance in large clusters or datacentres, DiBona said.

Google uses thousands of data formats for networked messages, and XML is simply too cumbersome to use as an encoding method for it all, Google software engineer Kenton Varda explained in a separate blog post. "As nice as XML is, it isn't going to be efficient enough for this scale," he wrote. "When all of your machines and network links are running at capacity, XML is an extremely expensive proposition."

Various other methods exist for passing encoded data over networks, but Google found that none of them suited its particular need, which was for a system optimised for efficiency over everything else, Varda said. Protocol Buffers is a sort of interface definition language (IDL), but IDLs have a reputation for being over-complicated, Varda said.

"One of Protocol Buffers' major design goals is simplicity," he wrote. "By sticking to a simple lists-and-records model that solves the majority of problems, and resisting the desire to chase diminishing returns, we believe we have created something that is powerful without being bloated."

He estimated the system is at least an order of magnitude faster than XML, while other Google documentation said Protocol Buffers can be parsed 20 to 100 times faster. The binary files produced by Protocol Buffers are three to 10 times smaller than a comparable XML file, Google said. Google released an FAQ detailing Protocol Buffers, along with source code for the Java, Python, and C++ protocol buffer compilers.

Google admitted that the system is comparable to long-established projects such as JavaScript Object Notation (JSON), which is often used in Ajax web programming. But JSON, like XML, is a human-readable text format, rather than a binary format such as Protocol Buffers, a fact that reduces JSON's efficiency, Google said.

Even so, Google was criticised on some fronts for creating its own system from scratch and ignoring currently existing approaches. David Golightly, user experience developer lead for Zillow.com, argued the textual syntax used in Protocol Buffers could have been made interoperable with an existing text-based format.

"I'm always just a little disappointed when someone goes about creating their own new textual format syntax on arbitrary grounds, rather than adapting an existing format to their needs," he said in a blog post. Google is not the first to open source its internal data interchange system: Protocol Buffers is very similar to the Thrift framework, developed by Facebook and now an open-source project in the Apache Software Foundation Incubator. Thrift, however, differs in that it describes services rather than pure data.

Talkback

I guess I should be tickled to be quoted here, but instead I'm slightly appalled. The "blog post" in question was not my blog, but rather Simon Willison's; I merely left a comment on the article. My comments were taken out of context. I have no beef with Protocol Buffers per se, and one of the later commenters in the story later answered the questions I had about the textual representation of the format in Google's API docs. Furthermore, my job title wasn't present anywhere in my comments; the writer evidently followed the link to my personal blog (http://davidgolightly.blogspot.com), found my job title, and quoted me out of context without contacting me for permission first. While I realize that getting content off the internet is more useless than trying to get urine out of a tainted swimming pool, I would have more respect for the writer's journalistic integrity had I been contact for a more apropos comment, rather than having my words taken out of context without my foreknowledge.

Protocol Buffers. They're great, try some.

-- David Golightly

David Golightly 11 July, 2008 22:15
Reply

Post your comment

In order to post a comment you need to be registered and logged in.

You can also log in with Facebook. Log in or create your ZDNet UK account below

  • Login

Will not be displayed with your comment

By signing up for this service, you indicate that you agree to our Terms and Conditions and have read and understood our Privacy Policy. Questions about membership? Find the answers in the Community FAQ

Get ZDNet UK's daily newsletter

Enter your email address to sign up

ZDNet UK Live

txtrainguy

Replying to an old topic that I'm currently facing with my CEO (who is on a Mac). Our servers are primarily Windows Servers, office is about...

6 hours ago by txtrainguy on Windows Server 2008 drops the ball for Mac compatibility
k0tcs3

Sure, that makes perfect sense. Pay wrong-doers money and thank them for breaching your security and pointing out your flaws, that would surely...

6 hours ago by k0tcs3 on US indicts Romanian over NASA climate change hack
Random_Error

I think he's referring specifically to Android apps, as Apple do regulate their App Store, but Google seem to let any old crap onto the Android store!

7 hours ago by Random_Error on RIM: BlackBerry will keep 'garbage' apps out of store
Paul Fezziwig

Keep the crap apps out?! How will they compete with Android and Apple's claim to fame of having so many life changing apps? I wonder if the media...

12 hours ago by Paul Fezziwig via Facebook on RIM: BlackBerry will keep 'garbage' apps out of store
Aigars Mahinovs

It has been shown time after time that if there is an author store that sells the songs at even 1$ per song and gives you a high-quality digital...

13 hours ago by Aigars Mahinovs via Facebook on Copyright isn't working, says European Commission
awbMaven

""As a result of Butyka's alleged conduct, researchers were unable to use the computers for more than two months while NASA removed the malicious...

15 hours ago by awbMaven on US indicts Romanian over NASA climate change hack
subhorup

It simultaneously worries me and uplifts me that a self-proclaimed group of internet activists name themselves after Indian mythical figures....

24 hours ago by subhorup on Anonymous activists release PCAnywhere source code
naviathan

It's actually far easier to work anonymously on the internet than you think. With tools like Tor bouncing your traffic around the world before...

1 day ago by naviathan on Anonymous activists release PCAnywhere source code
Agnostic_OS

1000272134 and bluedalmatian with you both there but then I'm still in 10.04 land (and happy with it)

1 day ago by Agnostic_OS on Ten factors that make Ubuntu 11.10 a hit
apexwm

Interesting article and definitely see your points on the products mentioned. One of the top products for our Help Desk (approximately 20% of all...

1 day ago by apexwm on Ten flawed products that derail productivity
Paul Hutchinson

Absolutely - this should obviously not be handled my isp - but handled by their hosting operator. What's been suggested here is that my isp police...

1 day ago by Paul Hutchinson via Facebook on MPs urge ISPs to take down terrorist material
Techs UK

Looks like a great phone. I don't notice any deficiencies in WP7. used IOS before, that's pretty good. I don't spend much time in Apps, all i need...

2 days ago by Techs UK on Nokia pins US 're-entry' hopes on Lumia 900
Larry Bloggy

Now with the help of these apps you are always synced with MS outlook while on the move. Just download apps like xobni or outlookreflex and get...

2 days ago by Larry Bloggy via Facebook on Outlook Social Connector beta 2 and the LinkedIn connector
mike40g123

Your details are wrong. The version currently being made is the one with 2 USB ports, 256MB RAM and a network port. This is the Model B. The...

2 days ago by mike40g123 on Raspberry Pi boards set to go on sale
Moley

The thing that has been puzzling me for quite a while is how Anonymous can remain anonymous whilst not only being active on the Internet but also...

2 days ago by Moley on Anonymous activists release PCAnywhere source code
Don Dilly

If what Semantec is saying is rue, that is even worse and shows a complete disregard for thier users. If what Anonymous claims is true and the...

2 days ago by Don Dilly via Facebook on Anonymous activists release PCAnywhere source code
MattChurchy

Didn't seem particularly biased to me either. Oh though you might have mentioned some other competitors with free search and email services...

3 days ago by MattChurchy on Time for an evil umpire: Google, Microsoft & privacy
Simon Bisson and Mary Branscombe

James - exactly as much as anyone paid you for your comment; I don't feel that I need to say that I'm independant and unbiased, but just for you...

3 days ago by Simon Bisson and Mary Branscombe on Time for an evil umpire: Google, Microsoft & privacy
Carl White

Once they realise symantec are willing to pay real money, they will simply keep extorting, unless of course symantec/authorities can use the...

3 days ago by Carl White via Facebook on Symantec offered hackers $50k in source code sting
Jonathan Hassell

You can find more information on BS 8878 by Jonathan Hassell its lead-author at http://www.hassellinclusion.com/bs8878/ The page includes a...

3 days ago by Jonathan Hassell on BSI publishes first British web accessibility standard

Latest in Application Development