Best unofficial Apache Server developers community
Username
Forgot password?
Sign in with Twitter account
Sign in with Facebook account

The Evolution of Google File System (GFS)

Firstly it was the Google File System described in this 2003 paper (PDF):

We have designed and implemented the Google File System, a scalable distributed file system for large distributed data-intensive applications. It provides fault tolerance while running on inexpensive commodity hardware, and it delivers high aggregate performance to a large number of clients.

This conversation between Marshall Kirk McKusick and Sean Quilan dives into some of the pluses of GFS, but also what led to the next version of Google File System:

Because GFS was designed initially to enable a crawling and indexing system, throughput was everything. […] While these instances-where you have to provide for failover and error recovery-may have been acceptable in the batch situation, they're definitely not OK from a latency point of view for a user-facing application. Another issue here is that there are places in the design where we've tried to optimize for throughput by dumping thousands of operations into a queue and then just processing through them. That leads to fine throughput, but it's not great for latency. You can easily get into situations where you might be stuck for seconds at a time in a queue just waiting to get to the head of the queue.

In this ZDNet interview, Urs Hölze confirms the migration to the new version of the filesystem:

[…] most applications don't use [Google File System (GFS)] today. In fact, we're phasing out GFS in favour of the next-generation file system that is very similar, but it's not GFS anymore. It scales better and has better latency properties as well.

And according to the same interview, Google is already looking into building another version of its distributed filesystem that would take advatange of flash memory:

I think three years from now we'll try to retire that because flash memory is coming and faster networks and faster CPUs are on the way and that will change how we want to do things.

Original title and link: The Evolution of Google File System (GFS) (NoSQL database©myNoSQL)

Source Article
Comments
0
Be the first to comment

Join with account you already have


Sign in with Twitter account
Sign in with Facebook account
Sign in with Google Friend Connect
avatar
Tags: marshall kirk mckusick, system throughput, commodity hardware, quilan, google, aggregate performance, fault tolerance, indexing system, gfs, error recovery, int
News] Oracle sues Google over Android operating system
Aug 12, 2010
Hi, FYI, I see this news this morning[1]. So finally Oracle start to shoot, Android is the first target. If what Oracle want is the control of the Java, Harmony may be on their list as well ( or already as Android take Harmony as their…

File not found exception when setup share file system master/slave
May 27, 2010
Hi, i tried to setup share file system master/slave on activemq 5.3.2 , my activemq.xml is updated as below: <broker xmlns="http://activemq.apache.org/schema/core" brokerName="share" useJmx="false" deleteAllMessagesOnStartup="true"…

copy a file from hdfs to local file system with java
Feb 25, 2011
How to copy a file from a HDS to local file system with a JAVA API ? where i can find a documentation and example about it? thanks

Created] (AMQ-3273) Slave broker acquires lock file when Master is still alive (shared file system m
Apr 6, 2011
Slave broker acquires lock file when Master is still alive (shared file system master/slave setup)

API evolution
May 11, 2011
Hi guys, those last three days, I made some modifications in the API following the discussions we has 2 weeks ago. Namely : 1) The 'simple' requests are not anymore returning a Response. If somethng went wrong, an exception will be thrown. For…

Documenting API evolution
Oct 14, 2010
Hi all, To document in JavaDoc the evolution of the API of a class we generally (and should) use the @since tag. Traditionally, this tag is provided with the first release version of the library included the modified API (e.g. the added method or…

File admin on JackRabbit file system
Jan 26, 2011
Dear all, It is my first time placing in this mailing list. Not sure if I do it right. We are now storing file on file system. We aware that the original file name is not retained, and has been coded into different file name. (e.g.…

Evolution IMAP with Dovecot as proxy
Feb 16, 2011
We have some users who use Evolution as mail client. During migration process we are run Dovecot as proxy of Courier-imap until any user get migrated depend of ldap attribute. At present, the problem is that with Thunderbird and other clients,…

Paths of evolution for Hadoop metrics
Sep 16, 2010
Working on the new metrics framework (https://issues.apache.org/jira/browse/HADOOP-6728 and its sub-tasks), I'd like to invite some feedback from the community on how to evolve Hadoop metrics. Here is a quick summary of changes: 1. Simplified…

Closed: (GERONIMO-5342) "File not found Exception" when set activemq share file system mas
May 31, 2010
[ https://issues.apache.org/jira/browse/GERONIMO-5342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] viola.lu closed GERONIMO-5342.