Browse Source

GFS -> IPFS tex

Juan Batiz-Benet 11 years ago
parent
commit
810ddd40f3
1 changed files with 47 additions and 49 deletions
  1. 47 49
      papers/ipfs/ipfs.tex

+ 47 - 49
papers/ipfs/ipfs.tex

@@ -12,9 +12,7 @@
 
 
 \begin{document}
 \begin{document}
 
 
-% \conferenceinfo{WOODSTOCK}{'97 El Paso, Texas USA}
-
-\title{Galactic File System}
+\title{IPFS - Towards The Permanent Web (DRAFT 2)}
 \subtitle{}
 \subtitle{}
 
 
 \numberofauthors{1}
 \numberofauthors{1}
@@ -39,16 +37,16 @@
 \maketitle
 \maketitle
 \begin{abstract}
 \begin{abstract}
 The Galactic File System is a peer-to-peer distributed file system capable of
 The Galactic File System is a peer-to-peer distributed file system capable of
-sharing the same files with millions of nodes. GFS combines a distributed
+sharing the same files with millions of nodes. IPFS combines a distributed
 hashtable, cryptographic techniques, merkle trees, content-addressable
 hashtable, cryptographic techniques, merkle trees, content-addressable
 storage, bittorrent, and tag-based filesystems to build a single massive
 storage, bittorrent, and tag-based filesystems to build a single massive
-file system shared between peers. GFS has no single point of failure, and
+file system shared between peers. IPFS has no single point of failure, and
 nodes do not need to trust each other.
 nodes do not need to trust each other.
 \end{abstract}
 \end{abstract}
 
 
 \section{Introduction}
 \section{Introduction}
 
 
-[Motivate GFS. Introduce problems. Describe BitTorrent existing problems (
+[Motivate IPFS. Introduce problems. Describe BitTorrent existing problems (
 multiple files. one swarm. sloppy dht implementation.) Describe version
 multiple files. one swarm. sloppy dht implementation.) Describe version
 control efforts. Propose potential combinations of good ideas.]
 control efforts. Propose potential combinations of good ideas.]
 
 
@@ -63,15 +61,15 @@ Ori,
 Coral]
 Coral]
 
 
 This paper introduces
 This paper introduces
-GFS, a novel peer-to-peer version-controlled filesystem;
-and BitSwap, the novel peer-to-peer block exchange protocol serving GFS.
+IPFS, a novel peer-to-peer version-controlled filesystem;
+and BitSwap, the novel peer-to-peer block exchange protocol serving IPFS.
 
 
 The rest of the paper is organized as follows.
 The rest of the paper is organized as follows.
 Section 2 describes the design of the filesystem.
 Section 2 describes the design of the filesystem.
 Section 3 evaluates various facets of the system under benchmark and common
 Section 3 evaluates various facets of the system under benchmark and common
 workloads.
 workloads.
-Section 4 presents and evaluates a world-wide deployment of GFS.
-Section 5 describes existing and potential applications of GFS.
+Section 4 presents and evaluates a world-wide deployment of IPFS.
+Section 5 describes existing and potential applications of IPFS.
 Section 6 discusses related and future work.
 Section 6 discusses related and future work.
 
 
 Notation Notes:
 Notation Notes:
@@ -83,9 +81,9 @@ Notation Notes:
 
 
 \section{Design}
 \section{Design}
 
 
-\subsection{GFS Nodes}
+\subsection{IPFS Nodes}
 
 
-GFS is a distributed file system where all nodes are the same. They are
+IPFS is a distributed file system where all nodes are the same. They are
 identified by a \texttt{NodeId}, the cryptographic hash of a public-key
 identified by a \texttt{NodeId}, the cryptographic hash of a public-key
 (note that \textit{checksum} will henceforth refer specifically to crypographic
 (note that \textit{checksum} will henceforth refer specifically to crypographic
 hashes of an object). Nodes also store their public and private keys. Clients
 hashes of an object). Nodes also store their public and private keys. Clients
@@ -107,8 +105,8 @@ accrued benefits. It is recommended that nodes remain the same.
 
 
 
 
 Together, the
 Together, the
-nodes store the GFS files in local storage, and send files to each other.
-GFS implements its features by combining several subsystems with many
+nodes store the IPFS files in local storage, and send files to each other.
+IPFS implements its features by combining several subsystems with many
 desirable properties:
 desirable properties:
 
 
 \begin{enumerate}
 \begin{enumerate}
@@ -127,12 +125,12 @@ desirable properties:
 
 
 These subsystems are not independent. They are well integrated and leverage
 These subsystems are not independent. They are well integrated and leverage
 their blended properties. However, it is useful to describe them separately,
 their blended properties. However, it is useful to describe them separately,
-building the system from the bottom up. Note that all GFS nodes are identical,
+building the system from the bottom up. Note that all IPFS nodes are identical,
 and run the same program.
 and run the same program.
 
 
 \subsection{Distributed Sloppy Hash Table}
 \subsection{Distributed Sloppy Hash Table}
 
 
-First, GFS nodes implement a DSHT based on Kademlia and Coral to coordinate
+First, IPFS nodes implement a DSHT based on Kademlia and Coral to coordinate
 and identify which nodes can serve a particular block of data.
 and identify which nodes can serve a particular block of data.
 
 
 \subsubsection{Kademlia DHT}
 \subsubsection{Kademlia DHT}
@@ -158,7 +156,7 @@ Kademlia is a DHT that provides:
 
 
 While some peer-to-peer filesystems store data blocks directly in DHTs,
 While some peer-to-peer filesystems store data blocks directly in DHTs,
 this ``wastes storage and bandwidth, as data must be stored at nodes where it
 this ``wastes storage and bandwidth, as data must be stored at nodes where it
-is not needed''. Instead, GFS stores a list of peers that can provide the data block.
+is not needed''. Instead, IPFS stores a list of peers that can provide the data block.
 
 
 \subsubsection{Coral DSHT}
 \subsubsection{Coral DSHT}
 
 
@@ -189,15 +187,15 @@ Coral extends Kademlia in three particularly important ways:
 \end{enumerate}
 \end{enumerate}
 
 
 
 
-\subsubsection{GFS DSHT}
+\subsubsection{IPFS DSHT}
 
 
-The GFS DSHT supports four RPC calls:
+The IPFS DSHT supports four RPC calls:
 
 
 
 
 
 
 \subsection{Block Exchange - BitSwap Protocol}
 \subsection{Block Exchange - BitSwap Protocol}
 
 
-The exchange of data in GFS happens by exchanging blocks with peers using a
+The exchange of data in IPFS happens by exchanging blocks with peers using a
 BitTorrent inspired protocol: BitSwap. Like BitTorrent, BitSwap peers are
 BitTorrent inspired protocol: BitSwap. Like BitTorrent, BitSwap peers are
 looking to acquire a set of blocks, and have blocks to offer in exchange.
 looking to acquire a set of blocks, and have blocks to offer in exchange.
 Unlike BitTorrent, BitSwap is not limited to the blocks in one torrent.
 Unlike BitTorrent, BitSwap is not limited to the blocks in one torrent.
@@ -453,7 +451,7 @@ the sender. Both receiver and sender should update their ledgers accordingly,
 though the sender is either malfunctioning or attacking the receiver. Note that
 though the sender is either malfunctioning or attacking the receiver. Note that
 BitSwap expects to operate on a reliable transmission channel, so data errors
 BitSwap expects to operate on a reliable transmission channel, so data errors
 -- which could lead to incorrect penalization of an honest sender -- are
 -- which could lead to incorrect penalization of an honest sender -- are
-expected to be caught before the data is given to BitSwap. GFS uses the uTP
+expected to be caught before the data is given to BitSwap. IPFS uses the uTP
 protocol.
 protocol.
 
 
 \paragraph{Peer.close(Bool)}
 \paragraph{Peer.close(Bool)}
@@ -491,12 +489,12 @@ the future, if it is useful to do so.
 
 
 \subsection{Object Model}
 \subsection{Object Model}
 
 
-The DHT and BitSwap allow GFS to form a massive peer-to-peer system for storing
+The DHT and BitSwap allow IPFS to form a massive peer-to-peer system for storing
 and distributing blocks quickly and robustly to users.
 and distributing blocks quickly and robustly to users.
-GFS builds a filesystem out of this efficient block distribution system,
+IPFS builds a filesystem out of this efficient block distribution system,
 constructing files and directories out of blocks.
 constructing files and directories out of blocks.
 
 
-Files in GFS are represented as a collection of inter-related objects, like in
+Files in IPFS are represented as a collection of inter-related objects, like in
 the version control system Git. Each object is addressed by the cryptographic
 the version control system Git. Each object is addressed by the cryptographic
 hash of its contents (\texttt{Checksum}). The file objects are:
 hash of its contents (\texttt{Checksum}). The file objects are:
 
 
@@ -524,10 +522,10 @@ Notes:
 
 
 The \texttt{Block} object contains an addressable unit of data, and
 The \texttt{Block} object contains an addressable unit of data, and
 represents a file.
 represents a file.
-GFS Blocks are like Git blobs or filesystem data blocks. They store the
+IPFS Blocks are like Git blobs or filesystem data blocks. They store the
 users' data. (The name \textit{block} is preferred over \textit{blob}, as the
 users' data. (The name \textit{block} is preferred over \textit{blob}, as the
-Git-inspired view of a \textit{blob} as a \textit{file} breaks down in GFS.
-GFS files can be represented by both \texttt{lists} and \texttt{blocks}.)
+Git-inspired view of a \textit{blob} as a \textit{file} breaks down in IPFS.
+IPFS files can be represented by both \texttt{lists} and \texttt{blocks}.)
 Format:
 Format:
 \begin{verbatim}
 \begin{verbatim}
 block <size>
 block <size>
@@ -539,9 +537,9 @@ block <size>
 \subsubsection{List Object}
 \subsubsection{List Object}
 
 
 The \texttt{List} object represents a large or de-duplicated file made up of
 The \texttt{List} object represents a large or de-duplicated file made up of
-several GFS \texttt{Blocks} concatenated together. \texttt{Lists} contain
+several IPFS \texttt{Blocks} concatenated together. \texttt{Lists} contain
 an ordered sequence of \texttt{block} or \texttt{list} objects.
 an ordered sequence of \texttt{block} or \texttt{list} objects.
-In a sense, the GFS \texttt{List} functions like a filesystem file with
+In a sense, the IPFS \texttt{List} functions like a filesystem file with
 indirect blocks. Since \texttt{lists} can contain other \texttt{lists}, topologies including linked lists and balanced trees are possible. Directed graphs where the same node appears in multiple places allow in-file deduplication. Cycles are not possible (enforced by hash addessing).
 indirect blocks. Since \texttt{lists} can contain other \texttt{lists}, topologies including linked lists and balanced trees are possible. Directed graphs where the same node appears in multiple places allow in-file deduplication. Cycles are not possible (enforced by hash addessing).
 Format:
 Format:
 \begin{verbatim}
 \begin{verbatim}
@@ -554,7 +552,7 @@ list <num objects> <size varint>
 
 
 \subsubsection{Tree Object}
 \subsubsection{Tree Object}
 
 
-The \texttt{tree} object in GFS is similar to Git trees: it represents a
+The \texttt{tree} object in IPFS is similar to Git trees: it represents a
 directory, a list of checksums and names. The checksums reference \texttt{blob}
 directory, a list of checksums and names. The checksums reference \texttt{blob}
 or other \texttt{tree} objects. Note that traditional path naming
 or other \texttt{tree} objects. Note that traditional path naming
 is implemented entirely by the \texttt{tree} objects. \texttt{Blocks} and
 is implemented entirely by the \texttt{tree} objects. \texttt{Blocks} and
@@ -569,7 +567,7 @@ tree <num objects> <size varint>
 
 
 \subsubsection{Commit Object}
 \subsubsection{Commit Object}
 
 
-The \texttt{commit} object in GFS is similar to Git's. It represents a
+The \texttt{commit} object in IPFS is similar to Git's. It represents a
 snapshot in the version history of a \texttt{tree}. Note that user
 snapshot in the version history of a \texttt{tree}. Note that user
 addresses are NodeIds (the hash of the public key).
 addresses are NodeIds (the hash of the public key).
 
 
@@ -592,17 +590,17 @@ it references are accessible, all preceding versions are retrievable and the
 full history of the filesystem changes can be accessed. This is a consequence
 full history of the filesystem changes can be accessed. This is a consequence
 of the \texttt{Git} object model and the graph it forms.
 of the \texttt{Git} object model and the graph it forms.
 
 
-The full power of the \texttt{Git} version control tools is available to GFS
+The full power of the \texttt{Git} version control tools is available to IPFS
 users. The object model is compatible (though not the same). The standard
 users. The object model is compatible (though not the same). The standard
-\texttt{Git} tools can be used on the \texttt{GFS} object graph after a
+\texttt{Git} tools can be used on the \texttt{IPFS} object graph after a
 conversion. Additionally, a fork of the tools is under development that will
 conversion. Additionally, a fork of the tools is under development that will
 allow users to use them directly without conversion.
 allow users to use them directly without conversion.
 
 
 \subsubsection{Object-level Cryptoraphy}
 \subsubsection{Object-level Cryptoraphy}
 
 
-GFS is equipped to handle object-level cryptographic operations. Any additional
+IPFS is equipped to handle object-level cryptographic operations. Any additional
 bytes are appended to the bottom of the object. This changes the object's hash
 bytes are appended to the bottom of the object. This changes the object's hash
-(defining a different object, as it should). GFS exposes an API that
+(defining a different object, as it should). IPFS exposes an API that
 automatically verifies signatures or decrypts data.
 automatically verifies signatures or decrypts data.
 
 
 \begin{itemize}
 \begin{itemize}
@@ -612,7 +610,7 @@ automatically verifies signatures or decrypts data.
 
 
 \subsubsection{Merkle Trees}
 \subsubsection{Merkle Trees}
 
 
-The object model in GFS forms a \textit{Merkle Tree}, which provides GFS with
+The object model in IPFS forms a \textit{Merkle Tree}, which provides IPFS with
 useful properties:
 useful properties:
 
 
 \begin{enumerate}
 \begin{enumerate}
@@ -634,7 +632,7 @@ useful properties:
 
 
 \subsubsection{Filesystem Paths}
 \subsubsection{Filesystem Paths}
 
 
-GFS exposes a slash-delimited path-based API. Paths work the same as in any
+IPFS exposes a slash-delimited path-based API. Paths work the same as in any
 traditional UNIX filesystem. Path subcomponents have different meanings per
 traditional UNIX filesystem. Path subcomponents have different meanings per
 object:
 object:
 
 
@@ -772,11 +770,11 @@ This is mitigated by:
 \begin{itemize}
 \begin{itemize}
   \item \textbf{tree caching}: since all objects are hash-addressed, they
   \item \textbf{tree caching}: since all objects are hash-addressed, they
         can be cached indefinitely. Additionally, \texttt{trees} tend to be
         can be cached indefinitely. Additionally, \texttt{trees} tend to be
-        small in size so GFS prioritizes caching them over \texttt{blocks}.
+        small in size so IPFS prioritizes caching them over \texttt{blocks}.
   \item \textbf{flattened trees}: for any given \texttt{tree}, a special
   \item \textbf{flattened trees}: for any given \texttt{tree}, a special
         \texttt{flattened tree} can be constructed to list all objects
         \texttt{flattened tree} can be constructed to list all objects
         reachable from the \texttt{tree}. Figure \ref{flattened-ttt111} shows
         reachable from the \texttt{tree}. Figure \ref{flattened-ttt111} shows
-        an example of a flattened tree. While GFS does not construct flattened
+        an example of a flattened tree. While IPFS does not construct flattened
         trees by default, it provides a function for users. For example,
         trees by default, it provides a function for users. For example,
 \end{itemize}
 \end{itemize}
 
 
@@ -796,13 +794,13 @@ This is mitigated by:
 
 
 \subsubsection{Publishing Objects}
 \subsubsection{Publishing Objects}
 
 
-GFS is globally distributed. It is designed to allow the files of millions
+IPFS is globally distributed. It is designed to allow the files of millions
 of users to coexist together. The \textbf{DHT} with content-hash addressing
 of users to coexist together. The \textbf{DHT} with content-hash addressing
 allows publishing objects in a fair, secure, and entirely distributed way.
 allows publishing objects in a fair, secure, and entirely distributed way.
 Anyone can publish an object by simply adding its key to the DHT, adding
 Anyone can publish an object by simply adding its key to the DHT, adding
 themselves as a peer, and giving other users the object's hash.
 themselves as a peer, and giving other users the object's hash.
 
 
-Additionally, the GFS root directory supports special functionality to
+Additionally, the IPFS root directory supports special functionality to
 allow namespacing and naming objects in a fair, secure, and distributed
 allow namespacing and naming objects in a fair, secure, and distributed
 manner.
 manner.
 \begin{itemize}
 \begin{itemize}
@@ -816,9 +814,9 @@ manner.
         a user can publish a \texttt{tree} or \texttt{commit} under their
         a user can publish a \texttt{tree} or \texttt{commit} under their
         name, and others can verify it by checking the signature matches.
         name, and others can verify it by checking the signature matches.
 
 
-  \item[(c)] If \texttt{/<domain>} is a valid domain name, GFS
-        looks up key \texttt{gfs} in its \texttt{DNS TXT} record. GFS
-        interprets the value as either an object hash or another GFS path:
+  \item[(c)] If \texttt{/<domain>} is a valid domain name, IPFS
+        looks up key \texttt{gfs} in its \texttt{DNS TXT} record. IPFS
+        interprets the value as either an object hash or another IPFS path:
         \begin{verbatim}
         \begin{verbatim}
   # this DNS TXT record
   # this DNS TXT record
   fs.benet.ai. TXT "gfs=/aabbccddeeffgg ..."
   fs.benet.ai. TXT "gfs=/aabbccddeeffgg ..."
@@ -832,15 +830,15 @@ manner.
 
 
 \subsection{Local Objects}
 \subsection{Local Objects}
 
 
-GFS clients require some \textit{local storage}, an external system
-on which to store and retrieve local raw data for the objects GFS manages.
+IPFS clients require some \textit{local storage}, an external system
+on which to store and retrieve local raw data for the objects IPFS manages.
 The type of storage depends on the node's use case.
 The type of storage depends on the node's use case.
 In most cases, this is simply a portion of disk space (either managed by
 In most cases, this is simply a portion of disk space (either managed by
-the native filesystem, or directly by the GFS client). In others, non-
+the native filesystem, or directly by the IPFS client). In others, non-
 persistent caches for example, this storage is just a portion of RAM.
 persistent caches for example, this storage is just a portion of RAM.
 
 
-Ultimately, all blocks available in GFS are in some node's
-\textit{local storage}. And when nodes open files with GFS, the objects are
+Ultimately, all blocks available in IPFS are in some node's
+\textit{local storage}. And when nodes open files with IPFS, the objects are
 downloaded and stored locally, at least temporarily. This provides
 downloaded and stored locally, at least temporarily. This provides
 fast lookup for some configurable amount of time thereafter.
 fast lookup for some configurable amount of time thereafter.