update a few documents, comments, and defaults to mention 3-of-10 instead of 25-of-100

author Zooko O'Whielacronx <zooko@zooko.com>

Tue, 16 Oct 2007 02:53:59 +0000 (19:53 -0700)

committer Zooko O'Whielacronx <zooko@zooko.com>

Tue, 16 Oct 2007 02:53:59 +0000 (19:53 -0700)
author Zooko O'Whielacronx <zooko@zooko.com>
Tue, 16 Oct 2007 02:53:59 +0000 (19:53 -0700)
committer Zooko O'Whielacronx <zooko@zooko.com>
Tue, 16 Oct 2007 02:53:59 +0000 (19:53 -0700)
diff --git a/docs/architecture.txt b/docs/architecture.txt

index 047c362df17cf20d7d35158643a47cdf09fd7b8c..a419f5d7c0b02af4030710cfd7daa782db132780 100644 (file)
--- a/docs/architecture.txt
+++ b/docs/architecture.txt
@@ -270,8 +270,8 @@ Because the shares being downloaded are distributed across a large number of
  peers, the download process will pull from many of them at the same time. The
  current encoding parameters require 3 shares to be retrieved for each
  segment, which means that up to 3 peers will be used simultaneously. For
-larger networks, 25-of-100 encoding is preferred, meaning 25 peers can be
-used simultaneously. This allows the download process to use the sum of the
+larger networks, 8-of-22 encoding could be used, meaning 8 peers can be used
+simultaneously. This allows the download process to use the sum of the
  available peers' upload bandwidths, resulting in downloads that take full
  advantage of the common 8x disparity between download and upload bandwith on
  modern ADSL lines.
@@ -550,10 +550,10 @@ different goals. Each choice results in a number of properties; there are
  many tradeoffs.
  
  First, some terms: the erasure-coding algorithm is described as K-out-of-N
-(for this release, the default values are K=25 and N=100). Each grid will
-have some number of peers; this number will rise and fall over time as peers
-join, drop out, come back, and leave forever. Files are of various sizes,
-some are popular, others are rare. Peers have various capacities, variable
+(for this release, the default values are K=3 and N=10). Each grid will have
+some number of peers; this number will rise and fall over time as peers join,
+drop out, come back, and leave forever. Files are of various sizes, some are
+popular, others are rare. Peers have various capacities, variable
  upload/download bandwidths, and network latency. Most of the mathematical
  models that look at peer failure assume some average (and independent)
  probability 'P' of a given peer being available: this can be high (servers
diff --git a/src/allmydata/encode.py b/src/allmydata/encode.py

index 48878a963613afe2124bd59465694d1ae0805507..5a46826e622d2b20eca09ff62fe01d40f2245aab 100644 (file)
--- a/src/allmydata/encode.py
+++ b/src/allmydata/encode.py
@@ -21,8 +21,8 @@ of shares is chosen to hit our reliability goals (more shares on more
  machines means more reliability), and is limited by overhead (proportional to
  numshares or log(numshares)) and the encoding technology in use (Reed-Solomon
  only permits 256 shares total). It is also constrained by the amount of data
-we want to send to each host. For estimating purposes, think of 100 shares
-out of which we need 25 to reconstruct the file.
+we want to send to each host. For estimating purposes, think of 10 shares
+out of which we need 3 to reconstruct the file.
  
  The encoder starts by cutting the original file into segments. All segments
  except the last are of equal size. The segment size is chosen to constrain
@@ -71,9 +71,9 @@ PiB=1024*TiB
  
  class Encoder(object):
      implements(IEncoder)
-    NEEDED_SHARES = 25
-    SHARES_OF_HAPPINESS = 75
-    TOTAL_SHARES = 100
+    NEEDED_SHARES = 3
+    SHARES_OF_HAPPINESS = 7
+    TOTAL_SHARES = 10
      MAX_SEGMENT_SIZE = 1*MiB
  
      def __init__(self, options={}):
@@ -285,10 +285,10 @@ class Encoder(object):
  
          # memory footprint: we only hold a tiny piece of the plaintext at any
          # given time. We build up a segment's worth of cryptttext, then hand
-        # it to the encoder. Assuming 25-of-100 encoding (4x expansion) and
-        # 2MiB max_segment_size, we get a peak memory footprint of 5*2MiB =
-        # 10MiB. Lowering max_segment_size to, say, 100KiB would drop the
-        # footprint to 500KiB at the expense of more hash-tree overhead.
+        # it to the encoder. Assuming 3-of-10 encoding (3.3x expansion) and
+        # 2MiB max_segment_size, we get a peak memory footprint of 4.3*2MiB =
+        # 8.6MiB. Lowering max_segment_size to, say, 100KiB would drop the
+        # footprint to 430KiB at the expense of more hash-tree overhead.
  
          d = self._gather_data(self.required_shares, input_piece_size,
                                crypttext_segment_hasher)
diff --git a/src/allmydata/interfaces.py b/src/allmydata/interfaces.py

index e1ac9e95660be3f5d193098357037a2937f6d74e..e2ae858c803578799d4584aefb6b8c18b3994eab 100644 (file)
--- a/src/allmydata/interfaces.py
+++ b/src/allmydata/interfaces.py
@@ -682,7 +682,7 @@ class IEncoder(Interface):
          be created.
  
          Encoding parameters can be set in three ways. 1: The Encoder class
-        provides defaults (25/75/100). 2: the Encoder can be constructed with
+        provides defaults (3/7/10). 2: the Encoder can be constructed with
          an 'options' dictionary, in which the
          needed_and_happy_and_total_shares' key can be a (k,d,n) tuple. 3:
          set_params((k,d,n)) can be called.
author	Zooko O'Whielacronx <zooko@zooko.com>
	Tue, 16 Oct 2007 02:53:59 +0000 (19:53 -0700)
committer	Zooko O'Whielacronx <zooko@zooko.com>
	Tue, 16 Oct 2007 02:53:59 +0000 (19:53 -0700)
docs/architecture.txt		patch \| blob \| history
src/allmydata/encode.py		patch \| blob \| history
src/allmydata/interfaces.py		patch \| blob \| history