From: Brian Warner Date: Mon, 23 Mar 2009 23:08:20 +0000 (-0700) Subject: move GC docs out of proposed/, since it's all implemented now. Add reference to confi... X-Git-Tag: allmydata-tahoe-1.4.0~35 X-Git-Url: https://git.rkrishnan.org/pf/content/en/footer/reliability?a=commitdiff_plain;h=b5d4972a7a28574762665f89bb119c51f54a3efa;p=tahoe-lafs%2Ftahoe-lafs.git move GC docs out of proposed/, since it's all implemented now. Add reference to configuration.txt . Add expire.*= suggestions to tahoe.cfg . --- diff --git a/docs/configuration.txt b/docs/configuration.txt index 6df08624..994f9ac9 100644 --- a/docs/configuration.txt +++ b/docs/configuration.txt @@ -310,6 +310,17 @@ reserved_space = (str, optional) "100MB", "100M", "100000000B", "100000000", and "100000kb" all mean the same thing. Likewise, "1MiB", "1024KiB", and "1048576B" all mean the same thing. +expire.enabled = +expire.mode = +expire.override_lease_duration = +expire.cutoff_date = +expire.immutable = +expire.mutable = + + These settings control garbage-collection, in which the server will delete + shares that no longer have an up-to-date lease on them. Please see the + neighboring "garbage-collection.txt" document for full details. + == Running A Helper == diff --git a/docs/garbage-collection.txt b/docs/garbage-collection.txt new file mode 100644 index 00000000..4cacadfb --- /dev/null +++ b/docs/garbage-collection.txt @@ -0,0 +1,280 @@ += Garbage Collection in Tahoe = + +When a file or directory in the virtual filesystem is no longer referenced, +the space that its shares occupied on each storage server can be freed, +making room for other shares. Tahoe currently uses a garbage collection +("GC") mechanism to implement this space-reclamation process. Each share has +one or more "leases", which are managed by clients who want the +file/directory to be retained. The storage server accepts each share for a +pre-defined period of time, and is allowed to delete the share if all of the +leases are cancelled or allowed to expire. + +Garbage collection is not enabled by default: storage servers will not delete +shares without being explicitly configured to do so. When GC is enabled, +clients are responsible for renewing their leases on a periodic basis at +least frequently enough to prevent any of the leases from expiring before the +next renewal pass. + +There are several tradeoffs to be considered when choosing the renewal timer +and the lease duration, and there is no single optimal pair of values. See +the "lease-tradeoffs.svg" diagram to get an idea for the tradeoffs involved. +If lease renewal occurs quickly and with 100% reliability, than any renewal +time that is shorter than the lease duration will suffice, but a larger ratio +of duration-over-renewal-time will be more robust in the face of occasional +delays or failures. + +The current recommended values for a small Tahoe grid are to renew the leases +once a week, and to give each lease a duration of 31 days. Renewing leases +can be expected to take about one second per file/directory, depending upon +the number of servers and the network speeds involved. Note that in the +current release, the server code enforces a 31 day lease duration: there is +not yet a way for the client to request a different duration (however the +server can use the "expire.override_lease_duration" configuration setting to +increase or decrease the effective duration to something other than 31 days). + +== Client-side Renewal == + +If all of the files and directories which you care about are reachable from a +single starting point (usually referred to as a "rootcap"), and you store +that rootcap as an alias (via "tahoe create-alias"), then the simplest way to +renew these leases is with the following CLI command: + + tahoe deep-check --add-lease ALIAS: + +This will recursively walk every directory under the given alias and renew +the leases on all files and directories. (You may want to add a --repair flag +to perform repair at the same time). Simply run this command once a week (or +whatever other renewal period your grid recommends) and make sure it +completes successfully. As a side effect, a manifest of all unique files and +directories will be emitted to stdout, as well as a summary of file sizes and +counts. It may be useful to track these statistics over time. + +Note that newly uploaded files (and newly created directories) get an initial +lease too: the --add-lease process is only needed to ensure that all older +objects have up-to-date leases on them. + +For larger systems (such as a commercial grid), a separate "maintenance +daemon" is under development. This daemon will acquire manifests from +rootcaps on a periodic basis, keep track of checker results, manage +lease-addition, and prioritize repair needs, using multiple worker nodes to +perform these jobs in parallel. Eventually, this daemon will be made +appropriate for use by individual users as well, and may be incorporated +directly into the client node. + +== Server Side Expiration == + +Expiration must be explicitly enabled on each storage server, since the +default behavior is to never expire shares. Expiration is enabled by adding +config keys to the "[storage]" section of the tahoe.cfg file (as described +below) and restarting the server node. + +Each lease has two parameters: a create/renew timestamp and a duration. The +timestamp is updated when the share is first uploaded (i.e. the file or +directory is created), and updated again each time the lease is renewed (i.e. +"tahoe check --add-lease" is performed). The duration is currently fixed at +31 days, and the "nominal lease expiration time" is simply $duration seconds +after the $create_renew timestamp. (In a future release of Tahoe, the client +will get to request a specific duration, and the server will accept or reject +the request depending upon its local configuration, so that servers can +achieve better control over their storage obligations). + +The lease-expiration code has two modes of operation. The first is age-based: +leases are expired when their age is greater than their duration. This is the +preferred mode: as long as clients consistently update their leases on a +periodic basis, and that period is shorter than the lease duration, then all +active files and directories will be preserved, and the garbage will +collected in a timely fashion. + +Since there is not yet a way for clients to request a lease duration of other +than 31 days, there is a tahoe.cfg setting to override the duration of all +leases. If, for example, this alternative duration is set to 60 days, then +clients could safely renew their leases with an add-lease operation perhaps +once every 50 days: even though nominally their leases would expire 31 days +after the renewal, the server would not actually expire the leases until 60 +days after renewal. + +The other mode is an absolute-date-cutoff: it compares the create/renew +timestamp against some absolute date, and expires any lease which was not +created or renewed since the cutoff date. If all clients have performed an +add-lease some time after March 20th, you could tell the storage server to +expire all leases that were created or last renewed on March 19th or earlier. +This is most useful if you have a manual (non-periodic) add-lease process. +Note that there is not much point to running a storage server in this mode +for a long period of time: once the lease-checker has examined all shares and +expired whatever it is going to expire, the second and subsequent passes are +not going to find any new leases to remove. + +The tahoe.cfg file uses the following keys to control lease expiration: + +[storage] + +expire.enabled = (boolean, optional) + + If this is True, the storage server will delete shares on which all leases + have expired. Other controls dictate when leases are considered to have + expired. The default is False. + +expire.mode = (string, "age" or "cutoff-date", required if expiration enabled) + + If this string is "age", the age-based expiration scheme is used, and the + "expire.override_lease_duration" setting can be provided to influence the + lease ages. If it is "cutoff-date", the absolute-date-cutoff mode is used, + and the "expire.cutoff_date" setting must be provided to specify the cutoff + date. The mode setting currently has no default: you must provide a value. + + In a future release, this setting is likely to default to "age", but in this + release it was deemed safer to require an explicit mode specification. + +expire.override_lease_duration = (duration string, optional) + + When age-based expiration is in use, a lease will be expired if its + "lease.create_renew" timestamp plus its "lease.duration" time is + earlier/older than the current time. This key, if present, overrides the + duration value for all leases, changing the algorithm from: + + if (lease.create_renew_timestamp + lease.duration) < now: + expire_lease() + + to: + + if (lease.create_renew_timestamp + override_lease_duration) < now: + expire_lease() + + The value of this setting is a "duration string", which is a number of days, + months, or years, followed by a units suffix, and optionally separated by a + space, such as one of the following: + + 7days + 31day + 60 days + 2mo + 3 month + 12 months + 2years + + This key is meant to compensate for the fact that clients do not yet have + the ability to ask for leases that last longer than 31 days. A grid which + wants to use faster or slower GC than a 31-day lease timer permits can use + this parameter to implement it. The current fixed 31-day lease duration + makes the server behave as if "lease.override_lease_duration = 31days" had + been passed. + + This key is only valid when age-based expiration is in use (i.e. when + "expire.mode = age" is used). It will be rejected if cutoff-date expiration + is in use. + +expire.cutoff_date = (date string, required if mode=cutoff-date) + + When cutoff-date expiration is in use, a lease will be expired if its + create/renew timestamp is older than the cutoff date. This string will be a + date in the following format: + + 2009-01-16 (January 16th, 2009) + 2008-02-02 + 2007-12-25 + + The actual cutoff time shall be midnight UTC at the beginning of the given + day. Lease timers should naturally be generous enough to not depend upon + differences in timezone: there should be at least a few days between the + last renewal time and the cutoff date. + + This key is only valid when cutoff-based expiration is in use (i.e. when + "expire.mode = cutoff-date"). It will be rejected if age-based expiration is + in use. + +expire.immutable = (boolean, optional) + + If this is False, then immutable shares will never be deleted, even if their + leases have expired. This can be used in special situations to perform GC on + mutable files but not immutable ones. The default is True. + +expire.mutable = (boolean, optional) + + If this is False, then mutable shares will never be deleted, even if their + leases have expired. This can be used in special situations to perform GC on + immutable files but not mutable ones. The default is True. + +== Expiration Progress == + +In the current release, leases are stored as metadata in each share file, and +no separate database is maintained. As a result, checking and expiring leases +on a large server may require multiple reads from each of several million +share files. This process can take a long time and be very disk-intensive, so +a "share crawler" is used. The crawler limits the amount of time looking at +shares to a reasonable percentage of the storage server's overall usage: by +default it uses no more than 10% CPU, and yields to other code after 100ms. A +typical server with 1.1M shares was observed to take 3.5 days to perform this +rate-limited crawl through the whole set of shares, with expiration disabled. +It is expected to take perhaps 4 or 5 days to do the crawl with expiration +turned on. + +The crawler's status is displayed on the "Storage Server Status Page", a web +page dedicated to the storage server. This page resides at $NODEURL/storage, +and there is a link to it from the front "welcome" page. The "Lease +Expiration crawler" section of the status page shows the progress of the +current crawler cycle, expected completion time, amount of space recovered, +and details of how many shares have been examined. + +The crawler's state is persistent: restarting the node will not cause it to +lose significant progress. The state file is located in two files +($BASEDIR/storage/lease_checker.state and lease_checker.history), and the +crawler can be forcibly reset by stopping the node, deleting these two files, +then restarting the node. + +== Future Directions == + +Tahoe's GC mechanism is undergoing significant changes. The global +mark-and-sweep garbage-collection scheme can require considerable network +traffic for large grids, interfering with the bandwidth available for regular +uploads and downloads (and for non-Tahoe users of the network). + +A preferable method might be to have a timer-per-client instead of a +timer-per-lease: the leases would not be expired until/unless the client had +not checked in with the server for a pre-determined duration. This would +reduce the network traffic considerably (one message per week instead of +thousands), but retain the same general failure characteristics. + +In addition, using timers is not fail-safe (from the client's point of view), +in that a client which leaves the network for an extended period of time may +return to discover that all of their files have been garbage-collected. (It +*is* fail-safe from the server's point of view, in that a server is not +obligated to provide disk space in perpetuity to an unresponsive client). It +may be useful to create a "renewal agent" to which a client can pass a list +of renewal-caps: the agent then takes the responsibility for keeping these +leases renewed, so the client can go offline safely. Of course, this requires +a certain amount of coordination: the renewal agent should not be keeping +files alive that the client has actually deleted. The client can send the +renewal-agent a manifest of renewal caps, and each new manifest should +replace the previous set. + +The GC mechanism is also not immediate: a client which deletes a file will +nevertheless be consuming extra disk space (and might be charged or otherwise +held accountable for it) until the ex-file's leases finally expire on their +own. If the client is certain that they've removed their last reference to +the file, they could accelerate the GC process by cancelling their lease. The +current storage server API provides a method to cancel a lease, but the +client must be careful to coordinate with anyone else who might be +referencing the same lease (perhaps a second directory in the same virtual +drive), otherwise they might accidentally remove a lease that should have +been retained. + +In the current release, these leases are each associated with a single "node +secret" (stored in $BASEDIR/private/secret), which is used to generate +renewal- and cancel- secrets for each lease. Two nodes with different secrets +will produce separate leases, and will not be able to renew or cancel each +others' leases. + +Once the Accounting project is in place, leases will be scoped by a +sub-delegatable "account id" instead of a node secret, so clients will be able +to manage multiple leases per file. In addition, servers will be able to +identify which shares are leased by which clients, so that clients can safely +reconcile their idea of which files/directories are active against the +server's list, and explicitly cancel leases on objects that aren't on the +active list. + +By reducing the size of the "lease scope", the coordination problem is made +easier. In general, mark-and-sweep is easier to implement (it requires mere +vigilance, rather than coordination), so unless the space used by deleted +files is not expiring fast enough, the renew/expire timed lease approach is +recommended. + diff --git a/docs/proposed/garbage-collection.txt b/docs/proposed/garbage-collection.txt deleted file mode 100644 index 4cacadfb..00000000 --- a/docs/proposed/garbage-collection.txt +++ /dev/null @@ -1,280 +0,0 @@ -= Garbage Collection in Tahoe = - -When a file or directory in the virtual filesystem is no longer referenced, -the space that its shares occupied on each storage server can be freed, -making room for other shares. Tahoe currently uses a garbage collection -("GC") mechanism to implement this space-reclamation process. Each share has -one or more "leases", which are managed by clients who want the -file/directory to be retained. The storage server accepts each share for a -pre-defined period of time, and is allowed to delete the share if all of the -leases are cancelled or allowed to expire. - -Garbage collection is not enabled by default: storage servers will not delete -shares without being explicitly configured to do so. When GC is enabled, -clients are responsible for renewing their leases on a periodic basis at -least frequently enough to prevent any of the leases from expiring before the -next renewal pass. - -There are several tradeoffs to be considered when choosing the renewal timer -and the lease duration, and there is no single optimal pair of values. See -the "lease-tradeoffs.svg" diagram to get an idea for the tradeoffs involved. -If lease renewal occurs quickly and with 100% reliability, than any renewal -time that is shorter than the lease duration will suffice, but a larger ratio -of duration-over-renewal-time will be more robust in the face of occasional -delays or failures. - -The current recommended values for a small Tahoe grid are to renew the leases -once a week, and to give each lease a duration of 31 days. Renewing leases -can be expected to take about one second per file/directory, depending upon -the number of servers and the network speeds involved. Note that in the -current release, the server code enforces a 31 day lease duration: there is -not yet a way for the client to request a different duration (however the -server can use the "expire.override_lease_duration" configuration setting to -increase or decrease the effective duration to something other than 31 days). - -== Client-side Renewal == - -If all of the files and directories which you care about are reachable from a -single starting point (usually referred to as a "rootcap"), and you store -that rootcap as an alias (via "tahoe create-alias"), then the simplest way to -renew these leases is with the following CLI command: - - tahoe deep-check --add-lease ALIAS: - -This will recursively walk every directory under the given alias and renew -the leases on all files and directories. (You may want to add a --repair flag -to perform repair at the same time). Simply run this command once a week (or -whatever other renewal period your grid recommends) and make sure it -completes successfully. As a side effect, a manifest of all unique files and -directories will be emitted to stdout, as well as a summary of file sizes and -counts. It may be useful to track these statistics over time. - -Note that newly uploaded files (and newly created directories) get an initial -lease too: the --add-lease process is only needed to ensure that all older -objects have up-to-date leases on them. - -For larger systems (such as a commercial grid), a separate "maintenance -daemon" is under development. This daemon will acquire manifests from -rootcaps on a periodic basis, keep track of checker results, manage -lease-addition, and prioritize repair needs, using multiple worker nodes to -perform these jobs in parallel. Eventually, this daemon will be made -appropriate for use by individual users as well, and may be incorporated -directly into the client node. - -== Server Side Expiration == - -Expiration must be explicitly enabled on each storage server, since the -default behavior is to never expire shares. Expiration is enabled by adding -config keys to the "[storage]" section of the tahoe.cfg file (as described -below) and restarting the server node. - -Each lease has two parameters: a create/renew timestamp and a duration. The -timestamp is updated when the share is first uploaded (i.e. the file or -directory is created), and updated again each time the lease is renewed (i.e. -"tahoe check --add-lease" is performed). The duration is currently fixed at -31 days, and the "nominal lease expiration time" is simply $duration seconds -after the $create_renew timestamp. (In a future release of Tahoe, the client -will get to request a specific duration, and the server will accept or reject -the request depending upon its local configuration, so that servers can -achieve better control over their storage obligations). - -The lease-expiration code has two modes of operation. The first is age-based: -leases are expired when their age is greater than their duration. This is the -preferred mode: as long as clients consistently update their leases on a -periodic basis, and that period is shorter than the lease duration, then all -active files and directories will be preserved, and the garbage will -collected in a timely fashion. - -Since there is not yet a way for clients to request a lease duration of other -than 31 days, there is a tahoe.cfg setting to override the duration of all -leases. If, for example, this alternative duration is set to 60 days, then -clients could safely renew their leases with an add-lease operation perhaps -once every 50 days: even though nominally their leases would expire 31 days -after the renewal, the server would not actually expire the leases until 60 -days after renewal. - -The other mode is an absolute-date-cutoff: it compares the create/renew -timestamp against some absolute date, and expires any lease which was not -created or renewed since the cutoff date. If all clients have performed an -add-lease some time after March 20th, you could tell the storage server to -expire all leases that were created or last renewed on March 19th or earlier. -This is most useful if you have a manual (non-periodic) add-lease process. -Note that there is not much point to running a storage server in this mode -for a long period of time: once the lease-checker has examined all shares and -expired whatever it is going to expire, the second and subsequent passes are -not going to find any new leases to remove. - -The tahoe.cfg file uses the following keys to control lease expiration: - -[storage] - -expire.enabled = (boolean, optional) - - If this is True, the storage server will delete shares on which all leases - have expired. Other controls dictate when leases are considered to have - expired. The default is False. - -expire.mode = (string, "age" or "cutoff-date", required if expiration enabled) - - If this string is "age", the age-based expiration scheme is used, and the - "expire.override_lease_duration" setting can be provided to influence the - lease ages. If it is "cutoff-date", the absolute-date-cutoff mode is used, - and the "expire.cutoff_date" setting must be provided to specify the cutoff - date. The mode setting currently has no default: you must provide a value. - - In a future release, this setting is likely to default to "age", but in this - release it was deemed safer to require an explicit mode specification. - -expire.override_lease_duration = (duration string, optional) - - When age-based expiration is in use, a lease will be expired if its - "lease.create_renew" timestamp plus its "lease.duration" time is - earlier/older than the current time. This key, if present, overrides the - duration value for all leases, changing the algorithm from: - - if (lease.create_renew_timestamp + lease.duration) < now: - expire_lease() - - to: - - if (lease.create_renew_timestamp + override_lease_duration) < now: - expire_lease() - - The value of this setting is a "duration string", which is a number of days, - months, or years, followed by a units suffix, and optionally separated by a - space, such as one of the following: - - 7days - 31day - 60 days - 2mo - 3 month - 12 months - 2years - - This key is meant to compensate for the fact that clients do not yet have - the ability to ask for leases that last longer than 31 days. A grid which - wants to use faster or slower GC than a 31-day lease timer permits can use - this parameter to implement it. The current fixed 31-day lease duration - makes the server behave as if "lease.override_lease_duration = 31days" had - been passed. - - This key is only valid when age-based expiration is in use (i.e. when - "expire.mode = age" is used). It will be rejected if cutoff-date expiration - is in use. - -expire.cutoff_date = (date string, required if mode=cutoff-date) - - When cutoff-date expiration is in use, a lease will be expired if its - create/renew timestamp is older than the cutoff date. This string will be a - date in the following format: - - 2009-01-16 (January 16th, 2009) - 2008-02-02 - 2007-12-25 - - The actual cutoff time shall be midnight UTC at the beginning of the given - day. Lease timers should naturally be generous enough to not depend upon - differences in timezone: there should be at least a few days between the - last renewal time and the cutoff date. - - This key is only valid when cutoff-based expiration is in use (i.e. when - "expire.mode = cutoff-date"). It will be rejected if age-based expiration is - in use. - -expire.immutable = (boolean, optional) - - If this is False, then immutable shares will never be deleted, even if their - leases have expired. This can be used in special situations to perform GC on - mutable files but not immutable ones. The default is True. - -expire.mutable = (boolean, optional) - - If this is False, then mutable shares will never be deleted, even if their - leases have expired. This can be used in special situations to perform GC on - immutable files but not mutable ones. The default is True. - -== Expiration Progress == - -In the current release, leases are stored as metadata in each share file, and -no separate database is maintained. As a result, checking and expiring leases -on a large server may require multiple reads from each of several million -share files. This process can take a long time and be very disk-intensive, so -a "share crawler" is used. The crawler limits the amount of time looking at -shares to a reasonable percentage of the storage server's overall usage: by -default it uses no more than 10% CPU, and yields to other code after 100ms. A -typical server with 1.1M shares was observed to take 3.5 days to perform this -rate-limited crawl through the whole set of shares, with expiration disabled. -It is expected to take perhaps 4 or 5 days to do the crawl with expiration -turned on. - -The crawler's status is displayed on the "Storage Server Status Page", a web -page dedicated to the storage server. This page resides at $NODEURL/storage, -and there is a link to it from the front "welcome" page. The "Lease -Expiration crawler" section of the status page shows the progress of the -current crawler cycle, expected completion time, amount of space recovered, -and details of how many shares have been examined. - -The crawler's state is persistent: restarting the node will not cause it to -lose significant progress. The state file is located in two files -($BASEDIR/storage/lease_checker.state and lease_checker.history), and the -crawler can be forcibly reset by stopping the node, deleting these two files, -then restarting the node. - -== Future Directions == - -Tahoe's GC mechanism is undergoing significant changes. The global -mark-and-sweep garbage-collection scheme can require considerable network -traffic for large grids, interfering with the bandwidth available for regular -uploads and downloads (and for non-Tahoe users of the network). - -A preferable method might be to have a timer-per-client instead of a -timer-per-lease: the leases would not be expired until/unless the client had -not checked in with the server for a pre-determined duration. This would -reduce the network traffic considerably (one message per week instead of -thousands), but retain the same general failure characteristics. - -In addition, using timers is not fail-safe (from the client's point of view), -in that a client which leaves the network for an extended period of time may -return to discover that all of their files have been garbage-collected. (It -*is* fail-safe from the server's point of view, in that a server is not -obligated to provide disk space in perpetuity to an unresponsive client). It -may be useful to create a "renewal agent" to which a client can pass a list -of renewal-caps: the agent then takes the responsibility for keeping these -leases renewed, so the client can go offline safely. Of course, this requires -a certain amount of coordination: the renewal agent should not be keeping -files alive that the client has actually deleted. The client can send the -renewal-agent a manifest of renewal caps, and each new manifest should -replace the previous set. - -The GC mechanism is also not immediate: a client which deletes a file will -nevertheless be consuming extra disk space (and might be charged or otherwise -held accountable for it) until the ex-file's leases finally expire on their -own. If the client is certain that they've removed their last reference to -the file, they could accelerate the GC process by cancelling their lease. The -current storage server API provides a method to cancel a lease, but the -client must be careful to coordinate with anyone else who might be -referencing the same lease (perhaps a second directory in the same virtual -drive), otherwise they might accidentally remove a lease that should have -been retained. - -In the current release, these leases are each associated with a single "node -secret" (stored in $BASEDIR/private/secret), which is used to generate -renewal- and cancel- secrets for each lease. Two nodes with different secrets -will produce separate leases, and will not be able to renew or cancel each -others' leases. - -Once the Accounting project is in place, leases will be scoped by a -sub-delegatable "account id" instead of a node secret, so clients will be able -to manage multiple leases per file. In addition, servers will be able to -identify which shares are leased by which clients, so that clients can safely -reconcile their idea of which files/directories are active against the -server's list, and explicitly cancel leases on objects that aren't on the -active list. - -By reducing the size of the "lease scope", the coordination problem is made -easier. In general, mark-and-sweep is easier to implement (it requires mere -vigilance, rather than coordination), so unless the space used by deleted -files is not expiring fast enough, the renew/expire timed lease approach is -recommended. - diff --git a/src/allmydata/scripts/create_node.py b/src/allmydata/scripts/create_node.py index b1d1c86c..7a20ded1 100644 --- a/src/allmydata/scripts/create_node.py +++ b/src/allmydata/scripts/create_node.py @@ -115,6 +115,8 @@ def create_client(basedir, config, out=sys.stdout, err=sys.stderr): c.write("enabled = %s\n" % boolstr[storage_enabled]) c.write("#readonly =\n") c.write("#reserved_space =\n") + c.write("#expire.enabled =\n") + c.write("#expire.mode =\n") c.write("\n") c.write("[helper]\n")