4 Given a list of nodeids and a 'convergence' file, create a bunch of files
5 that will (when encoded at k=1,N=1) be uploaded to specific nodeids.
9 make-canary-files.py -c PATH/TO/convergence -n PATH/TO/nodeids -k 1 -N 1
11 It will create a directory named 'canaries', with one file per nodeid named
12 '$NODEID-$NICKNAME.txt', that contains some random text.
14 The 'nodeids' file should contain one base32 nodeid per line, followed by the
15 optional nickname, like:
18 5yyqu2hbvbh3rgtsgxrmmg4g77b6p3yo server12
19 vb7vm2mneyid5jbyvcbk2wb5icdhwtun server13
23 The resulting 'canaries/5yyqu2hbvbh3rgtsgxrmmg4g77b6p3yo-server12.txt' file
24 will, when uploaded with the given (convergence,k,N) pair, have its first
25 share placed on the 5yyq/server12 storage server. If N>1, the other shares
26 will be placed elsewhere, of course.
28 This tool can be useful to construct a set of 'canary' files, which can then
29 be uploaded to storage servers, and later downloaded to test a grid's health.
30 If you are able to download the canary for server12 via some tahoe node X,
31 then the following properties are known to be true:
33 node X is running, and has established a connection to server12
34 server12 is running, and returning data for at least the given file
36 Using k=1/N=1 creates a separate test for each server. The test process is
37 then to download the whole directory of files (perhaps with a t=deep-check
40 Alternatively, you could upload with the usual k=3/N=10 and then move/delete
41 shares to put all N shares on a single server.
43 Note that any changes to the nodeid list will affect the placement of shares.
44 Shares should be uploaded with the same nodeid list as this tool used when
45 constructing the files.
47 Also note that this tool uses the Tahoe codebase, so it should be run on a
48 system where Tahoe is installed, or in a source tree with setup.py like this:
50 setup.py run_with_pythonpath -p -c 'misc/make-canary-files.py ARGS..'
54 from twisted.python import usage
55 from allmydata.immutable import upload
56 from allmydata.util import base32
58 class Options(usage.Options):
60 ("convergence", "c", None, "path to NODEDIR/private/convergence"),
61 ("nodeids", "n", None, "path to file with one base32 nodeid per line"),
62 ("k", "k", 1, "number of necessary shares, defaults to 1", int),
63 ("N", "N", 1, "number of total shares, defaults to 1", int),
66 ("verbose", "v", "Be noisy"),
72 verbose = bool(opts["verbose"])
75 for line in open(opts["nodeids"], "r").readlines():
77 if not line or line.startswith("#"):
79 pieces = line.split(None, 1)
81 nodeid_s, nickname = pieces
85 nodeid = base32.a2b(nodeid_s)
86 nodes[nodeid] = nickname
88 if opts["k"] != 3 or opts["N"] != 10:
89 print "note: using non-default k/N requires patching the Tahoe code"
90 print "src/allmydata/client.py line 55, DEFAULT_ENCODING_PARAMETERS"
92 convergence_file = os.path.expanduser(opts["convergence"])
93 convergence_s = open(convergence_file, "rb").read().strip()
94 convergence = base32.a2b(convergence_s)
96 def get_permuted_peers(key):
99 permuted = sha.new(key + nodeid).digest()
100 results.append((permuted, nodeid))
101 results.sort(lambda a,b: cmp(a[0], b[0]))
102 return [ r[1] for r in results ]
104 def find_share_for_target(target):
105 target_s = base32.b2a(target)
106 prefix = "The first share of this file will be placed on " + target_s + "\n"
107 prefix += "This data is random: "
111 suffix = base32.b2a(os.urandom(10))
112 if verbose: print " trying", suffix,
113 data = prefix + suffix + "\n"
114 assert len(data) > 55 # no LIT files
115 # now, what storage index will this get?
116 u = upload.Data(data, convergence)
117 eu = upload.EncryptAnUploadable(u)
118 d = eu.get_storage_index() # this happens to run synchronously
120 if verbose: print "SI", base32.b2a(si),
121 peerlist = get_permuted_peers(si)
122 if peerlist[0] == target:
124 if verbose: print " yay!"
125 fn = base32.b2a(target)
127 nickname = nodes[target].replace("/", "_")
130 fn = os.path.join("canaries", fn)
131 open(fn, "w").write(data)
133 # nope, must try again
134 if verbose: print " boo"
136 d.addCallback(_got_si)
137 # get sneaky and look inside the Deferred for the synchronous result
144 target_s = base32.b2a(target)
145 print "working on", target_s
146 attempts.append(find_share_for_target(target))
148 print "%d attempts total, avg %d per target, max %d" % \
149 (sum(attempts), 1.0* sum(attempts) / len(nodes), max(attempts))